← Back to API Documentation

Convert API Documentation

Document processing, OCR, and audio transcription API for extracting text from PDFs, images, documents, and audio files in 60+ languages.

Quick Reference
Base URL:
https://api.ainoflow.io
Authentication:
Authorization: Bearer your_api_key_here
Supported File Types

Input Formats

Images (OCR): PDF, JPEG, PNG, TIFF, BMP, WebP, GIF

Documents: Word (.doc, .docx), RTF, ODT, TXT

Spreadsheets: Excel (.xls, .xlsx), ODS

Presentations: PowerPoint (.ppt, .pptx), ODP

Audio: WAV, MP3, M4A, MP4, WebM, OGG, FLAC, AAC, Opus

Output Formats

text - Plain text extraction or transcription

pdf - Searchable PDF with OCR layer

Main Endpoints

POST
Submit File for Processing
/api/v1/convert/submit-file
Upload a document or audio file directly using multipart/form-data

Request Body (multipart/form-data):

file
required
- Binary file data to process
languages
required
- Comma-separated language codes (e.g., 'en,de,fr')
outputs
required
- Output formats: text, pdf (comma-separated)
models
optional
- Processing model: auto, tesseract, paddleocr, whisper* (default: auto)
ocr
optional
- OCR control: auto, force, skip (default: auto)
webhookUrl
optional
- URL to receive completion notification
reference
optional
- Your custom reference ID for tracking
jobExpiryInMinutes
optional
- Job expiration time in minutes (default: 1440)
response
optional
- Response mode: polling, direct, webhook, persisted (default: polling)

Response:

{
  "id": "uuid",
  "status": "processing",          // created, accepted, processing, completed, failed, cancelled
  "reference": "doc123",            // Your reference ID
  "models": "auto",                 // Processing model used
  "processingTimeInSeconds": 0.5,
  "responseMode": "polling",        // How results are returned
  
  "content": [{                     // Optional - Direct mode only
    "text": "string",
    "pdf": "string"
  }],
  "files": [{                       // Optional - Persisted mode only
    "models": "tesseract",
    "text": {
      "url": "https://...",
      "expiration": "2024-01-16T10:30:05Z"
    },
    "pdf": {
      "url": "https://...",
      "expiration": "2024-01-16T10:30:05Z"
    }
  }],
  
  "error": {                        // Optional - If processing failed
    "message": "string"
  }
}
POST
Submit Base64 Document
/api/v1/convert/submit-base64
Submit a document or audio file using base64-encoded content

Request Body (application/json):

documentBase64
required
- Base64-encoded file content
filename
optional
- Original filename for content type detection
languages
required
- Comma-separated language codes
outputs
required
- Output formats: text, pdf
models
optional
- Processing model (default: auto)
response
optional
- Response mode (default: polling)

Example:

{
  "documentBase64": "JVBERi0xLjQK...",
  "filename": "document.pdf",
  "languages": "en,de",
  "outputs": "text,pdf"
}
POST
Submit External URL
/api/v1/convert/submit-url
Submit a document or audio file from an external URL

Request Body (application/json):

sourceUrl
required
- URL to download the file from
languages
required
- Comma-separated language codes
outputs
required
- Output formats: text, pdf
models
optional
- Processing model (default: auto)
response
optional
- Response mode (default: polling)

Example:

{
  "sourceUrl": "https://example.com/document.pdf",
  "languages": "en,de",
  "outputs": "text,pdf"
}
GET
Get Job Status
/api/v1/convert/jobs/{jobId}
Check processing status and get download URLs

Path Parameters:

jobId
required
- Job ID returned from submit endpoint

Response:

{
  "id": "uuid",
  "status": "completed",              // created, accepted, processing, completed, failed, cancelled
  "reference": "doc123",              // Your reference ID
  "models": "tesseract",              // Model used
  "responseMode": "polling",
  "createdAt": "2024-01-15T10:30:00Z",
  "startedAt": "2024-01-15T10:30:01Z", // When processing started
  "completedAt": "2024-01-15T10:30:05Z",
  "expiryAt": "2024-01-16T10:30:00Z",  // When job expires
  "processingTimeInSeconds": 5.2,
  
  "content": [{                       // Optional - Direct mode only
    "text": "string",
    "pdf": "string"
  }],
  "files": [{                         // Download URLs for results
    "models": "tesseract",            // Models used
    "text": {
      "url": "https://...",
      "expiration": "2024-01-16T10:30:05Z"
    },
    "pdf": {
      "url": "https://...",
      "expiration": "2024-01-16T10:30:05Z"
    }
  }]
}

Example Usage

cURL Examples

Submit File (Multipart)

curl -X POST https://api.ainoflow.io/api/v1/convert/submit-file \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F "languages=en,de" \
  -F "outputs=text,pdf" \
  -F "response=polling"

Submit Base64

curl -X POST https://api.ainoflow.io/api/v1/convert/submit-base64 \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documentBase64": "JVBERi0xLjQK...",
    "filename": "document.pdf",
    "languages": "en,de",
    "outputs": "text"
  }'

Submit URL

curl -X POST https://api.ainoflow.io/api/v1/convert/submit-url \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sourceUrl": "https://example.com/document.pdf",
    "languages": "en,de",
    "outputs": "text,pdf"
  }'

Transcribe Audio File

curl -X POST https://api.ainoflow.io/api/v1/convert/submit-url \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sourceUrl": "https://example.com/meeting.mp3",
    "languages": "en",
    "outputs": "text"
  }'

Check Job Status

curl https://api.ainoflow.io/api/v1/convert/jobs/{jobId} \
  -H "Authorization: Bearer YOUR_API_KEY"

Common Error Codes

400
Bad Request - Invalid request parameters or unsupported language code
401
Unauthorized - Missing or invalid API key
403
Forbidden - Insufficient permissions
404
Not Found - Resource not found
429
Too Many Requests - Convert API jobs limit reached for the current billing cycle
500
Internal Server Error - An unexpected error occurred

Additional Features

Supported Languages (60+)
OCR and audio transcription supports over 60 languages including European, Asian, Middle Eastern, and African languages

Language codes (comma-separated): en, de, fr, es, it, pt, ru, pl, uk, cs, and many more.

Examples: en (English), de (German), fr (French), es (Spanish), ru (Russian), zh-cn (Chinese Simplified), ar (Arabic), ja (Japanese)

Processing Models
Automatic model selection based on file type
auto (default) - Automatically selects optimal model: PaddleOCR for images, Tesseract for PDFs, Whisper for audio
tesseract - Traditional OCR engine, best for PDFs
paddleocr - AI-powered OCR, best for images
whispersmall - Default audio transcription model (best quality)
whispertiny / whisperbase - Faster audio transcription options
Response Modes
Choose how you want to receive processing results
polling (default) - Returns job ID immediately, client polls for status and gets download URLs
direct - Waits for processing and returns file contents directly in response (base64 for PDFs)
webhook - Returns job ID immediately, sends POST notification to your webhook URL when complete
persisted - Waits for processing but returns download URLs instead of file contents
Webhook Integration
Get notified when processing completes

Set response: "webhook" and provide webhookUrl to receive POST notifications.

Webhook payload contains the complete job status with download URLs, identical to the GET /jobs/{jobId} response.

Ready to Get Started?

Create your free account and get your API key in minutes.