Batch API
Process large volumes of requests asynchronously at lower cost.
Overview
The Batch API processes multiple requests asynchronously, ideal for:
- Bulk data processing
- Batch evaluations
- Large-scale embeddings
- Offline content generation
Create Batch
POST /v1/batchesStep 1: Upload Input File
Create a JSONL file where each line is a request:
{"custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "model-id", "messages": [{"role": "user", "content": "Summarize: AI is transforming healthcare..."}]}}
{"custom_id": "req-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "model-id", "messages": [{"role": "user", "content": "Translate to French: Hello World"}]}}Upload it:
batch_file = client.files.create(
file=open("batch_input.jsonl", "rb"),
purpose="batch"
)Step 2: Create Batch Job
batch = client.batches.create(
input_file_id=batch_file.id,
endpoint="/v1/chat/completions",
completion_window="24h"
)
print(f"Batch ID: {batch.id}")
print(f"Status: {batch.status}")Step 3: Monitor Progress
batch = client.batches.retrieve(batch.id)
print(f"Status: {batch.status}")
print(f"Completed: {batch.request_counts.completed}/{batch.request_counts.total}")Step 4: Retrieve Results
if batch.status == "completed":
content = client.files.content(batch.output_file_id)
results = content.text
for line in results.strip().split("\n"):
result = json.loads(line)
print(f"{result['custom_id']}: {result['response']['body']['choices'][0]['message']['content']}")Batch Status
| Status | Description |
|---|---|
validating | Input file is being validated |
in_progress | Requests are being processed |
completed | All requests processed |
failed | Batch failed |
expired | Exceeded completion window |
cancelled | Cancelled by user |
Cancel Batch
client.batches.cancel(batch.id)How is this guide?