llms.txtllms-full.txt
DashboardStatusGet API Key
IntroductionQuickstartModelsPricingArchitecture & SecurityLimits & Quotas
Execution Modes & HTTP QueueWebhooksWebSocketsMCP Servern8n Integrationn8n dryAPI node
API OverviewErrorsText-to-ImagePOSTText-to-Image Price CalculationPOSTText-to-VideoPOSTText-to-Video Price CalculationPOSTImage-to-VideoPOSTImage-to-Video Price CalculationPOSTAudio-to-VideoPOSTAudio-to-Video Price CalculationPOSTText-to-Speech (TTS)POSTText-to-Speech Price CalculationPOSTText-to-MusicPOSTText-to-Music Price CalculationPOSTText-to-EmbeddingPOSTText-to-Embedding Price CalculationPOSTImage-to-ImagePOSTImage-to-Image Price CalculationPOSTImage Background RemovalPOSTImage Background Removal Price CalculationPOSTImage UpscalePOSTImage Upscale Price CalculationPOST
OpenAPI
OpenAPI ReferenceJobsOpenAI CompatibleRunPod JobsWebhooks POST
SDKs & IntegrationsPayment MethodsFAQ — Frequently Asked QuestionsSupport & Contact
dAdryAPI
DashboardStatusGet API Key
APIAPI Reference
Technical Reference

OpenAI Compatible

OpenAI/OpenRouter-style inference enqueue endpoints.

Queue a chat completion request

POST
/v1/chat/completions

Authorization

BearerAuth
AuthorizationBearer <token>

Use Authorization: Bearer <api-key>.

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

Response Body

application/json

application/json

curl -X POST "https://loading/v1/chat/completions" \  -H "Content-Type: application/json" \  -d '{    "messages": [      {        "role": "user",        "content": "Summarize this document in 5 bullets."      }    ]  }'
{
  "id": "f3de27f8-61d5-4d58-aad1-a7d63f8a6e0f",
  "object": "chat.completion.enqueue",
  "created": 1773504000,
  "status": "queued",
  "surface": "chat",
  "model": "Llama3_8B_Instruct",
  "endpoint_id": "o6r4i5q9j8k7l6",
  "runpod": {
    "id": "f3de27f8-61d5-4d58-aad1-a7d63f8a6e0f",
    "status": "IN_QUEUE"
  }
}
{
  "error": {
    "code": "missing_endpoint_id",
    "message": "No RunPod endpoint configured for surface chat"
  }
}
Empty
Empty
Empty
Empty

Queue an image generation request

POST
/v1/images/generations

Authorization

BearerAuth
AuthorizationBearer <token>

Use Authorization: Bearer <api-key>.

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

Response Body

application/json

curl -X POST "https://loading/v1/images/generations" \  -H "Content-Type: application/json" \  -d '{    "prompt": "A moody cyberpunk alley at night with rain reflections, 35mm film look"  }'
{
  "id": "f3de27f8-61d5-4d58-aad1-a7d63f8a6e0f",
  "object": "image.generation.enqueue",
  "created": 1773504000,
  "status": "queued",
  "surface": "images",
  "model": "Flux1schnell",
  "endpoint_id": "o6r4i5q9j8k7l6",
  "runpod": {
    "id": "f3de27f8-61d5-4d58-aad1-a7d63f8a6e0f",
    "status": "IN_QUEUE"
  }
}
Empty
Empty
Empty
Empty
Empty

Queue an audio transcription request

POST
/v1/audio/transcriptions

Authorization

BearerAuth
AuthorizationBearer <token>

Use Authorization: Bearer <api-key>.

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

Response Body

application/json

curl -X POST "https://loading/v1/audio/transcriptions" \  -H "Content-Type: application/json" \  -d '{    "audioUrl": "https://cdn.example.com/audio/customer-call-2026-03-15.mp3"  }'
{
  "id": "f3de27f8-61d5-4d58-aad1-a7d63f8a6e0f",
  "object": "audio.transcription.enqueue",
  "created": 1773504000,
  "status": "queued",
  "surface": "transcribe",
  "model": "WhisperLargeV3",
  "endpoint_id": "o6r4i5q9j8k7l6",
  "runpod": {
    "id": "f3de27f8-61d5-4d58-aad1-a7d63f8a6e0f",
    "status": "IN_QUEUE"
  }
}
Empty
Empty
Empty
Empty
Empty

Queue an embeddings request

POST
/v1/embeddings

Authorization

BearerAuth
AuthorizationBearer <token>

Use Authorization: Bearer <api-key>.

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

Response Body

application/json

curl -X POST "https://loading/v1/embeddings" \  -H "Content-Type: application/json" \  -d '{    "input": "How to optimize cold starts for serverless GPUs"  }'
{
  "id": "f3de27f8-61d5-4d58-aad1-a7d63f8a6e0f",
  "object": "embedding.enqueue",
  "created": 1773504000,
  "status": "queued",
  "surface": "embeddings",
  "model": "BGE_Large",
  "endpoint_id": "o6r4i5q9j8k7l6",
  "runpod": {
    "id": "f3de27f8-61d5-4d58-aad1-a7d63f8a6e0f",
    "status": "IN_QUEUE"
  }
}
Empty
Empty
Empty
Empty
Empty
Last updated on 21 March 2026

Jobs

Provider-agnostic async status, download, and websocket streaming routes.

RunPod Jobs

Direct provider operational routes for run, runsync, status, and queue control.

model?string

Active chat model slug used for OpenAI-compatible chat routing.

Length1 <= length
messages*array<>

Conversation message list in OpenAI chat format.

Items1 <= items
temperature?number

Sampling temperature. Lower values are more deterministic.

Range0 <= value <= 2
max_tokens?integer

Maximum number of output tokens.

Range1 <= value <= 8192
stream?boolean

Request token streaming from the upstream provider when supported.

Defaultfalse
user?string

Caller-provided user identifier for traceability and abuse monitoring.

Length1 <= length
webhook_url?string

Optional HTTPS webhook URL for async status updates.

Formaturi
[key: string]?any
model?string

Active image model slug used for OpenAI-compatible image routing.

Value in"Ben2" | "Flux_2_Klein_4B_BF16" | "Ltx2_3_22B_Dist_INT8" | "RealESRGAN_x4" | "ZImageTurbo_INT8"
prompt*string

Natural-language prompt used for image generation.

Length1 <= length
negative_prompt?string

Optional content to discourage in the generated output.

Length1 <= length
n?integer

Number of images to generate.

Default1
Range1 <= value <= 8
size?string

Requested output size in WIDTHxHEIGHT format.

Length3 <= length
response_format?|

Preferred image payload format when provider supports it.

webhook_url?string

Optional HTTPS webhook URL for async status updates.

Formaturi
[key: string]?any
model?string

Active transcription model slug used for OpenAI-compatible transcription routing.

Value in"WhisperLargeV3"
audioUrl*string

Publicly accessible audio URL to transcribe.

Formaturi
language?string

Optional BCP-47 language hint, such as en or es.

Length2 <= length
prompt?string

Optional prompt to bias transcription output.

Length1 <= length
response_format?||||

Preferred output format when provider supports custom transcript renderers.

webhook_url?string

Optional HTTPS webhook URL for async status updates.

Formaturi
[key: string]?any
model?string

Active embeddings model slug used for OpenAI-compatible embedding routing.

Value in"Bge_M3_INT8"
input*|array<>
dimensions?integer

Optional embedding dimensionality override when supported by the model.

Range1 <= value
encoding_format?|

Vector encoding preference when supported by the provider.

user?string

Caller-provided user identifier for tracing and abuse monitoring.

Length1 <= length
webhook_url?string

Optional HTTPS webhook URL for async status updates.

Formaturi
[key: string]?any