Overview
An overview of OneRouter's API
OneRouter's request and response schemas are very similar to the OpenAI Chat API, with a few small differences. At a high level, OneRouter normalizes the schema across models and providers so you only need to learn one.
Quick start
Using the OpenAI SDK
from openai import OpenAI
client = OpenAI(
base_url="https://llm.onerouter.pro/v1",
api_key="<API_KEY>",
)
completion = client.chat.completions.create(
model="claude-3-5-sonnet@20240620",
messages=[
{
"role": "user",
"content": "What is the meaning of life?"
}
]
)
print(completion.choices[0].message.content)import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'https://llm.onerouter.pro/v1',
apiKey: '<API_KEY>',
});
async function main() {
const completion = await openai.chat.completions.create({
model: 'claude-3-5-sonnet@20240620',
messages: [
{
role: 'user',
content: 'What is the meaning of life?',
},
],
});
console.log(completion.choices[0].message);
}
main();Using the OneRouter API directly
Requests
Completions Request Format
Here is the request schema as a TypeScript type. This will be the body of your POST request to the /v1/chat/completions endpoint (see the quick start above for an example).
For a complete list of parameters, see the Parameters.
Headers
Assistant Prefill
OneRouter supports asking models to complete a partial response. This can be useful for guiding models to respond in a certain way.
To use this features, simply include a message with role: "assistant" at the end of your messages array.
Responses
CompletionsResponse Format
OneRouter normalizes the schema across models and providers to comply with the OpenAI Chat API.
This means that choices is always an array, even if the model only returns one completion. Each choice will contain a delta property if a stream was requested and a message property otherwise. This makes it easier to use the same code for all models.
Here's the response schema as a TypeScript type:
Here's an example:
Finish Reason
OneRouter normalizes each model's finish_reason to one of the following values: tool_calls, stop, length, content_filter, error.
Querying Cost and Stats
The token counts that are returned in the completions API response are not counted via the model's native tokenizer. Instead it uses a normalized, model-agnostic count (accomplished via the GPT4o tokenizer). This is because some providers do not reliably return native token counts. This behavior is becoming more rare, however, and we may add native token counts to the response object in the future.
Credit usage and model pricing are based on the native token counts (not the 'normalized' token counts returned in the API response).
Note that token counts are also available in the usage field of the response body for non-streaming completions.
Last updated