Reasoning

Advanced reasoning capabilities with the Responses API

The Responses API supports advanced reasoning capabilities, allowing models to show their internal reasoning process with configurable effort levels.

Reasoning Configuration

Configure reasoning behavior using the reasoning parameter:

const response = await fetch('https://llm.onerouter.pro/v1/responses', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer <<API_KEY>>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'o4-mini',
    input: 'What is the meaning of life?',
    reasoning: {
      effort: 'high'
    },
    max_output_tokens: 9000,
  }),
});

const result = await response.json();
console.log(result);

Reasoning Effort Levels

The effort parameter controls how much computational effort the model puts into reasoning:

Effort Level
Description

minimal

Basic reasoning with minimal computational effort

low

Light reasoning for simple problems

medium

Balanced reasoning for moderate complexity

high

Deep reasoning for complex problems

Complex Reasoning Example

For complex mathematical or logical problems:

const response = await fetch('https://llm.onerouter.pro/v1/responses', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer <<API_KEY>>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'o4-mini',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          {
            type: 'input_text',
            text: 'Was 1995 30 years ago? Please show your reasoning.',
          },
        ],
      },
    ],
    reasoning: {
      effort: 'high'
    },
    max_output_tokens: 9000,
  }),
});

const result = await response.json();
console.log(result);

Reasoning in Conversation Context

Include reasoning in multi-turn conversations:

const response = await fetch('https://llm.onerouter.pro/v1/responses', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer <<API_KEY>>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'o4-mini',
    input: [
      {
        type: 'message',
        role: 'user',
        content: [
          {
            type: 'input_text',
            text: 'What is your favorite color?',
          },
        ],
      },
      {
        type: 'message',
        role: 'assistant',
        id: 'msg_abc123',
        status: 'completed',
        content: [
          {
            type: 'output_text',
            text: "I don't have a favorite color.",
            annotations: []
          }
        ]
      },
      {
        type: 'message',
        role: 'user',
        content: [
          {
            type: 'input_text',
            text: 'How many Earths can fit on Mars?',
          },
        ],
      },
    ],
    reasoning: {
      effort: 'high'
    },
    max_output_tokens: 9000,
  }),
});

const result = await response.json();
console.log(result);

Streaming Reasoning

Enable streaming to see reasoning develop in real-time:

const response = await fetch('https://llm.onerouter.pro/v1/responses', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer <<API_KEY>>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'o4-mini',
    input: 'Solve this step by step: If a train travels 60 mph for 2.5 hours, how far does it go?',
    reasoning: {
      effort: 'medium'
    },
    stream: true,
    max_output_tokens: 9000,
  }),
});

const reader = response.body?.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const chunk = decoder.decode(value);
  const lines = chunk.split('\n');

  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = line.slice(6);
      if (data === '[DONE]') return;

      try {
        const parsed = JSON.parse(data);
        if (parsed.type === 'response.output_text.delta') {
          console.log('Reasoning:', parsed.delta);
        }
      } catch (e) {
        // Skip invalid JSON
      }
    }
  }
}

Best Practices

  1. Choose appropriate effort levels: Use high for complex problems, low for simple tasks

  2. Consider token usage: Reasoning increases token consumption

  3. Use streaming: For long reasoning chains, streaming provides better user experience

  4. Include context: Provide sufficient context for the model to reason effectively

Next Steps

  • Explore Tool Calling with reasoning

Tool Calling
  • Review Basic Usage fundamentals

Basic Usage

Last updated