Skip to content

Models

Every Agent ultimately calls an LLM. The SDK abstracts models behind two lightweight interfaces:

  • Model – knows how to make one request against a specific API.
  • ModelProvider – resolves human‑readable model names (e.g. 'gpt‑5.4') to Model instances.

In day‑to‑day work you normally only interact with model names and occasionally ModelSettings.

Specifying a model per‑agent
import { Agent } from '@openai/agents';
const agent = new Agent({
name: 'Creative writer',
model: 'gpt-5.4',
});

When you don’t specify a model when initializing an Agent, the default model will be used. The default is currently gpt-4.1 for compatibility and low latency. If you have access, we recommend setting your agents to gpt-5.4 for higher quality while keeping explicit modelSettings.

If you want to switch to other models like gpt-5.4, there are two ways to configure your agents.

First, if you want to consistently use a specific model for all agents that do not set a custom model, set the OPENAI_DEFAULT_MODEL environment variable before running your agents.

Terminal window
export OPENAI_DEFAULT_MODEL=gpt-5.4
node my-awesome-agent.js

Second, you can set a default model for a Runner instance. If you don’t set a model for an agent, this Runner’s default model will be used.

Set a default model for a Runner
import { Runner } from '@openai/agents';
const runner = new Runner({ model: 'gpt‑4.1-mini' });

When you use any GPT-5.x model such as gpt-5.4 in this way, the SDK applies default modelSettings. It sets the ones that work the best for most use cases. To adjust the reasoning effort for the default model, pass your own modelSettings:

Customize GPT-5 default settings
import { Agent } from '@openai/agents';
const myAgent = new Agent({
name: 'My Agent',
instructions: "You're a helpful agent.",
// If OPENAI_DEFAULT_MODEL=gpt-5.4 is set, passing only modelSettings works.
// It's also fine to pass a GPT-5.x model name explicitly:
model: 'gpt-5.4',
modelSettings: {
reasoning: { effort: 'high' },
text: { verbosity: 'low' },
},
});

If latency matters, start with reasoning.effort: "none" on gpt-5.4 and increase it only if your task needs more deliberate reasoning. The gpt-4.1 family (including mini and nano variants) also remains a solid choice for building interactive agent apps.

If you pass a non–GPT-5 model name without custom modelSettings, the SDK reverts to generic modelSettings compatible with any model.


The default ModelProvider resolves names using the OpenAI APIs. It supports two distinct endpoints:

APIUsageCall setOpenAIAPI()
Chat CompletionsStandard chat & function callssetOpenAIAPI('chat_completions')
ResponsesNew streaming‑first generative API (tool calls, flexible outputs)setOpenAIAPI('responses') (default)
Set default OpenAI key
import { setDefaultOpenAIKey } from '@openai/agents';
setDefaultOpenAIKey(process.env.OPENAI_API_KEY!); // sk-...

You can also plug your own OpenAI client via setDefaultOpenAIClient(client) if you need custom networking settings.

When you use the OpenAI provider with the Responses API, you can send requests over a WebSocket transport instead of the default HTTP transport.

Enable it globally with setOpenAIResponsesTransport('websocket'), or enable it per provider with new OpenAIProvider({ useResponses: true, useResponsesWebSocket: true }).

You do not need withResponsesWebSocketSession(...) or a custom OpenAIProvider just to use the WebSocket transport. If reconnecting for each run/request is acceptable, your existing run() / Runner.run() usage will continue to work after enabling setOpenAIResponsesTransport('websocket').

Transport selection follows model resolution:

  • setOpenAIResponsesTransport('websocket') only affects string model names that are later resolved through the OpenAI provider while using the Responses API.
  • If you pass a concrete Model instance to an Agent or Runner, that instance is used as-is. OpenAIResponsesWSModel stays on WebSocket, OpenAIResponsesModel stays on HTTP, and OpenAIChatCompletionsModel stays on Chat Completions.
  • If you provide your own modelProvider, that provider controls model resolution. Enable WebSocket there instead of relying on the global setter.
  • If you route through a proxy, gateway, or other OpenAI-compatible endpoint, the target must support the WebSocket /responses endpoint. You may also need to set websocketBaseURL explicitly.

Use withResponsesWebSocketSession(...) or a custom OpenAIProvider / Runner only when you want to optimize connection reuse and manage the websocket provider lifecycle more explicitly:

  • withResponsesWebSocketSession(...): convenient scoped lifecycle with automatic cleanup after the callback.
  • Custom OpenAIProvider / Runner: explicit lifecycle control (including shutdown cleanup) in your own app architecture.

Despite the name, withResponsesWebSocketSession(...) is a transport lifecycle helper and is unrelated to the memory Session interface described in the sessions guide.

If you use a websocket proxy or gateway, configure websocketBaseURL on OpenAIProvider or set OPENAI_WEBSOCKET_BASE_URL.

If you instantiate OpenAIProvider yourself, remember that websocket-backed Responses model wrappers are cached by default for connection reuse. Call await provider.close() during shutdown to release those cached connections. withResponsesWebSocketSession(...) exists largely to manage that lifecycle for you: it creates a websocket-enabled provider and runner, passes them to your callback, and always closes the provider afterward. Use providerOptions for the temporary provider and runnerConfig for callback-scoped runner defaults.

See examples/basic/stream-ws.ts for a full streaming + HITL example using the Responses WebSocket transport.

toolSearchTool(), toolNamespace(), and function tools or hosted MCP tools that set deferLoading: true require the OpenAI Responses API. The Chat Completions provider rejects namespaced or deferred function tools, and the AI SDK adapter does not support deferred Responses tool-loading flows. Use a Responses model directly when you need tool search.

Tool search is supported only on GPT-5.4 and newer model releases that support it in the Responses API.

When a run includes deferred tools, add toolSearchTool() to the same agent and keep modelSettings.toolChoice on 'auto'. The SDK does not let you force the built-in tool_search tool or a deferred function tool by name because the model needs to decide when to load those definitions. See the Tools guide and the official OpenAI tool search guide for the full setup.


ModelSettings mirrors the OpenAI parameters but is provider‑agnostic.

FieldTypeNotes
temperaturenumberCreativity vs. determinism.
topPnumberNucleus sampling.
frequencyPenaltynumberPenalise repeated tokens.
presencePenaltynumberEncourage new tokens.
toolChoice'auto' | 'required' | 'none' | stringSee forcing tool use. On OpenAI Responses, toolChoice: 'computer' forces the GA built-in computer tool when available.
parallelToolCallsbooleanAllow parallel function calls where supported.
truncation'auto' | 'disabled'Token truncation strategy.
maxTokensnumberMaximum tokens in the response.
storebooleanPersist the response for retrieval / RAG workflows.
promptCacheRetention'in-memory' | '24h' | nullControls provider prompt-cache retention when supported.
reasoning.effort'none' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'Reasoning effort for gpt-5.x models.
reasoning.summary'auto' | 'concise' | 'detailed'Controls how much reasoning summary the model returns.
text.verbosity'low' | 'medium' | 'high'Text verbosity for gpt-5.x etc.
providerDataRecord<string, any>Provider-specific passthrough options forwarded to the underlying model.
retryModelRetrySettingsRuntime-only opt-in retry config. See Model retries.

Attach settings at either level:

Model settings
import { Runner, Agent } from '@openai/agents';
const agent = new Agent({
name: 'Creative writer',
// ...
modelSettings: { temperature: 0.7, toolChoice: 'auto' },
});
// or globally
new Runner({ modelSettings: { temperature: 0.3 } });

Runner‑level settings override any conflicting per‑agent settings. retry is the notable exception: its nested fields are merged across runner and agent settings unless you explicitly clear inherited values with undefined.

Retries are runtime-only and opt-in. The SDK does not retry model requests unless you configure modelSettings.retry and your policy returns a retry decision.

Opt in to model retries
import { Agent, Runner, retryPolicies } from '@openai/agents';
const sharedRetry = {
maxRetries: 4,
backoff: {
initialDelayMs: 500,
maxDelayMs: 5_000,
multiplier: 2,
jitter: true,
},
policy: retryPolicies.any(
retryPolicies.providerSuggested(),
retryPolicies.retryAfter(),
retryPolicies.networkError(),
retryPolicies.httpStatus([408, 409, 429, 500, 502, 503, 504]),
),
};
const runner = new Runner({
modelSettings: {
retry: sharedRetry,
},
});
const agent = new Agent({
name: 'Assistant',
instructions: 'You are a concise assistant.',
modelSettings: {
retry: {
maxRetries: 2,
backoff: {
maxDelayMs: 2_000,
},
},
},
});
await runner.run(agent, 'Summarize exponential backoff in plain English.');

ModelRetrySettings has three fields:

FieldTypeNotes
maxRetriesnumberNumber of retry attempts allowed after the initial request.
backoff{ initialDelayMs?, maxDelayMs?, multiplier?, jitter? }Default delay strategy when the policy retries without returning delayMs.
policyRetryPolicyCallback that decides whether to retry. This function is runtime-only and is not serialized into persisted run state.

A retry policy receives a RetryPolicyContext with:

  • attempt and maxRetries so you can make attempt-aware decisions.
  • stream so you can branch between streamed and non-streamed behavior.
  • error for raw inspection.
  • normalized facts such as statusCode, retryAfterMs, errorCode, isNetworkError, and isAbort.
  • providerAdvice when the underlying model/provider can supply retry guidance.

The policy can return either:

  • true / false for a simple retry decision.
  • { retry, delayMs?, reason? } when you want to override the delay or attach a diagnostic reason for logging.

The SDK exports ready-made helpers on retryPolicies:

HelperBehavior
retryPolicies.never()Always opts out.
retryPolicies.providerSuggested()Follows provider retry advice when available.
retryPolicies.networkError()Matches transient transport/connectivity failures.
retryPolicies.httpStatus([..])Matches selected HTTP status codes.
retryPolicies.retryAfter()Retries only when a retry-after hint is available, using that delay.
retryPolicies.any(...)Retries when any nested policy opts in.
retryPolicies.all(...)Retries only when every nested policy opts in.

When you compose policies, providerSuggested() is the safest first building block because it preserves provider vetoes and replay-safety approvals when the provider can distinguish them.

Some failures are never retried automatically:

  • Abort errors.
  • Streamed runs after any visible event or raw model event has already been emitted.
  • Provider advice that marks replay as unsafe.

Stateful follow-up requests using previousResponseId or conversationId are also treated more conservatively. For those requests, non-provider predicates such as networkError() or httpStatus([500]) are not enough by themselves. The retry policy must include a replay-safe approval from the provider, typically via retryPolicies.providerSuggested().

retry is deep-merged between runner-level and agent-level modelSettings:

  • An agent can override only retry.maxRetries and still inherit the runner’s policy.
  • An agent can override only part of retry.backoff and keep sibling backoff fields from the runner.
  • If you need to remove an inherited policy or backoff, set that field to undefined explicitly.

See examples/basic/retry.ts and examples/ai-sdk/retry.ts for fuller examples with logging.


Agents can be configured with a prompt parameter, indicating a server-stored prompt configuration that should be used to control the Agent’s behavior. Currently, this option is only supported when you use the OpenAI Responses API.

prompt can be either a static object or a function that returns one at runtime. For the callback shape, see Dynamic prompts.

FieldTypeNotes
promptIdstringUnique identifier for a prompt.
versionstringVersion of the prompt you wish to use.
variablesobjectA key/value pair of variables to substitute into the prompt. Values can be strings or content input types like text, images, or files.
Agent with prompt
import { parseArgs } from 'node:util';
import { Agent, run } from '@openai/agents';
/*
NOTE: This example will not work out of the box, because the default prompt ID will not
be available in your project.
To use it, please:
1. Go to https://platform.openai.com/chat/edit
2. Create a new prompt variable, `poem_style`.
3. Create a system prompt with the content:
Write a poem in {{poem_style}}
4. Run the example with the `--prompt-id` flag.
*/
const DEFAULT_PROMPT_ID =
'pmpt_6965a984c7ac8194a8f4e79b00f838840118c1e58beb3332';
const POEM_STYLES = ['limerick', 'haiku', 'ballad'];
function pickPoemStyle(): string {
return POEM_STYLES[Math.floor(Math.random() * POEM_STYLES.length)];
}
async function runDynamic(promptId: string) {
const poemStyle = pickPoemStyle();
console.log(`[debug] Dynamic poem_style: ${poemStyle}`);
const agent = new Agent({
name: 'Assistant',
prompt: {
promptId,
version: '1',
variables: { poem_style: poemStyle },
},
});
const result = await run(agent, 'Tell me about recursion in programming.');
console.log(result.finalOutput);
}
async function runStatic(promptId: string) {
const agent = new Agent({
name: 'Assistant',
prompt: {
promptId,
version: '1',
variables: { poem_style: 'limerick' },
},
});
const result = await run(agent, 'Tell me about recursion in programming.');
console.log(result.finalOutput);
}
async function main() {
const args = parseArgs({
options: {
dynamic: { type: 'boolean', default: false },
'prompt-id': { type: 'string', default: DEFAULT_PROMPT_ID },
},
});
const promptId = args.values['prompt-id'];
if (!promptId) {
console.error('Please provide a prompt ID via --prompt-id.');
process.exit(1);
}
if (args.values.dynamic) {
await runDynamic(promptId);
} else {
await runStatic(promptId);
}
}
main().catch((error) => {
console.error(error);
process.exit(1);
});

Any additional agent configuration, like tools or instructions, will override the values you may have configured in your stored prompt.

When a stored prompt already defines the model, the SDK does not send the agent’s default model unless you explicitly override it. That matters for computerTool(): prompt-managed runs keep the legacy preview wire shape by default for compatibility. To opt into the GA Responses computer tool on a prompt-managed run, explicitly set modelSettings.toolChoice: 'computer' or send an explicit model such as gpt-5.4. See Tools for the surrounding computer-use details.


Implementing your own provider is straightforward – implement ModelProvider and Model and pass the provider to the Runner constructor:

Minimal custom provider
import {
ModelProvider,
Model,
ModelRequest,
ModelResponse,
ResponseStreamEvent,
} from '@openai/agents-core';
import { Agent, Runner } from '@openai/agents';
class EchoModel implements Model {
name: string;
constructor() {
this.name = 'Echo';
}
async getResponse(request: ModelRequest): Promise<ModelResponse> {
return {
usage: {},
output: [{ role: 'assistant', content: request.input as string }],
} as any;
}
async *getStreamedResponse(
_request: ModelRequest,
): AsyncIterable<ResponseStreamEvent> {
yield {
type: 'response.completed',
response: { output: [], usage: {} },
} as any;
}
}
class EchoProvider implements ModelProvider {
getModel(_modelName?: string): Promise<Model> | Model {
return new EchoModel();
}
}
const runner = new Runner({ modelProvider: new EchoProvider() });
console.log(runner.config.modelProvider.getModel());
const agent = new Agent({
name: 'Test Agent',
instructions: 'You are a helpful assistant.',
model: new EchoModel(),
modelSettings: { temperature: 0.7, toolChoice: 'auto' },
});
console.log(agent.model);

If you want every run() call and every newly constructed Runner to use the same provider by default, set it once during app startup:

Set a default model provider
import { setDefaultModelProvider } from '@openai/agents';
setDefaultModelProvider({
async getModel() {
// Return any Model implementation here.
throw new Error('Provide your own model implementation.');
},
});

This is useful when your app standardizes on a non-OpenAI provider and you do not want to pass a custom Runner everywhere.

If you want to use non-OpenAI models without implementing ModelProvider yourself, see Using any model with Vercel’s AI SDK. That adapter lets you plug an AI SDK model into the Agents runtime directly, which is useful when your app already standardizes on AI SDK providers or you want access to the wider provider ecosystem. It also documents how Agents SDK providerData maps to AI SDK providerMetadata, plus the stream helpers available for AI SDK UI routes.


Tracing is already enabled by default in supported server runtimes. Use setTracingExportApiKey() only when trace export should use a different credential than the default OpenAI API key:

Set tracing export API key
import { setTracingExportApiKey } from '@openai/agents';
setTracingExportApiKey('sk-...');

This sends traces to the OpenAI dashboard using that credential. For exporter customization such as custom ingest endpoints or retry tuning, see the Tracing guide.