Models
Every Agent ultimately calls an LLM. The SDK abstracts models behind two lightweight interfaces:
Model– knows how to make one request against a specific API.ModelProvider– resolves human‑readable model names (e.g.'gpt‑5.4') toModelinstances.
In day‑to‑day work you normally only interact with model names and occasionally ModelSettings.
import { Agent } from '@openai/agents';
const agent = new Agent({ name: 'Creative writer', model: 'gpt-5.4',});Choosing models
Section titled “Choosing models”Default model
Section titled “Default model”When you don’t specify a model when initializing an Agent, the default model will be used. The default is currently gpt-4.1 for compatibility and low latency. If you have access, we recommend setting your agents to gpt-5.4 for higher quality while keeping explicit modelSettings.
If you want to switch to other models like gpt-5.4, there are two ways to configure your agents.
First, if you want to consistently use a specific model for all agents that do not set a custom model, set the OPENAI_DEFAULT_MODEL environment variable before running your agents.
export OPENAI_DEFAULT_MODEL=gpt-5.4node my-awesome-agent.jsSecond, you can set a default model for a Runner instance. If you don’t set a model for an agent, this Runner’s default model will be used.
import { Runner } from '@openai/agents';
const runner = new Runner({ model: 'gpt‑4.1-mini' });GPT-5.x models
Section titled “GPT-5.x models”When you use any GPT-5.x model such as gpt-5.4 in this way, the SDK applies default modelSettings. It sets the ones that work the best for most use cases. To adjust the reasoning effort for the default model, pass your own modelSettings:
import { Agent } from '@openai/agents';
const myAgent = new Agent({ name: 'My Agent', instructions: "You're a helpful agent.", // If OPENAI_DEFAULT_MODEL=gpt-5.4 is set, passing only modelSettings works. // It's also fine to pass a GPT-5.x model name explicitly: model: 'gpt-5.4', modelSettings: { reasoning: { effort: 'high' }, text: { verbosity: 'low' }, },});If latency matters, start with reasoning.effort: "none" on gpt-5.4 and increase it only if your task needs more deliberate reasoning. The gpt-4.1 family (including mini and nano variants) also remains a solid choice for building interactive agent apps.
Non-GPT-5 models
Section titled “Non-GPT-5 models”If you pass a non–GPT-5 model name without custom modelSettings, the SDK reverts to generic modelSettings compatible with any model.
OpenAI provider configuration
Section titled “OpenAI provider configuration”The OpenAI provider
Section titled “The OpenAI provider”The default ModelProvider resolves names using the OpenAI APIs. It supports two distinct endpoints:
| API | Usage | Call setOpenAIAPI() |
|---|---|---|
| Chat Completions | Standard chat & function calls | setOpenAIAPI('chat_completions') |
| Responses | New streaming‑first generative API (tool calls, flexible outputs) | setOpenAIAPI('responses') (default) |
Authentication
Section titled “Authentication”import { setDefaultOpenAIKey } from '@openai/agents';
setDefaultOpenAIKey(process.env.OPENAI_API_KEY!); // sk-...You can also plug your own OpenAI client via setDefaultOpenAIClient(client) if you need custom networking settings.
Responses WebSocket transport
Section titled “Responses WebSocket transport”When you use the OpenAI provider with the Responses API, you can send requests over a WebSocket transport instead of the default HTTP transport.
Enable it globally with setOpenAIResponsesTransport('websocket'), or enable it per provider with new OpenAIProvider({ useResponses: true, useResponsesWebSocket: true }).
You do not need withResponsesWebSocketSession(...) or a custom OpenAIProvider just to use the WebSocket transport. If reconnecting for each run/request is acceptable, your existing run() / Runner.run() usage will continue to work after enabling setOpenAIResponsesTransport('websocket').
Transport selection follows model resolution:
setOpenAIResponsesTransport('websocket')only affects string model names that are later resolved through the OpenAI provider while using the Responses API.- If you pass a concrete
Modelinstance to anAgentorRunner, that instance is used as-is.OpenAIResponsesWSModelstays on WebSocket,OpenAIResponsesModelstays on HTTP, andOpenAIChatCompletionsModelstays on Chat Completions. - If you provide your own
modelProvider, that provider controls model resolution. Enable WebSocket there instead of relying on the global setter. - If you route through a proxy, gateway, or other OpenAI-compatible endpoint, the target must support the WebSocket
/responsesendpoint. You may also need to setwebsocketBaseURLexplicitly.
Use withResponsesWebSocketSession(...) or a custom OpenAIProvider / Runner only when you want to optimize connection reuse and manage the websocket provider lifecycle more explicitly:
withResponsesWebSocketSession(...): convenient scoped lifecycle with automatic cleanup after the callback.- Custom
OpenAIProvider/Runner: explicit lifecycle control (including shutdown cleanup) in your own app architecture.
Despite the name, withResponsesWebSocketSession(...) is a transport lifecycle helper and is unrelated to the memory Session interface described in the sessions guide.
If you use a websocket proxy or gateway, configure websocketBaseURL on OpenAIProvider or set OPENAI_WEBSOCKET_BASE_URL.
If you instantiate OpenAIProvider yourself, remember that websocket-backed Responses model wrappers are cached by default for connection reuse. Call await provider.close() during shutdown to release those cached connections. withResponsesWebSocketSession(...) exists largely to manage that lifecycle for you: it creates a websocket-enabled provider and runner, passes them to your callback, and always closes the provider afterward. Use providerOptions for the temporary provider and runnerConfig for callback-scoped runner defaults.
See examples/basic/stream-ws.ts for a full streaming + HITL example using the Responses WebSocket transport.
Responses-only deferred tool loading
Section titled “Responses-only deferred tool loading”toolSearchTool(), toolNamespace(), and function tools or hosted MCP tools that set deferLoading: true require the OpenAI Responses API. The Chat Completions provider rejects namespaced or deferred function tools, and the AI SDK adapter does not support deferred Responses tool-loading flows. Use a Responses model directly when you need tool search.
Tool search is supported only on GPT-5.4 and newer model releases that support it in the Responses API.
When a run includes deferred tools, add toolSearchTool() to the same agent and keep modelSettings.toolChoice on 'auto'. The SDK does not let you force the built-in tool_search tool or a deferred function tool by name because the model needs to decide when to load those definitions. See the Tools guide and the official OpenAI tool search guide for the full setup.
Model behavior and prompts
Section titled “Model behavior and prompts”ModelSettings
Section titled “ModelSettings”ModelSettings mirrors the OpenAI parameters but is provider‑agnostic.
| Field | Type | Notes |
|---|---|---|
temperature | number | Creativity vs. determinism. |
topP | number | Nucleus sampling. |
frequencyPenalty | number | Penalise repeated tokens. |
presencePenalty | number | Encourage new tokens. |
toolChoice | 'auto' | 'required' | 'none' | string | See forcing tool use. On OpenAI Responses, toolChoice: 'computer' forces the GA built-in computer tool when available. |
parallelToolCalls | boolean | Allow parallel function calls where supported. |
truncation | 'auto' | 'disabled' | Token truncation strategy. |
maxTokens | number | Maximum tokens in the response. |
store | boolean | Persist the response for retrieval / RAG workflows. |
promptCacheRetention | 'in-memory' | '24h' | null | Controls provider prompt-cache retention when supported. |
reasoning.effort | 'none' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh' | Reasoning effort for gpt-5.x models. |
reasoning.summary | 'auto' | 'concise' | 'detailed' | Controls how much reasoning summary the model returns. |
text.verbosity | 'low' | 'medium' | 'high' | Text verbosity for gpt-5.x etc. |
providerData | Record<string, any> | Provider-specific passthrough options forwarded to the underlying model. |
retry | ModelRetrySettings | Runtime-only opt-in retry config. See Model retries. |
Attach settings at either level:
import { Runner, Agent } from '@openai/agents';
const agent = new Agent({ name: 'Creative writer', // ... modelSettings: { temperature: 0.7, toolChoice: 'auto' },});
// or globallynew Runner({ modelSettings: { temperature: 0.3 } });Runner‑level settings override any conflicting per‑agent settings. retry is the notable exception: its nested fields are merged across runner and agent settings unless you explicitly clear inherited values with undefined.
Model retries
Section titled “Model retries”Retries are runtime-only and opt-in. The SDK does not retry model requests unless you configure modelSettings.retry and your policy returns a retry decision.
import { Agent, Runner, retryPolicies } from '@openai/agents';
const sharedRetry = { maxRetries: 4, backoff: { initialDelayMs: 500, maxDelayMs: 5_000, multiplier: 2, jitter: true, }, policy: retryPolicies.any( retryPolicies.providerSuggested(), retryPolicies.retryAfter(), retryPolicies.networkError(), retryPolicies.httpStatus([408, 409, 429, 500, 502, 503, 504]), ),};
const runner = new Runner({ modelSettings: { retry: sharedRetry, },});
const agent = new Agent({ name: 'Assistant', instructions: 'You are a concise assistant.', modelSettings: { retry: { maxRetries: 2, backoff: { maxDelayMs: 2_000, }, }, },});
await runner.run(agent, 'Summarize exponential backoff in plain English.');ModelRetrySettings has three fields:
| Field | Type | Notes |
|---|---|---|
maxRetries | number | Number of retry attempts allowed after the initial request. |
backoff | { initialDelayMs?, maxDelayMs?, multiplier?, jitter? } | Default delay strategy when the policy retries without returning delayMs. |
policy | RetryPolicy | Callback that decides whether to retry. This function is runtime-only and is not serialized into persisted run state. |
A retry policy receives a RetryPolicyContext with:
attemptandmaxRetriesso you can make attempt-aware decisions.streamso you can branch between streamed and non-streamed behavior.errorfor raw inspection.normalizedfacts such asstatusCode,retryAfterMs,errorCode,isNetworkError, andisAbort.providerAdvicewhen the underlying model/provider can supply retry guidance.
The policy can return either:
true/falsefor a simple retry decision.{ retry, delayMs?, reason? }when you want to override the delay or attach a diagnostic reason for logging.
The SDK exports ready-made helpers on retryPolicies:
| Helper | Behavior |
|---|---|
retryPolicies.never() | Always opts out. |
retryPolicies.providerSuggested() | Follows provider retry advice when available. |
retryPolicies.networkError() | Matches transient transport/connectivity failures. |
retryPolicies.httpStatus([..]) | Matches selected HTTP status codes. |
retryPolicies.retryAfter() | Retries only when a retry-after hint is available, using that delay. |
retryPolicies.any(...) | Retries when any nested policy opts in. |
retryPolicies.all(...) | Retries only when every nested policy opts in. |
When you compose policies, providerSuggested() is the safest first building block because it preserves provider vetoes and replay-safety approvals when the provider can distinguish them.
Safety boundaries
Section titled “Safety boundaries”Some failures are never retried automatically:
- Abort errors.
- Streamed runs after any visible event or raw model event has already been emitted.
- Provider advice that marks replay as unsafe.
Stateful follow-up requests using previousResponseId or conversationId are also treated more conservatively. For those requests, non-provider predicates such as networkError() or httpStatus([500]) are not enough by themselves. The retry policy must include a replay-safe approval from the provider, typically via retryPolicies.providerSuggested().
Runner and agent merge behavior
Section titled “Runner and agent merge behavior”retry is deep-merged between runner-level and agent-level modelSettings:
- An agent can override only
retry.maxRetriesand still inherit the runner’spolicy. - An agent can override only part of
retry.backoffand keep sibling backoff fields from the runner. - If you need to remove an inherited
policyorbackoff, set that field toundefinedexplicitly.
See examples/basic/retry.ts and examples/ai-sdk/retry.ts for fuller examples with logging.
Prompt
Section titled “Prompt”Agents can be configured with a prompt parameter, indicating a server-stored prompt configuration that should be used to control the Agent’s behavior. Currently, this option is only supported when you use the OpenAI Responses API.
prompt can be either a static object or a function that returns one at runtime. For the callback shape, see Dynamic prompts.
| Field | Type | Notes |
|---|---|---|
promptId | string | Unique identifier for a prompt. |
version | string | Version of the prompt you wish to use. |
variables | object | A key/value pair of variables to substitute into the prompt. Values can be strings or content input types like text, images, or files. |
import { parseArgs } from 'node:util';import { Agent, run } from '@openai/agents';
/*NOTE: This example will not work out of the box, because the default prompt ID will notbe available in your project.
To use it, please:1. Go to https://platform.openai.com/chat/edit2. Create a new prompt variable, `poem_style`.3. Create a system prompt with the content: Write a poem in {{poem_style}}4. Run the example with the `--prompt-id` flag.*/
const DEFAULT_PROMPT_ID = 'pmpt_6965a984c7ac8194a8f4e79b00f838840118c1e58beb3332';const POEM_STYLES = ['limerick', 'haiku', 'ballad'];
function pickPoemStyle(): string { return POEM_STYLES[Math.floor(Math.random() * POEM_STYLES.length)];}
async function runDynamic(promptId: string) { const poemStyle = pickPoemStyle(); console.log(`[debug] Dynamic poem_style: ${poemStyle}`);
const agent = new Agent({ name: 'Assistant', prompt: { promptId, version: '1', variables: { poem_style: poemStyle }, }, });
const result = await run(agent, 'Tell me about recursion in programming.'); console.log(result.finalOutput);}
async function runStatic(promptId: string) { const agent = new Agent({ name: 'Assistant', prompt: { promptId, version: '1', variables: { poem_style: 'limerick' }, }, });
const result = await run(agent, 'Tell me about recursion in programming.'); console.log(result.finalOutput);}
async function main() { const args = parseArgs({ options: { dynamic: { type: 'boolean', default: false }, 'prompt-id': { type: 'string', default: DEFAULT_PROMPT_ID }, }, });
const promptId = args.values['prompt-id']; if (!promptId) { console.error('Please provide a prompt ID via --prompt-id.'); process.exit(1); }
if (args.values.dynamic) { await runDynamic(promptId); } else { await runStatic(promptId); }}
main().catch((error) => { console.error(error); process.exit(1);});Any additional agent configuration, like tools or instructions, will override the values you may have configured in your stored prompt.
When a stored prompt already defines the model, the SDK does not send the agent’s default model unless you explicitly override it. That matters for computerTool(): prompt-managed runs keep the legacy preview wire shape by default for compatibility. To opt into the GA Responses computer tool on a prompt-managed run, explicitly set modelSettings.toolChoice: 'computer' or send an explicit model such as gpt-5.4. See Tools for the surrounding computer-use details.
Advanced providers and observability
Section titled “Advanced providers and observability”Custom model providers
Section titled “Custom model providers”Implementing your own provider is straightforward – implement ModelProvider and Model and pass the provider to the Runner constructor:
import { ModelProvider, Model, ModelRequest, ModelResponse, ResponseStreamEvent,} from '@openai/agents-core';
import { Agent, Runner } from '@openai/agents';
class EchoModel implements Model { name: string; constructor() { this.name = 'Echo'; } async getResponse(request: ModelRequest): Promise<ModelResponse> { return { usage: {}, output: [{ role: 'assistant', content: request.input as string }], } as any; } async *getStreamedResponse( _request: ModelRequest, ): AsyncIterable<ResponseStreamEvent> { yield { type: 'response.completed', response: { output: [], usage: {} }, } as any; }}
class EchoProvider implements ModelProvider { getModel(_modelName?: string): Promise<Model> | Model { return new EchoModel(); }}
const runner = new Runner({ modelProvider: new EchoProvider() });console.log(runner.config.modelProvider.getModel());const agent = new Agent({ name: 'Test Agent', instructions: 'You are a helpful assistant.', model: new EchoModel(), modelSettings: { temperature: 0.7, toolChoice: 'auto' },});console.log(agent.model);If you want every run() call and every newly constructed Runner to use the same provider by default, set it once during app startup:
import { setDefaultModelProvider } from '@openai/agents';
setDefaultModelProvider({ async getModel() { // Return any Model implementation here. throw new Error('Provide your own model implementation.'); },});This is useful when your app standardizes on a non-OpenAI provider and you do not want to pass a custom Runner everywhere.
AI SDK integration
Section titled “AI SDK integration”If you want to use non-OpenAI models without implementing ModelProvider yourself, see Using any model with Vercel’s AI SDK. That adapter lets you plug an AI SDK model into the Agents runtime directly, which is useful when your app already standardizes on AI SDK providers or you want access to the wider provider ecosystem. It also documents how Agents SDK providerData maps to AI SDK providerMetadata, plus the stream helpers available for AI SDK UI routes.
Tracing credentials
Section titled “Tracing credentials”Tracing is already enabled by default in supported server runtimes. Use setTracingExportApiKey() only when trace export should use a different credential than the default OpenAI API key:
import { setTracingExportApiKey } from '@openai/agents';
setTracingExportApiKey('sk-...');This sends traces to the OpenAI dashboard using that credential. For exporter customization such as custom ingest endpoints or retry tuning, see the Tracing guide.
Next steps
Section titled “Next steps”- Explore running agents.
- Give your models super‑powers with tools.
- Add guardrails or tracing as needed.