Skip to content

Quickstart

Realtime agents in the Python SDK are server-side, low-latency agents built on the OpenAI Realtime API over WebSocket transport.

Python SDK boundary

The Python SDK does not provide a browser WebRTC transport. This page only covers Python-managed realtime sessions over server-side WebSockets. Use this SDK for server-side orchestration, tools, approvals, and telephony integrations. See also Realtime transport.

Prerequisites

  • Python 3.10 or higher
  • OpenAI API key
  • Basic familiarity with the OpenAI Agents SDK

Installation

If you haven't already, install the OpenAI Agents SDK:

pip install openai-agents

Create a server-side realtime session

1. Import the realtime components

import asyncio

from agents.realtime import RealtimeAgent, RealtimeRunner

2. Define the starting agent

agent = RealtimeAgent(
    name="Assistant",
    instructions="You are a helpful voice assistant. Keep responses short and conversational.",
)

3. Configure the runner

Prefer the nested audio.input / audio.output session settings shape for new code. For new realtime agents, start with gpt-realtime-2.

runner = RealtimeRunner(
    starting_agent=agent,
    config={
        "model_settings": {
            "model_name": "gpt-realtime-2",
            "audio": {
                "input": {
                    "format": "pcm16",
                    "transcription": {"model": "gpt-4o-mini-transcribe"},
                    "turn_detection": {
                        "type": "semantic_vad",
                        "interrupt_response": True,
                    },
                },
                "output": {
                    "format": "pcm16",
                    "voice": "ash",
                },
            },
        }
    },
)

4. Start the session and send input

runner.run() returns a RealtimeSession. The connection is opened when you enter the session context.

async def main() -> None:
    session = await runner.run()

    async with session:
        await session.send_message("Say hello in one short sentence.")

        async for event in session:
            if event.type == "audio":
                # Forward or play event.audio.data.
                pass
            elif event.type == "history_added":
                print(event.item)
            elif event.type == "agent_end":
                # One assistant turn finished.
                break
            elif event.type == "error":
                print(f"Error: {event.error}")


if __name__ == "__main__":
    asyncio.run(main())

session.send_message() accepts either a plain string or a structured realtime message. For raw audio chunks, use session.send_audio().

What this quickstart does not include

Key settings

Once the basic session works, the settings most people reach for next are:

  • model_name
  • audio.input.format, audio.output.format
  • audio.input.transcription
  • audio.input.noise_reduction
  • audio.input.turn_detection for automatic turn detection
  • audio.output.voice
  • tool_choice, prompt, tracing
  • async_tool_calls, guardrails_settings.debounce_text_length, tool_error_formatter

The older flat aliases such as input_audio_format, output_audio_format, input_audio_transcription, and turn_detection still work, but nested audio settings are preferred for new code.

For manual turn control, use a raw session.update / input_audio_buffer.commit / response.create flow as described in the Realtime agents guide.

For the full schema, see RealtimeRunConfig and RealtimeSessionModelSettings.

Connection options

Set your API key in the environment:

export OPENAI_API_KEY="your-api-key-here"

Or pass it directly when starting the session:

session = await runner.run(model_config={"api_key": "your-api-key"})

model_config also supports:

  • url: Custom WebSocket endpoint
  • headers: Custom request headers
  • call_id: Attach to an existing realtime call. In this repo, the documented attach flow is SIP.
  • playback_tracker: Report how much audio the user has actually heard

If you pass headers explicitly, the SDK will not inject an Authorization header for you.

When connecting to Azure OpenAI, pass a GA Realtime endpoint URL in model_config["url"] and explicit headers. Avoid the legacy beta path (/openai/realtime?api-version=...) with realtime agents. See the Realtime agents guide for details.

Next steps