Voice Agents Quickstart
-
Create a project
In this quickstart we will create a voice agent you can use in the browser. If you want to check out a new project, you can try out
Next.js
orVite
.Terminal window npm create vite@latest my-project --template vanilla-ts -
Install the Agents SDK
Terminal window npm install @openai/agentsAlternatively you can install
@openai/agents-realtime
for a standalone browser package. -
Generate a client ephemeral token
As this application will run in the users browser, we need a secure way to connect to the model through the Realtime API. For this we can use a ephemeral client key that should get generated on your backend server. For testing purposes you can also generate a key using
curl
and your regular OpenAI API key.Terminal window curl -X POST https://api.openai.com/v1/realtime/sessions \-H "Authorization: Bearer $OPENAI_API_KEY" \-H "Content-Type: application/json" \-d '{"model": "gpt-4o-realtime-preview-2025-06-03"}'The response will contain a
client_secret.value
value that you can use to connect later on. Note that this key is only valid for a short period of time and will need to be regenerated. -
Create your first Agent
Creating a new
RealtimeAgent
is very similar to creating a regularAgent
.import { RealtimeAgent } from '@openai/agents-realtime';const agent = new RealtimeAgent({name: 'Assistant',instructions: 'You are a helpful assistant.',}); -
Create a session
Unlike a regular agent, a Voice Agent is continously running and listening inside a
RealtimeSession
that handles the conversation and connection to the model over time. This session will also handle the audio processing, interruptions, and a lot of the other lifecycle functionality we will cover later on.import { RealtimeSession } from '@openai/agents-realtime';const session = new RealtimeSession(agent, {model: 'gpt-4o-realtime-preview-2025-06-03',});The
RealtimeSession
constructor takes anagent
as the first argument. This agent will be the first agent that your user will be able to interact with. -
Connect to the session
To connect to the session you need to pass the client ephemeral token you generated earlier on.
await session.connect({ apiKey: '<client-api-key>' });This will connect to the Realtime API using WebRTC in the browser and automatically configure your microphone and speaker for audio input and output. If you are running your
RealtimeSession
on a backend server (like Node.js) the SDK will automatically use WebSocket as a connection. You can learn more about the different transport layers in the Realtime Transport Layer guide. -
Putting it all together
import { RealtimeAgent, RealtimeSession } from '@openai/agents/realtime';const agent = new RealtimeAgent({name: 'Assistant',instructions: 'You are a helpful assistant.',});const session = new RealtimeSession(agent);// Automatically connects your microphone and audio output// in the browser via WebRTC.await session.connect({apiKey: '<client-api-key>',}); -
Fire up the engines and start talking
Start up your webserver and navigate to the page that includes your new Realtime Agent code. You should see a request for microphone access. Once you grant access you should be able to start talking to your agent.
Terminal window npm run dev
Next Steps
Section titled “Next Steps”From here you can start designing and building your own voice agent. Voice agents include a lot of the same features as regular agents, but have some of their own unique features.
-
Learn how to give your voice agent:
-
Learn more about the different transport layers.