快速开始

创建项目

在本快速上手中，我们将创建一个可在浏览器中使用的语音智能体。若想尝试全新项目，可以使用 Next.js 或 Vite。
Terminal window
```
npm create vite@latest my-project -- --template vanilla-ts
```
安装 Agents SDK
Terminal window
```
npm install @openai/agents zod@3
```
或者，你也可以安装 @openai/agents-realtime 来使用独立的浏览器包。
生成客户端临时令牌

由于该应用将在用户的浏览器中运行，我们需要一种安全的方式通过 Realtime API 连接到模型。为此，可以使用应由后端服务器生成的短期客户端密钥。在测试时，你也可以使用 curl 和你的常规 OpenAI API 密钥来生成一个密钥。
Terminal window
```
export OPENAI_API_KEY="sk-proj-...(your own key here)"
curl -X POST https://api.openai.com/v1/realtime/client_secrets \
   -H "Authorization: Bearer $OPENAI_API_KEY" \
   -H "Content-Type: application/json" \
   -d '{
     "session": {
       "type": "realtime",
       "model": "gpt-realtime"
     }
   }'
```
响应会在顶层包含一个以 “ek_” 为前缀的 “value” 字符串。你可以使用该临时密钥稍后建立 WebRTC 连接。请注意，该密钥仅在短时间内有效，需要重新生成。

创建你的第一个智能体

创建一个新的 RealtimeAgent 与创建常规的 Agent 非常类似。

import { RealtimeAgent } from '@openai/agents/realtime';

const agent = new RealtimeAgent({
  name: 'Assistant',
  instructions: 'You are a helpful assistant.',
});

创建会话

与常规智能体不同，语音智能体会持续在 RealtimeSession 中运行与监听，该会话会随着时间处理与模型的对话与连接。该会话还会处理音频处理、打断，以及我们稍后将介绍的许多其他生命周期功能。
```
import { RealtimeSession } from '@openai/agents/realtime';

const session = new RealtimeSession(agent, {
  model: 'gpt-realtime',
});
```
RealtimeSession 构造函数将 agent 作为第一个参数。该智能体将是你的用户最先能够交互的智能体。
连接到会话

要连接到会话，你需要传入先前生成的客户端临时令牌。
```
await session.connect({ apiKey: 'ek_...(put your own key here)' });
```
这将在浏览器中使用 WebRTC 连接到 Realtime API，并自动配置你的麦克风和扬声器以进行音频输入与输出。如果你在后端服务器（如 Node.js）上运行 RealtimeSession，SDK 将自动使用 WebSocket 作为连接。你可以在传输机制指南中了解更多不同传输层的信息。

整合到一起

import { RealtimeAgent, RealtimeSession } from '@openai/agents/realtime';

export async function setupCounter(element: HTMLButtonElement) {
  // ....
  // for quickly start, you can append the following code to the auto-generated TS code

  const agent = new RealtimeAgent({
    name: 'Assistant',
    instructions: 'You are a helpful assistant.',
  });
  const session = new RealtimeSession(agent);
  // Automatically connects your microphone and audio output in the browser via WebRTC.
  try {
    await session.connect({
      // To get this ephemeral key string, you can run the following command or implement the equivalent on the server side:
      // curl -s -X POST https://api.openai.com/v1/realtime/client_secrets -H "Authorization: Bearer $OPENAI_API_KEY" -H "Content-Type: application/json" -d '{"session": {"type": "realtime", "model": "gpt-realtime"}}' | jq .value
      apiKey: 'ek_...(put your own key here)',
    });
    console.log('You are connected!');
  } catch (e) {
    console.error(e);
  }
}

启动并开始对话

启动你的 Web 服务器并访问包含新 Realtime Agent 代码的页面。你应会看到麦克风访问请求。授予权限后，你就可以开始与智能体对话了。
Terminal window
```
npm run dev
```

后续步骤

从这里开始，你可以设计并构建自己的语音智能体。语音智能体包含许多与常规智能体相同的功能，但也有其独特之处。

了解如何为你的语音智能体添加：
进一步了解不同的传输层：