传输机制
WebRTC 连接
Section titled “WebRTC 连接”默认传输层使用 WebRTC。音频将从麦克风录制并自动播放。
如需使用您自己的媒体流或音频元素,在创建会话时提供一个 OpenAIRealtimeWebRTC 实例。
import { RealtimeAgent, RealtimeSession, OpenAIRealtimeWebRTC } from '@openai/agents/realtime';
const agent = new RealtimeAgent({ name: 'Greeter', instructions: 'Greet the user with cheer and answer questions.',});
async function main() { const transport = new OpenAIRealtimeWebRTC({ mediaStream: await navigator.mediaDevices.getUserMedia({ audio: true }), audioElement: document.createElement('audio'), });
const customSession = new RealtimeSession(agent, { transport });}WebSocket 连接
Section titled “WebSocket 连接”在创建会话时传入 transport: 'websocket' 或 OpenAIRealtimeWebSocket 的实例,以使用 WebSocket 连接替代 WebRTC。这非常适合服务器端用例,例如构建使用 Twilio 的电话智能体。
import { RealtimeAgent, RealtimeSession } from '@openai/agents/realtime';
const agent = new RealtimeAgent({ name: 'Greeter', instructions: 'Greet the user with cheer and answer questions.',});
const myRecordedArrayBuffer = new ArrayBuffer(0);
const wsSession = new RealtimeSession(agent, { transport: 'websocket', model: 'gpt-realtime',});await wsSession.connect({ apiKey: process.env.OPENAI_API_KEY! });
wsSession.on('audio', (event) => { // event.data is a chunk of PCM16 audio});
wsSession.sendAudio(myRecordedArrayBuffer);使用任意录制/播放库来处理原始 PCM16 音频字节。
SIP 连接
Section titled “SIP 连接”通过使用 OpenAIRealtimeSIP 传输,将来自 Twilio 等提供商的 SIP 呼叫进行桥接。该传输会让 Realtime 会话与您的电话服务提供商发出的 SIP 事件保持同步。
- 通过
OpenAIRealtimeSIP.buildInitialConfig()生成初始会话配置以接受来电。这可确保 SIP 邀请与 Realtime 会话共享一致的默认值。 - 附加一个使用
OpenAIRealtimeSIP传输的RealtimeSession,并使用提供商 webhook 发放的callId进行连接。 - 监听会话事件,用于驱动通话分析、转录或升级逻辑。
import OpenAI from 'openai';import { OpenAIRealtimeSIP, RealtimeAgent, RealtimeSession, type RealtimeSessionOptions,} from '@openai/agents/realtime';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY!, webhookSecret: process.env.OPENAI_WEBHOOK_SECRET!,});
const agent = new RealtimeAgent({ name: 'Receptionist', instructions: 'Welcome the caller, answer scheduling questions, and hand off if the caller requests a human.',});
const sessionOptions: Partial<RealtimeSessionOptions> = { model: 'gpt-realtime', config: { audio: { input: { turnDetection: { type: 'semantic_vad', interruptResponse: true }, }, }, },};
export async function acceptIncomingCall(callId: string): Promise<void> { const initialConfig = await OpenAIRealtimeSIP.buildInitialConfig( agent, sessionOptions, ); await openai.realtime.calls.accept(callId, initialConfig);}
export async function attachRealtimeSession( callId: string,): Promise<RealtimeSession> { const session = new RealtimeSession(agent, { transport: new OpenAIRealtimeSIP(), ...sessionOptions, });
session.on('history_added', (item) => { console.log('Realtime update:', item.type); });
await session.connect({ apiKey: process.env.OPENAI_API_KEY!, callId, });
return session;}Cloudflare Workers(workerd)注意事项
Section titled “Cloudflare Workers(workerd)注意事项”Cloudflare Workers 和其他 workerd 运行时无法使用全局 WebSocket 构造函数打开出站 WebSocket。请使用扩展包中的 Cloudflare 传输,它会在内部执行基于 fetch() 的升级。
import { CloudflareRealtimeTransportLayer } from '@openai/agents-extensions';import { RealtimeAgent, RealtimeSession } from '@openai/agents/realtime';
const agent = new RealtimeAgent({ name: 'My Agent',});
// Create a transport that connects to OpenAI Realtime via Cloudflare/workerd's fetch-based upgrade.const cfTransport = new CloudflareRealtimeTransportLayer({ url: 'wss://api.openai.com/v1/realtime?model=gpt-realtime',});
const session = new RealtimeSession(agent, { // Set your own transport. transport: cfTransport,});自定义传输机制
Section titled “自定义传输机制”如果您想使用不同的语音到语音 API,或拥有自己的自定义传输机制,可以通过实现 RealtimeTransportLayer 接口并发出 RealtimeTransportEventTypes 事件来创建自己的传输层。
与 Realtime API 的更直接交互
Section titled “与 Realtime API 的更直接交互”如果您想使用 OpenAI Realtime API,同时对 Realtime API 拥有更直接的访问方式,有两种选项:
选项 1 - 访问传输层
Section titled “选项 1 - 访问传输层”如果您仍希望受益于 RealtimeSession 的全部能力,可以通过 session.transport 访问您的传输层。
传输层会在 * 事件下发出其接收到的每一个事件,您也可以使用 sendEvent() 方法发送原始事件。
import { RealtimeAgent, RealtimeSession } from '@openai/agents/realtime';
const agent = new RealtimeAgent({ name: 'Greeter', instructions: 'Greet the user with cheer and answer questions.',});
const session = new RealtimeSession(agent, { model: 'gpt-realtime',});
session.transport.on('*', (event) => { // JSON parsed version of the event received on the connection});
// Send any valid event as JSON. For example triggering a new responsesession.transport.sendEvent({ type: 'response.create', // ...});选项 2 — 仅使用传输层
Section titled “选项 2 — 仅使用传输层”如果您不需要自动工具执行、护栏等功能,也可以将传输层用作仅管理连接和中断的“瘦”客户端。
import { OpenAIRealtimeWebRTC } from '@openai/agents/realtime';
const client = new OpenAIRealtimeWebRTC();const audioBuffer = new ArrayBuffer(0);
await client.connect({ apiKey: '<api key>', model: 'gpt-4o-mini-realtime-preview', initialSessionConfig: { instructions: 'Speak like a pirate', voice: 'ash', modalities: ['text', 'audio'], inputAudioFormat: 'pcm16', outputAudioFormat: 'pcm16', },});
// optionally for WebSocketsclient.on('audio', (newAudio) => {});
client.sendAudio(audioBuffer);