跳转到内容

传输机制

默认传输层使用 WebRTC。音频将从麦克风录制并自动回放。

若要使用您自己的媒体流或音频元素,请在创建会话时提供一个 OpenAIRealtimeWebRTC 实例。

import { RealtimeAgent, RealtimeSession, OpenAIRealtimeWebRTC } from '@openai/agents/realtime';
const agent = new RealtimeAgent({
name: 'Greeter',
instructions: 'Greet the user with cheer and answer questions.',
});
async function main() {
const transport = new OpenAIRealtimeWebRTC({
mediaStream: await navigator.mediaDevices.getUserMedia({ audio: true }),
audioElement: document.createElement('audio'),
});
const customSession = new RealtimeSession(agent, { transport });
}

在创建会话时传入 transport: 'websocket' 或一个 OpenAIRealtimeWebSocket 实例,以使用 WebSocket 连接替代 WebRTC。此方式非常适合服务器端用例,例如使用 Twilio 构建电话智能体。

import { RealtimeAgent, RealtimeSession } from '@openai/agents/realtime';
const agent = new RealtimeAgent({
name: 'Greeter',
instructions: 'Greet the user with cheer and answer questions.',
});
const myRecordedArrayBuffer = new ArrayBuffer(0);
const wsSession = new RealtimeSession(agent, {
transport: 'websocket',
model: 'gpt-realtime',
});
await wsSession.connect({ apiKey: process.env.OPENAI_API_KEY! });
wsSession.on('audio', (event) => {
// event.data is a chunk of PCM16 audio
});
wsSession.sendAudio(myRecordedArrayBuffer);

使用任意录制/回放库来处理原始 PCM16 音频字节。

Cloudflare Workers(workerd)注意事项

Section titled “Cloudflare Workers(workerd)注意事项”

Cloudflare Workers 和其他 workerd 运行时无法使用全局 WebSocket 构造函数发起出站 WebSocket。请使用扩展包中的 Cloudflare 传输组件,它会在内部执行基于 fetch() 的升级。

import { CloudflareRealtimeTransportLayer } from '@openai/agents-extensions';
import { RealtimeAgent, RealtimeSession } from '@openai/agents/realtime';
const agent = new RealtimeAgent({
name: 'My Agent',
});
// Create a transport that connects to OpenAI Realtime via Cloudflare/workerd's fetch-based upgrade.
const cfTransport = new CloudflareRealtimeTransportLayer({
url: 'wss://api.openai.com/v1/realtime?model=gpt-realtime',
});
const session = new RealtimeSession(agent, {
// Set your own transport.
transport: cfTransport,
});

如果您想使用不同的语音到语音 API,或拥有自定义传输机制,您可以通过实现 RealtimeTransportLayer 接口并触发 RealtimeTransportEventTypes 事件来创建自己的传输层。

如果您想使用 OpenAI Realtime API,同时更直接地访问 Realtime API,有两种选项:

如果您仍希望受益于 RealtimeSession 的全部能力,您可以通过 session.transport 访问传输层。

传输层会在 * 事件下触发其接收到的每个事件,您也可以使用 sendEvent() 方法发送原始事件。

import { RealtimeAgent, RealtimeSession } from '@openai/agents/realtime';
const agent = new RealtimeAgent({
name: 'Greeter',
instructions: 'Greet the user with cheer and answer questions.',
});
const session = new RealtimeSession(agent, {
model: 'gpt-realtime',
});
session.transport.on('*', (event) => {
// JSON parsed version of the event received on the connection
});
// Send any valid event as JSON. For example triggering a new response
session.transport.sendEvent({
type: 'response.create',
// ...
});

如果您不需要自动工具执行、护栏等功能,也可以将传输层作为一个只管理连接与中断的“轻薄”客户端来使用。

import { OpenAIRealtimeWebRTC } from '@openai/agents/realtime';
const client = new OpenAIRealtimeWebRTC();
const audioBuffer = new ArrayBuffer(0);
await client.connect({
apiKey: '<api key>',
model: 'gpt-4o-mini-realtime-preview',
initialSessionConfig: {
instructions: 'Speak like a pirate',
voice: 'ash',
modalities: ['text', 'audio'],
inputAudioFormat: 'pcm16',
outputAudioFormat: 'pcm16',
},
});
// optionally for WebSockets
client.on('audio', (newAudio) => {});
client.sendAudio(audioBuffer);