가드레일

가드레일은 에이전트와 함께 실행되거나 완료될 때까지 실행을 차단하여, 사용자 입력 또는 에이전트 출력에 대한 점검과 검증을 수행할 수 있습니다. 예를 들어, 비용이 큰 모델을 호출하기 전에 가벼운 모델을 가드레일로 실행할 수 있습니다. 가드레일이 악의적 사용을 감지하면 오류를 발생시키고 비용이 큰 모델의 실행을 중단할 수 있습니다.

가드레일에는 두 가지 종류가 있습니다:

입력 가드레일은 초기 사용자 입력에서 실행됩니다.
출력 가드레일은 최종 에이전트 출력에서 실행됩니다.

입력 가드레일

입력 가드레일은 세 단계로 실행됩니다:

가드레일은 에이전트에 전달된 것과 동일한 입력을 받습니다.
가드레일 함수가 실행되어 GuardrailFunctionOutput을(를) InputGuardrailResult 안에 감싼 결과로 반환합니다.
tripwireTriggered가 true이면 InputGuardrailTripwireTriggered 오류가 발생합니다.

주의 입력 가드레일은 사용자 입력을 위한 것이므로 워크플로에서 에이전트가 첫 번째 일 때만 실행됩니다. 가드레일은 에이전트마다 요구 사항이 다를 수 있으므로 에이전트 자체에서 구성합니다.

실행 모드

runInParallel: true(기본값)은 LLM/도구 호출과 함께 가드레일을 시작합니다. 지연 시간을 최소화하지만, 가드레일이 나중에 트리거되더라도 모델이 이미 토큰을 소비하거나 도구를 실행했을 수 있습니다.
runInParallel: false는 모델을 호출하기 전에 가드레일을 실행하여, 가드레일이 요청을 차단할 때 토큰 소비와 도구 실행을 방지합니다. 지연 시간보다 안전성과 비용을 우선할 때 사용하세요.

출력 가드레일

출력 가드레일은 세 단계로 실행됩니다:

가드레일은 에이전트가 생성한 출력을 받습니다.
가드레일 함수가 실행되어 GuardrailFunctionOutput을(를) OutputGuardrailResult 안에 감싼 결과로 반환합니다.
tripwireTriggered가 true이면 OutputGuardrailTripwireTriggered 오류가 발생합니다.

주의 출력 가드레일은 워크플로에서 에이전트가 마지막 일 때만 실행됩니다. 실시간 음성 상호작용의 경우 음성 에이전트 구축을 참고하세요.

트립와이어

가드레일이 실패하면 트립와이어로 이를 신호합니다. 트립와이어가 트리거되는 즉시 러너가 해당 오류를 발생시키고 실행을 중단합니다.

가드레일 구현

가드레일은 GuardrailFunctionOutput을 반환하는 단순한 함수입니다. 아래는 내부적으로 다른 에이전트를 실행하여 사용자가 수학 숙제를 도와 달라고 요청하는지 확인하는 최소 예시입니다.

import {
  Agent,
  run,
  InputGuardrailTripwireTriggered,
  InputGuardrail,
} from '@openai/agents';
import { z } from 'zod';

const guardrailAgent = new Agent({
  name: 'Guardrail check',
  instructions: 'Check if the user is asking you to do their math homework.',
  outputType: z.object({
    isMathHomework: z.boolean(),
    reasoning: z.string(),
  }),
});

const mathGuardrail: InputGuardrail = {
  name: 'Math Homework Guardrail',
  // Set runInParallel to false to block the model until the guardrail completes.
  runInParallel: false,
  execute: async ({ input, context }) => {
    const result = await run(guardrailAgent, input, { context });
    return {
      outputInfo: result.finalOutput,
      tripwireTriggered: result.finalOutput?.isMathHomework ?? false,
    };
  },
};

const agent = new Agent({
  name: 'Customer support agent',
  instructions:
    'You are a customer support agent. You help customers with their questions.',
  inputGuardrails: [mathGuardrail],
});

async function main() {
  try {
    await run(agent, 'Hello, can you help me solve for x: 2x + 3 = 11?');
    console.log("Guardrail didn't trip - this is unexpected");
  } catch (e) {
    if (e instanceof InputGuardrailTripwireTriggered) {
      console.log('Math homework guardrail tripped');
    }
  }
}

main().catch(console.error);

출력 가드레일도 동일한 방식으로 동작합니다.

import {
  Agent,
  run,
  OutputGuardrailTripwireTriggered,
  OutputGuardrail,
} from '@openai/agents';
import { z } from 'zod';

// The output by the main agent
const MessageOutput = z.object({ response: z.string() });
type MessageOutput = z.infer<typeof MessageOutput>;

// The output by the math guardrail agent
const MathOutput = z.object({ reasoning: z.string(), isMath: z.boolean() });

// The guardrail agent
const guardrailAgent = new Agent({
  name: 'Guardrail check',
  instructions: 'Check if the output includes any math.',
  outputType: MathOutput,
});

// An output guardrail using an agent internally
const mathGuardrail: OutputGuardrail<typeof MessageOutput> = {
  name: 'Math Guardrail',
  async execute({ agentOutput, context }) {
    const result = await run(guardrailAgent, agentOutput.response, {
      context,
    });
    return {
      outputInfo: result.finalOutput,
      tripwireTriggered: result.finalOutput?.isMath ?? false,
    };
  },
};

const agent = new Agent({
  name: 'Support agent',
  instructions:
    'You are a user support agent. You help users with their questions.',
  outputGuardrails: [mathGuardrail],
  outputType: MessageOutput,
});

async function main() {
  try {
    const input = 'Hello, can you help me solve for x: 2x + 3 = 11?';
    await run(agent, input);
    console.log("Guardrail didn't trip - this is unexpected");
  } catch (e) {
    if (e instanceof OutputGuardrailTripwireTriggered) {
      console.log('Math output guardrail tripped');
    }
  }
}

main().catch(console.error);

guardrailAgent는 가드레일 함수 내부에서 사용됩니다.
가드레일 함수는 에이전트 입력 또는 출력을 받아 결과를 반환합니다.
추가 정보를 가드레일 결과에 포함할 수 있습니다.
agent는 가드레일이 적용되는 실제 워크플로를 정의합니다.