Skip to content

Guardrails

Guardrails can run alongside your agents or block execution until they complete, allowing you to perform checks and validations on user input or agent output. For example, you may run a lightweight model as a guardrail before invoking an expensive model. If the guardrail detects malicious usage, it can trigger an error and stop the costly model from running.

There are two kinds of guardrails:

  1. Input guardrails run on the initial user input.
  2. Output guardrails run on the final agent output.

Input guardrails run in three steps:

  1. The guardrail receives the same input passed to the agent.
  2. The guardrail function executes and returns a GuardrailFunctionOutput wrapped inside an InputGuardrailResult.
  3. If tripwireTriggered is true, an InputGuardrailTripwireTriggered error is thrown.

Note Input guardrails are intended for user input, so they only run if the agent is the first agent in the workflow. Guardrails are configured on the agent itself because different agents often require different guardrails.

  • runInParallel: true (default) starts guardrails alongside the LLM/tool calls. This minimizes latency but the model may already have consumed tokens or run tools if the guardrail later triggers.
  • runInParallel: false runs the guardrail before calling the model, preventing token spend and tool execution when the guardrail blocks the request. Use this when you prefer safety and cost over latency.

Output guardrails run in 3 steps:

  1. The guardrail receives the output produced by the agent.
  2. The guardrail function executes and returns a GuardrailFunctionOutput wrapped inside an OutputGuardrailResult.
  3. If tripwireTriggered is true, an OutputGuardrailTripwireTriggered error is thrown.

Note Output guardrails only run if the agent is the last agent in the workflow. For realtime voice interactions see the voice agents guide.

Tool guardrails wrap function tools and let you validate or block tool calls before and after execution. They are configured on the tool itself (via tool() options) and run for every tool invocation.

  • Input tool guardrails run before the tool executes and can reject the call with a message or throw a tripwire.
  • Output tool guardrails run after the tool executes and can replace the output with a rejection message or throw a tripwire.

Tool guardrails return a behavior:

  • allow — continue to the next guardrail or tool execution.
  • rejectContent — short‑circuit with a message (tool call is skipped or output is replaced).
  • throwException — throw a tripwire error immediately.

Tool guardrails apply to function tools created with tool(). Hosted tools and local built‑in tools (computerTool, shellTool, applyPatchTool) do not use this guardrail pipeline.

When a guardrail fails, it signals this via a tripwire. As soon as a tripwire is triggered, the runner throws the corresponding error and halts execution.

A guardrail is simply a function that returns a GuardrailFunctionOutput. Below is a minimal example that checks whether the user is asking for math homework help by running another agent under the hood.

Input guardrail example
import {
Agent,
run,
InputGuardrailTripwireTriggered,
InputGuardrail,
} from '@openai/agents';
import { z } from 'zod';
const guardrailAgent = new Agent({
name: 'Guardrail check',
instructions: 'Check if the user is asking you to do their math homework.',
outputType: z.object({
isMathHomework: z.boolean(),
reasoning: z.string(),
}),
});
const mathGuardrail: InputGuardrail = {
name: 'Math Homework Guardrail',
// Set runInParallel to false to block the model until the guardrail completes.
runInParallel: false,
execute: async ({ input, context }) => {
const result = await run(guardrailAgent, input, { context });
return {
outputInfo: result.finalOutput,
tripwireTriggered: result.finalOutput?.isMathHomework === false,
};
},
};
const agent = new Agent({
name: 'Customer support agent',
instructions:
'You are a customer support agent. You help customers with their questions.',
inputGuardrails: [mathGuardrail],
});
async function main() {
try {
await run(agent, 'Hello, can you help me solve for x: 2x + 3 = 11?');
console.log("Guardrail didn't trip - this is unexpected");
} catch (e) {
if (e instanceof InputGuardrailTripwireTriggered) {
console.log('Math homework guardrail tripped');
}
}
}
main().catch(console.error);

Output guardrails work the same way.

Output guardrail example
import {
Agent,
run,
OutputGuardrailTripwireTriggered,
OutputGuardrail,
} from '@openai/agents';
import { z } from 'zod';
// The output by the main agent
const MessageOutput = z.object({ response: z.string() });
type MessageOutput = z.infer<typeof MessageOutput>;
// The output by the math guardrail agent
const MathOutput = z.object({ reasoning: z.string(), isMath: z.boolean() });
// The guardrail agent
const guardrailAgent = new Agent({
name: 'Guardrail check',
instructions: 'Check if the output includes any math.',
outputType: MathOutput,
});
// An output guardrail using an agent internally
const mathGuardrail: OutputGuardrail<typeof MessageOutput> = {
name: 'Math Guardrail',
async execute({ agentOutput, context }) {
const result = await run(guardrailAgent, agentOutput.response, {
context,
});
return {
outputInfo: result.finalOutput,
tripwireTriggered: result.finalOutput?.isMath ?? false,
};
},
};
const agent = new Agent({
name: 'Support agent',
instructions:
'You are a user support agent. You help users with their questions.',
outputGuardrails: [mathGuardrail],
outputType: MessageOutput,
});
async function main() {
try {
const input = 'Hello, can you help me solve for x: 2x + 3 = 11?';
await run(agent, input);
console.log("Guardrail didn't trip - this is unexpected");
} catch (e) {
if (e instanceof OutputGuardrailTripwireTriggered) {
console.log('Math output guardrail tripped');
}
}
}
main().catch(console.error);

Tool input/output guardrails look like this:

Tool guardrails
import {
Agent,
ToolGuardrailFunctionOutputFactory,
defineToolInputGuardrail,
defineToolOutputGuardrail,
tool,
} from '@openai/agents';
import { z } from 'zod';
const blockSecrets = defineToolInputGuardrail({
name: 'block_secrets',
run: async ({ toolCall }) => {
const args = JSON.parse(toolCall.arguments) as { text?: string };
if (args.text?.includes('sk-')) {
return ToolGuardrailFunctionOutputFactory.rejectContent(
'Remove secrets before calling this tool.',
);
}
return ToolGuardrailFunctionOutputFactory.allow();
},
});
const redactOutput = defineToolOutputGuardrail({
name: 'redact_output',
run: async ({ output }) => {
const text = String(output ?? '');
if (text.includes('sk-')) {
return ToolGuardrailFunctionOutputFactory.rejectContent(
'Output contained sensitive data.',
);
}
return ToolGuardrailFunctionOutputFactory.allow();
},
});
const classifyTool = tool({
name: 'classify_text',
description: 'Classify text for internal routing.',
parameters: z.object({
text: z.string(),
}),
inputGuardrails: [blockSecrets],
outputGuardrails: [redactOutput],
execute: ({ text }) => `length:${text.length}`,
});
const agent = new Agent({
name: 'Classifier',
instructions: 'Classify incoming text.',
tools: [classifyTool],
});
void agent;
  1. guardrailAgent is used inside the guardrail functions.
  2. The guardrail function receives the agent input or output and returns the result.
  3. Extra information can be included in the guardrail result.
  4. agent defines the actual workflow where guardrails are applied.