Custom Prompt Check
Implements custom content checks using configurable LLM prompts. Uses your custom LLM prompts to perform specialized validation, allows you to define exactly what constitutes a violation, provides flexibility for business-specific validation rules, and returns structured results based on your prompt design.
Configuration
{
"name": "Custom Prompt Check",
"config": {
"model": "gpt-5",
"confidence_threshold": 0.7,
"system_prompt_details": "Determine if the user's request needs to be escalated to a senior support agent. Indications of escalation include: ...",
"include_reasoning": false,
"max_turns": 10
}
}
Parameters
model(required): Model to use for the check (e.g., "gpt-5")confidence_threshold(required): Minimum confidence score to trigger tripwire (0.0 to 1.0)system_prompt_details(required): Custom instructions defining the content detection criteriainclude_reasoning(optional): Whether to include reasoning/explanation fields in the guardrail output (default:false)- When
false: The LLM only generates the essential fields (flaggedandconfidence), reducing token generation costs - When
true: Additionally, returns detailed reasoning for its decisions - Performance: In our evaluations, disabling reasoning reduces median latency by 40% on average (ranging from 18% to 67% depending on model) while maintaining detection performance
- Use Case: Keep disabled for production to minimize costs; enable for development and debugging
- When
max_turns(optional): Maximum number of conversation turns to include for multi-turn analysis (default:10)- Set to
1for single-turn mode
- Set to
Implementation Notes
- LLM Required: Uses an LLM for analysis
- Business Scope:
system_prompt_detailsshould clearly define your policy and acceptable topics. Effective prompt engineering is essential for optimal LLM performance and detection accuracy.
What It Returns
Returns a GuardrailResult with the following info dictionary:
{
"guardrail_name": "Custom Prompt Check",
"flagged": true,
"confidence": 0.85,
"threshold": 0.7,
"token_usage": {
"prompt_tokens": 110,
"completion_tokens": 18,
"total_tokens": 128
}
}
flagged: Whether the custom validation criteria were metconfidence: Confidence score (0.0 to 1.0) for the validationthreshold: The confidence threshold that was configuredreason: Explanation of why the input was flagged (or not flagged) - only included wheninclude_reasoning=truetoken_usage: Token usage details from the LLM call