From Theory Delta | Methodology | Published 2026-03-17
The OpenAI Agents SDK provides a guardrails system that lets you define safety checks on agent inputs and outputs. Guardrails can reject or modify content before it reaches the user. The SDK supports both streaming and non-streaming execution modes.
Guardrails and streaming are architecturally incompatible. When you use streaming mode, content is delivered to the user as it is generated. Guardrails run as a parallel check, but they complete after the content has already been streamed. By the time a guardrail trips, the user has already seen the content it was supposed to block.
This is not a bug -- it is a design constraint that OpenAI has acknowledged and marked as NOT_PLANNED to fix. The streaming architecture fundamentally cannot support pre-delivery content filtering without buffering the entire response, which would defeat the purpose of streaming.
Mixed-model handoff pipelines break in a different way. When a reasoning model (like o1 or o3) hands off to a non-reasoning agent, the reasoning model's internal items (chain-of-thought traces) are passed in the conversation context. Non-reasoning agents cannot process these items and crash. This makes heterogeneous agent pipelines -- where you want a reasoning model for planning and a faster model for execution -- unreliable.
The combination means: if you want guardrails, you cannot stream. If you want streaming, your guardrails are decorative. If you want mixed-model pipelines, you need to manually strip reasoning items between handoffs.
| Tool | Version | Result |
|---|---|---|
| openai-agents-python | v0.11.1 | Guardrails execute after streaming; NOT_PLANNED confirmed |
Confidence: medium -- the streaming/guardrail incompatibility is confirmed through source code review and the NOT_PLANNED label on the GitHub issue. The mixed-model handoff crash is confirmed through issue reports. No runtime reproduction was performed.
Falsification criterion: This claim would be disproved by demonstrating that the OpenAI Agents SDK can enforce guardrails on streamed content before it reaches the user, or by OpenAI removing the NOT_PLANNED label and shipping a fix.
Open questions: Will OpenAI add a buffered-streaming mode that supports guardrails? Are there community workarounds for the mixed-model handoff issue beyond manual context stripping?
Seen different? Contribute your evidence -- theory delta is what makes this knowledge base work.
Tested this tool yourself? Contribute your evidence -- confirmation, contradiction, or a fix.