Skip to content

Usage

The Agents SDK automatically tracks token usage for every run. You can access it from the run context and use it to monitor costs, enforce limits, or record analytics.

What is tracked

  • requests: number of LLM API calls made
  • input_tokens: total input tokens sent
  • output_tokens: total output tokens received
  • total_tokens: input + output
  • details:
  • input_tokens_details.cached_tokens
  • output_tokens_details.reasoning_tokens

Accessing usage from a run

After Runner.run(...), access usage via result.context_wrapper.usage.

result = await Runner.run(agent, "What's the weather in Tokyo?")
usage = result.context_wrapper.usage

print("Requests:", usage.requests)
print("Input tokens:", usage.input_tokens)
print("Output tokens:", usage.output_tokens)
print("Total tokens:", usage.total_tokens)

Usage is aggregated across all model calls during the run (including tool calls and handoffs).

Accessing usage with sessions

When you use a Session (e.g., SQLiteSession), usage continues to accumulate across turns within the same run. Each call to Runner.run(...) returns the run’s cumulative usage at that point.

session = SQLiteSession("my_conversation")

first = await Runner.run(agent, "Hi!", session=session)
print(first.context_wrapper.usage.total_tokens)

second = await Runner.run(agent, "Can you elaborate?", session=session)
print(second.context_wrapper.usage.total_tokens)  # includes both turns

Using usage in hooks

If you’re using RunHooks, the context object passed to each hook contains usage. This lets you log usage at key lifecycle moments.

class MyHooks(RunHooks):
    async def on_agent_end(self, context: RunContextWrapper, agent: Agent, output: Any) -> None:
        u = context.usage
        print(f"{agent.name}{u.requests} requests, {u.total_tokens} total tokens")