Skip to content

Quickstart

Modern agents work best when they can operate on real files in a filesystem. Sandbox Agents in the Agents SDK give the model a persistent workspace where it can search large document sets, edit files, run commands, generate artifacts, and pick work back up from saved sandbox state.

The SDK gives you that execution harness without making you wire together file staging, filesystem tools, shell access, sandbox lifecycle, snapshots, and provider-specific glue yourself. You keep the normal Agent and Runner flow, then add a Manifest for the workspace, capabilities for sandbox-native tools, and the sandbox run option for where the work runs.

  • Node.js 22 or higher.
  • Basic familiarity with the OpenAI Agents SDK.
  • A sandbox client. For local development, start with UnixLocalSandboxClient.

This quickstart uses Node.js and npm commands, but the SDK is not limited to Node.js. Sandbox agents can also run on Deno and Bun when your project uses compatible package resolution and runtime APIs.

If you have not already installed the SDK:

Terminal window
npm install @openai/agents

For Docker-backed sandboxes, install Docker locally and use DockerSandboxClient from @openai/agents/sandbox/local.

If you use interactive local PTY sessions with tty: true, the process running the SDK also needs Python 3 available as python3, or through OPENAI_AGENTS_PYTHON. Non-PTY shell commands do not require Python.

This example stages a local repo under repo/, loads local skills lazily, and lets the runner create a Unix-local sandbox session for the run. The agent definition owns the manifest and capabilities, while the run config only chooses the sandbox client for this run.

Create a local sandbox agent
import { run } from '@openai/agents';
import {
Capabilities,
Manifest,
SandboxAgent,
localDir,
skills,
} from '@openai/agents/sandbox';
import {
UnixLocalSandboxClient,
localDirLazySkillSource,
} from '@openai/agents/sandbox/local';
import { dirname, join } from 'node:path';
import { fileURLToPath } from 'node:url';
const exampleDir = dirname(fileURLToPath(import.meta.url));
const hostRepoDir = join(exampleDir, 'repo');
const hostSkillsDir = join(exampleDir, 'skills');
const manifest = new Manifest({
entries: {
repo: localDir({ src: hostRepoDir }),
},
});
const agent = new SandboxAgent({
name: 'Sandbox engineer',
model: 'gpt-5.5',
instructions:
'Read `repo/task.md` before editing files. Load the `$invoice-total-fixer` skill before changing code. Stay grounded in the repository, preserve existing behavior, and mention the exact verification command you ran. If you edit files with apply_patch, paths are relative to the sandbox workspace root.',
defaultManifest: manifest,
capabilities: [
...Capabilities.default(),
skills({
lazyFrom: localDirLazySkillSource(hostSkillsDir),
}),
],
});
const result = await run(
agent,
'Open `repo/task.md`, fix the issue, run the targeted test, and summarize the change.',
{
sandbox: {
client: new UnixLocalSandboxClient(),
},
},
);
console.log(result.finalOutput);

Once the basic run works, the choices most people reach for next are:

  • defaultManifest: the files, repos, directories, and mounts for fresh sandbox sessions.
  • instructions: short workflow rules that should apply across prompts.
  • baseInstructions: an advanced escape hatch for replacing the SDK sandbox prompt.
  • capabilities: sandbox-native tools such as filesystem editing/image inspection, shell, skills, memory, and compaction.
  • runAs: the sandbox user identity for model-facing tools.
  • sandbox.client: the sandbox backend.
  • sandbox.session, sandbox.sessionState, or sandbox.snapshot: how later runs reconnect to prior work.
  • Concepts: understand manifests, capabilities, permissions, snapshots, run config, and composition patterns.
  • Sandbox clients: choose Unix-local, Docker, hosted providers, and mount strategies.
  • Agent memory: preserve and reuse lessons from previous sandbox runs.

If shell access is only one occasional tool, start with hosted shell in the Tools guide. Reach for sandbox agents when workspace isolation, sandbox client choice, or sandbox-session resume behavior are part of the design.