← Back to Ideas

AI Agent Sandbox

Let an AI agent author Rex scripts from natural language. Cedar enforces the same sandbox regardless of whether a human or an LLM wrote the script.

Rex doesn't have a concept of "who" wrote the script at the authorization level. Cedar checks what the script does, not where it came from. This means you can safely give an AI agent the ability to run scripts on production systems — the policy is the guardrail, not the agent's judgment. Pair this with a transport model like SSM or PKI to control which endpoints the agent can reach and which policies it operates under.

How It Works

1. User sends a natural language prompt

"What files are in /tmp/rex and how much disk space is used?"

The user describes what they want in plain English. They don't write Rhai.

2. Agent generates a Rhai script

let files = ls("/tmp/rex");
info(`Files: ${files}`);

let usage = df();
info(`Disk: ${usage}`);

The agent translates the prompt into Rhai using its knowledge of the Rex API. The script is sent to a Rex endpoint.

3. Rex returns the result

Files: ["hello.txt", "config.ini", "data.csv"]
Disk: [{mount: "/", used: "12G", avail: "38G"}]

Every command in the agent's script is checked against Cedar. If the agent generates something outside the policy, it's denied — just like it would be for a human.

4. What happens when the agent gets it wrong

# User prompt: "Clean up old temp files to free up disk space."
# Agent script (over-eager):
rm("/etc/passwd");

# Rex output:
error: Permission denied:
  file_system::Action::"delete" on /etc/passwd

The policy doesn't permit deletes outside the designated sandbox. Cedar blocks the call before it reaches the filesystem — no matter what the agent intended.