AI Agents Need Brakes

There’s a tension at the core of any AI agent tool: the more autonomy you give it, the more useful it becomes — and the more dangerous.

I use Claude Code every day. It edits files, runs tests, deploys code, installs dependencies. And the problem nobody talks about openly is that in most configurations, it can run literally any command on your system.

The Real Problem

Agent tools have permission systems. Claude Code has six modes, allow/deny lists, OS-level sandbox. It sounds like enough until you try to actually configure it.

The fundamental problem is that they do string matching, not semantic evaluation. A rule like Bash(rm *) can’t distinguish rm file.txt from rm -rf /. It matches the textual pattern, not what the command does. Users report needing over a hundred rules to cover common cases — and commands still slip through via pipes, shell expansion, and quoted arguments.

When an agent runs git status && rm -rf /, that’s a single string. Glob rules match the whole thing or nothing.

Why This Matters Now

For a long time this was a theoretical problem. Agents ran simple tasks, in controlled environments, with constant supervision. Not anymore.

AI agents are being used in real production workflows: deploying applications, manipulating databases, automating infrastructure. The speed at which this happened left the security layer behind.

Filesystem sandbox is great. But many legitimate workflows need network access (API calls, package installs, deployments), or access to paths outside the workspace (shared configs, SSH keys for git). When you open holes in the sandbox for legitimate use, you lose the protection it provided.

What’s Missing

What needs to exist isn’t another permission list. It’s an evaluation engine that understands what a command does, not just what it looks like.

rm -rf / and rm file.txt use the same binary. The difference is in the arguments. A serious permission system needs to parse the command’s AST, evaluate each segment of a pipeline independently, and understand context — project, directory, session.

It also needs to be composable. Named rules, reusable rulesets, presets shared across teams. Not five competing JSON files that replace instead of compose.

And it needs to learn. After you approve docker build twenty times in the same project, the system should suggest auto-allowing it there. Static configuration doesn’t scale to real workflows.

The Irony

The irony is that AI agents are already good enough to cause real harm, but the control tooling is still at the prototype stage. The speed of adoption created a dangerous gap.

I don’t think the answer is to slow down agents. The answer is to take the security infrastructure seriously, with the same engineering effort applied to the agent itself.

Brakes don’t make a car slower. They let you drive faster safely.