The Agent Security Stack: From Weekend Project to Docker Partnership in Six Weeks

Six weeks ago, Gavriel Cohen built NanoClaw in a weekend.

The premise was simple. OpenClaw, the open-source coding agent that became the most-starred project in GitHub history (210,000+ stars), had a security problem. Agents running with access to your codebase, your terminal, and your file system inside environments with network access and persistent state is a recipe for disaster. RoguePilot had already demonstrated this: a GitHub Codespaces vulnerability where malicious Copilot instructions could seize control of entire repositories.

Cohen's response was NanoClaw. A security-first alternative built on Claude Code that wraps Anthropic's agent with orchestration, persistent memory, and channel integrations, all running inside isolated containers. No ambient authority. No shared file system access across sessions. Every agent interaction sandboxed.

Six weeks later, NanoClaw has 20,000+ GitHub stars and 100,000+ downloads. And Docker just announced a partnership to integrate NanoClaw with Docker Sandboxes, their MicroVM isolation layer.

The agent security stack went from nonexistent to enterprise-grade in six weeks. This is worth paying attention to, not just for what NanoClaw does, but for the pattern it follows.

The Pattern You've Seen Before

Infrastructure categories don't emerge from corporate planning. They emerge from the same repeating sequence.

A viral project creates a category. Early adopters discover security or reliability gaps. A hardened alternative fills those gaps. An enterprise infrastructure company legitimizes the whole thing.

Docker itself followed this pattern. Linux containers existed for years before Docker made them usable. Docker's ease of use created mass adoption. Mass adoption created security concerns (container escapes, image supply chain attacks). Kubernetes emerged as the orchestration and security layer. Cloud providers legitimized the ecosystem with managed services.

Now the same sequence is playing out in agent deployment.

OpenClaw created the category. A coding agent that could navigate repositories, write code, run tests, and iterate on feedback. 210,000 stars in weeks. Developers loved it. Security teams did not.

RoguePilot and similar vulnerabilities created the security urgency. When an agent can execute arbitrary commands with your credentials, the attack surface isn't theoretical. It's a demonstrated exploit.

NanoClaw filled the gap. Same agent capabilities, isolated execution. Built on Claude Code's capabilities with a security-first architecture that treats every agent session as potentially hostile.

Docker legitimized the approach. The MicroVM integration gives NanoClaw the same isolation guarantees that Docker provides for traditional containerized workloads. Enterprise security teams can evaluate agent deployment against the same frameworks they use for everything else.

What the Security Stack Looks Like

The emerging agent security stack has four layers, and they're solidifying fast.

Layer 1: The agent runtime. This is the model plus tool-use framework. Claude Code, GPT-5.4 with function calling, or open-weight alternatives. The agent needs to understand code, navigate file systems, execute commands, and iterate on results. This layer is mature.

Layer 2: Orchestration and memory. Agents need to maintain context across sessions, manage multi-step workflows, and coordinate with other agents or human reviewers. NanoClaw adds persistent memory and channel integrations (Slack, GitHub, terminal). This layer is emerging.

Layer 3: Isolation and sandboxing. Every agent session runs in its own container or MicroVM with explicit permissions. No access to the host file system beyond what's explicitly granted. No network access beyond what's whitelisted. No persistence beyond what's explicitly saved. Docker Sandboxes provide this through lightweight MicroVMs that boot in milliseconds. This layer just arrived.

Layer 4: Governance and audit. Logging every agent action, reviewing outputs before they hit production, enforcing organizational policies on what agents can and can't do. This layer barely exists yet. Expect it to be the next battleground.

We tracked agent maturation on March 12, noting the shift from stateless tools to context-aware agents with project-level memory (the "README for agents" pattern in Claude Code and Cursor). NanoClaw's Docker partnership moves the conversation one layer deeper. It's not enough for agents to be smart and contextual. They need to be confined.

The GTC Collision Course

GTC starts Monday. Nvidia is launching NemoClaw, their enterprise agent platform, which directly competes with NanoClaw's positioning.

NemoClaw takes the opposite approach to the stack. Rather than open-source agent plus container isolation, Nvidia is offering an integrated enterprise platform with built-in security, compliance features, and GPU-optimized inference. Think of it as the difference between building your own Kubernetes cluster and buying a managed service.

The competition breaks down along familiar lines.

NanoClaw plus Docker appeals to teams that want control over their agent infrastructure. Open source. Customizable. Runs anywhere Docker runs. The same teams that chose self-hosted Kubernetes in 2017.

NemoClaw appeals to teams that want a turnkey solution from a vendor they already buy hardware from. Integrated security. Managed infrastructure. Premium pricing. The same teams that chose GKE or EKS.

Both will find customers. The interesting question is which approach becomes the default for the middle of the market, the companies that aren't sophisticated enough to self-host but aren't enterprise enough for premium pricing.

What Builders Should Know Right Now

If you're evaluating agent deployment for production use, here's the practical state of things.

Agent security is no longer optional. RoguePilot demonstrated that agents with ambient authority in shared environments are exploitable. If you're running agents without isolation, you're running on borrowed time. The attack surface is real and it's being actively probed.

The isolation patterns are settling. Container-based isolation (NanoClaw plus Docker), MicroVM isolation (Firecracker-based sandboxes), and platform-managed isolation (NemoClaw) are the three approaches. Each has trade-offs in complexity, performance, and control. But all three are viable for production.

Don't commit to an architecture before GTC. NemoClaw's launch will change the competitive landscape. If Nvidia subsidizes agent deployment for customers buying their GPUs (likely), the economics of self-hosted agent infrastructure change. Wait for the GTC pricing announcements before signing long-term infrastructure contracts.

Governance is the next gap. Security (keeping agents confined) is being solved. Governance (ensuring agents follow organizational policies, logging decisions for audit, preventing unauthorized actions) is not. If your industry has compliance requirements, you'll need to build the governance layer yourself for now. This is the unsexy but critical work.

The Bigger Picture

NanoClaw's trajectory from weekend project to Docker partnership in six weeks tells you something about the speed of infrastructure development in the agent era.

The coding agent category barely existed a year ago. OpenClaw mainstreamed it. Security concerns emerged within weeks. A hardened alternative appeared within days. Enterprise legitimization arrived within a month. The full infrastructure lifecycle that took containers 5-7 years (2013-2020) is replaying in 5-7 months.

That speed cuts both ways. It means the agent deployment problem will be solved faster than most people expect. But it also means today's best practices will be obsolete in months, not years. If you're building agent infrastructure decisions into your architecture right now, build them to be replaceable.

I've run production systems where the "industry standard" tool we chose became deprecated before we finished migration to it. The agent infrastructure space is moving fast enough that this risk is real. Pick tools that solve today's problems. Make sure you can swap them when next month's tools arrive.