The Week the Rules Stopped Keeping Up

AI Safety AI Agents

The Synthetic Web — Your AI Agent Trusts the First Result

What Happened

Shah and Ozgur built synthetic mini-internets with thousands of hyperlinked articles, each ground-truth labeled for credibility, to stress-test how language agents handle adversarial information environments. The core finding: agent accuracy collapses when high-plausibility misinformation ranks well, even when truthful sources are freely available elsewhere in the index. Agents show minimal search escalation behavior and severe miscalibration when encountering conflicting information across sources.

Why It Matters

Every company deploying agentic RAG or web-browsing agents is implicitly trusting that agents can distinguish signal from noise at retrieval time. This paper shows they fundamentally cannot when ranking is adversarial. The real-world attack surface here is already industrialized. SEO manipulation is a billion-dollar industry, and it doesn't need to be retooled to target agents. The gap between deployment ambition (Pentagon-scale AI agents, enterprise agentic workflows) and epistemological robustness is wider than most teams realize. We noted the emergence of safety evaluation frameworks like AutoControl Arena and LieCraft yesterday. This paper targets the failure mode that matters most for agents actually in the field.

Sources

The Synthetic Web: Agent Epistemics Under Adversarial Ranking (arXiv)

Lab Watch AI & Policy

Google Fills the Vacuum Anthropic Created — In 24 Hours

What Happened

Google launched Agent Designer on GenAI.mil, letting the Pentagon's 3 million-plus staffers build custom Gemini agents for unclassified work. This came one day after Anthropic sued the Trump administration over being designated a supply chain risk for refusing surveillance and weapons applications. In a remarkable twist, Jeff Dean (Google's chief scientist) and over 30 employees from rival labs including OpenAI and Google DeepMind filed an amicus brief supporting Anthropic's position.

Why It Matters

Google is playing both sides of the trust equation with remarkable precision. Its employees publicly back Anthropic's ethical stance while Google's government business unit fills the exact contract vacuum Anthropic's refusal created. This isn't hypocrisy so much as a company large enough to contain genuine contradictions. The Pentagon gets its AI agents regardless. The only variable is whether the provider has redlines. We tracked Anthropic's trust-as-moat strategy yesterday. The prediction that trust differentiation becomes a competitive axis is validated, but with a critical refinement: trust is bifurcating by market segment. Commercial enterprise customers reward it. Government and defense customers punish it.

Sources

Google Deepens Pentagon AI Push After Anthropic Sues Trump Admin (CNBC) Employees Rush to Anthropic's Defense (TechCrunch)

Open Source AI AI Models

OpenAI Goes Open (Again) — GPT-OSS Under Apache 2.0

What Happened

OpenAI released gpt-oss-120b and gpt-oss-20b under Apache 2.0, their first open-weight models since GPT-2 in 2019. The 120b model achieves near-parity with o4-mini on reasoning benchmarks while running on a single 80GB GPU. The 20b model matches o3-mini and runs on 16GB, making edge deployment viable. The Apache 2.0 license is notably more permissive than Meta's Llama license, which carries a 700M MAU threshold.

Why It Matters

OpenAI is now fighting a three-front war: frontier closed models (GPT-5.4), open-weight (GPT-OSS competing directly with Llama and DeepSeek), and enterprise lock-in (Excel integration, Amazon cloud exclusivity). The open-source move is ecosystem defense, not generosity. If developers build on GPT-OSS, they stay API-compatible with OpenAI's closed model family. It's the Android strategy applied to AI models. We noted OpenAI's $110B raise and deprecation cycles yesterday. Open-weight is the third front, and it changes the competitive calculus for Anthropic, which has no open-weight play at all.

Sources

Introducing GPT-OSS (OpenAI)

AI & Policy AI & Business

EU AI Act Delays — Regulation Enters Schrodinger's Cat Territory

What Happened

The EU's Digital Omnibus proposal pushes high-risk AI compliance deadlines from August 2026 to end of 2027 for Annex III systems and August 2028 for Annex I (medical devices, aviation). Enforcement is now conditional: rules only apply after the European Commission confirms that adequate compliance support, meaning published standards and implementation guidelines, actually exists. Non-compliance still carries fines up to 35 million EUR or 7% of global turnover.

Why It Matters

Europe didn't pause the AI Act. They did something worse: made enforcement unpredictable. Companies now face regulatory Schrodinger's cat, simultaneously regulated and unregulated until the Commission opens the box. The conditional enforcement mechanism sounds reasonable in isolation, but it creates a perverse incentive structure. Large companies with legal teams can deploy aggressively in the ambiguity gap, while smaller companies without that buffer freeze or over-comply. This uncertainty tax falls hardest on exactly the players Europe claims to want to protect. Meanwhile, across the Atlantic, the Pentagon is deploying AI agents to 3 million staffers. The regulation-deployment gap is widening on both sides.

Sources

Comprehensive Guide to AI Laws and Regulations (Sumsub)

AI Infrastructure AI Research

The Inference Stack Is Converging — And It Matters More Than Any Single Model

What Happened

Three simultaneous developments this week point in the same direction. LookaheadKV (ICLR 2026) achieves 14.5x faster KV cache eviction through learned prediction of attention importance scores, with negligible runtime overhead. David Patterson's IEEE paper argues that LLM inference is fundamentally memory-bound, not compute-bound, and proposes architectural solutions including high-bandwidth flash and processing-near-memory. And vLLM v0.17.1 shipped with FP8 inference on H100/Blackwell, continuous batching, and Transformers v5 compatibility. The algorithmic, hardware, and framework layers are converging on the same problem.

Why It Matters

These aren't isolated papers. They're the same insight expressed in three different languages: inference cost is the bottleneck, and the fix is coming from every direction at once. The combinatorial improvement is multiplicative, not additive. A 14x algorithmic improvement combined with 10x memory capacity hardware combined with optimized serving frameworks doesn't give you 34x. It gives you orders of magnitude. And open-source tooling like vLLM and llama.cpp is creating model-switching abstractions that directly undermine vendor lock-in. We've been tracking Mercury's parallel generation approach and the model generational turnover pattern. This week's papers add three more vectors to the same convergence. The inference cost curve may be about to break downward faster than anyone's pricing models assume.

Sources

LookaheadKV: KV Cache Eviction via Learned Attention Prediction (arXiv (ICLR 2026)) LLM Inference Hardware Challenges (IEEE Computer) vLLM v0.17.1 Release (GitHub)

On the Radar

Perplexity Approaching 1B Monthly Queries

780M monthly queries, ~$200M ARR, $20B+ valuation. The buried lede is 78% desktop usage, signaling that AI search is winning the knowledge-work query, not consumer search. This is a professional research tool that happens to look like search.

DemandSage

Wonderful AI — $2B at 13 Months Old

Israeli startup raised $150M for AI customer support at a $2B valuation, 13 months after founding. Compare to Oracle and Block cutting thousands of customer-facing roles. The supply side (layoffs) and demand side (funding replacement tools) are now synchronized.

Bloomberg

Anthropic Found 22 Firefox Bugs via AI

Anthropic discovered 22 new security vulnerabilities in Firefox through a partnership with Mozilla. 14 high severity. One of the clearest examples of AI doing work humans weren't doing, not replacing researchers but finding what existing processes missed. Quietly strengthens the trust narrative during the Pentagon lawsuit.

The Hacker News

Deep Dives

Full analysis from today's coverage.

The Trust Paradox: Anthropic's Principles Created a Vacuum. Google Filled It in 24 Hours. Deep Read →

Anthropic refused to build weapons for the Pentagon. Google launched Pentagon AI agents the next day. Jeff Dean signed a brief supporting Anthropic anyway.

The Inference Stack Is Converging The Pattern →

Five different teams, working on five different layers of the inference problem, arrived at the same conclusion this week. The cost of running AI models is about to break downward in ways nobody's pricing models account for.

The Synthetic Web Problem: Your AI Agent Can't Tell Truth From a Well-Ranked Lie Deep Read →

Researchers built fake internets to test whether AI agents can find truth when misinformation ranks first. The agents failed. We're deploying them at Pentagon scale anyway.