AI Builders Digest — 2026-05-13

Waymo 联合创始人 Dmitri Dolgov：2000 万次全自动驾驶之旅与通往完全自主的道路Waymo's Dmitri Dolgov: 20 Million Rides and the Road to Full Autonomy

核心要点：安全地大规模部署全自动驾驶汽车需要极强的毅力来度过炒作周期、将安全作为不可妥协的基础，以及超越基本端到端模型的复杂 AI 架构。

Waymo 领导者 Dmitri Dolgov 自 20 多年前 DARPA 挑战赛以来一直从事自动驾驶工作，他分享了公司的历程。从 2009 年作为 Google 小型项目起步，只有十几人，团队设定了雄心勃勃的目标，如完成 10 万英里全自动驾驶和无干预完成困难的 100 英里路线。早期激烈的创业日子每个人都身兼多职，从硬件到算法。

Dolgov 解释说，AI 炒作周期会带来早期快速进步，但无法解决真实世界部署的长尾问题。Waymo 通过专注于关键使命——减少全球每 26 秒发生一次的道路死亡——并理解没有银弹而坚持下来。他们的 Waymo Foundation 模型驱动驾驶员、模拟器和评论家，作为多模态世界动作语言模型，理解物理、代理行为和安全驾驶原则。

他们用结构化中间表示增强基本端到端学习，以实现更好的验证、训练和安全。新第六代硬件为 OHAI 等车辆提供动力，围绕乘客体验设计。扩展速度大幅加快：2000 万次全自动驾驶，其中 1000 万次发生在过去七个月，目前在 11 个城市运营。该系统展现出超人安全水平，在防止严重伤害方面比人类驾驶员安全 13 倍以上，目前规模下每八天防止一次严重伤害。

Dolgov 强调「现状不可接受」，并指出像用超人反应时间防止事故这样的真实事件让工作充满回报。他与家人的日常使用凸显了产品的成熟度。

The Takeaway: Building safe, fully autonomous vehicles at scale requires extreme persistence through hype cycles, a deep commitment to safety as a non-negotiable foundation, and sophisticated AI architectures that go beyond basic end-to-end models.

Waymo leader Dmitri Dolgov, who has been working on autonomous vehicles since the DARPA challenges over 20 years ago, shares lessons from the company's journey. Starting as a small Google project in 2009 with a dozen people, the team set ambitious goals like driving 100,000 fully autonomous miles and completing difficult 100-mile routes without interventions. Those early intense startup days involved everyone doing everything from hardware to algorithms.

Dolgov explains that AI hype cycles create rapid early progress but don't solve the long tail of real-world deployment. Waymo persisted by focusing on the critical mission—reducing road deaths that occur every 26 seconds globally—and understanding there are no silver bullets. Their Waymo Foundation model powers a driver, simulator, and critic, functioning as a multimodal world action language model that understands physics, agent behaviors, and safe driving principles.

They augment basic end-to-end learning with structured intermediate representations for better validation, training, and safety. The new sixth-generation hardware powers vehicles like the OHAI, designed around rider experience. Scaling has accelerated dramatically: 20 million fully autonomous rides, with 10 million in the last seven months, now operating in 11 cities. The system demonstrates superhuman safety, being over 13 times safer than human drivers in preventing serious injuries and preventing one every eight days at current scale.

"The status quo is not okay," Dolgov emphasizes, highlighting how real-world events like preventing accidents with superhuman reaction times make the work rewarding. His daily use with family underscores the product's maturity.

查看原文 →

Anthropic Engineering：Claude Code auto mode——更安全的跳过权限方式Anthropic Engineering: Claude Code auto mode: a safer way to skip permissions

Anthropic Engineering 推出了 Claude Code auto mode，这是一种新的权限系统，在手动批准和风险较高的 --dangerously-skip-permissions 标志之间取得了平衡。它使用基于模型的分类器自动批准安全操作，同时阻止危险操作，减少批准疲劳且不会牺牲太多安全性。

该系统采用两层防护：服务器端 prompt-injection 探测器扫描工具输出，以及两阶段 transcript 分类器（由 Sonnet 4.6 驱动），根据用户意图和风险规则评估操作。它针对过度积极行为、诚实错误和潜在数据泄露，同时允许常规编码任务。在真实流量上的表现显示完整流程后假阳性率仅 0.4%，对危险操作有良好的召回率。

关键设计包括剥离助手消息和工具结果以防止操纵和注入攻击，以及 deny-and-continue 逻辑，让被阻止的操作提示更安全的替代方案而非中断会话。可自定义槽位允许用户定义信任环境、阻止类别和例外。

这种方法让开发者自主代理工作流变得更加实用，同时保留针对常见失败模式（如批量删除、凭证搜索或未经授权部署）的防护。用户可通过文档开始使用，保守默认设置会随着分类器覆盖范围扩大而改进。文章详细介绍了全面的威胁建模和评估结果，是朝向更安全代理编码工具迈出的有意义一步。

直达链接：https://www.anthropic.com/engineering/claude-code-auto-mode

Anthropic Engineering introduces Claude Code auto mode, a new permission system that strikes a balance between manual approvals and the risky --dangerously-skip-permissions flag. It uses model-based classifiers to automatically approve safe actions while blocking dangerous ones, reducing approval fatigue without sacrificing too much safety.

The system employs two layers: a server-side prompt-injection probe that scans tool outputs and a two-stage transcript classifier (powered by Sonnet 4.6) that evaluates actions against user intent and risk rules. It targets overeager behaviors, honest mistakes, and potential exfiltrations while allowing routine coding tasks. Performance on real traffic shows a low 0.4% false positive rate after the full pipeline, with solid recall on dangerous actions.

Key design choices include stripping assistant messages and tool results for the classifier to prevent gaming and injection attacks, plus deny-and-continue logic so blocked actions prompt safer alternatives rather than halting sessions. Customizable slots let users define their trusted environment, block categories, and exceptions.

This approach makes autonomous agent workflows far more practical for developers while maintaining guardrails against common failure modes like mass deletions, credential hunting, or unauthorized deploys. Users can get started via the docs, with conservative defaults that improve over time as classifier coverage grows. The post details extensive threat modeling and evaluation results, marking a meaningful step toward safer agentic coding tools.

Direct link: https://www.anthropic.com/engineering/claude-code-auto-mode

查看原文 →

AI Builders
Digest

💰 创业成功案例

Waymo 联合创始人 Dmitri Dolgov：2000 万次全自动驾驶之旅与通往完全自主的道路Waymo's Dmitri Dolgov: 20 Million Rides and the Road to Full Autonomy

🛠️ 开发者工具与技巧

Anthropic Engineering：Claude Code auto mode——更安全的跳过权限方式Anthropic Engineering: Claude Code auto mode: a safer way to skip permissions