AI Builders Digest — 2026-05-11

🔥 热点话题

Baseten CEO 谈 AI 推理危机、定制模型与推理云Baseten CEO on AI Inference Crunch, Custom Models, and Inference Cloud

核心要点：AI 推理市场正在爆炸式增长，定制模型主导工作负载，而严重的计算供应限制塑造着行业未来。Baseten CEO Tuhin Srivastava 分享了公司一年内增长 30 倍、预计收入超过 10 亿美元的经验，通过服务 AI 原生应用和间接服务企业实现。当前绝大多数推理（95%+）涉及定制或后训练模型，而非 vanilla 开源权重。企业采用仍处于早期，但转向基于专有工作流和用户信号的专用模型将维持独立应用层。关于中国模型，Srivastava 认为它们质量高，实际安全问题较少，美国获取这些模型能加速创新，同时美国需加强自身开源努力。容量是最大瓶颈，高利用率、多云策略和长期合同成为常态。推理软件层粘性强，构建推理、后训练与沙盒等工具的闭环将驱动下一阶段。

“Intelligence being embedded in more places”是成本下降、Agent 处理更长时序时的核心机会。

The Takeaway: The AI inference market is exploding with custom models dominating workloads, while severe compute supply constraints shape the industry's future. Baseten CEO Tuhin Srivastava shares how his company scaled 30x in a year, expecting over $1B revenue, by serving AI-native apps and enterprises indirectly. Most inference today (95%+) involves custom/post-trained models rather than vanilla open source weights. Enterprises are still early in adoption, but the shift to specialized models on proprietary workflows and user signals will sustain an independent application layer. On Chinese models, Srivastava sees them as high-quality with few real security issues in practice, arguing US access accelerates innovation while the US must strengthen its own open source efforts. Capacity is the biggest bottleneck, with high utilization, multi-cloud strategies, and long-term contracts becoming standard. Inference software layers are sticky, and building loops between inference, post-training, and tools like sandboxes will drive the next phase.

"Intelligence being embedded in more places" is the core opportunity as costs drop and agents handle longer horizons.

查看原文 →

Box CEO Aaron Levie 谈高级 Agent 在知识工作中的部署Box CEO Aaron Levie on Deploying Advanced Agents in Knowledge Work

Box CEO Aaron Levie 强调，将 Agent 从编码扩展到更广泛的知识工作需要大量工程工作。公司需要为上下文、安全集成、质量控制、人机协作工作流以及跨模型升级的维护构建稳健系统。这不是副业，而是类似前线部署工程师的高技术角色。Box 正在招聘 AI 自动化工程职位，直接与业务团队合作。Levie 预测大多数公司未来都需要各种这类专业角色。

Box CEO Aaron Levie emphasizes that moving agents from coding to broader knowledge work requires serious engineering. Companies need robust systems for context, secure integrations, quality control, human-in-the-loop workflows, and maintenance across model upgrades. This is not a side project but a highly technical role similar to forward-deployed engineering. Box is now hiring AI automation engineering roles to partner directly with business teams. Levie predicts most companies will need various versions of these specialized roles going forward.

查看原文 →

🛠️ 开发者工具与技巧

Anthropic 发布 Managed Agents：解耦 Agent 大脑与执行手Anthropic Engineering: Scaling Managed Agents by Decoupling Brain from Hands

Anthropic Engineering 推出 Managed Agents，这是一个用于长时序 Agent 的托管服务，提供稳定接口。通过解耦 session（持久事件日志）、harness（大脑）和 sandbox（手），系统变得更具弹性且面向未来。关键改进包括 cattle 而非 pets 的基础设施、更好的 VPC 支持、显著降低首 token 时间（p50 下降约 60%，p95 超过 90%），并支持多个大脑和多个手。Session 作为模型上下文窗口外的可恢复上下文。该元 harness 设计可容纳 Claude Code 或任务特定 harness 等演进实现，同时保持安全性和可靠性。链接：https://www.anthropic.com/engineering/managed-agents

Anthropic Engineering introduces Managed Agents, a hosted service for long-horizon agents with stable interfaces. By decoupling the session (durable event log), harness (brain), and sandbox (hands), the system becomes resilient and future-proof. Key improvements include cattle-not-pets infrastructure, better VPC support, dramatically reduced time-to-first-token (p50 down ~60%, p95 over 90%), and support for many brains and many hands. The session acts as recoverable context outside the model's window. This meta-harness design accommodates evolving implementations like Claude Code or task-specific harnesses while maintaining security and reliability. Link: https://www.anthropic.com/engineering/managed-agents

查看原文 →

开发者使用 HTML 进行规划、代码审查等Using HTML for Planning, Spec'ing, Exploration, Code Review and More

Claude Code 工程师 Thariq 分享了他如何使用 HTML 进行规划、规格定义、探索、代码审查、报告等。另外，他强调 Jarred 用 Rust 重写 Bun 通过了 99.8% 的现有测试，指出我们还不够有野心。

Claude Code engineer Thariq shares how he's been using HTML for planning, speccing, exploration, code review, reports and more. Separately, he highlights Jarred's rewrite of Bun in Rust passing 99.8% of tests, noting we're not being ambitious enough with such efforts.

查看原文 →查看原文 →

Peter Steinberger 用 Codex 构建浏览器集成与测试工具Peter Steinberger Builds Browser Integration and E2E Tests with Codex

OpenAI 的 Peter Steinberger 展示了 Codex 驱动的工作流：在 RepoBar 中内置浏览器以获取 issue/PR/context，使用 OpenClaw 改进其自身端点的 e2e 测试，并通过 Birdclaw 查询完整 Twitter 归档获取旧推文。

OpenAI's Peter Steinberger demonstrates Codex-powered workflows: building a browser into RepoBar for issues/PRs/context, using OpenClaw to improve e2e tests on its own endpoint, and querying his full Twitter archive via Birdclaw for old tweets.

查看原文 →查看原文 →查看原文 →

💰 创业成功案例

Baseten 30x 增长与推理市场洞见Baseten 30x Growth and Inference Market Insights

Baseten 在过去一年增长 30 倍，通过抓住定制模型的长尾并服务快速增长的 AI 应用公司（这些公司又服务于企业）实现。这为企业需求提供了翻译后的要求。公司通过收购大力投资后训练专业知识，并认为推理和后训练紧密相连。

Baseten has grown 30x in the last year by indexing on the long tail of custom models and serving fast-growing AI application companies that themselves serve enterprises. This provides translated requirements for enterprise needs. The company is investing heavily in post-training expertise via acquisition and sees inference and post-training as deeply linked.

查看原文 →

🌍 其他动态

Swyx 讨论 SaaS build vs buySwyx on Build vs Buy in SaaS

Swyx 分享了 SaaS 中 build vs buy 的思考，并 @ 了 Box CEO Aaron Levie。

Swyx shares thoughts on build versus buy decisions in SaaS, tagging Box CEO Aaron Levie.

查看原文 →

Sam Altman 趣谈下一代模型命名Sam Altman Jokingly Considers "Goblin" as Next Model Name

Sam Altman 戏称可能将下一代模型命名为“goblin”，并分享了一个有趣的观察。

Sam Altman playfully suggests naming the next model "goblin" and shares an interesting observation.

查看原文 →

Dan Shipper 的 Codex 驱动周末黑客项目Dan Shipper's Codex-Native Weekend Hack Projects

Every CEO Dan Shipper 用 Codex 在几分钟内构建了 MIDI 键盘和弦检测器和练习助手。他还注意到布鲁克林小团队保持 1-2 个月领先，并分享内容 A/B 测试的胜利。

Every CEO Dan Shipper built a MIDI keyboard chord detector and exercise helper with Codex in minutes. He also notes a small Brooklyn team staying 1-2 months ahead and shares A/B test wins for content.

查看原文 →查看原文 →

Garry Tan 谈设计的最高形式Garry Tan on the Highest Form of Design

Y Combinator CEO Garry Tan 认为，设计的最高形式实际上是将人类痛苦与折磨进行纯粹转化。

Y Combinator CEO Garry Tan reflects that the highest form of design is the transmutation of human pain and suffering.

查看原文 →

其他 AI 社区动态Other AI Community Updates

Peter Yang 希望 AI 总结孩子学校的简报以获取关键信息。Ryo Lu 试验 ryOS IRC 桥接和联网。Nikunj 体会到有孩子后红眼航班的好处。Guillermo Rauch 庆祝母亲并分享旧金山健身动态。

Peter Yang wants AI to summarize kids' school newsletters for key info. Ryo Lu experiments with ryOS IRC bridge and networking. Nikunj appreciates red-eye flights post-kids. Guillermo Rauch celebrates moms and shares SF calisthenics.

查看原文 →查看原文 →查看原文 →查看原文 →