AI Builders Digest — 2026-04-24

🔥 热点话题

GPT-5.5 引发热议：首次通过 F-Zero 测试，企业评估大幅提升GPT-5.5 Sparks Discussion: First F-Zero Test Pass, Big Enterprise Gains

Peter Yang 的 F-Zero 测试首次成功：GPT-5.5 + Codex 是唯一能构建出可玩游戏模型的组合。Dan Shipper 认为 GPT-5.5 不再犹豫，直接执行计划。Box CEO Aaron Levie 分享企业评估结果，GPT-5.5 在金融、医疗、公共部门等准确率整体提升 10 个百分点（金融 83% vs 64%，医疗 78% vs 61%）。

Peter Yang's F-Zero test succeeded for the first time: GPT-5.5 + Codex was the only combo that built a working game. Dan Shipper noted that GPT-5.5 just executes plans instead of hesitating. Box CEO Aaron Levie shared enterprise evals showing a 10-point accuracy jump across industries (financial 83% vs 64%, healthcare 78% vs 61%).

查看原文 →查看原文 →查看原文 →

AI 编程大战现状：2026 年编码智能体打破边界State of AI Coding Wars: 2026 Coding Agents Break Containment

来自 Unsupervised Learning 与 Latent Space 的交叉对话：Anthropic 和 OpenAI 仅从编程产品就各自达到约 20 亿美元 ARR。Swyx 指出，2025 年是编码智能体之年，2026 年将是它们打破边界、做所有其他事情的一年。市场仍处于能力探索阶段，疯狂而富有创造力的人将获得回报。

From an Unsupervised Learning x Latent Space crossover: Anthropic and OpenAI are each at roughly $2B ARR from coding products alone. Swyx notes that 2025 was the year of coding agents, and 2026 is when they break containment to do everything else. The market is still in capability exploration, rewarding those who are crazy and creative.

查看原文 →

AI 反而让你更忙？Box CEO 提出工作悖论AI Makes You Work More? Box CEO on the Work Paradox

Aaron Levie 认为 AI 不会自动减少工作，因为工作不是静态的。AI 降低了探索成本，人们开始做更多原本因太耗时从未完成的任务。一些小事会迅速消耗 3 小时，因为智能体让启动变得容易，但剩下的工作仍需人来完成。

Aaron Levie argues AI won't automatically reduce work because work isn't static. AI lowers exploration cost, so people start doing far more tasks that previously went undone. Small things quickly consume 3 hours because agents make it easy to start, but the rest still requires human effort.

查看原文 →

Anthropic 复盘 Claude Code 质量下降：三个问题已修复Anthropic Postmortem on Claude Code Quality: Three Issues Fixed

Anthropic 工程团队调查了用户报告的 Claude Code 质量下降，发现三个独立问题：默认推理 effort 从 high 改为 medium（降低智能）、缓存 bug 导致每次对话轮次清空思考历史（造成遗忘和重复）、以及系统提示减少冗余词句损害了编码质量。所有问题已在 4 月 20 日前修复，并为所有订阅用户重置使用限额。

Anthropic Engineering investigated user reports of degraded Claude Code quality and found three separate issues: default reasoning effort changed from high to medium (reducing intelligence), a caching bug that cleared thinking history every turn (causing forgetfulness), and a system prompt change to reduce verbosity that hurt coding quality. All resolved by April 20, with usage limits reset for all subscribers.

查看原文 →

Replit CEO 反驳中国蒸馏恐慌：中国科学家在开放分享真正的突破Replit CEO Pushes Back on Chinese Distillation Scaremongering

Amjad Masad 批评美国政客炒作“中国蒸馏”恐慌，指出中国科学家正在开放分享真正的 AI 突破，这些进步与数据无关，惠及所有人，包括美国实验室。他还提到 DeepSeek v4 刚刚发布。

Amjad Masad criticized US politicians scaremongering about 'Chinese distillation,' pointing out that Chinese scientists are openly sharing real AI breakthroughs that benefit everyone, including US labs. He also noted the release of DeepSeek v4.

查看原文 →查看原文 →

💰 创业成功案例

F-Zero 测试首次成功：GPT-5.5 + Codex 构建可玩游戏F-Zero Test Succeeds for First Time with GPT-5.5 + Codex

Peter Yang 每次新模型发布都会进行 F-Zero 测试（让模型构建游戏），GPT-5.5 和 Codex 是第一个成功构建出可玩游戏的组合，甚至生成了多个竞速机器人。

Peter Yang's F-Zero test (having models build a game) succeeded for the first time with GPT-5.5 + Codex, even creating multiple racing bots.

查看原文 →

M&A 热潮与创始人警告：创始人很多，企业家很少M&A Fever and Founder Warning: Many Founders, Few Entrepreneurs

Nikunj Kothari 分析 M&A 热潮原因：种子轮到 A 轮差距扩大、2021 年僵尸独角兽回归、大模型工厂吸走人才、公开市场情绪低迷。他警告：“有很多创始人，但企业家很少。不要为了当创始人而创业，除非你实在别无选择。”

Nikunj Kothari analyzes the M&A surge: widening seed-to-Series A gap, 2021 zombiecorns coming home, big token factories gobbling talent, bearish public markets. He warns: 'There is a lot of founders but very few entrepreneurs. Don't become a founder for the sake of being one - do it because you literally can't do anything else.'

查看原文 →

Y Combinator CEO 将 OpenClaw 迁移到 GBrain MinionsY Combinator CEO Migrates OpenClaw to GBrain Minions

Garry Tan 正在将 OpenClaw 的定时任务和子智能体迁移到 GBrain Minions，并创建了新的评估，展示图搜索 + 向量搜索在知识维基上的强大能力。

Garry Tan is migrating OpenClaw cron jobs and subagents to GBrain Minions, with stability improvements, and created new evals showing the power of graph + vector search on knowledge wikis.

查看原文 →查看原文 →

🛠️ 开发者工具与技巧

AI 构建者不败技术栈：Claude Opus + Codex + Conductor + RailwayUndefeated AI Building Stack: Claude Opus + Codex + Conductor + Railway

Nikunj Kothari 分享了他的首选技术栈：Claude Opus 负责规划和前端设计，Codex 负责工程，Conductor 负责编排，Railway 负责部署。他说从未如此轻松有趣地构建产品。

Nikunj Kothari shares his undefeated stack: Claude Opus for planning/design, Codex for engineering, Conductor for orchestration, Railway for deployment. He says it's never been easier and more fun to build.

查看原文 →

OpenAI 与 NVIDIA 联合部署 Codex 覆盖整个公司OpenAI and NVIDIA Roll Out Codex Across a Whole Company

Sam Altman 宣布与 NVIDIA 合作，成功将 Codex 部署到整个公司，称效果令人惊叹，并邀请其他公司尝试。

Sam Altman announced a successful Codex rollout across a whole company with NVIDIA, calling it awesome, and invited other companies to try.

查看原文 →

Claude Managed Agents 记忆功能公测：文件系统记忆，可导出Claude Managed Agents Memory Public Beta: Filesystem Memories, Exportable

Anthropic 为 Claude Managed Agents 推出公测版记忆功能，智能体可跨会话学习，记忆存储为文件，可通过 API 导出和管理，并支持审计日志。Netflix、Rakuten 等已在使用，Rakuten 将首次错误减少 97%。

Anthropic launched memory for Claude Managed Agents in public beta. Agents learn across sessions, memories stored as files, exportable via API, with audit logs. Netflix and Rakuten are using it; Rakuten cut first-pass errors by 97%.

查看原文 →查看原文 →

Claude 新增日常应用连接器：AllTrails、Instacart、Audible 等Claude Adds Everyday Life Connectors: AllTrails, Instacart, Audible, etc.

Claude 扩展连接器生态，现已支持 AllTrails、Instacart、Audible、TripAdvisor、Uber 等日常应用。Claude 会在对话中动态建议相关应用，无广告，不使用用户数据训练模型，用户可随时断开。

Claude expanded connectors to include AllTrails, Instacart, Audible, TripAdvisor, Uber, and more. Claude dynamically suggests relevant apps in conversations, ad-free, no data training, and users can disconnect anytime.

查看原文 →

OpenClaw 作者分享 GitHub 安全公告OpenClaw Creator Shares GitHub Security Advisory

Peter Steinberger 称赞 GitHub 团队，并分享了他最喜欢的安全公告。

Peter Steinberger praised the GitHub team and shared a favorite security advisory.

查看原文 →

🌍 其他动态

Anthropic 伦理学家的时代重量Anthropic Ethicist on the Weight of the Era

Amanda Askell 反思身处人类历史上最关键的时期之一，从内部感受到所有重量，这种感觉很奇怪。

Amanda Askell reflected on the odd feeling of living through one of the most critical periods in human history and feeling all the weight of it from the inside.

查看原文 →

AI 时代产品管理转型：Claude Code 如何保持速度Product Management Shift in AI Era: How Claude Code Maintains Velocity

Cat Wu 与 Lenny San 讨论了 AI 时代产品管理角色的转变，以及 Claude Code 如何保持产品速度。

Cat Wu discussed with Lenny San how the product management role is shifting in the AI era and how Claude Code maintains product velocity.

查看原文 →

为什么 AI 价值集中在旧金山 4 英里半径内Why AI Value Concentrates Within a 4-Mile Radius in San Francisco

Aditya Agarwal 反思 AI 创造的价值 95% 集中在旧金山 4 英里半径内，归因于好奇心、谦逊和支持性的文化，并以设计界传奇人物的演讲和每像素生成的惊人演示为例。

Aditya Agarwal reflected on why 95% of AI value is concentrated within a 4-mile radius in SF, citing curiosity, humility, and supportive culture, illustrated by design talks from legends and a mind-bending demo of every pixel being generated.

查看原文 →

纽约 AI 社区聚会：Data Driven NYC #121NYC AI Community Gathering: Data Driven NYC #121

Matt Turck 邀请纽约 AI 构建者参加 Data Driven NYC 活动，Ramp 将演示智能体创新，Estuary 展示统一数据基础设施。

Matt Turck invites NYC AI builders to Data Driven NYC, featuring Ramp's agentic innovation demo and Estuary's unified data infra for AI.

查看原文 →

Swyx 点评 Codex 新设计、发布与新加坡邀请Swyx on Codex Redesign, Launch, and Singapore Invitation

Swyx 认为 Codex 应用面目全非，本该一直叫 Atlas。他指出此次发布最被低估的部分不是 GPT-5.5。他还邀请 Anthropic 到“人均 AI 浓度最高”的新加坡。

Swyx says the Codex app is unrecognizable and should have been Atlas all along. He notes the most underrated part of the launch isn't GPT-5.5. He also invited Anthropic to the 'most AI-pilled country per capita' (Singapore).

查看原文 →查看原文 →查看原文 →