🌐 双语
Archive

AI Builders
Digest

2026-05-08 17 builders · 34 tweets · 1 podcasts · 0 blogs

🔥 热点话题

OpenAI董事会成员Zico Kolter谈前沿AI安全风险OpenAI Board Member Zico Kolter on Frontier AI Safety Risks

OpenAI董事会成员兼安全与安全委员会主席、卡内基梅隆大学机器学习系主任Zico Kolter分享了AI治理实践。核心观点是:模型不会仅通过规模化自动变得更安全,需要明确的针对性训练、多层防御(如瑞士奶酪模型)和持续治理努力。OpenAI的Preparedness框架针对生物、cyber等灾难性风险设定阈值和保障措施。

Kolter将AI风险分为四类:模型错误(如幻觉和prompt injection)、有害使用、社会心理影响以及失控情景。他强调安全工作必须覆盖所有维度,而非只关注单一方面。关于doomer vs accelerationist辩论,他认为这些标签过于简化,95%以上的研究者都认可AI的巨大潜力同时需警惕风险。

他还讨论了jailbreak研究(如GCG论文)和现代防御:输入/输出分类器、安全训练以及运营监控。代理系统扩大了攻击面,尤其是prompt injection,但通过沙箱和适当权限仍可安全用于生产。

引用:"You can't just sort of trust models to get safer by getting bigger."
OpenAI board member and chair of the Safety and Security Committee, Carnegie Mellon ML department head Zico Kolter shares insights on AI governance in practice. The core message: models do not automatically become safer with scale. Explicit safety training, multi-layered defenses (Swiss cheese model), and ongoing governance are required. OpenAI's Preparedness framework sets thresholds and safeguards for catastrophic risks like biological and cyber misuse.

Kolter categorizes AI risks into four types: model mistakes (hallucinations, prompt injection), harmful use, societal/psychological effects, and loss of control. Safety efforts must address all, not just one. He dismisses doomer/accelerationist labels as oversimplifications, noting most researchers see both promise and risks.

He covers jailbreak research (GCG paper) and modern defenses: input/output classifiers, safety training, and operational monitoring. Agents expand attack surfaces via prompt injection but can be production-ready with sandboxes and proper permissions.

Quote: "You can't just sort of trust models to get safer by getting bigger."
查看原文 →

Sam Altman:AI让开发者变超级英雄,需加强安全Sam Altman: AI Turns Developers into Superheroes, Urgent Need for Security

OpenAI CEO Sam Altman表示,帮助软件开发者通过AI进化成超级英雄比取代他们更有价值。一个真正优秀的人现在能做到的事情令人难以置信。同时,他强调公司应尽快开始帮助自身加强安全工作,这一点非常重要。
OpenAI CEO Sam Altman says it's way cooler to help software developers pokemon-evolve into superheroes than to replace them. It is insane what one really good person can do now. He also notes OpenAI wants to help companies secure themselves and believes it's important to start this work quickly.
查看原文 →查看原文 →

💰 创业成功案例

Madhu Guru离开Google,回顾Gemini从追赶到前沿Madhu Guru Departs Google, Reflects on Gemini Journey

Google Gemini产品领导者Madhu Guru宣布离开公司。他帮助从零构建了两项业务,包括Gemini。从三年前OpenAI和Anthropic领先,到构建模型 playbook、客户反馈飞轮和企业业务,Gemini 3标志着这些系统整合成功。团队从underdogs成长为前沿竞争者。
Google Gemini product leader Madhu Guru announced his departure. He helped build two businesses from zero, including Gemini. Three years ago OpenAI and Anthropic led; the team built the playbook for models, customer feedback flywheel, and enterprise business. Gemini 3 was the moment systems came together, turning underdogs into frontier competitors.
查看原文 →

🛠️ 开发者工具与技巧

Claude集成Microsoft Office全家桶Claude Integrates with Microsoft Office Suite

Anthropic的Claude for Excel、PowerPoint和Word现已全面可用,Outlook版进入公测。Claude可在Microsoft应用间保持完整对话上下文。
Anthropic's Claude for Excel, PowerPoint, and Word are now generally available, with Outlook in public beta. Claude carries full conversation context as it moves between Microsoft apps.
查看原文 →

Cursor和代理工具更新:从想法到合并Cursor and Agent Tools Updates: Idea to Merge

Cursor设计者Ryo Lu展示了使用Cursor从想法到代码合并的全流程。Peter Steinberger分享了Claw与Molty代理相互通信、委派cron任务的能力。Garry Tan发布了GStack v1.28,新增浏览器下载、headed模式和llms.txt支持。
Cursor designer Ryo Lu demonstrated going from idea to merge entirely in Cursor. Peter Steinberger shared claws talking to each other and Molty delegating cron jobs. Garry Tan dropped GStack v1.28 with browser downloads, headed mode, anti-bot detection, and llms.txt for better agent skills.
查看原文 →查看原文 →查看原文 →

Claude帮助Firefox团队创纪录修复安全bugClaude Helps Firefox Team Set Bug-Fixing Record

借助Claude Mythos Preview,Firefox团队在四月修复的安全bug数量超过过去15个月的总和。
With help from Claude Mythos Preview, the Firefox team fixed more security bugs in April than in the past 15 months combined.
查看原文 →

Aaron Levie谈AI自动化后的差异化Aaron Levie on Differentiation After AI Automation

Box CEO Aaron Levie指出,当AI让某件事变得容易时,每个人都能做到,竞争力量会将资源转向销售、营销和客户成功等差异化领域。问自己:如果大家都用这项技术做同样的事,我如何脱颖而出?
Box CEO Aaron Levie notes that when AI makes one thing easy, everyone can do it, so competitive forces shift resources to sales, marketing, and customer success for differentiation. Ask: if everyone does exactly what I do with this technology, how will I stand out?
查看原文 →

🌍 其他动态

Replit托管史上最病毒式请愿Replit Hosts Most Viral Petition in History

Replit CEO Amjad Masad分享:一个关于Mbappe的请愿在Replit上成为史上最病毒式传播(Replit对此无立场)。
Replit CEO Amjad Masad notes they're calling it the most viral petition in history, hosted on Replit (Replit has no opinion on Mbappe).
查看原文 →

AI安全与代理生产就绪性AI Security and Agent Production Readiness

Zico Kolter确认在适当guardrails、沙箱和权限控制下,AI代理已可用于生产。Sam Altman强调帮助公司加强安全的重要性。
Zico Kolter confirms AI agents are production-ready with proper guardrails, sandboxing, and permissions. Sam Altman emphasizes the importance of helping companies secure themselves.
查看原文 →查看原文 →

🌍 其他动态

Zico Kolter的现代AI入门课程Zico Kolter's Intro to Modern AI Course

卡内基梅隆大学Zico Kolter推出免费在线Intro to Modern AI课程,学生从零构建LLM(约200-300行Python代码),包括PyTorch、RL和工具调用,强调AI系统本质上非常简单,其复杂性主要来自训练数据。
Carnegie Mellon’s Zico Kolter launched a free online Intro to Modern AI course where students build an LLM from scratch (~200-300 lines of Python), covering PyTorch, RLHF with tool calls. He emphasizes modern AI systems are incredibly simple; complexity emerges from training data.
查看原文 →

AI行业人员动态AI Industry Moves

Google Gemini产品领导者Madhu Guru离开公司;多位AI构建者分享代理工具、浏览器集成和个人生产力提升。
Google Gemini product leader Madhu Guru is moving on; multiple AI builders share agent tools, browser integrations, and productivity gains.