Baseten CEO 谈 AI 推理危机、定制模型与推理云Baseten CEO on AI Inference Crunch, Custom Models, and Inference Cloud
核心要点:AI 推理市场正在爆炸式增长,定制模型主导工作负载,而严重的计算供应限制塑造着行业未来。Baseten CEO Tuhin Srivastava 分享了公司一年内增长 30 倍、预计收入超过 10 亿美元的经验,通过服务 AI 原生应用和间接服务企业实现。当前绝大多数推理(95%+)涉及定制或后训练模型,而非 vanilla 开源权重。企业采用仍处于早期,但转向基于专有工作流和用户信号的专用模型将维持独立应用层。关于中国模型,Srivastava 认为它们质量高,实际安全问题较少,美国获取这些模型能加速创新,同时美国需加强自身开源努力。容量是最大瓶颈,高利用率、多云策略和长期合同成为常态。推理软件层粘性强,构建推理、后训练与沙盒等工具的闭环将驱动下一阶段。
“Intelligence being embedded in more places”是成本下降、Agent 处理更长时序时的核心机会。
“Intelligence being embedded in more places”是成本下降、Agent 处理更长时序时的核心机会。
The Takeaway: The AI inference market is exploding with custom models dominating workloads, while severe compute supply constraints shape the industry's future. Baseten CEO Tuhin Srivastava shares how his company scaled 30x in a year, expecting over $1B revenue, by serving AI-native apps and enterprises indirectly. Most inference today (95%+) involves custom/post-trained models rather than vanilla open source weights. Enterprises are still early in adoption, but the shift to specialized models on proprietary workflows and user signals will sustain an independent application layer. On Chinese models, Srivastava sees them as high-quality with few real security issues in practice, arguing US access accelerates innovation while the US must strengthen its own open source efforts. Capacity is the biggest bottleneck, with high utilization, multi-cloud strategies, and long-term contracts becoming standard. Inference software layers are sticky, and building loops between inference, post-training, and tools like sandboxes will drive the next phase.
"Intelligence being embedded in more places" is the core opportunity as costs drop and agents handle longer horizons.
查看原文 →
"Intelligence being embedded in more places" is the core opportunity as costs drop and agents handle longer horizons.