185 结果
2025年7月17日 / Gemini
Veo 3, Google’s latest AI video generation model, is now available in paid preview via the Gemini API and Google AI Studio. Unveiled at Google I/O 2025, Veo 3 can generate both video and synchronized audio, including dialogue, background sounds, and even animal noises. This model delivers realistic visuals, natural lighting, and physics, with accurate lip syncing and sound that matches on-screen action.
2025年7月16日 / AI
The `logprobs` feature has been officially introduced in the Gemini API on Vertex AI, provides insight into the model's decision-making by showing probability scores for chosen and alternative tokens. This step-by-step guide will walk you through how to enable and interpret this feature and apply it to powerful use cases such as confident classification, dynamic autocomplete, and quantitative RAG evaluation.
2025年7月16日 / Cloud
The Marin project aims to expand the definition of 'open' in AI to include the entire scientific process, not just the model itself, by making the complete development journey accessible and reproducible. This effort, powered by the JAX framework and its Levanter tool, allows for deep scrutiny, trust in, and building upon foundation models, fostering a more transparent future for AI research.
2025年7月16日 / Gemini
The updated Agent Development Kit (ADK) simplifies and accelerates the process of building AI agents by providing the CLI with a deep, cost-effective understanding of the ADK framework, allowing developers to quickly ideate, generate, test, and improve functional agents through conversational prompts, eliminating friction and keeping them in a productive "flow" state.
2025年7月14日 / Gemini
The Gemini Embedding text model is now generally available in the Gemini API and Vertex AI. This versatile model has consistently ranked #1 on the MTEB Multilingual leaderboard since its experimental launch in March, supports over 100 languages, has a 2048 maximum input token length, and is priced at $0.15 per 1M input tokens.
2025年7月10日 / Gemini
GenAI Processors 是 Google DeepMind 推出的一个全新开源 Python 库,旨在为从输入处理到模型调用和输出处理之间的所有步骤提供一致的“Processor”接口,以实现无缝链接和并发执行,从而简化 AI 应用的开发,特别是那些用于处理多模态输入且需要实时响应的应用。
2025年7月10日 / Cloud
Updates in Firebase Studio include new Agent modes, foundational support for the Model Context Protocol (MCP), and Gemini CLI integration, all designed to redefine AI-assisted development allow developers to create full-stack applications from a single prompt and integrate powerful AI capabilities directly into their workflow.
2025年7月9日 / Gemma
作为 Encoder-Decoder LLM 的新系列,T5Gemma 通过转换和调整基于 Gemma 2 框架的预训练 Decoder-only 模型开发而成,与其对应的 Decoder-only 模型相比,具有更出色的性能和效率,尤其适用于需要深度输入理解的任务,例如摘要和翻译。
2025年7月7日 / Gemini
Gemini API 新推出的批量模式专为高吞吐量、对延迟时间不敏感的 AI 负载而设计,通过执行调度和处理来简化大型作业,并使数据分析、批量内容创建和模型评估等任务更具成本效益和可扩展性,从而让开发者能高效地处理大量数据。
2025年6月26日 / Gemma
Gemma 3n 模型已经全面发布,在既往 Gemma 模型的成功基础上进一步提升,并以前所未有的性能为边缘设备带来先进的设备端多模态功能。欢迎探索 Gemma 3n 的创新,包括其移动设备优先架构、MatFormer 技术、分层嵌入、KV 缓存共享以及新的音频和 MobileNet-V5 视觉编码器,并了解开发者如何立即开始使用它进行构建。