Simon Willison · 2d ago · 6 · tool api update agent

Datasette 1.0a30 introduced a new makeJumpSections() JavaScript plugin hook that datasette-agent leverages to add agent chat functionality directly into the Jump to menu interface. This represents a practical integration pattern for embedding AI agents into existing tools, though it's specific to the Datasette ecosystem rather than broadly applicable.

Latent Space · 4d ago · 6 · agent workflow api update

Industry shift from models as primary product to agents as integrated systems combining models, harnesses, UI, and workflows. Major players (OpenAI, AI21, DeepSeek) are building dedicated agent teams and reducing standalone model focus, with concrete shipping examples like OpenAI's Codex updates and Claude's auto-mode expansion showing product differentiation moving beyond model quality alone.

r/MachineLearning · 4d ago · 6 · api update inference deployment

Analysis of AI lab profitability models (Anthropic, xAI, OpenAI) and their implications for API pricing and developer costs. The article examines divergent strategies: Anthropic's enterprise lock-in approach with claimed 77% margins versus xAI's aggressive subsidy-driven approach, with direct impact on token pricing through Q3.

Latent Space · 7d ago · 9 · new model api update agent workflow

Google released Gemini 3.5 Flash (GA immediately) with 1M context window, 65k max output, and agentic/coding capabilities, plus the new Gemini Omni multimodal family for video/audio generation and editing. The stack includes expanded Antigravity agents across desktop/CLI/SDK/API, with Google reporting 3.2 quadrillion tokens/month processed and 900M+ monthly users.

Simon Willison · 7d ago · 9 · new model api update deployment inference

Google released Gemini 3.5 Flash to general availability with 1M input/65K output tokens, integrated into billions of consumer products, but at 3-6x higher pricing than previous Flash models ($1.50/$9 per million tokens). The release includes a new Interactions API (beta) for server-side history management and demonstrates industry-wide trend of pricing increases for new model releases across OpenAI, Anthropic, and Google.

Anthropic Blog · 8d ago · 8 · tool agent api update deployment

Anthropic acquired Stainless, the company behind SDK generation and MCP server tooling that powers Claude integrations. This acquisition strengthens agent connectivity by consolidating SDK/CLI generation and Model Context Protocol infrastructure, directly impacting how developers build tool-calling capabilities for AI agents.

OpenAI Blog · 9d ago · 6 · deployment agent api update

OpenAI and Dell are partnering to enable on-premise deployment of Codex for enterprise environments, addressing secure AI coding in hybrid setups. This allows software engineers to integrate AI coding capabilities within their own infrastructure while maintaining data privacy and control.

DeepMind Blog · 9d ago · 8 · new model api update inference

Google released Gemini Omni Flash, a multimodal generative model that creates and edits video from text, image, audio, and video inputs with consistent physics and character continuity. The model supports iterative natural language editing and reasoning about real-world physics, now rolling out to Gemini app, Google Flow, and YouTube Shorts with plans to add image and audio generation.

DeepMind Blog · 10d ago · 7 · new model agent api update workflow

Google launches Gemini for Science, a collection of experimental AI tools (Co-Scientist, Alpha Evolve, Empirical Research Assistance, NotebookLM) designed to accelerate scientific research workflows by automating complex tasks like literature analysis and data synthesis. Enterprise versions are already in private preview with companies like BASF and Bayer, with validation papers published in Nature.

DeepMind Blog · 10d ago · 6 · tool api update deployment

Google is expanding SynthID digital watermarking and C2PA Content Credentials verification across its products (Search, Gemini, Chrome, Pixel) to help detect AI-generated vs. authentic content. The verification tools have already been used 50 million times and are rolling out to more platforms, with industry partners like OpenAI and ElevenLabs adopting SynthID for their generated content.

Latent Space · 12d ago · 7 · tool agent api update workflow

GitHub and OpenAI released significant updates to coding agent tooling: GitHub's new Copilot App provides an agent-first desktop environment for parallel workflows, while OpenAI expanded Codex into mobile with remote execution, SSH management, and programmatic automation hooks. VS Code added multi-agent/multi-project support with browser/mobile access via vscode.dev/agents and token-efficiency features.

OpenAI Blog · 12d ago · 5 · workflow api update

Article describes using Codex (OpenAI's code model) to automate documentation generation for data science workflows, converting raw work inputs into structured business outputs like briefs and analytics specs. Practical for engineers integrating LLMs into data pipelines, though focuses more on business process automation than novel technical implementation.

r/LocalLLaMA · 12d ago · 6 · tool workflow api update

VS Code's AI Toolkit extension now supports agent-first development with configurable language models optimized for different tasks, including reasoning models with adjustable thinking effort levels. The article covers model selection strategies (fast vs. reasoning models), tool-calling support for agents, and how to configure API keys for custom models.

OpenAI Blog · 13d ago · 6 · api update tool workflow

OpenAI's Codex integration in the ChatGPT mobile app enables remote code generation and task monitoring across devices. This expands practical access to AI-assisted coding workflows beyond desktop environments, useful for developers managing remote infrastructure or mobile-first development pipelines.

OpenAI Blog · 13d ago · 5 · api update

OpenAI has implemented safety updates to ChatGPT that improve contextual understanding of sensitive conversations and risk detection patterns. While the safety mechanisms are interesting from an AI safety perspective, the practical technical details and implementation methods are not disclosed, limiting direct applicability for engineers building with AI.

Anthropic Blog · 13d ago · 6 · api update workflow agent

Anthropic launched Claude for Small Business, a package of pre-built agentic workflows and connectors that integrate Claude into tools like QuickBooks, HubSpot, and Google Workspace for small business automation tasks. The offering includes 15 ready-to-run workflows across finance, sales, and operations, plus emphasis on data security and AI training partnerships.

r/MachineLearning · 13d ago · 8 · open source tool inference deployment api update

Scenema Audio releases open-source diffusion-based TTS model weights and inference code that decouples emotional performance from voice identity through prompt-based control. Key technical advantages include more natural emotional delivery than autoregressive TTS, support for audio-first video generation workflows, optimized diffusion (8 steps), and Docker/REST API deployment with automatic VRAM management. Practical trade-offs noted: stochastic quality requiring post-editing workflow, sensitivity to detailed prompting, and phonetic spelling for complex words.

Latent Space · 14d ago · 7 · fine tuning benchmark agent open source research api update

OpenAI is deprecating fine-tuning APIs, shifting the AI engineering landscape toward open models, longer context windows, and agentic systems. The piece covers emerging research benchmarks (FrontierMath, medical evals), agentic breakthroughs in math/physics/coding, and the practical move away from proprietary model fine-tuning toward prompt engineering and open-source RLFT alternatives.

Simon Willison · 14d ago · 9 · api update inference tool

OpenAI's reasoning-capable models now use a new /v1/responses endpoint instead of /v1/chat/completions, enabling interleaved reasoning across tool calls for GPT-5 class models. Developers can now view summarized reasoning tokens in their prompts with new command flags (-R/--hide-reasoning) to control visibility.