Anthropic Research · 12h ago · 8 · open source tool benchmark research

Petri 3.0, an open-source alignment testing toolbox, has been transferred to Meridian Labs nonprofit to evaluate LLMs for misaligned behaviors like deception and sycophancy. The tool uses separate auditor and judge models to systematically test alignment across scenarios, and is now part of a broader evaluation stack alongside Inspect and Scout for independent, credible model assessment.

Anthropic Research · 12h ago · 9 · research tool open source workflow

Anthropic introduces Natural Language Autoencoders (NLAs), a method that converts neural network activations into human-readable text explanations, enabling direct interpretation of what language models are thinking. The approach trains models to explain their own activations and reconstruct them from text, with applications to safety testing and reliability improvements. Code and an interactive frontend are released for researchers to build on this interpretability technique.

r/MachineLearning · 12h ago · 7 · tutorial inference deployment

Manning is releasing 'Quantization and Fast Inference' by Kalyan Aranganathan, a practical guide covering PTQ, QAT, and production deployment trade-offs for efficient model inference. The book addresses real-world quantization challenges like activation outliers in LLMs, KV cache optimization, and hardware-specific behavior—moving beyond theory to operational constraints.

Simon Willison · 14h ago · 6 · new model benchmark

Simon Willison shares retrospective analysis of Gemini 3.1 Flash-Lite, comparing the March preview version to the now-released production model. The writeup covers technical characteristics of this lightweight variant in Google's Gemini 3.1 lineup, useful for understanding model capabilities and trade-offs for different deployment scenarios.

Simon Willison · 16h ago · 8 · workflow research tool agent

Mozilla leveraged Claude Mythos preview to systematically identify and fix hundreds of Firefox security vulnerabilities using improved AI-guided techniques for steering, scaling, and filtering model outputs. The approach discovered 423 security bugs in April 2026 (vs. 20-30/month previously), demonstrating practical application of advanced LLMs for security auditing at scale.

r/MachineLearning · 16h ago · 6 · research prompt engineering

A developer proposes using diffusion models operating on abstract syntax trees (ASTs) to guarantee syntactically correct code generation by constraining the search space to valid program structures rather than token sequences. The idea suggests this approach could reduce training data requirements by leveraging the finite combinatorial space of valid ASTs with fixed node counts.

r/MachineLearning · 17h ago · 6 · research workflow benchmark

A software engineer shares a technical approach using Jensen-Shannon divergence (JSD) to detect narrative shifts in AI news before sentiment aggregates register them, comparing rolling 7-day windows across vocabulary distributions and an 8-category narrative frame taxonomy. The core challenge is establishing reliable baselines and trigger thresholds at short time horizons where existing semantic change literature (typically longer-term) may not directly apply, raising questions about window sizing, distance metrics, and frame granularity for daily news regime detection.

r/MachineLearning · 19h ago · 6 · inference deployment workflow

A practitioner explores ROCm viability for model training on AMD GPUs (RX7900XTX) as an alternative to NVIDIA RTX 3090s, noting PyTorch support but lacking concrete user reports on training performance and ecosystem maturity. The technical comparison focuses on FP16 throughput advantages and seeking real-world validation of ROCm's production-readiness for training workflows.

r/MachineLearning · 21h ago · 8 · tool tutorial open source

An interactive dataflow visualization tool for understanding transformer architectures from first principles, covering attention mechanisms (MLA, hybrid attention, RoPE), routing methods (MoE), and model variants from GPT-2 to Qwen 3.6. Useful for engineers who need to understand architectural differences and implementations across modern LLM families.

r/MachineLearning · 21h ago · 6 · inference workflow

A software engineer asks about reproducibility of video diffusion models across different GPU architectures, questioning whether identical weights, prompts, and noise seeds produce perceptually similar outputs despite floating-point arithmetic differences. This technical question touches on practical concerns for deterministic inference and model deployment consistency.

r/LocalLLaMA · 22h ago · 7 · tool inference open source

This PR adds MiMo V2.5 model support to llama.cpp with text-to-text inference capabilities, including proper FP8 dequantization handling and attention value scale fixes for better transformer compatibility. The implementation addresses weight sharding complexities and unfuses attention components to maintain compatibility with existing MiMo V2 inference paths.

OpenAI Blog · 23h ago · 6 · agent deployment api update

Parloa is a platform that uses OpenAI's models to build voice-based customer service agents with simulation and deployment capabilities. While it demonstrates practical application of LLMs for enterprise use, it's primarily a SaaS product rather than a new technical capability or tool that directly impacts daily AI engineering workflows.

OpenAI Blog · 1d ago · 9 · api update new model inference

OpenAI has released new realtime voice models in their API supporting reasoning, translation, and transcription capabilities. This enables building voice applications with lower latency and more natural interactions, expanding the technical possibilities for voice-based AI products and integrations.

Simon Willison · 1d ago · 6 · tool api update

Simon Willison built a tool that fetches GitHub repository statistics (commits, etc.) via REST/GraphQL API to work around missing metrics on GitHub's mobile site. The tool demonstrates practical API usage for extracting repository metadata that engineers might find useful when evaluating projects.

Latent Space · 1d ago · 7 · api update agent tool inference

Anthropic announced Claude Code feature updates (doubled rate limits, removed peak-hour restrictions) and new agent platform capabilities at their developer event, plus a SpaceX compute partnership that enables immediate product improvements. While no new model release, the practical updates to Claude Code and emerging multi-agent orchestration patterns are useful for engineers building with Claude.