News Nug
all agent benchmark eval fine tuning inference new model open source research tool workflow
SentryCode: Real-time Auditor + Honeytokens for AI Coding Agents [P]
r/MachineLearning · 2d ago · 5
Making LLMs Better at Creative Writing using Entropy
r/LocalLLaMA · 2d ago · 5
[D] Self-Promotion Thread
r/MachineLearning · 2d ago · 5
Making Optimization Work When Labels Are Scarce [R]
r/MachineLearning · 2d ago · 5
Autoresearch: The feedback loop behind self-improving agents
Latent Space · 2d ago · 5
I extended Gemma4-31B to 44B (88 layers) — since Google won't give us anything bigger than 31B
r/LocalLLaMA · 2d ago · 5
Hamiltonian Neural Networks from a Differential Geometry Perspective [D]
r/MachineLearning · 2d ago · 5
Senior SWE Bench: a new benchmark focussed on realistically underspecified feature tasks
r/LocalLLaMA · 2d ago · 5
New PyMuPDF release, supports Markdown [N]
r/MachineLearning · 2d ago · 5
ZCode: New Agentic Code Editor from the Makers of GLM
r/LocalLLaMA · 2d ago · 5
How to describe a model that has higher accuracy with fewer #param and FLOPs? [D]
r/MachineLearning · 2d ago · 5
How Cursor deploys AI inside the enterprise
Latent Space · 2d ago · 5
ACL ARR May 2026[D]
r/MachineLearning · 3d ago · 5
Couldn't hold back
r/LocalLLaMA · 3d ago · 5
P Moth-Retrieval: Graph-Free Multi-Hop Retrieval via Query-Time Orchestration (Beating Graph-Based Systems on HotpotQA) [P]
r/MachineLearning · 3d ago · 5
Open Models - June 2026
r/LocalLLaMA · 3d ago · 5
gemma-4-31B on Cerebras is better than ChatGPT voice mode
r/LocalLLaMA · 3d ago · 5
[D] Simple Questions Thread
r/MachineLearning · 3d ago · 5
SWE-rebench leaderboard update: GLM-5.2, Qwen3.6-27B, Qwen3.6-35B-A3B, Gemma 4 31B and more + improved UI
r/LocalLLaMA · 3d ago · 5
🔬 The Coolest Diffusion Research Isn't in LLMs — Evan Feinberg & Sergey Edunov, Genesis Molecular AI
Latent Space · 3d ago · 5
<12345…75>