News Nug
all agent api update benchmark business cuda dataset deployment eval fine tuning game dev hardware inference library monitoring new model open source optimization probe targeted prompt engineering quantization rag research security tool training tutorial workflow
Gemma 4 QAT 31B responds better to KV cache quantization too
r/LocalLLaMA · 7h ago · 5
GLM-5.2 is on DeepSWE
r/LocalLLaMA · 16h ago · 5
sqlite-utils 4.0rc1 adds migrations and nested transactions
Simon Willison · 18h ago · 5
sqlite-utils 4.0rc1
Simon Willison · 18h ago · 5
Local LLM Inference Optimization: The Complete Guide
r/LocalLLaMA · 18h ago · 5
Samsung Electronics brings ChatGPT and Codex to employees
OpenAI Blog · 18h ago · 5
Temporary Cloudflare Accounts for AI agents
Simon Willison · 19h ago · 5
[ECCV 2026] Paper Decision Appeals Discussion [D]
r/MachineLearning · 21h ago · 5
An Update on Matrix Recurrent Units, an Attention Alternative [R]
r/MachineLearning · 22h ago · 5
Data-centric debugging for teams training neural nets [P]
r/MachineLearning · 1d ago · 5
Best current methods for finetuning whisper on domain specific vocabulary? [P]
r/MachineLearning · 1d ago · 5
EMA on LoRA ? [R]
r/MachineLearning · 1d ago · 5
A slightly improved DVD-JEPA demo [P]
r/MachineLearning · 1d ago · 5
[Exclusive] $250 off AI Engineer tix til Monday
Latent Space · 1d ago · 5
Tokenomics
r/LocalLLaMA · 1d ago · 5
When I can start applying for job[D]
r/MachineLearning · 1d ago · 5
8-16 MI50s Minimax M3 @19 tps TG (peak)
r/LocalLLaMA · 1d ago · 5
I released a softmax-free attention model at GPT-2 Medium scale (~354M params, 11.5B tokens): structural sparsity + tile-skipping kernels for long-context VRAM savings. Open weights + custom Triton kernels [R]
r/MachineLearning · 1d ago · 5
Gemma 4 QAT seems to respond significantly better to KV cache quantization
r/LocalLLaMA · 1d ago · 5
Vercel CEO: "Almost shocked" by how good GLM-5.2 is at coding
r/LocalLLaMA · 1d ago · 5
<12345…61>