r/MachineLearning · 2d ago · 7 · research open source inference

Engineer demonstrates language model-based source code compression using n-gram models + arithmetic coding, achieving 82.4% compression (0.176x ratio) on Flask codebase—33% better than zlib but 1600× slower. The work showcases how token-level modeling captures syntactic patterns better than byte-level compressors, with practical implications for downstream transformer/LSTM approaches and batch optimization.

r/MachineLearning · 2d ago · 6 · workflow api update library

A developer encounters a breaking change in the Hugging Face Transformers library where the 'question-answering' pipeline task has been deprecated, and seeks alternatives for zero-shot extractive QA on text. The post highlights a practical workflow issue: the code previously used `pipeline('question-answering')` no longer works, and available alternatives like 'document-question-answering' don't fit text-only use cases.

r/MachineLearning · 2d ago · 7 · research rag fine tuning inference

Experimental work on augmenting frozen transformers with lightweight external memory for in-context adaptation without weight updates. Uses forward-pass derived correction vectors to enable one-shot binding of new facts while maintaining context separation, with results showing 80%+ accuracy on same-context recall but degraded generalization to new contexts.

r/MachineLearning · 3d ago · 8 · workflow tutorial

A discussion thread addressing the common blocker of content consumption without practical application—exploring how to transition from learning AI concepts to independently building systems. The conversation likely covers project-based learning strategies, determining necessary depth in math/theory, and developing the problem-solving mindset needed for real-world engineering rather than tutorial-following.

r/MachineLearning · 3d ago · 8 · research agent open source benchmark inference

A minimal research implementation of Meta AI's test-time compute scaling paper (PDR+RTV pipeline) for agentic coding tasks, enabling developers to experiment with the approach using Gemini 3.1 Pro on SWE-bench. This is the first public implementation of the paper's core techniques, making it immediately useful for engineers exploring advanced reasoning strategies in coding agents.

r/MachineLearning · 3d ago · 5 · research

Discussion thread exploring practical applications of physics-informed neural networks (PINNs) and physics-informed AI beyond academia. The post raises valid questions about deployment in real industries but is primarily a question seeking examples rather than showcasing actual technical implementations or breakthroughs.

r/MachineLearning · 3d ago · 7 · open source agent rag tutorial

OpenVidya is an open-source multi-agent AI system for curriculum-aware lesson generation tailored to Indian education (NCERT/CBSE), featuring concept dependency graphs, exam-pattern grounding, and five pedagogical modes with mode-specific prompting. The project demonstrates practical application of agentic AI and RAG patterns for domain-specific education, with structured curriculum integration as a reusable architecture pattern.

Simon Willison · 3d ago · 7 · tool workflow prompt engineering

Developer built a complete web app entirely on mobile using Claude Code, demonstrating a practical AI-assisted workflow: created a Python CLI tool, set up Git scraping automation, and generated a JavaScript frontend with a single LLM prompt. Shows how Claude can handle multi-layer full-stack development from local tooling to cloud-hosted APIs.

r/MachineLearning · 3d ago · 8 · open source benchmark research

A researcher has assembled and open-sourced a 103.1B token Usenet corpus (1980-2013) with comprehensive metadata, deduplication, and cleaning—representing a rare, temporally-coherent pretraining dataset spanning 33 years of language evolution before modern web interference. The dataset includes 408M posts across diverse hierarchies with 96.6% English coverage plus 100+ other languages, complete with published data card and processing methodology on Hugging Face.

r/LocalLLaMA · 4d ago · 8 · library tool open source inference deployment

AutoRound is a mature quantization toolkit for LLMs/VLMs achieving 2-4 bit quantization with minimal accuracy loss using sign-gradient descent, now integrated into major frameworks like vLLM, SGLang, and Transformers. Recent updates include block-wise FP8, mixed-precision schemes, and GGUF format support, making it practical for production deployment with fast quantization times (~10 min for 7B models).

r/MachineLearning · 4d ago · 6 · tool workflow

A discussion thread about open-source PDF-to-Markdown conversion tools, with focus on handling complex tables in financial documents. User compares existing solutions (docling, marker, graphite-docling) against paid alternatives like LandingAI, seeking recommendations for robust table parsing.

r/MachineLearning · 4d ago · 8 · tool open source inference deployment

Phosphene is a free macOS desktop app that wraps Lightricks' LTX 2.3 video generation model on Apple Silicon, notable for synced audio-video generation in a single forward pass rather than post-processing. It features multiple generation modes (text→video, image→video, frame interpolation), three quality tiers with honest hardware gating based on RAM availability, and local prompt rewriting via Gemma 3 12B, making it a practical tool for engineers building video generation workflows on Apple Silicon.

Latent Space · 4d ago · 7 · new model agent tool inference deployment

OpenAI released GPT-5.5 with strong cyber task performance (71.4% pass rate on multi-step attack simulations) and expanded Codex into a general-purpose agent for non-coding computer work with 42% faster inference, dynamic UI routing, and integrations with Microsoft/Google/Salesforce/creative tools. Anthropic launched Claude Security for code review and expanded creative tool support, while the broader narrative shows AI agents increasingly capable of autonomous task execution across diverse domains.

r/LocalLLaMA · 4d ago · 9 · new model open source inference deployment

Google DeepMind released Gemma 4 26B IT, an open multimodal model supporting text, images, and video with a 256K context window and hybrid attention mechanism for efficient inference on consumer GPUs. The NVIDIA-quantized NVFP4 version enables frontier-level performance for reasoning, coding, and agentic workflows with commercial/non-commercial licensing under Apache 2.0.

Simon Willison · 4d ago · 7 · agent tool workflow

Codex CLI 0.128.0 introduces a /goal feature that implements agentic looping similar to the Ralph pattern, automatically re-prompting until goal completion or token budget exhaustion. The implementation uses injected continuation and budget-limit prompts, demonstrating a practical approach to autonomous agent workflows with built-in resource constraints.

Simon Willison · 4d ago · 7 · benchmark new model research

UK AI Security Institute evaluated GPT-5.5's cybersecurity capabilities, finding it comparable to Claude Mythos for vulnerability detection with broader availability. This is a direct model capability assessment relevant to engineers evaluating LLMs for security applications.