llm.c is a high-performance C/CUDA implementation for LLM pretraining that eliminates heavy dependencies (PyTorch, Python) while achieving 7% faster performance than PyTorch Nightly. It provides clean reference implementations for reproducing GPT-2/GPT-3 models with both GPU (CUDA) and CPU code paths, making it valuable for understanding model training mechanics and CUDA optimization.
HN AI Stories
·
756d ago
·
9
·
open source
library
inference
tutorial
HN AI Stories
·
883d ago
·
5
HN AI Stories
·
887d ago
·
8
·
tool
open source
deployment
inference
llamafile 0.10.0 update from Mozilla.ai enables distributing and running open LLMs as single-file executables across platforms with no installation required, now with improved alignment to latest llama.cpp versions and support for more recent models. The tool also includes whisperfile for single-file speech-to-text capabilities, making local LLM deployment significantly more accessible for developers.