Best Local Runner
Updated DailyRankings use category fit, feature coverage, pricing signals, public reception, and recency. Affiliate relationships do not affect scores.
No tags available
LM Studio is a revolutionary desktop application that simplifies running large language models locally. It provides a user-friendly interface for downloading, configuring, and deploying various open-s...
The premier all-in-one local LLM runner with built-in model download, management, and inference. The benchmark for local AI.
vLLM is primarily a high-throughput serving engine, but its ability to run models locally makes it invaluable for developers building local AI services. It implements advanced techniques like PagedAtt...
While not a dedicated IDE plugin, utilizing the Hugging Face Transformers library directly within a Python script allows developers to load and run the absolute latest, state-of-the-art models locally...
Text Generation WebUI is a highly popular open-source LLM inference web interface built around the llama.cpp library. Its renowned for its extensive feature set, including support for various quantiza...
Mixtral is famous for its Mixture-of-Experts (MoE) architecture, allowing it to achieve performance rivaling much larger models while maintaining reasonable inference speeds when self-hosted. Running...
Continue is a powerful VS Code/JetBrains extension that excels at providing a chat-like interface directly within the IDE, allowing you to interact with various local backends (like Ollama or llama.cp...
llama.cpp-mac is a highly optimized port of the llama.cpp library specifically tailored for Apple Silicon Macs. Its designed to deliver exceptional inference performance, particularly with GGUF quanti...
Jan AI aims to provide a polished, standalone desktop application experience for running local LLMs. It balances the ease of use of LM Studio with a more polished, integrated feel, making it accessibl...
This package provides Python bindings directly to the highly optimized llama.cpp core. It is the preferred method for developers who want the raw speed and efficiency of llama.cpp but need to interact...
DeepSeek Coder models are specifically trained on massive, high-quality code datasets, giving them a distinct edge in code generation accuracy across multiple languages. When run locally, they provide...
GPT4All provides a streamlined way to run LLMs on CPUs. It's designed for users who dont have access to powerful GPUs, offering a surprisingly capable experience with optimized models. It focuses on...
StarCoder2, trained by DeepMind and Hugging Face, is a highly respected, academically validated model for code generation. It excels at understanding the context provided by surrounding code blocks an...
Microsoft's Phi-3 Mini is celebrated for achieving surprisingly high performance on complex tasks despite its relatively small parameter count. When run locally, it offers incredibly fast inference sp...
While often marketed for creative writing and roleplaying, KoboldAI provides a robust local inference engine that can be adapted for coding tasks. Its strength lies in its highly configurable text gen...
Code Llama, Meta's dedicated coding model, remains a foundational and highly stable choice for local development. It benefits from Meta's massive resources and is specifically tuned for coding tasks....
GPT-Engineer is an agentic framework designed to take a high-level prompt and generate a complete, multi-file project structure. When adapted to use local models via Ollama or llama.cpp, it becomes a...
You're in. We'll email you when new Local Runner land.