Best Local LLM
Updated DailyRankings use category fit, feature coverage, pricing signals, public reception, and recency. Affiliate relationships do not affect scores.
No tags available
Easiest way to run various local LLMs. Works with JetBrains via Continue, Tabby, or custom scripts for code completion and chat.
Continue AI is a highly flexible, open-source extension designed to act as a universal AI coding copilot. Its standout feature is its ability to connect to virtually any LLMlocal, cloud, or privatemak...
vLLM is primarily a high-throughput serving engine, but its ability to run models locally makes it invaluable for developers building local AI services. It implements advanced techniques like PagedAtt...
vLLM is primarily known for its high-throughput serving capabilities, utilizing advanced techniques like PagedAttention. While it's often used for cloud deployment, running it locally allows developer...
A feature-rich web interface for Ollama, providing a ChatGPT-like experience. Can be paired with LM Studio for model management.
Llama 3 8B represents a massive leap in general reasoning and instruction following for local models. While not exclusively a coding model, its superior coherence and ability to follow complex, multi-...
This combination represents the gold standard for accessible local coding assistance. Ollama provides a simple, robust API layer, while CodeLlama offers specialized performance on code tasks. It is hi...
Mixtral provides massive effective parameter count and superior context handling due to its Mixture-of-Experts (MoE) architecture. This makes it phenomenal for understanding very large codebases or co...
Mistral-Instruct 7B delivers impressive code generation and conversational abilities within JetBrains IDEs. Its instruction tuning makes it highly responsive to developer prompts, providing accurate s...
DeepSeek Coder is highly regarded in academic circles for its strong performance across a wide array of programming languages. It often provides superior accuracy in understanding niche or complex lan...
Phi-3 Mini is a remarkably efficient and powerful local LLM, designed for developers seeking a lightweight solution for code completion and natural language processing. Its 8 billion parameters deliv...
Jan AI aims to provide a polished, standalone desktop application experience for running local LLMs. It balances the ease of use of LM Studio with a more polished, integrated feel, making it accessibl...
StarCoder2, available through local inference frameworks, is a powerful open-source code generation model specifically trained on a massive dataset of code. Its architecture is designed for efficient...
Microsoft's Phi-3 Mini is renowned for achieving surprisingly high performance given its small parameter count. When run via Ollama, it offers excellent reasoning capabilities in a very lightweight pa...
OpenHermes 2.5 Mistral is a refined version of the Mistral 7B model, specifically optimized for conversational AI. It boasts enhanced dialogue capabilities and improved code generation performance com...
PrivateGPT is a powerful tool for building private AI assistants that leverage local LLMs and vector databases. It allows you to index your own documents, enabling the model to answer questions based...
This model remains a benchmark for code generation specifically. The 13B variant offers a significant step up in code quality and complexity handling compared to the 7B version. It excels at generatin...
MLC-LLM is a powerful, hardware-agnostic framework designed to run machine learning models efficiently across various platforms, including mobile and edge devices. For local AI, it offers a unique adv...
This specific mode of Cursor allows advanced users to bypass cloud APIs entirely by connecting it to a locally running LLM via Ollama. This provides the highest level of data privacy and control, ensu...
MLC-LLM focuses on compiling and optimizing models specifically for the target hardware (CPU, GPU, Metal). This deep-level optimization can sometimes yield performance gains that general runners miss,...
Google's Gemma models provide a strong, open-weights alternative backed by Google's research. The 2B variant is extremely efficient, making it highly portable. While its coding specialization might tr...
Mistral Large, accessible through LM Studio, represents a significant leap in local LLM performance. Its 7B parameter Mixture of Experts architecture delivers exceptional code generation capabilities...
For the absolute minimum resource requirement, TinyLlama is unmatched. It runs incredibly fast, even on low-power CPUs, making it perfect for simple, real-time autocomplete suggestions where latency i...
You're in. We'll email you when new Local LLM land.