Best Local LLM

Updated Daily
inventory_2 23 items
trending_up Scored across 12 criteria

Rankings use category fit, feature coverage, pricing signals, public reception, and recency. Affiliate relationships do not affect scores.

Filter by Tags
0.0 - 10.0
Best 1 Ollama (General Platform)
Ollama (General Platform)

Easiest way to run various local LLMs. Works with JetBrains via Continue, Tabby, or custom scripts for code completion and chat.

8.98 Excellent
Visit
2 Continue AI
Continue AI
Free Plan Available From Free (with paid tiers available for increased usage)

Continue AI is a highly flexible, open-source extension designed to act as a universal AI coding copilot. Its standout feature is its ability to connect to virtually any LLMlocal, cloud, or privatemak...

8.69 Excellent
Visit
3 vLLM (Local Deployment)
vLLM (Local Deployment)

vLLM is primarily a high-throughput serving engine, but its ability to run models locally makes it invaluable for developers building local AI services. It implements advanced techniques like PagedAtt...

8.67 Excellent
Visit
4 vLLM (API Serving)
vLLM (API Serving)

vLLM is primarily known for its high-throughput serving capabilities, utilizing advanced techniques like PagedAttention. While it's often used for cloud deployment, running it locally allows developer...

8.49 Excellent
Visit
5 Ollama Web UI (Open WebUI)
Ollama Web UI (Open WebUI)

A feature-rich web interface for Ollama, providing a ChatGPT-like experience. Can be paired with LM Studio for model management.

8.48 Excellent
Visit
6 Llama 3 8B (via Ollama)
Llama 3 8B (via Ollama)

Llama 3 8B represents a massive leap in general reasoning and instruction following for local models. While not exclusively a coding model, its superior coherence and ability to follow complex, multi-...

8.38 Excellent
Visit
7 Ollama with CodeLlama-7B
Ollama with CodeLlama-7B
Free Plan Available

This combination represents the gold standard for accessible local coding assistance. Ollama provides a simple, robust API layer, while CodeLlama offers specialized performance on code tasks. It is hi...

8.22 Excellent
Visit
8 Mixtral 8x7B (via Ollama)
Mixtral 8x7B (via Ollama)

Mixtral provides massive effective parameter count and superior context handling due to its Mixture-of-Experts (MoE) architecture. This makes it phenomenal for understanding very large codebases or co...

8.18 Excellent
Visit
9 Mistral-Instruct-7B (via LM Studio)
Mistral-Instruct-7B (via LM Studio)

Mistral-Instruct 7B delivers impressive code generation and conversational abilities within JetBrains IDEs. Its instruction tuning makes it highly responsive to developer prompts, providing accurate s...

8.15 Excellent
Visit
10 DeepSeek Coder (via Ollama)
DeepSeek Coder (via Ollama)

DeepSeek Coder is highly regarded in academic circles for its strong performance across a wide array of programming languages. It often provides superior accuracy in understanding niche or complex lan...

8.13 Excellent
Visit
11 Phi-3 Mini
Phi-3 Mini

Phi-3 Mini is a remarkably efficient and powerful local LLM, designed for developers seeking a lightweight solution for code completion and natural language processing. Its 8 billion parameters deliv...

8.05 Excellent
12 Jan AI
Jan AI

Jan AI aims to provide a polished, standalone desktop application experience for running local LLMs. It balances the ease of use of LM Studio with a more polished, integrated feel, making it accessibl...

8.01 Excellent
Visit
13 StarCoder2 (via Local Inference)
StarCoder2 (via Local Inference)

StarCoder2, available through local inference frameworks, is a powerful open-source code generation model specifically trained on a massive dataset of code. Its architecture is designed for efficient...

7.94 Very Good
Visit
14 Microsoft Phi-3 Mini (via Ollama)
Microsoft Phi-3 Mini (via Ollama)

Microsoft's Phi-3 Mini is renowned for achieving surprisingly high performance given its small parameter count. When run via Ollama, it offers excellent reasoning capabilities in a very lightweight pa...

7.94 Very Good
Visit
15 OpenHermes 2.5 Mistral
OpenHermes 2.5 Mistral

OpenHermes 2.5 Mistral is a refined version of the Mistral 7B model, specifically optimized for conversational AI. It boasts enhanced dialogue capabilities and improved code generation performance com...

7.89 Very Good
Visit
16 PrivateGPT
PrivateGPT

PrivateGPT is a powerful tool for building private AI assistants that leverage local LLMs and vector databases. It allows you to index your own documents, enabling the model to answer questions based...

7.85 Very Good
Visit
17 CodeLlama-13B (via Ollama)
CodeLlama-13B (via Ollama)

This model remains a benchmark for code generation specifically. The 13B variant offers a significant step up in code quality and complexity handling compared to the 7B version. It excels at generatin...

7.85 Very Good
Visit
18 MLC-LLM
MLC-LLM

MLC-LLM is a powerful, hardware-agnostic framework designed to run machine learning models efficiently across various platforms, including mobile and edge devices. For local AI, it offers a unique adv...

7.53 Very Good
19 Cursor IDE (Local LLM Mode)
Cursor IDE (Local LLM Mode)

This specific mode of Cursor allows advanced users to bypass cloud APIs entirely by connecting it to a locally running LLM via Ollama. This provides the highest level of data privacy and control, ensu...

7.36 Very Good
Visit
20 MLC-LLM (Model Compilation)
MLC-LLM (Model Compilation)

MLC-LLM focuses on compiling and optimizing models specifically for the target hardware (CPU, GPU, Metal). This deep-level optimization can sometimes yield performance gains that general runners miss,...

7.33 Very Good
21 Google Gemma 2B (via Ollama)
Google Gemma 2B (via Ollama)

Google's Gemma models provide a strong, open-weights alternative backed by Google's research. The 2B variant is extremely efficient, making it highly portable. While its coding specialization might tr...

7.26 Very Good
22 Mistral Large (via LM Studio)
Mistral Large (via LM Studio)

Mistral Large, accessible through LM Studio, represents a significant leap in local LLM performance. Its 7B parameter Mixture of Experts architecture delivers exceptional code generation capabilities...

7.14 Very Good
Visit
23 TinyLlama-1.1B (via Ollama)
TinyLlama-1.1B (via Ollama)

For the absolute minimum resource requirement, TinyLlama is unmatched. It runs incredibly fast, even on low-power CPUs, making it perfect for simple, real-time autocomplete suggestions where latency i...

6.87 Good
Visit
You've reached the end — 23 items

Save to your list

Create your first list and start tracking the tools that matter to you.

Track favorites
Get updates
Compare scores

Already have an account? Sign in

Compare Items

See how they stack up against each other

Comparing
VS
Select 1 more item to compare