LLM Models
Compare local LLM models — find which GPU you need to run them. VRAM requirements, quantization options, and hardware compatibility.
Showing 346 of 346 models
| Model | Developer | Params | Context | Min VRAM | Use Cases | Compatible GPUs |
|---|---|---|---|---|---|---|
| AceReason-Nemotron-14B | nvidia | 14.77B | 4K | 9.75GB | text-generation | 108 |
| AI21-Jamba-Large-1.5 | ai21labs | 398.56B | 4K | 263.04GB | text-generation | 0 |
| AI21-Jamba-Mini-1.5 | ai21labs | 51.57B | 4K | 34.03GB | text-generation | 37 |
| AI21-Jamba-Mini-1.6 | ai21labs | 51.57B | 4K | 34.03GB | text-generation | 37 |
| Athene-70B-Preview | Nexusflow | 70.55B | 4K | 46.56GB | text-generation | 36 |
| Athene-V2-Agent | Nexusflow | 72.70B | 4K | 47.98GB | text-generation | 36 |
| bigscience-small-testing | bigscience | 0.02B | 4K | 0.01GB | text-generation | 169 |
| bitnet-b1.58-2B-4T-bf16 | microsoft | 2.41B | 4K | 1.59GB | text-generation | 169 |
| bloom-1b1 | bigscience | 1.07B | 4K | 0.7GB | text-generation | 169 |
| bloom-1b7 | bigscience | 1.72B | 4K | 1.13GB | text-generation | 169 |
| bloom-3b | bigscience | 3.00B | 4K | 1.98GB | text-generation | 169 |
| bloom-560m | bigscience | 0.56B | 4K | 0.37GB | text-generation | 169 |
| bloom-7b1 | bigscience | 7.07B | 4K | 4.66GB | text-generation | 163 |
| bloomz | bigscience | 176.25B | 4K | 116.33GB | text-generation | 5 |
| bloomz-1b7 | bigscience | 1.72B | 4K | 1.13GB | text-generation | 169 |
| bloomz-3b | bigscience | 3.00B | 4K | 1.98GB | text-generation | 169 |
| bloomz-560m | bigscience | 0.56B | 4K | 0.37GB | text-generation | 169 |
| bloomz-7b1 | bigscience | 7.07B | 4K | 4.66GB | text-generation | 163 |
| bloomz-7b1-mt | bigscience | 7.07B | 4K | 4.66GB | text-generation | 163 |
| Bolmo-1B | allenai | 1.47B | 4K | 0.97GB | text-generation | 169 |
| codegemma-2b | 2.51B | 4K | 1.65GB | text-generation | 169 | |
| CodeLlama-13b-Instruct-hf | meta-llama | 13.02B | 4K | 8.59GB | text-generation | 108 |
| CodeLlama-7b-Instruct-hf | meta-llama | 6.74B | 4K | 4.44GB | text-generation | 163 |
| deep-ignorance-unfiltered | EleutherAI | 6.86B | 4K | 4.52GB | text-generation | 163 |
| deepseek-coder-33b-base | deepseek-ai | 33.34B | 4K | 22.01GB | text-generation | 54 |
| deepseek-coder-33b-instruct | deepseek-ai | 33.34B | 4K | 22.01GB | text-generation | 54 |
| deepseek-coder-6.7b-base | deepseek-ai | 6.74B | 4K | 4.44GB | text-generation | 163 |
| deepseek-coder-6.7b-instruct | deepseek-ai | 6.74B | 4K | 4.44GB | text-generation | 163 |
| deepseek-coder-7b-base-v1.5 | deepseek-ai | 6.91B | 4K | 4.57GB | text-generation | 163 |
| deepseek-coder-7b-instruct-v1.5 | deepseek-ai | 6.91B | 4K | 4.57GB | text-generation | 163 |
| DeepSeek-Coder-V2-Instruct | deepseek-ai | 235.74B | 4K | 155.58GB | text-generation | 1 |
| DeepSeek-Coder-V2-Instruct-0724 | deepseek-ai | 235.74B | 4K | 155.58GB | text-generation | 1 |
| DeepSeek-Coder-V2-Lite-Base | deepseek-ai | 15.71B | 4K | 10.36GB | text-generation | 106 |
| deepseek-moe-16b-base | deepseek-ai | 16.38B | 4K | 10.81GB | text-generation | 106 |
| deepseek-moe-16b-chat | deepseek-ai | 16.38B | 4K | 10.81GB | text-generation | 106 |
| DeepSeek-R1-0528 | deepseek-ai | 684.53B | 4K | 451.79GB | text-generation | 0 |
| DeepSeek-R1-0528-NVFP4 | nvidia | 396.77B | 4K | 261.87GB | text-generation | 0 |
| DeepSeek-R1-0528-NVFP4-v2 | nvidia | 393.63B | 4K | 259.8GB | text-generation | 0 |
| DeepSeek-R1-0528-Qwen3-8B | deepseek-ai | 8.19B | 4K | 5.4GB | text-generation | 163 |
| DeepSeek-R1-0528-Qwen3-8B-MLX-4bit | lmstudio-community | 1.28B | 4K | 0.85GB | text-generation | 169 |
| DeepSeek-R1-0528-Qwen3-8B-MLX-8bit | lmstudio-community | 2.30B | 4K | 1.52GB | text-generation | 169 |
| DeepSeek R1 Distill 14B | DeepSeek | 14.00B | 65K | 9.5GB | reasoning, math, coding, analysis | 108 |
| DeepSeek-R1-Distill-Qwen-14B | deepseek-ai | 14.77B | 4K | 9.75GB | text-generation | 108 |
| DeepSeek-R1-Distill-Qwen-32B | deepseek-ai | 32.76B | 4K | 21.63GB | text-generation | 54 |
| DeepSeek-R1-Distill-Qwen-7B | deepseek-ai | 7.62B | 4K | 5.03GB | text-generation | 163 |
| DeepSeek-R1-NVFP4 | nvidia | 396.77B | 4K | 261.87GB | text-generation | 0 |
| DeepSeek-V2 | deepseek-ai | 235.74B | 4K | 155.58GB | text-generation | 1 |
| DeepSeek-V2.5 | deepseek-ai | 235.74B | 4K | 155.58GB | text-generation | 1 |
| DeepSeek-V2-Chat | deepseek-ai | 235.74B | 4K | 155.58GB | text-generation | 1 |
| DeepSeek-V2-Chat-0628 | deepseek-ai | 235.74B | 4K | 155.58GB | text-generation | 1 |
| DeepSeek-V2-Lite | deepseek-ai | 15.71B | 4K | 10.36GB | text-generation | 106 |
| DeepSeek-V2-Lite-Chat | deepseek-ai | 15.71B | 4K | 10.36GB | text-generation | 106 |
| Deepseek-V2 Pro | DeepSeek AI | 70.00B | 131K | 45.02GB | chat, code, reasoning | 36 |
| DeepSeek-V3-0324 | deepseek-ai | 684.53B | 4K | 451.79GB | text-generation | 0 |
| DeepSeek-V3-0324-NVFP4 | nvidia | 396.77B | 4K | 261.87GB | text-generation | 0 |
| DeepSeek-V3.1-NVFP4 | nvidia | 393.63B | 4K | 259.8GB | text-generation | 0 |
| DeepSeek-V3.2 | DeepSeek AI | 70.00B | 131K | 77.47GB | reasoning, agentic workflows | 15 |
| DeepSeek-V3.2-NVFP4 | nvidia | 394.50B | 4K | 260.37GB | text-generation | 0 |
| DialoGPT-small | microsoft | 0.18B | 4K | 0.12GB | text-generation | 169 |
| distilgpt2 | distilbert | 0.09B | 4K | 0.06GB | text-generation | 169 |
| dolphin-2.9.1-yi-1.5-34b | dphn | 34.39B | 4K | 22.69GB | text-generation | 54 |
| Dolphin-Mistral-24B-Venice-Edition | dphn | 23.57B | 4K | 15.55GB | text-generation | 81 |
| ELM | Joaoffg | 0.90B | 4K | 0.59GB | text-generation | 169 |
| falcon-11B | tiiuae | 11.10B | 4K | 7.33GB | text-generation | 141 |
| falcon-7b-instruct | tiiuae | 7.22B | 4K | 4.76GB | text-generation | 163 |
| Falcon-H1-0.5B-Base | tiiuae | 0.52B | 4K | 0.34GB | text-generation | 169 |
| Falcon-H1-0.5B-Instruct | tiiuae | 0.52B | 4K | 0.34GB | text-generation | 169 |
| Falcon-H1-1.5B-Base | tiiuae | 1.55B | 4K | 1.02GB | text-generation | 169 |
| Falcon-H1-1.5B-Instruct | tiiuae | 1.55B | 4K | 1.02GB | text-generation | 169 |
| Falcon-H1-34B-Base | tiiuae | 33.64B | 4K | 22.21GB | text-generation | 54 |
| Falcon-H1-34B-Instruct | tiiuae | 33.64B | 4K | 22.21GB | text-generation | 54 |
| Falcon-H1-3B-Base | tiiuae | 3.15B | 4K | 2.08GB | text-generation | 169 |
| Falcon-H1-3B-Instruct | tiiuae | 3.15B | 4K | 2.08GB | text-generation | 169 |
| Falcon-H1-7B-Base | tiiuae | 7.59B | 4K | 5GB | text-generation | 163 |
| Falcon-H1-7B-Instruct | tiiuae | 7.59B | 4K | 5GB | text-generation | 163 |
| Falcon-H1-Tiny-90M-Instruct | tiiuae | 0.09B | 4K | 0.06GB | text-generation | 169 |
| falcon-mamba-7b-instruct | tiiuae | 7.27B | 4K | 4.8GB | text-generation | 163 |
| falcon-mamba-tiny-dev | tiiuae | 0.01B | 4K | 0.01GB | text-generation | 169 |
| Falcon3-10B-Base | tiiuae | 10.31B | 4K | 6.8GB | text-generation | 141 |
| Falcon3-1B-Instruct | tiiuae | 1.67B | 4K | 1.1GB | text-generation | 169 |
| Falcon3-3B-Base | tiiuae | 3.23B | 4K | 2.13GB | text-generation | 169 |
| Falcon3-3B-Instruct | tiiuae | 3.23B | 4K | 2.13GB | text-generation | 169 |
| Falcon3-7B-Base | tiiuae | 7.46B | 4K | 4.92GB | text-generation | 163 |
| Falcon3-7B-Instruct | tiiuae | 7.46B | 4K | 4.92GB | text-generation | 163 |
| Flex-reddit-2x7B-1T | allenai | 11.63B | 4K | 7.68GB | text-generation | 141 |
| gemma-1.1-2b-it | 2.51B | 4K | 1.65GB | text-generation | 169 | |
| gemma-1.1-7b-it | 8.54B | 4K | 5.63GB | text-generation | 163 | |
| gemma-2-27b-it | 27.23B | 4K | 17.97GB | text-generation | 58 | |
| gemma-2-9b-it | 9.24B | 4K | 6.11GB | text-generation | 141 | |
| GLM-4.7-Flash-MLX-6bit | lmstudio-community | 6.56B | 4K | 4.32GB | text-generation | 163 |
| GLM-4.7-Flash-MLX-8bit | lmstudio-community | 8.43B | 4K | 5.57GB | text-generation | 163 |
| gpt-neo-1.3B | EleutherAI | 1.37B | 4K | 0.9GB | text-generation | 169 |
| gpt-neo-125m | EleutherAI | 0.15B | 4K | 0.1GB | text-generation | 169 |
| gpt-neo-2.7B | EleutherAI | 2.72B | 4K | 1.79GB | text-generation | 169 |
| gpt-oss-120b | openai | 120.41B | 4K | 79.48GB | text-generation | 15 |
| gpt-oss-120b-Eagle3-long-context | nvidia | 0.22B | 4K | 0.14GB | text-generation | 169 |
| gpt-oss-20b | openai | 21.51B | 4K | 14.2GB | text-generation | 81 |
| gpt2 | openai-community | 0.14B | 4K | 0.09GB | text-generation | 169 |
| gpt2-large | openai-community | 0.81B | 4K | 0.54GB | text-generation | 169 |
| gpt2-medium | openai-community | 0.38B | 4K | 0.25GB | text-generation | 169 |
| gpt2-mini | erwanf | 0.04B | 4K | 0.02GB | text-generation | 169 |
| h2ovl-mississippi-2b | h2oai | 2.15B | 4K | 1.42GB | text-generation | 169 |
| h2ovl-mississippi-800m | h2oai | 0.83B | 4K | 0.55GB | text-generation | 169 |
| Hermes-2-Pro-Llama-3-8B | NousResearch | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Hermes-2-Pro-Mistral-7B | NousResearch | 7.24B | 4K | 4.79GB | text-generation | 163 |
| Hermes-2-Theta-Llama-3-8B | NousResearch | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Hermes-3-Llama-3.1-8B | NousResearch | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Hermes-4-14B | NousResearch | 14.77B | 4K | 9.75GB | text-generation | 108 |
| internlm2_5-7b | internlm | 7.74B | 4K | 5.1GB | text-generation | 163 |
| internlm2-chat-1_8b | internlm | 1.89B | 4K | 1.24GB | text-generation | 169 |
| internlm2-chat-20b | internlm | 19.86B | 4K | 13.11GB | text-generation | 81 |
| internlm2-chat-7b-sft | internlm | 7.74B | 4K | 5.1GB | text-generation | 163 |
| Jan-v3-4B-base-instruct | janhq | 4.41B | 4K | 2.92GB | text-generation | 169 |
| japanese-gpt-neox-small | rinna | 0.20B | 4K | 0.13GB | text-generation | 169 |
| LFM2-24B-A2B | LiquidAI | 23.84B | 4K | 15.74GB | text-generation | 81 |
| LFM2.5-1.2B-Instruct | LiquidAI | 1.17B | 4K | 0.77GB | text-generation | 169 |
| LFM2.5-1.2B-Instruct-MLX-4bit | lmstudio-community | 0.18B | 4K | 0.12GB | text-generation | 169 |
| LFM2.5-1.2B-Instruct-MLX-6bit | lmstudio-community | 0.26B | 4K | 0.17GB | text-generation | 169 |
| LFM2.5-1.2B-Instruct-MLX-8bit | lmstudio-community | 0.33B | 4K | 0.22GB | text-generation | 169 |
| LFM2-8B-A1B | LiquidAI | 8.34B | 4K | 5.5GB | text-generation | 163 |
| Llama-2-7b-hf | meta-llama | 6.74B | 4K | 4.44GB | text-generation | 163 |
| Llama-3.1-405B-Instruct | meta-llama | 405.85B | 4K | 267.86GB | text-generation | 0 |
| Llama-3.1-405B-Instruct-FP8 | meta-llama | 405.87B | 4K | 267.87GB | text-generation | 0 |
| Llama 3.1 70B | Meta | 70.00B | 131K | 44GB | chat, coding, reasoning | 36 |
| Llama-3.1-70B-Instruct | meta-llama | 70.55B | 4K | 46.56GB | text-generation | 36 |
| Llama 3.1 8B | Meta | 8.00B | 131K | 5.5GB | chat, coding, summarization | 163 |
| Llama-3.1-8B-Instruct-FP8 | nvidia | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Llama-3.1-Tulu-3-8B-SFT | allenai | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Llama-3.2-1B | meta-llama | 1.24B | 4K | 0.81GB | text-generation | 169 |
| Llama-3.2-1B-Instruct-FP8 | RedHatAI | 1.50B | 4K | 0.99GB | text-generation | 169 |
| Llama-3.2-1B-Instruct-FP8-dynamic | RedHatAI | 1.50B | 4K | 0.99GB | text-generation | 169 |
| Llama-3.2-3B | meta-llama | 3.21B | 4K | 2.12GB | text-generation | 169 |
| llama-3.3-70b-instruct-awq | casperhansen | 70.55B | 4K | 46.56GB | text-generation | 36 |
| Llama-3_3-Nemotron-Super-49B-v1 | nvidia | 49.87B | 4K | 32.91GB | text-generation | 37 |
| Llama-3_3-Nemotron-Super-49B-v1_5-FP8 | nvidia | 49.87B | 4K | 32.91GB | text-generation | 37 |
| Llama-3_3-Nemotron-Super-49B-v1_5-NVFP4 | nvidia | 28.97B | 4K | 19.12GB | text-generation | 58 |
| Llama-3_3-Nemotron-Super-49B-v1-FP8 | nvidia | 49.87B | 4K | 32.91GB | text-generation | 37 |
| llama-300M-v3-original | deqing | 0.32B | 4K | 0.21GB | text-generation | 169 |
| Llama-Guard-3-8B | meta-llama | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Llama-Guard-3-8B-INT8 | meta-llama | 8.03B | 4K | 5.3GB | text-generation | 163 |
| LlamaGuard-7b | meta-llama | 6.74B | 4K | 4.44GB | text-generation | 163 |
| llm-jp-3-3.7b-instruct | llm-jp | 3.78B | 4K | 2.5GB | text-generation | 169 |
| LocoOperator-4B | LocoreMind | 4.02B | 4K | 2.65GB | text-generation | 169 |
| maira-2 | microsoft | 6.88B | 4K | 4.54GB | text-generation | 163 |
| MediPhi-Clinical | microsoft | 3.82B | 4K | 2.52GB | text-generation | 169 |
| MediPhi-Instruct | microsoft | 3.82B | 4K | 2.52GB | text-generation | 169 |
| Meta-Llama-3.1-70B-Instruct | NousResearch | 70.55B | 4K | 46.56GB | text-generation | 36 |
| Meta-Llama-3.1-8B | NousResearch | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Meta-Llama-3.1-8B-Instruct | unsloth | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Meta-Llama-3.1-8B-Instruct-bnb-4bit | unsloth | 8.25B | 4K | 5.45GB | text-generation | 163 |
| Meta-Llama-3.1-8B-Instruct-FP8 | RedHatAI | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Meta-Llama-3-70B-Instruct | meta-llama | 70.55B | 4K | 46.56GB | text-generation | 36 |
| Meta-Llama-3-8B | meta-llama | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Meta-Llama-3-8B-Instruct | meta-llama | 8.03B | 4K | 5.3GB | text-generation | 163 |
| Meta-Llama-Guard-2-8B | meta-llama | 8.03B | 4K | 5.3GB | text-generation | 163 |
| MiniMax-M2.5 | MiniMaxAI | 228.70B | 4K | 150.94GB | text-generation | 1 |
| MiniMax-M2-AWQ | QuantTrio | 228.69B | 4K | 150.93GB | text-generation | 1 |
| Mistral 7B | Mistral | 7.00B | 32K | 5GB | chat, instruction-following, translation | 163 |
| Mistral-7B-Instruct-v0.2 | mistralai | 7.24B | 4K | 4.79GB | text-generation | 163 |
| mistral-7b-v0.3-bnb-4bit | unsloth | 7.47B | 4K | 4.93GB | text-generation | 163 |
| Mistral-NeMo-Minitron-8B-Instruct | nvidia | 8.41B | 4K | 5.56GB | text-generation | 163 |
| Mistral-Small-24B-Instruct-2501-AWQ | stelterlab | 23.57B | 4K | 15.55GB | text-generation | 81 |
| Mixtral-8x7B-Instruct-v0.1-GPTQ | TheBloke | 46.71B | 4K | 30.83GB | text-generation | 40 |
| Nanbeige4.1-3B | Nanbeige | 3.93B | 4K | 2.6GB | text-generation | 169 |
| Nanbeige4.1-3B-heretic | heretic-org | 3.93B | 4K | 2.6GB | text-generation | 169 |
| Nemotron-Flash-3B | nvidia | 2.75B | 4K | 1.81GB | text-generation | 169 |
| Nemotron-H-4B-Base-8K | nvidia | 4.49B | 4K | 2.96GB | text-generation | 169 |
| Nemotron-H-4B-Instruct-128K | nvidia | 4.49B | 4K | 2.96GB | text-generation | 169 |
| Nous-Hermes-2-Mistral-7B-DPO | NousResearch | 7.24B | 4K | 4.79GB | text-generation | 163 |
| Nous-Hermes-2-Mixtral-8x7B-DPO | NousResearch | 46.70B | 4K | 30.82GB | text-generation | 40 |
| Nous-Hermes-2-SOLAR-10.7B | NousResearch | 10.73B | 4K | 7.08GB | text-generation | 141 |
| Nous-Hermes-llama-2-7b | NousResearch | 6.74B | 4K | 4.44GB | text-generation | 163 |
| NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16 | nvidia | 31.58B | 4K | 20.85GB | text-generation | 54 |
| NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 | nvidia | 31.58B | 4K | 20.85GB | text-generation | 54 |
| NVIDIA-Nemotron-3-Nano-30B-A3B-FP8 | nvidia | 31.58B | 4K | 20.85GB | text-generation | 54 |
| NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4 | nvidia | 18.24B | 4K | 12.03GB | text-generation | 81 |
| NVIDIA-Nemotron-Nano-9B-v2 | nvidia | 8.89B | 4K | 5.86GB | text-generation | 163 |
| NVIDIA-Nemotron-Nano-9B-v2-Base | nvidia | 8.89B | 4K | 5.86GB | text-generation | 163 |
| NVIDIA-Nemotron-Nano-9B-v2-FP8 | nvidia | 8.89B | 4K | 5.86GB | text-generation | 163 |
| NVIDIA-Nemotron-Nano-9B-v2-Japanese | nvidia | 8.89B | 4K | 5.86GB | text-generation | 163 |
| OLMo-1B | allenai | 1.18B | 4K | 0.78GB | text-generation | 169 |
| OLMo-1B-0724-hf | allenai | 1.28B | 4K | 0.85GB | text-generation | 169 |
| OLMo-1B-hf | allenai | 1.18B | 4K | 0.78GB | text-generation | 169 |
| OLMo-2-0325-32B | allenai | 32.23B | 4K | 21.27GB | text-generation | 54 |
| OLMo-2-0325-32B-Instruct | allenai | 32.23B | 4K | 21.27GB | text-generation | 54 |
| OLMo-2-0425-1B | allenai | 1.48B | 4K | 0.98GB | text-generation | 169 |
| OLMo-2-0425-1B-Instruct | allenai | 1.48B | 4K | 0.98GB | text-generation | 169 |
| OLMo-2-0425-1B-RLVR1 | allenai | 1.48B | 4K | 0.98GB | text-generation | 169 |
| OLMo-2-1124-13B-Instruct | allenai | 13.72B | 4K | 9.05GB | text-generation | 108 |
| OLMo-2-1124-7B-Instruct | allenai | 7.30B | 4K | 4.82GB | text-generation | 163 |
| Olmo-3.1-32B-Think | allenai | 32.23B | 4K | 21.27GB | text-generation | 54 |
| Olmo-3.1-7B-RL-Zero-Math | allenai | 7.30B | 4K | 4.82GB | text-generation | 163 |
| Olmo-3-1025-7B | allenai | 7.30B | 4K | 4.82GB | text-generation | 163 |
| Olmo-3-1125-32B | allenai | 32.23B | 4K | 21.27GB | text-generation | 54 |
| Olmo-3-32B-Think | allenai | 32.23B | 4K | 21.27GB | text-generation | 54 |
| Olmo-3-7B-Instruct | allenai | 7.30B | 4K | 4.82GB | text-generation | 163 |
| Olmo-3-7B-Instruct-DPO | allenai | 7.30B | 4K | 4.82GB | text-generation | 163 |
| Olmo-3-7B-Instruct-SFT | allenai | 7.30B | 4K | 4.82GB | text-generation | 163 |
| Olmo-3-7B-Think | allenai | 7.30B | 4K | 4.82GB | text-generation | 163 |
| Olmo-3-7B-Think-DPO | allenai | 7.30B | 4K | 4.82GB | text-generation | 163 |
| Olmo-3-7B-Think-SFT | allenai | 7.30B | 4K | 4.82GB | text-generation | 163 |
| OLMo-7B-0724-hf | allenai | 6.89B | 4K | 4.54GB | text-generation | 163 |
| OLMo-7B-hf | allenai | 6.89B | 4K | 4.54GB | text-generation | 163 |
| Olmo-Hybrid-Instruct-DPO-7B | allenai | 7.43B | 4K | 4.91GB | text-generation | 163 |
| OLMoE-1B-7B-0125 | allenai | 6.92B | 4K | 4.57GB | text-generation | 163 |
| OLMoE-1B-7B-0125-Instruct | allenai | 6.92B | 4K | 4.57GB | text-generation | 163 |
| OLMoE-1B-7B-0924-Instruct | allenai | 6.92B | 4K | 4.57GB | text-generation | 163 |
| phi-1 | microsoft | 1.42B | 4K | 0.94GB | text-generation | 169 |
| phi-1_5 | microsoft | 1.42B | 4K | 0.94GB | text-generation | 169 |
| phi-2 | microsoft | 2.78B | 4K | 1.84GB | text-generation | 169 |
| Phi-3-medium-4k-instruct | microsoft | 13.96B | 4K | 9.22GB | text-generation | 108 |
| Phi-3-mini-4k-instruct-gptq-4bit | kaitchup | 3.82B | 4K | 2.52GB | text-generation | 169 |
| Phi-3-small-8k-instruct | microsoft | 7.39B | 4K | 4.88GB | text-generation | 163 |
| Phi-mini-MoE-instruct | microsoft | 7.65B | 4K | 5.05GB | text-generation | 163 |
| Phi-tiny-MoE-instruct | microsoft | 3.76B | 4K | 2.48GB | text-generation | 169 |
| polyglot-ko-1.3b | EleutherAI | 1.43B | 4K | 0.95GB | text-generation | 169 |
| polyglot-ko-12.8b | EleutherAI | 13.06B | 4K | 8.62GB | text-generation | 108 |
| polyglot-ko-5.8b | EleutherAI | 6.00B | 4K | 3.96GB | text-generation | 169 |
| pythia-1.4b | EleutherAI | 1.52B | 4K | 1GB | text-generation | 169 |
| pythia-1.4b-deduped | EleutherAI | 1.41B | 4K | 0.94GB | text-generation | 169 |
| pythia-12b | EleutherAI | 12.00B | 4K | 7.92GB | text-generation | 141 |
| pythia-14m | EleutherAI | 0.01B | 4K | 0.01GB | text-generation | 169 |
| pythia-14m-deduped | EleutherAI | 0.04B | 4K | 0.02GB | text-generation | 169 |
| pythia-160m-deduped | EleutherAI | 0.21B | 4K | 0.14GB | text-generation | 169 |
| pythia-160m-seed1 | EleutherAI | 0.21B | 4K | 0.14GB | text-generation | 169 |
| pythia-1b | EleutherAI | 1.08B | 4K | 0.72GB | text-generation | 169 |
| pythia-2.8b-deduped | EleutherAI | 2.91B | 4K | 1.93GB | text-generation | 169 |
| pythia-31m | EleutherAI | 0.03B | 4K | 0.02GB | text-generation | 169 |
| pythia-31m-deduped | EleutherAI | 0.06B | 4K | 0.03GB | text-generation | 169 |
| pythia-410m | EleutherAI | 0.51B | 4K | 0.33GB | text-generation | 169 |
| pythia-410m-deduped | EleutherAI | 0.51B | 4K | 0.33GB | text-generation | 169 |
| pythia-410m-v0 | EleutherAI | 0.51B | 4K | 0.33GB | text-generation | 169 |
| pythia-6.9b | EleutherAI | 6.99B | 4K | 4.61GB | text-generation | 163 |
| pythia-70m-deduped | EleutherAI | 0.10B | 4K | 0.07GB | text-generation | 169 |
| Qwen 2.5 72B | Alibaba | 72.00B | 131K | 45.02GB | chat, code, reasoning | 36 |
| Qwen 2.5 72B Instruct | Qwen | 72.00B | 131K | 45.02GB | chat, code, reasoning | 36 |
| Qwen1.5-110B-Chat-AWQ | Qwen | 111.21B | 4K | 73.4GB | text-generation | 15 |
| Qwen2-0.5B-Instruct | Qwen | 0.49B | 4K | 0.33GB | text-generation | 169 |
| Qwen2-1.5B-Instruct | Qwen | 1.54B | 4K | 1.02GB | text-generation | 169 |
| Qwen2.5-0.5B | Qwen | 0.49B | 4K | 0.33GB | text-generation | 169 |
| Qwen2.5-0.5B-Instruct | Qwen | 0.49B | 4K | 0.33GB | text-generation | 169 |
| Qwen2.5-1.5B | Qwen | 1.54B | 4K | 1.02GB | text-generation | 169 |
| Qwen2.5-1.5B-Instruct | Qwen | 1.54B | 4K | 1.02GB | text-generation | 169 |
| Qwen2.5-1.5B-Instruct-AWQ | Qwen | 1.78B | 4K | 1.18GB | text-generation | 169 |
| Qwen2.5-1.5B-quantized.w8a8 | RedHatAI | 1.78B | 4K | 1.18GB | text-generation | 169 |
| Qwen2.5-14B-Instruct-AWQ | Qwen | 14.77B | 4K | 9.75GB | text-generation | 108 |
| Qwen2.5-32B | Qwen | 32.76B | 4K | 21.63GB | text-generation | 54 |
| Qwen2.5-32B-Instruct-AWQ | Qwen | 32.76B | 4K | 21.63GB | text-generation | 54 |
| Qwen2.5-3B | Qwen | 3.09B | 4K | 2.04GB | text-generation | 169 |
| Qwen2.5-3B-Instruct | Qwen | 3.09B | 4K | 2.04GB | text-generation | 169 |
| Qwen2.5-72B-Instruct | Qwen | 72.71B | 4K | 47.98GB | text-generation | 36 |
| Qwen2.5-72B-Instruct-AWQ | Qwen | 72.96B | 4K | 48.15GB | text-generation | 16 |
| Qwen2.5-7B | Qwen | 7.62B | 4K | 5.03GB | text-generation | 163 |
| Qwen2.5-7B-Instruct | Qwen | 7.62B | 4K | 5.03GB | text-generation | 163 |
| Qwen2.5-Coder-0.5B-Instruct | Qwen | 0.49B | 4K | 0.33GB | text-generation | 169 |
| Qwen2.5-Coder-1.5B-Instruct | Qwen | 1.54B | 4K | 1.02GB | text-generation | 169 |
| Qwen2.5-Coder-14B-Instruct | Qwen | 14.77B | 4K | 9.75GB | text-generation | 108 |
| Qwen2.5-Coder-32B-Instruct | Qwen | 32.76B | 4K | 21.63GB | text-generation | 54 |
| Qwen2.5-Coder-32B-Instruct-AWQ | Qwen | 32.76B | 4K | 21.63GB | text-generation | 54 |
| Qwen2.5-Coder-7B-Instruct | Qwen | 7.62B | 4K | 5.03GB | text-generation | 163 |
| Qwen2.5-Coder-7B-Instruct-AWQ | Qwen | 7.62B | 4K | 5.03GB | text-generation | 163 |
| Qwen2.5-Coder-7B-Instruct-GPTQ-Int4 | Qwen | 7.62B | 4K | 5.03GB | text-generation | 163 |
| Qwen2.5-Math-1.5B | Qwen | 1.54B | 4K | 1.02GB | text-generation | 169 |
| Qwen2.5-VL-7B-Instruct-NVFP4 | nvidia | 5.44B | 4K | 3.59GB | text-generation | 169 |
| Qwen2 72B | Qwen | 72.00B | 65K | 45.02GB | chat, code, reasoning | 36 |
| Qwen2-7B-Instruct | Qwen | 7.62B | 4K | 5.03GB | text-generation | 163 |
| Qwen3-0.6B | Qwen | 0.75B | 4K | 0.5GB | text-generation | 169 |
| Qwen3-0.6B-FP8 | Qwen | 0.75B | 4K | 0.5GB | text-generation | 169 |
| Qwen3-1.7B-Base | Qwen | 1.72B | 4K | 1.13GB | text-generation | 169 |
| Qwen3-14B-Instruct | OpenPipe | 14.77B | 4K | 9.75GB | text-generation | 108 |
| Qwen3-14B-NVFP4 | nvidia | 8.99B | 4K | 5.93GB | text-generation | 163 |
| Qwen3-235B-A22B | Qwen | 235.09B | 4K | 155.17GB | text-generation | 1 |
| Qwen3-235B-A22B-Instruct-2507-FP8 | Qwen | 235.11B | 4K | 155.17GB | text-generation | 1 |
| Qwen3-235B-A22B-NVFP4 | nvidia | 132.81B | 4K | 87.65GB | text-generation | 8 |
| Qwen3-30B-A3B-Instruct-2507-FP8 | Qwen | 30.53B | 4K | 20.15GB | text-generation | 54 |
| Qwen3-30B-A3B-NVFP4 | nvidia | 17.45B | 4K | 11.52GB | text-generation | 100 |
| Qwen3-32B-AWQ | Qwen | 32.76B | 4K | 21.63GB | text-generation | 54 |
| Qwen3-32B-NVFP4 | nvidia | 19.11B | 4K | 12.62GB | text-generation | 81 |
| Qwen3-4B-AWQ | Qwen | 4.02B | 4K | 2.65GB | text-generation | 169 |
| Qwen3-4B-Instruct-2507-FP8 | Qwen | 4.41B | 4K | 2.92GB | text-generation | 169 |
| Qwen3-4B-SafeRL | Qwen | 4.02B | 4K | 2.65GB | text-generation | 169 |
| Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled | Jackrong | 27.78B | 4K | 18.34GB | text-generation | 58 |
| Qwen3.5-27B-Text-NVFP4-MTP | osoleve | 16.67B | 4K | 11GB | text-generation | 106 |
| Qwen3.5-4B-Safety-Thinking | MerlinSafety | 4.21B | 4K | 2.77GB | text-generation | 169 |
| Qwen3.5-9B-abliterated | lukey03 | 8.95B | 4K | 5.91GB | text-generation | 163 |
| Qwen3-8B-AWQ | Qwen | 8.19B | 4K | 5.4GB | text-generation | 163 |
| Qwen3-8B-Base | Qwen | 8.19B | 4K | 5.4GB | text-generation | 163 |
| Qwen3-8B-FP8 | nvidia | 8.19B | 4K | 5.4GB | text-generation | 163 |
| Qwen3-8B-NVFP4 | nvidia | 5.15B | 4K | 3.4GB | text-generation | 169 |
| Qwen3-Coder-30B-A3B-Instruct-FP8 | Qwen | 30.53B | 4K | 20.15GB | text-generation | 54 |
| Qwen3-Coder-Next | Qwen | 79.67B | 4K | 52.58GB | text-generation | 16 |
| Qwen3-Coder-Next-8bit | NexVeridian | 22.41B | 4K | 14.79GB | text-generation | 81 |
| Qwen3-Coder-Next-AWQ-4bit | bullpoint | 14.44B | 4K | 9.54GB | text-generation | 108 |
| Qwen3-Coder-Next-Base | Qwen | 79.67B | 4K | 52.58GB | text-generation | 16 |
| Qwen3-Coder-Next-FP8 | Qwen | 79.68B | 4K | 52.59GB | text-generation | 16 |
| Qwen3-Next-80B-A3B-Instruct | Qwen | 81.32B | 4K | 53.67GB | text-generation | 16 |
| Qwen3-Next-80B-A3B-Instruct-FP8 | Qwen | 81.33B | 4K | 53.68GB | text-generation | 16 |
| Qwen3-VL-30B-A3B-Instruct-AWQ | QuantTrio | 31.07B | 4K | 20.5GB | text-generation | 54 |
| Qwen3Guard-Gen-0.6B | Qwen | 0.75B | 4K | 0.5GB | text-generation | 169 |
| Qwen3Guard-Gen-4B | Qwen | 4.41B | 4K | 2.92GB | text-generation | 169 |
| Qwen3Guard-Gen-8B | Qwen | 8.19B | 4K | 5.4GB | text-generation | 163 |
| QwQ-32B-AWQ | Qwen | 32.76B | 4K | 21.63GB | text-generation | 54 |
| recurrentgemma-2b | 2.68B | 4K | 1.77GB | text-generation | 169 | |
| saiga_llama3_8b | IlyaGusev | 8.03B | 4K | 5.3GB | text-generation | 163 |
| SmolLM-135M-Instruct | HuggingFaceTB | 0.13B | 4K | 0.09GB | text-generation | 169 |
| SmolLM2-135M | HuggingFaceTB | 0.13B | 4K | 0.09GB | text-generation | 169 |
| SmolLM2-135M-Instruct | HuggingFaceTB | 0.13B | 4K | 0.09GB | text-generation | 169 |
| SOLAR-10.7B-v1.0 | upstage | 10.73B | 4K | 7.08GB | text-generation | 141 |
| StableBeluga-13B | stabilityai | 13.02B | 4K | 8.59GB | text-generation | 108 |
| stablelm-2-1_6b | stabilityai | 1.64B | 4K | 1.09GB | text-generation | 169 |
| stablelm-2-zephyr-1_6b | stabilityai | 1.64B | 4K | 1.09GB | text-generation | 169 |
| stablelm-3b-4e1t | stabilityai | 2.80B | 4K | 1.85GB | text-generation | 169 |
| stablelm-base-alpha-7b-v2 | stabilityai | 6.89B | 4K | 4.54GB | text-generation | 163 |
| stablelm-zephyr-3b | stabilityai | 2.80B | 4K | 1.85GB | text-generation | 169 |
| starchat-alpha | HuggingFaceH4 | 15.52B | 4K | 10.24GB | text-generation | 106 |
| Starling-LM-7B-beta | Nexusflow | 7.24B | 4K | 4.79GB | text-generation | 163 |
| steerling-8b | guidelabs | 8.39B | 4K | 5.54GB | text-generation | 163 |
| Step-3.5-Flash | stepfun-ai | 199.38B | 4K | 131.59GB | text-generation | 3 |
| stories15M_MOE | ggml-org | 0.04B | 4K | 0.02GB | text-generation | 169 |
| Strand-Rust-Coder-14B-v1 | Fortytwo-Network | 14.77B | 4K | 9.75GB | text-generation | 108 |
| tiny-aya-global | CohereLabs | 3.35B | 4K | 2.21GB | text-generation | 169 |
| tiny-random-Gemma2ForCausalLM | hmellor | 0.01B | 4K | 0.01GB | text-generation | 169 |
| TinyLlama-1.1B-Chat-v0.3-GPTQ | TheBloke | 1.10B | 4K | 0.73GB | text-generation | 169 |
| TinyLlama-1.1B-Chat-v1.0 | TinyLlama | 1.10B | 4K | 0.73GB | text-generation | 169 |
| tulu-2-dpo-70b | allenai | 68.98B | 4K | 45.53GB | text-generation | 36 |
| txgemma-2b-predict | 2.61B | 4K | 1.73GB | text-generation | 169 | |
| vaultgemma-1b | 1.04B | 4K | 0.68GB | text-generation | 169 | |
| wildguard | allenai | 7.25B | 4K | 4.79GB | text-generation | 163 |
| Yi-1.5-34B | 01-ai | 34.39B | 4K | 22.69GB | text-generation | 54 |
| Yi-1.5-34B-32K | 01-ai | 34.39B | 4K | 22.69GB | text-generation | 54 |
| Yi-1.5-34B-Chat | 01-ai | 34.39B | 4K | 22.69GB | text-generation | 54 |
| Yi-1.5-34B-Chat-16K | 01-ai | 34.39B | 4K | 22.69GB | text-generation | 54 |
| Yi-1.5-6B | 01-ai | 6.06B | 4K | 4GB | text-generation | 169 |
| Yi-1.5-6B-Chat | 01-ai | 6.06B | 4K | 4GB | text-generation | 169 |
| Yi-1.5-9B | 01-ai | 8.83B | 4K | 5.83GB | text-generation | 163 |
| Yi-1.5-9B-32K | 01-ai | 8.83B | 4K | 5.83GB | text-generation | 163 |
| Yi-1.5-9B-Chat | 01-ai | 8.83B | 4K | 5.83GB | text-generation | 163 |
| Yi-1.5-9B-Chat-16K | 01-ai | 8.83B | 4K | 5.83GB | text-generation | 163 |
| Yi-6B | 01-ai | 6.06B | 4K | 4GB | text-generation | 169 |
| Yi-6B-200K | 01-ai | 6.06B | 4K | 4GB | text-generation | 169 |
| Yi-6B-Chat | 01-ai | 6.06B | 4K | 4GB | text-generation | 169 |
| Yi-9B | 01-ai | 8.83B | 4K | 5.83GB | text-generation | 163 |
| Yi-9B-200K | 01-ai | 8.83B | 4K | 5.83GB | text-generation | 163 |
| Yi-Coder-9B | 01-ai | 8.83B | 4K | 5.83GB | text-generation | 163 |
| Yi-Coder-9B-Chat | 01-ai | 8.83B | 4K | 5.83GB | text-generation | 163 |
| zephyr-7b-beta | HuggingFaceH4 | 7.24B | 4K | 4.79GB | text-generation | 163 |