LLM Models

Compare local LLM models — find which GPU you need to run them. VRAM requirements, quantization options, and hardware compatibility.

Showing 346 of 346 models

Model	Developer	Params	Context	Min VRAM	Use Cases	Compatible GPUs
AceReason-Nemotron-14B	nvidia	14.77B	4K	9.75GB	text-generation	108
AI21-Jamba-Large-1.5	ai21labs	398.56B	4K	263.04GB	text-generation	0
AI21-Jamba-Mini-1.5	ai21labs	51.57B	4K	34.03GB	text-generation	37
AI21-Jamba-Mini-1.6	ai21labs	51.57B	4K	34.03GB	text-generation	37
Athene-70B-Preview	Nexusflow	70.55B	4K	46.56GB	text-generation	36
Athene-V2-Agent	Nexusflow	72.70B	4K	47.98GB	text-generation	36
bigscience-small-testing	bigscience	0.02B	4K	0.01GB	text-generation	169
bitnet-b1.58-2B-4T-bf16	microsoft	2.41B	4K	1.59GB	text-generation	169
bloom-1b1	bigscience	1.07B	4K	0.7GB	text-generation	169
bloom-1b7	bigscience	1.72B	4K	1.13GB	text-generation	169
bloom-3b	bigscience	3.00B	4K	1.98GB	text-generation	169
bloom-560m	bigscience	0.56B	4K	0.37GB	text-generation	169
bloom-7b1	bigscience	7.07B	4K	4.66GB	text-generation	163
bloomz	bigscience	176.25B	4K	116.33GB	text-generation	5
bloomz-1b7	bigscience	1.72B	4K	1.13GB	text-generation	169
bloomz-3b	bigscience	3.00B	4K	1.98GB	text-generation	169
bloomz-560m	bigscience	0.56B	4K	0.37GB	text-generation	169
bloomz-7b1	bigscience	7.07B	4K	4.66GB	text-generation	163
bloomz-7b1-mt	bigscience	7.07B	4K	4.66GB	text-generation	163
Bolmo-1B	allenai	1.47B	4K	0.97GB	text-generation	169
codegemma-2b	google	2.51B	4K	1.65GB	text-generation	169
CodeLlama-13b-Instruct-hf	meta-llama	13.02B	4K	8.59GB	text-generation	108
CodeLlama-7b-Instruct-hf	meta-llama	6.74B	4K	4.44GB	text-generation	163
deep-ignorance-unfiltered	EleutherAI	6.86B	4K	4.52GB	text-generation	163
deepseek-coder-33b-base	deepseek-ai	33.34B	4K	22.01GB	text-generation	54
deepseek-coder-33b-instruct	deepseek-ai	33.34B	4K	22.01GB	text-generation	54
deepseek-coder-6.7b-base	deepseek-ai	6.74B	4K	4.44GB	text-generation	163
deepseek-coder-6.7b-instruct	deepseek-ai	6.74B	4K	4.44GB	text-generation	163
deepseek-coder-7b-base-v1.5	deepseek-ai	6.91B	4K	4.57GB	text-generation	163
deepseek-coder-7b-instruct-v1.5	deepseek-ai	6.91B	4K	4.57GB	text-generation	163
DeepSeek-Coder-V2-Instruct	deepseek-ai	235.74B	4K	155.58GB	text-generation	1
DeepSeek-Coder-V2-Instruct-0724	deepseek-ai	235.74B	4K	155.58GB	text-generation	1
DeepSeek-Coder-V2-Lite-Base	deepseek-ai	15.71B	4K	10.36GB	text-generation	106
deepseek-moe-16b-base	deepseek-ai	16.38B	4K	10.81GB	text-generation	106
deepseek-moe-16b-chat	deepseek-ai	16.38B	4K	10.81GB	text-generation	106
DeepSeek-R1-0528	deepseek-ai	684.53B	4K	451.79GB	text-generation	0
DeepSeek-R1-0528-NVFP4	nvidia	396.77B	4K	261.87GB	text-generation	0
DeepSeek-R1-0528-NVFP4-v2	nvidia	393.63B	4K	259.8GB	text-generation	0
DeepSeek-R1-0528-Qwen3-8B	deepseek-ai	8.19B	4K	5.4GB	text-generation	163
DeepSeek-R1-0528-Qwen3-8B-MLX-4bit	lmstudio-community	1.28B	4K	0.85GB	text-generation	169
DeepSeek-R1-0528-Qwen3-8B-MLX-8bit	lmstudio-community	2.30B	4K	1.52GB	text-generation	169
DeepSeek R1 Distill 14B	DeepSeek	14.00B	65K	9.5GB	reasoning, math, coding, analysis	108
DeepSeek-R1-Distill-Qwen-14B	deepseek-ai	14.77B	4K	9.75GB	text-generation	108
DeepSeek-R1-Distill-Qwen-32B	deepseek-ai	32.76B	4K	21.63GB	text-generation	54
DeepSeek-R1-Distill-Qwen-7B	deepseek-ai	7.62B	4K	5.03GB	text-generation	163
DeepSeek-R1-NVFP4	nvidia	396.77B	4K	261.87GB	text-generation	0
DeepSeek-V2	deepseek-ai	235.74B	4K	155.58GB	text-generation	1
DeepSeek-V2.5	deepseek-ai	235.74B	4K	155.58GB	text-generation	1
DeepSeek-V2-Chat	deepseek-ai	235.74B	4K	155.58GB	text-generation	1
DeepSeek-V2-Chat-0628	deepseek-ai	235.74B	4K	155.58GB	text-generation	1
DeepSeek-V2-Lite	deepseek-ai	15.71B	4K	10.36GB	text-generation	106
DeepSeek-V2-Lite-Chat	deepseek-ai	15.71B	4K	10.36GB	text-generation	106
Deepseek-V2 Pro	DeepSeek AI	70.00B	131K	45.02GB	chat, code, reasoning	36
DeepSeek-V3-0324	deepseek-ai	684.53B	4K	451.79GB	text-generation	0
DeepSeek-V3-0324-NVFP4	nvidia	396.77B	4K	261.87GB	text-generation	0
DeepSeek-V3.1-NVFP4	nvidia	393.63B	4K	259.8GB	text-generation	0
DeepSeek-V3.2	DeepSeek AI	70.00B	131K	77.47GB	reasoning, agentic workflows	15
DeepSeek-V3.2-NVFP4	nvidia	394.50B	4K	260.37GB	text-generation	0
DialoGPT-small	microsoft	0.18B	4K	0.12GB	text-generation	169
distilgpt2	distilbert	0.09B	4K	0.06GB	text-generation	169
dolphin-2.9.1-yi-1.5-34b	dphn	34.39B	4K	22.69GB	text-generation	54
Dolphin-Mistral-24B-Venice-Edition	dphn	23.57B	4K	15.55GB	text-generation	81
ELM	Joaoffg	0.90B	4K	0.59GB	text-generation	169
falcon-11B	tiiuae	11.10B	4K	7.33GB	text-generation	141
falcon-7b-instruct	tiiuae	7.22B	4K	4.76GB	text-generation	163
Falcon-H1-0.5B-Base	tiiuae	0.52B	4K	0.34GB	text-generation	169
Falcon-H1-0.5B-Instruct	tiiuae	0.52B	4K	0.34GB	text-generation	169
Falcon-H1-1.5B-Base	tiiuae	1.55B	4K	1.02GB	text-generation	169
Falcon-H1-1.5B-Instruct	tiiuae	1.55B	4K	1.02GB	text-generation	169
Falcon-H1-34B-Base	tiiuae	33.64B	4K	22.21GB	text-generation	54
Falcon-H1-34B-Instruct	tiiuae	33.64B	4K	22.21GB	text-generation	54
Falcon-H1-3B-Base	tiiuae	3.15B	4K	2.08GB	text-generation	169
Falcon-H1-3B-Instruct	tiiuae	3.15B	4K	2.08GB	text-generation	169
Falcon-H1-7B-Base	tiiuae	7.59B	4K	5GB	text-generation	163
Falcon-H1-7B-Instruct	tiiuae	7.59B	4K	5GB	text-generation	163
Falcon-H1-Tiny-90M-Instruct	tiiuae	0.09B	4K	0.06GB	text-generation	169
falcon-mamba-7b-instruct	tiiuae	7.27B	4K	4.8GB	text-generation	163
falcon-mamba-tiny-dev	tiiuae	0.01B	4K	0.01GB	text-generation	169
Falcon3-10B-Base	tiiuae	10.31B	4K	6.8GB	text-generation	141
Falcon3-1B-Instruct	tiiuae	1.67B	4K	1.1GB	text-generation	169
Falcon3-3B-Base	tiiuae	3.23B	4K	2.13GB	text-generation	169
Falcon3-3B-Instruct	tiiuae	3.23B	4K	2.13GB	text-generation	169
Falcon3-7B-Base	tiiuae	7.46B	4K	4.92GB	text-generation	163
Falcon3-7B-Instruct	tiiuae	7.46B	4K	4.92GB	text-generation	163
Flex-reddit-2x7B-1T	allenai	11.63B	4K	7.68GB	text-generation	141
gemma-1.1-2b-it	google	2.51B	4K	1.65GB	text-generation	169
gemma-1.1-7b-it	google	8.54B	4K	5.63GB	text-generation	163
gemma-2-27b-it	google	27.23B	4K	17.97GB	text-generation	58
gemma-2-9b-it	google	9.24B	4K	6.11GB	text-generation	141
GLM-4.7-Flash-MLX-6bit	lmstudio-community	6.56B	4K	4.32GB	text-generation	163
GLM-4.7-Flash-MLX-8bit	lmstudio-community	8.43B	4K	5.57GB	text-generation	163
gpt-neo-1.3B	EleutherAI	1.37B	4K	0.9GB	text-generation	169
gpt-neo-125m	EleutherAI	0.15B	4K	0.1GB	text-generation	169
gpt-neo-2.7B	EleutherAI	2.72B	4K	1.79GB	text-generation	169
gpt-oss-120b	openai	120.41B	4K	79.48GB	text-generation	15
gpt-oss-120b-Eagle3-long-context	nvidia	0.22B	4K	0.14GB	text-generation	169
gpt-oss-20b	openai	21.51B	4K	14.2GB	text-generation	81
gpt2	openai-community	0.14B	4K	0.09GB	text-generation	169
gpt2-large	openai-community	0.81B	4K	0.54GB	text-generation	169
gpt2-medium	openai-community	0.38B	4K	0.25GB	text-generation	169
gpt2-mini	erwanf	0.04B	4K	0.02GB	text-generation	169
h2ovl-mississippi-2b	h2oai	2.15B	4K	1.42GB	text-generation	169
h2ovl-mississippi-800m	h2oai	0.83B	4K	0.55GB	text-generation	169
Hermes-2-Pro-Llama-3-8B	NousResearch	8.03B	4K	5.3GB	text-generation	163
Hermes-2-Pro-Mistral-7B	NousResearch	7.24B	4K	4.79GB	text-generation	163
Hermes-2-Theta-Llama-3-8B	NousResearch	8.03B	4K	5.3GB	text-generation	163
Hermes-3-Llama-3.1-8B	NousResearch	8.03B	4K	5.3GB	text-generation	163
Hermes-4-14B	NousResearch	14.77B	4K	9.75GB	text-generation	108
internlm2_5-7b	internlm	7.74B	4K	5.1GB	text-generation	163
internlm2-chat-1_8b	internlm	1.89B	4K	1.24GB	text-generation	169
internlm2-chat-20b	internlm	19.86B	4K	13.11GB	text-generation	81
internlm2-chat-7b-sft	internlm	7.74B	4K	5.1GB	text-generation	163
Jan-v3-4B-base-instruct	janhq	4.41B	4K	2.92GB	text-generation	169
japanese-gpt-neox-small	rinna	0.20B	4K	0.13GB	text-generation	169
LFM2-24B-A2B	LiquidAI	23.84B	4K	15.74GB	text-generation	81
LFM2.5-1.2B-Instruct	LiquidAI	1.17B	4K	0.77GB	text-generation	169
LFM2.5-1.2B-Instruct-MLX-4bit	lmstudio-community	0.18B	4K	0.12GB	text-generation	169
LFM2.5-1.2B-Instruct-MLX-6bit	lmstudio-community	0.26B	4K	0.17GB	text-generation	169
LFM2.5-1.2B-Instruct-MLX-8bit	lmstudio-community	0.33B	4K	0.22GB	text-generation	169
LFM2-8B-A1B	LiquidAI	8.34B	4K	5.5GB	text-generation	163
Llama-2-7b-hf	meta-llama	6.74B	4K	4.44GB	text-generation	163
Llama-3.1-405B-Instruct	meta-llama	405.85B	4K	267.86GB	text-generation	0
Llama-3.1-405B-Instruct-FP8	meta-llama	405.87B	4K	267.87GB	text-generation	0
Llama 3.1 70B	Meta	70.00B	131K	44GB	chat, coding, reasoning	36
Llama-3.1-70B-Instruct	meta-llama	70.55B	4K	46.56GB	text-generation	36
Llama 3.1 8B	Meta	8.00B	131K	5.5GB	chat, coding, summarization	163
Llama-3.1-8B-Instruct-FP8	nvidia	8.03B	4K	5.3GB	text-generation	163
Llama-3.1-Tulu-3-8B-SFT	allenai	8.03B	4K	5.3GB	text-generation	163
Llama-3.2-1B	meta-llama	1.24B	4K	0.81GB	text-generation	169
Llama-3.2-1B-Instruct-FP8	RedHatAI	1.50B	4K	0.99GB	text-generation	169
Llama-3.2-1B-Instruct-FP8-dynamic	RedHatAI	1.50B	4K	0.99GB	text-generation	169
Llama-3.2-3B	meta-llama	3.21B	4K	2.12GB	text-generation	169
llama-3.3-70b-instruct-awq	casperhansen	70.55B	4K	46.56GB	text-generation	36
Llama-3_3-Nemotron-Super-49B-v1	nvidia	49.87B	4K	32.91GB	text-generation	37
Llama-3_3-Nemotron-Super-49B-v1_5-FP8	nvidia	49.87B	4K	32.91GB	text-generation	37
Llama-3_3-Nemotron-Super-49B-v1_5-NVFP4	nvidia	28.97B	4K	19.12GB	text-generation	58
Llama-3_3-Nemotron-Super-49B-v1-FP8	nvidia	49.87B	4K	32.91GB	text-generation	37
llama-300M-v3-original	deqing	0.32B	4K	0.21GB	text-generation	169
Llama-Guard-3-8B	meta-llama	8.03B	4K	5.3GB	text-generation	163
Llama-Guard-3-8B-INT8	meta-llama	8.03B	4K	5.3GB	text-generation	163
LlamaGuard-7b	meta-llama	6.74B	4K	4.44GB	text-generation	163
llm-jp-3-3.7b-instruct	llm-jp	3.78B	4K	2.5GB	text-generation	169
LocoOperator-4B	LocoreMind	4.02B	4K	2.65GB	text-generation	169
maira-2	microsoft	6.88B	4K	4.54GB	text-generation	163
MediPhi-Clinical	microsoft	3.82B	4K	2.52GB	text-generation	169
MediPhi-Instruct	microsoft	3.82B	4K	2.52GB	text-generation	169
Meta-Llama-3.1-70B-Instruct	NousResearch	70.55B	4K	46.56GB	text-generation	36
Meta-Llama-3.1-8B	NousResearch	8.03B	4K	5.3GB	text-generation	163
Meta-Llama-3.1-8B-Instruct	unsloth	8.03B	4K	5.3GB	text-generation	163
Meta-Llama-3.1-8B-Instruct-bnb-4bit	unsloth	8.25B	4K	5.45GB	text-generation	163
Meta-Llama-3.1-8B-Instruct-FP8	RedHatAI	8.03B	4K	5.3GB	text-generation	163
Meta-Llama-3-70B-Instruct	meta-llama	70.55B	4K	46.56GB	text-generation	36
Meta-Llama-3-8B	meta-llama	8.03B	4K	5.3GB	text-generation	163
Meta-Llama-3-8B-Instruct	meta-llama	8.03B	4K	5.3GB	text-generation	163
Meta-Llama-Guard-2-8B	meta-llama	8.03B	4K	5.3GB	text-generation	163
MiniMax-M2.5	MiniMaxAI	228.70B	4K	150.94GB	text-generation	1
MiniMax-M2-AWQ	QuantTrio	228.69B	4K	150.93GB	text-generation	1
Mistral 7B	Mistral	7.00B	32K	5GB	chat, instruction-following, translation	163
Mistral-7B-Instruct-v0.2	mistralai	7.24B	4K	4.79GB	text-generation	163
mistral-7b-v0.3-bnb-4bit	unsloth	7.47B	4K	4.93GB	text-generation	163
Mistral-NeMo-Minitron-8B-Instruct	nvidia	8.41B	4K	5.56GB	text-generation	163
Mistral-Small-24B-Instruct-2501-AWQ	stelterlab	23.57B	4K	15.55GB	text-generation	81
Mixtral-8x7B-Instruct-v0.1-GPTQ	TheBloke	46.71B	4K	30.83GB	text-generation	40
Nanbeige4.1-3B	Nanbeige	3.93B	4K	2.6GB	text-generation	169
Nanbeige4.1-3B-heretic	heretic-org	3.93B	4K	2.6GB	text-generation	169
Nemotron-Flash-3B	nvidia	2.75B	4K	1.81GB	text-generation	169
Nemotron-H-4B-Base-8K	nvidia	4.49B	4K	2.96GB	text-generation	169
Nemotron-H-4B-Instruct-128K	nvidia	4.49B	4K	2.96GB	text-generation	169
Nous-Hermes-2-Mistral-7B-DPO	NousResearch	7.24B	4K	4.79GB	text-generation	163
Nous-Hermes-2-Mixtral-8x7B-DPO	NousResearch	46.70B	4K	30.82GB	text-generation	40
Nous-Hermes-2-SOLAR-10.7B	NousResearch	10.73B	4K	7.08GB	text-generation	141
Nous-Hermes-llama-2-7b	NousResearch	6.74B	4K	4.44GB	text-generation	163
NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16	nvidia	31.58B	4K	20.85GB	text-generation	54
NVIDIA-Nemotron-3-Nano-30B-A3B-BF16	nvidia	31.58B	4K	20.85GB	text-generation	54
NVIDIA-Nemotron-3-Nano-30B-A3B-FP8	nvidia	31.58B	4K	20.85GB	text-generation	54
NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4	nvidia	18.24B	4K	12.03GB	text-generation	81
NVIDIA-Nemotron-Nano-9B-v2	nvidia	8.89B	4K	5.86GB	text-generation	163
NVIDIA-Nemotron-Nano-9B-v2-Base	nvidia	8.89B	4K	5.86GB	text-generation	163
NVIDIA-Nemotron-Nano-9B-v2-FP8	nvidia	8.89B	4K	5.86GB	text-generation	163
NVIDIA-Nemotron-Nano-9B-v2-Japanese	nvidia	8.89B	4K	5.86GB	text-generation	163
OLMo-1B	allenai	1.18B	4K	0.78GB	text-generation	169
OLMo-1B-0724-hf	allenai	1.28B	4K	0.85GB	text-generation	169
OLMo-1B-hf	allenai	1.18B	4K	0.78GB	text-generation	169
OLMo-2-0325-32B	allenai	32.23B	4K	21.27GB	text-generation	54
OLMo-2-0325-32B-Instruct	allenai	32.23B	4K	21.27GB	text-generation	54
OLMo-2-0425-1B	allenai	1.48B	4K	0.98GB	text-generation	169
OLMo-2-0425-1B-Instruct	allenai	1.48B	4K	0.98GB	text-generation	169
OLMo-2-0425-1B-RLVR1	allenai	1.48B	4K	0.98GB	text-generation	169
OLMo-2-1124-13B-Instruct	allenai	13.72B	4K	9.05GB	text-generation	108
OLMo-2-1124-7B-Instruct	allenai	7.30B	4K	4.82GB	text-generation	163
Olmo-3.1-32B-Think	allenai	32.23B	4K	21.27GB	text-generation	54
Olmo-3.1-7B-RL-Zero-Math	allenai	7.30B	4K	4.82GB	text-generation	163
Olmo-3-1025-7B	allenai	7.30B	4K	4.82GB	text-generation	163
Olmo-3-1125-32B	allenai	32.23B	4K	21.27GB	text-generation	54
Olmo-3-32B-Think	allenai	32.23B	4K	21.27GB	text-generation	54
Olmo-3-7B-Instruct	allenai	7.30B	4K	4.82GB	text-generation	163
Olmo-3-7B-Instruct-DPO	allenai	7.30B	4K	4.82GB	text-generation	163
Olmo-3-7B-Instruct-SFT	allenai	7.30B	4K	4.82GB	text-generation	163
Olmo-3-7B-Think	allenai	7.30B	4K	4.82GB	text-generation	163
Olmo-3-7B-Think-DPO	allenai	7.30B	4K	4.82GB	text-generation	163
Olmo-3-7B-Think-SFT	allenai	7.30B	4K	4.82GB	text-generation	163
OLMo-7B-0724-hf	allenai	6.89B	4K	4.54GB	text-generation	163
OLMo-7B-hf	allenai	6.89B	4K	4.54GB	text-generation	163
Olmo-Hybrid-Instruct-DPO-7B	allenai	7.43B	4K	4.91GB	text-generation	163
OLMoE-1B-7B-0125	allenai	6.92B	4K	4.57GB	text-generation	163
OLMoE-1B-7B-0125-Instruct	allenai	6.92B	4K	4.57GB	text-generation	163
OLMoE-1B-7B-0924-Instruct	allenai	6.92B	4K	4.57GB	text-generation	163
phi-1	microsoft	1.42B	4K	0.94GB	text-generation	169
phi-1_5	microsoft	1.42B	4K	0.94GB	text-generation	169
phi-2	microsoft	2.78B	4K	1.84GB	text-generation	169
Phi-3-medium-4k-instruct	microsoft	13.96B	4K	9.22GB	text-generation	108
Phi-3-mini-4k-instruct-gptq-4bit	kaitchup	3.82B	4K	2.52GB	text-generation	169
Phi-3-small-8k-instruct	microsoft	7.39B	4K	4.88GB	text-generation	163
Phi-mini-MoE-instruct	microsoft	7.65B	4K	5.05GB	text-generation	163
Phi-tiny-MoE-instruct	microsoft	3.76B	4K	2.48GB	text-generation	169
polyglot-ko-1.3b	EleutherAI	1.43B	4K	0.95GB	text-generation	169
polyglot-ko-12.8b	EleutherAI	13.06B	4K	8.62GB	text-generation	108
polyglot-ko-5.8b	EleutherAI	6.00B	4K	3.96GB	text-generation	169
pythia-1.4b	EleutherAI	1.52B	4K	1GB	text-generation	169
pythia-1.4b-deduped	EleutherAI	1.41B	4K	0.94GB	text-generation	169
pythia-12b	EleutherAI	12.00B	4K	7.92GB	text-generation	141
pythia-14m	EleutherAI	0.01B	4K	0.01GB	text-generation	169
pythia-14m-deduped	EleutherAI	0.04B	4K	0.02GB	text-generation	169
pythia-160m-deduped	EleutherAI	0.21B	4K	0.14GB	text-generation	169
pythia-160m-seed1	EleutherAI	0.21B	4K	0.14GB	text-generation	169
pythia-1b	EleutherAI	1.08B	4K	0.72GB	text-generation	169
pythia-2.8b-deduped	EleutherAI	2.91B	4K	1.93GB	text-generation	169
pythia-31m	EleutherAI	0.03B	4K	0.02GB	text-generation	169
pythia-31m-deduped	EleutherAI	0.06B	4K	0.03GB	text-generation	169
pythia-410m	EleutherAI	0.51B	4K	0.33GB	text-generation	169
pythia-410m-deduped	EleutherAI	0.51B	4K	0.33GB	text-generation	169
pythia-410m-v0	EleutherAI	0.51B	4K	0.33GB	text-generation	169
pythia-6.9b	EleutherAI	6.99B	4K	4.61GB	text-generation	163
pythia-70m-deduped	EleutherAI	0.10B	4K	0.07GB	text-generation	169
Qwen 2.5 72B	Alibaba	72.00B	131K	45.02GB	chat, code, reasoning	36
Qwen 2.5 72B Instruct	Qwen	72.00B	131K	45.02GB	chat, code, reasoning	36
Qwen1.5-110B-Chat-AWQ	Qwen	111.21B	4K	73.4GB	text-generation	15
Qwen2-0.5B-Instruct	Qwen	0.49B	4K	0.33GB	text-generation	169
Qwen2-1.5B-Instruct	Qwen	1.54B	4K	1.02GB	text-generation	169
Qwen2.5-0.5B	Qwen	0.49B	4K	0.33GB	text-generation	169
Qwen2.5-0.5B-Instruct	Qwen	0.49B	4K	0.33GB	text-generation	169
Qwen2.5-1.5B	Qwen	1.54B	4K	1.02GB	text-generation	169
Qwen2.5-1.5B-Instruct	Qwen	1.54B	4K	1.02GB	text-generation	169
Qwen2.5-1.5B-Instruct-AWQ	Qwen	1.78B	4K	1.18GB	text-generation	169
Qwen2.5-1.5B-quantized.w8a8	RedHatAI	1.78B	4K	1.18GB	text-generation	169
Qwen2.5-14B-Instruct-AWQ	Qwen	14.77B	4K	9.75GB	text-generation	108
Qwen2.5-32B	Qwen	32.76B	4K	21.63GB	text-generation	54
Qwen2.5-32B-Instruct-AWQ	Qwen	32.76B	4K	21.63GB	text-generation	54
Qwen2.5-3B	Qwen	3.09B	4K	2.04GB	text-generation	169
Qwen2.5-3B-Instruct	Qwen	3.09B	4K	2.04GB	text-generation	169
Qwen2.5-72B-Instruct	Qwen	72.71B	4K	47.98GB	text-generation	36
Qwen2.5-72B-Instruct-AWQ	Qwen	72.96B	4K	48.15GB	text-generation	16
Qwen2.5-7B	Qwen	7.62B	4K	5.03GB	text-generation	163
Qwen2.5-7B-Instruct	Qwen	7.62B	4K	5.03GB	text-generation	163
Qwen2.5-Coder-0.5B-Instruct	Qwen	0.49B	4K	0.33GB	text-generation	169
Qwen2.5-Coder-1.5B-Instruct	Qwen	1.54B	4K	1.02GB	text-generation	169
Qwen2.5-Coder-14B-Instruct	Qwen	14.77B	4K	9.75GB	text-generation	108
Qwen2.5-Coder-32B-Instruct	Qwen	32.76B	4K	21.63GB	text-generation	54
Qwen2.5-Coder-32B-Instruct-AWQ	Qwen	32.76B	4K	21.63GB	text-generation	54
Qwen2.5-Coder-7B-Instruct	Qwen	7.62B	4K	5.03GB	text-generation	163
Qwen2.5-Coder-7B-Instruct-AWQ	Qwen	7.62B	4K	5.03GB	text-generation	163
Qwen2.5-Coder-7B-Instruct-GPTQ-Int4	Qwen	7.62B	4K	5.03GB	text-generation	163
Qwen2.5-Math-1.5B	Qwen	1.54B	4K	1.02GB	text-generation	169
Qwen2.5-VL-7B-Instruct-NVFP4	nvidia	5.44B	4K	3.59GB	text-generation	169
Qwen2 72B	Qwen	72.00B	65K	45.02GB	chat, code, reasoning	36
Qwen2-7B-Instruct	Qwen	7.62B	4K	5.03GB	text-generation	163
Qwen3-0.6B	Qwen	0.75B	4K	0.5GB	text-generation	169
Qwen3-0.6B-FP8	Qwen	0.75B	4K	0.5GB	text-generation	169
Qwen3-1.7B-Base	Qwen	1.72B	4K	1.13GB	text-generation	169
Qwen3-14B-Instruct	OpenPipe	14.77B	4K	9.75GB	text-generation	108
Qwen3-14B-NVFP4	nvidia	8.99B	4K	5.93GB	text-generation	163
Qwen3-235B-A22B	Qwen	235.09B	4K	155.17GB	text-generation	1
Qwen3-235B-A22B-Instruct-2507-FP8	Qwen	235.11B	4K	155.17GB	text-generation	1
Qwen3-235B-A22B-NVFP4	nvidia	132.81B	4K	87.65GB	text-generation	8
Qwen3-30B-A3B-Instruct-2507-FP8	Qwen	30.53B	4K	20.15GB	text-generation	54
Qwen3-30B-A3B-NVFP4	nvidia	17.45B	4K	11.52GB	text-generation	100
Qwen3-32B-AWQ	Qwen	32.76B	4K	21.63GB	text-generation	54
Qwen3-32B-NVFP4	nvidia	19.11B	4K	12.62GB	text-generation	81
Qwen3-4B-AWQ	Qwen	4.02B	4K	2.65GB	text-generation	169
Qwen3-4B-Instruct-2507-FP8	Qwen	4.41B	4K	2.92GB	text-generation	169
Qwen3-4B-SafeRL	Qwen	4.02B	4K	2.65GB	text-generation	169
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled	Jackrong	27.78B	4K	18.34GB	text-generation	58
Qwen3.5-27B-Text-NVFP4-MTP	osoleve	16.67B	4K	11GB	text-generation	106
Qwen3.5-4B-Safety-Thinking	MerlinSafety	4.21B	4K	2.77GB	text-generation	169
Qwen3.5-9B-abliterated	lukey03	8.95B	4K	5.91GB	text-generation	163
Qwen3-8B-AWQ	Qwen	8.19B	4K	5.4GB	text-generation	163
Qwen3-8B-Base	Qwen	8.19B	4K	5.4GB	text-generation	163
Qwen3-8B-FP8	nvidia	8.19B	4K	5.4GB	text-generation	163
Qwen3-8B-NVFP4	nvidia	5.15B	4K	3.4GB	text-generation	169
Qwen3-Coder-30B-A3B-Instruct-FP8	Qwen	30.53B	4K	20.15GB	text-generation	54
Qwen3-Coder-Next	Qwen	79.67B	4K	52.58GB	text-generation	16
Qwen3-Coder-Next-8bit	NexVeridian	22.41B	4K	14.79GB	text-generation	81
Qwen3-Coder-Next-AWQ-4bit	bullpoint	14.44B	4K	9.54GB	text-generation	108
Qwen3-Coder-Next-Base	Qwen	79.67B	4K	52.58GB	text-generation	16
Qwen3-Coder-Next-FP8	Qwen	79.68B	4K	52.59GB	text-generation	16
Qwen3-Next-80B-A3B-Instruct	Qwen	81.32B	4K	53.67GB	text-generation	16
Qwen3-Next-80B-A3B-Instruct-FP8	Qwen	81.33B	4K	53.68GB	text-generation	16
Qwen3-VL-30B-A3B-Instruct-AWQ	QuantTrio	31.07B	4K	20.5GB	text-generation	54
Qwen3Guard-Gen-0.6B	Qwen	0.75B	4K	0.5GB	text-generation	169
Qwen3Guard-Gen-4B	Qwen	4.41B	4K	2.92GB	text-generation	169
Qwen3Guard-Gen-8B	Qwen	8.19B	4K	5.4GB	text-generation	163
QwQ-32B-AWQ	Qwen	32.76B	4K	21.63GB	text-generation	54
recurrentgemma-2b	google	2.68B	4K	1.77GB	text-generation	169
saiga_llama3_8b	IlyaGusev	8.03B	4K	5.3GB	text-generation	163
SmolLM-135M-Instruct	HuggingFaceTB	0.13B	4K	0.09GB	text-generation	169
SmolLM2-135M	HuggingFaceTB	0.13B	4K	0.09GB	text-generation	169
SmolLM2-135M-Instruct	HuggingFaceTB	0.13B	4K	0.09GB	text-generation	169
SOLAR-10.7B-v1.0	upstage	10.73B	4K	7.08GB	text-generation	141
StableBeluga-13B	stabilityai	13.02B	4K	8.59GB	text-generation	108
stablelm-2-1_6b	stabilityai	1.64B	4K	1.09GB	text-generation	169
stablelm-2-zephyr-1_6b	stabilityai	1.64B	4K	1.09GB	text-generation	169
stablelm-3b-4e1t	stabilityai	2.80B	4K	1.85GB	text-generation	169
stablelm-base-alpha-7b-v2	stabilityai	6.89B	4K	4.54GB	text-generation	163
stablelm-zephyr-3b	stabilityai	2.80B	4K	1.85GB	text-generation	169
starchat-alpha	HuggingFaceH4	15.52B	4K	10.24GB	text-generation	106
Starling-LM-7B-beta	Nexusflow	7.24B	4K	4.79GB	text-generation	163
steerling-8b	guidelabs	8.39B	4K	5.54GB	text-generation	163
Step-3.5-Flash	stepfun-ai	199.38B	4K	131.59GB	text-generation	3
stories15M_MOE	ggml-org	0.04B	4K	0.02GB	text-generation	169
Strand-Rust-Coder-14B-v1	Fortytwo-Network	14.77B	4K	9.75GB	text-generation	108
tiny-aya-global	CohereLabs	3.35B	4K	2.21GB	text-generation	169
tiny-random-Gemma2ForCausalLM	hmellor	0.01B	4K	0.01GB	text-generation	169
TinyLlama-1.1B-Chat-v0.3-GPTQ	TheBloke	1.10B	4K	0.73GB	text-generation	169
TinyLlama-1.1B-Chat-v1.0	TinyLlama	1.10B	4K	0.73GB	text-generation	169
tulu-2-dpo-70b	allenai	68.98B	4K	45.53GB	text-generation	36
txgemma-2b-predict	google	2.61B	4K	1.73GB	text-generation	169
vaultgemma-1b	google	1.04B	4K	0.68GB	text-generation	169
wildguard	allenai	7.25B	4K	4.79GB	text-generation	163
Yi-1.5-34B	01-ai	34.39B	4K	22.69GB	text-generation	54
Yi-1.5-34B-32K	01-ai	34.39B	4K	22.69GB	text-generation	54
Yi-1.5-34B-Chat	01-ai	34.39B	4K	22.69GB	text-generation	54
Yi-1.5-34B-Chat-16K	01-ai	34.39B	4K	22.69GB	text-generation	54
Yi-1.5-6B	01-ai	6.06B	4K	4GB	text-generation	169
Yi-1.5-6B-Chat	01-ai	6.06B	4K	4GB	text-generation	169
Yi-1.5-9B	01-ai	8.83B	4K	5.83GB	text-generation	163
Yi-1.5-9B-32K	01-ai	8.83B	4K	5.83GB	text-generation	163
Yi-1.5-9B-Chat	01-ai	8.83B	4K	5.83GB	text-generation	163
Yi-1.5-9B-Chat-16K	01-ai	8.83B	4K	5.83GB	text-generation	163
Yi-6B	01-ai	6.06B	4K	4GB	text-generation	169
Yi-6B-200K	01-ai	6.06B	4K	4GB	text-generation	169
Yi-6B-Chat	01-ai	6.06B	4K	4GB	text-generation	169
Yi-9B	01-ai	8.83B	4K	5.83GB	text-generation	163
Yi-9B-200K	01-ai	8.83B	4K	5.83GB	text-generation	163
Yi-Coder-9B	01-ai	8.83B	4K	5.83GB	text-generation	163
Yi-Coder-9B-Chat	01-ai	8.83B	4K	5.83GB	text-generation	163
zephyr-7b-beta	HuggingFaceH4	7.24B	4K	4.79GB	text-generation	163