Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

41,024

Full-text search

Active filters: 4-bit

0xSero/GLM-4.7-REAP-50-W4A16

Text Generation • 2B • Updated 4 days ago • 1.69k • 47

0xSero/MiniMax-M2.1-REAP-50-W4A16

Text Generation • 17B • Updated 5 days ago • 1.4k • 26

mlx-community/GLM-4.7-REAP-50-mxfp4

Text Generation • 185B • Updated 6 days ago • 1.13k • 21

unsloth/gemma-3-12b-it-bnb-4bit

Any-to-Any • 13B • Updated May 12, 2025 • 5.81k • 18

Intel/GLM-4.7-int4-mixed-AutoRound

Text Generation • 2B • Updated 9 days ago • 179 • 24

mlx-community/IQuest-Coder-V1-40B-Loop-Instruct-4bit

Text Generation • 40B • Updated 1 day ago • 1.24k • 8

QuantTrio/GLM-4.7-AWQ

Text Generation • 358B • Updated 11 days ago • 17.2k • 18

LiquidAI/LFM2.5-1.2B-Instruct-MLX-4bit

Text Generation • 0.2B • Updated 2 days ago • 118 • 5

0xSero/GLM-4.7-REAP-40-W4A16

Text Generation • 2B • Updated 5 days ago • 2.29k • 4

LiquidAI/LFM2.5-1.2B-JP-MLX-4bit

Text Generation • 0.2B • Updated 2 days ago • 58 • 4

Disty0/Z-Image-Turbo-SDNQ-uint4-svd-r32

Text-to-Image • Updated Dec 3, 2025 • 56.9k • 51

QuantTrio/MiniMax-M2.1-AWQ

Text Generation • 229B • Updated 10 days ago • 4.3k • 8

mlx-community/Youtu-LLM-2B-mlx-4bit

Text Generation • 0.3B • Updated 6 days ago • 100 • 3

mlx-community/Falcon-H1R-7B-4bit

Text Generation • 1B • Updated 4 days ago • 138 • 3

Disty0/LTX-2-SDNQ-4bit-dynamic

Updated about 17 hours ago • 16 • 3

MaziyarPanahi/gemma-7b-GGUF

Text Generation • 9B • Updated Feb 29, 2024 • 1.34k • 14

MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF

Text Generation • 7B • Updated May 22, 2024 • 133k • 131

hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4

Text Generation • 8B • Updated Aug 7, 2024 • 149k • 84

gaunernst/gemma-3-4b-it-int4-awq

Image-Text-to-Text • Updated Apr 6, 2025 • 37.9k • 5

stelterlab/DeepSeek-R1-0528-Qwen3-8B-AWQ

Text Generation • 8B • Updated Jun 4, 2025 • 5.61k • 4

mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit

Text Generation • Updated Sep 12, 2025 • 5.55k • 23

unsloth/Qwen3-VL-8B-Instruct-unsloth-bnb-4bit

Image-Text-to-Text • 9B • Updated Oct 31, 2025 • 56k • 15

uqer1244/MLX-z-image

Text-to-Image • Updated 15 days ago • 63 • 4

tencent/HY-MT1.5-7B-GPTQ-Int4

Translation • 8B • Updated 8 days ago • 554 • 7

mlx-community/Youtu-LLM-2B-4bit

Text Generation • 0.3B • Updated 8 days ago • 188 • 3

zimengxiong/WeDLM-8B-Instruct-MLX-4bit

Text Generation • 1B • Updated 7 days ago • 197 • 2

jjjssjs/HyperCLOVAX-SEED-Think-32B-4bit

33B • Updated 5 days ago • 573 • 2

Intel/Qwen3-VL-30B-A3B-Instruct-int4-AutoRound

1B • Updated 3 days ago • 94 • 2

TheBloke/WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GPTQ

Text Generation • 33B • Updated Sep 27, 2023 • 144 • 86

TheBloke/BigTranslate-13B-GPTQ

Text Generation • 13B • Updated Aug 21, 2023 • 1.2k • 20