Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

compressed-tensors

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

3,833

Full-text search

Active filters: compressed-tensors

mratsim/Strawberrylemonade-L3-70B-v1.1-NVFP4

Text Generation • 41B • Updated Oct 26, 2025 • 76

CPU-Hybrid-MoE/GLM-4.5-Air-GPU-weight

32B • Updated Oct 24, 2025 • 5

AngelSlim/Glm4_6-fp8_static

353B • Updated Oct 29, 2025 • 13

CPU-Hybrid-MoE/DeepSeek-V3-0324-GPU-FP8-GPTQ4

Text Generation • 106B • Updated Nov 12, 2025 • 4

reinforce20001/Sakura-GalTransl-14B-v3.8-NVFP4

9B • Updated Oct 24, 2025 • 6

reinforce20001/Sakura-GalTransl-14B-v3.8-W8A8-Int8

15B • Updated Oct 24, 2025 • 2

reinforce20001/Sakura-GalTransl-14B-v3.8-NVFP4-FP8

10B • Updated Oct 24, 2025 • 6

mratsim/Wayfarer-Large-70B-NVFP4A16

Text Generation • 41B • Updated Oct 26, 2025 • 9

Benasd/Qwen3-30B-A3B-Instruct-2507-FP8-BLOCK

31B • Updated Oct 24, 2025 • 2

Benasd/Qwen3-30B-A3B-Instruct-2507-FP8

31B • Updated Oct 24, 2025 • 3

Benasd/Qwen3-30B-A3B-Instruct-2507-FP8-MIXED

31B • Updated Oct 24, 2025 • 5

Benasd/Qwen3-30B-A3B-Instruct-2507-FP8-BLOCK-MIXED

31B • Updated Oct 24, 2025 • 4

Benasd/Qwen3-30B-A3B-Instruct-2507-NVFP4-BF16-MIXED

18B • Updated Oct 26, 2025 • 6

ig1/Qwen3-30B-A3B-NVFP4

17B • Updated Oct 29, 2025 • 5

Benasd/Qwen3-30B-A3B-Instruct-2507-NVFP4A16

17B • Updated Oct 24, 2025 • 3

Firworks/Qwen3-4B-Instruct-2507-nvfp4

3B • Updated Oct 25, 2025 • 59

Firworks/Cassiopeia-70B-nvfp4

41B • Updated Oct 24, 2025 • 3

Firworks/Llama-3.3-70B-Vulpecula-r1-nvfp4

41B • Updated Oct 25, 2025 • 3

Firworks/Qwen3-Coder-30B-A3B-Instruct-nvfp4

17B • Updated Oct 25, 2025 • 102 • 1

deepcogito/cogito-671b-v2.1-FP8

Text Generation • 671B • Updated Nov 21, 2025 • 56 • • 9

Firworks/Cydonia-24B-v4.2.0-nvfp4

14B • Updated Oct 25, 2025 • 21

apolloparty/Qwen3-4B-Instruct-2507-NVFP4

3B • Updated Oct 25, 2025 • 6

Firworks/FAPO-32B-nvfp4

19B • Updated Oct 26, 2025 • 2

CPU-Hybrid-MoE/DeepSeek-R1-0528-GPU-FP8-GPTQ4

Text Generation • 106B • Updated Nov 12, 2025 • 2

Firworks/Dolphin-Mistral-24B-Venice-Edition-nvfp4

14B • Updated Oct 26, 2025 • 32

Benasd/Qwen3-30B-A3B-Instruct-2507-NVFP4-BF16-MIXED-2

18B • Updated Oct 26, 2025 • 6

mratsim/Nova-70B-NVFP4A16

Text Generation • 41B • Updated Oct 26, 2025 • 4

mratsim/Anubis-70B-v1.1-NVFP4A16

Text Generation • 41B • Updated Oct 26, 2025 • 5

mratsim/GoldDiamondGold-L33-70B-NVFP4A16

Text Generation • 41B • Updated Nov 8, 2025 • 2

mratsim/L3.3-Ignition-v0.1-70B-NVFP4A16

Text Generation • 41B • Updated Nov 8, 2025 • 3