Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

compressed-tensors

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

3,803

Full-text search

Active filters: compressed-tensors

iproskurina/Qwen2.5-1.5B-awq-int4

0.6B • Updated Oct 31, 2025 • 2

iproskurina/Qwen2.5-7B-awq-int4

2B • Updated Oct 31, 2025 • 3

iproskurina/gemma-7b-awq-int4

3B • Updated Oct 31, 2025 • 3

iproskurina/gemma-2b-awq-int4

Text Generation • 1B • Updated Oct 31, 2025 • 3

cyankiwi/Kimi-Linear-48B-A3B-Instruct-AWQ-8bit

15B • Updated Nov 27, 2025 • 351 • 3

Iker/Latxa-Llama-3.1-8B-Instruct-w4a4_nvfp4

Text Generation • 5B • Updated Oct 31, 2025 • 9

Iker/Latxa-Llama-3.1-8B-Instruct-w4a16_nvfp4

Text Generation • 2B • Updated Oct 31, 2025 • 3

Iker/Latxa-Llama-3.1-8B-Instruct-w8a8_fp8

Text Generation • 8B • Updated Oct 31, 2025 • 1

iproskurina/Llama-3.1-8B-awq-int4

Text Generation • 2B • Updated Oct 31, 2025 • 1

iproskurina/Llama-3.2-3B-awq-int4

Text Generation • 1B • Updated Oct 31, 2025 • 2

iproskurina/pythia-6.9b-awq-int4

Text Generation • 1B • Updated Oct 31, 2025 • 1

iproskurina/Mistral-7B-v0.3-awq-int4

Text Generation • 1B • Updated Oct 31, 2025 • 1

iproskurina/aya-expanse-8b-awq-int4

Text Generation • 3B • Updated Oct 31, 2025 • 5

iproskurina/granite-3.3-8b-base-awq-int4

Text Generation • 1B • Updated Oct 31, 2025 • 2

iproskurina/gemma-2-9b-awq-int4

3B • Updated Oct 31, 2025 • 3

iproskurina/gemma-2-2b-awq-int4

1B • Updated Oct 31, 2025 • 3

SicariusSicariiStuff/Hebrew_Nemo_FP8

Text Generation • 12B • Updated Oct 31, 2025 • 3

Iker/Latxa-Llama-3.1-70B-Instruct-w4a4_nvfp4

Text Generation • 41B • Updated Oct 31, 2025 • 16

Iker/Latxa-Llama-3.1-70B-Instruct-w8a8_fp8

Text Generation • 71B • Updated Oct 31, 2025 • 2

Iker/Latxa-Llama-3.1-70B-Instruct-w4a16_nvfp4

Text Generation • 11B • Updated Oct 31, 2025 • 3

LuisMonAN/Gemma3-12B-AWQ-W4A16-CausalLM

3B • Updated Nov 21, 2025 • 10

nm-testing/Kimi-Linear-48B-A3B-Instruct-FP8-DYNAMIC

49B • Updated Oct 31, 2025 • 32

Firworks/Kimi-Linear-48B-A3B-Instruct-nvfp4

28B • Updated Nov 27, 2025 • 230 • 9

Letinapx/ymodel-4bit-try

Text Generation • 2B • Updated Oct 31, 2025 • 3

skozachuk/Dolphin-Mistral-24B-Venice-Edition-AWQ

4B • Updated Jul 17, 2025 • 20

iproskurina/opt-350m-awq-int4

94.1M • Updated Nov 1, 2025 • 2

iproskurina/Llama-3.2-1B-VLLM-GPTQ-W4A16-G128

0.7B • Updated Nov 1, 2025 • 5

NeoChen1024/gemma-3-12b-it-NVFP4

Image-Text-to-Text • 8B • Updated Nov 2, 2025 • 27

NeoChen1024/gemma-3-27b-it-NVFP4

Image-Text-to-Text • 18B • Updated Nov 2, 2025 • 331

dqubit/Vikhr-YandexGPT-5-Lite-8B-it-W4A16-AWQ

2B • Updated Nov 2, 2025 • 2