Models that run well on a *standalone* RTX a6000's 48gb of VRAM.
Ben Kelly PRO
YellowjacketGames
AI & ML interests
None yet
Recent Activity
upvoted
a
changelog
about 6 hours ago
Sort Models by Parameter Size
published
an
article
1 day ago
Flux 2 Dev on a single RTX A6000 48gb GPU #
updated
a collection
1 day ago
RTX a6000 48gb
Organizations
None yet
RTX A6000 96gb [NVLink]
-
unsloth/Nemotron-3-Nano-30B-A3B-GGUF
Text Generation • 32B • Updated • 102k • 229 -
unsloth/GLM-4.7-Flash-GGUF
Text Generation • 30B • Updated • 112k • 235 -
black-forest-labs/FLUX.2-dev
Image-to-Image • Updated • 106k • • 1.27k -
unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
Image-to-Text • 108B • Updated • 21k • 129
CPU Only Ryzen 64c-1024gb + RTX A6000 96gbNVL
TPS can be as low as 1.0, seriously. its SLOW.
-
unsloth/GLM-4.7-GGUF
Text Generation • 358B • Updated • 130k • 179 -
unsloth/DeepSeek-R1-0528-GGUF
Text Generation • 671B • Updated • 4.44k • 193 -
unsloth/Llama-4-Maverick-17B-128E-Instruct-GGUF
Image-to-Text • 401B • Updated • 5.73k • 42 -
unsloth/MiniMax-M2.1-GGUF
Text Generation • 229B • Updated • 148k • 148
RTX a6000 48gb
Models that run well on a *standalone* RTX a6000's 48gb of VRAM.
GTX 1660 Super 6gb
The best little card under 100 euros. Full Precision vs Quants not benchmarked. This card is so much better at running inference than you realize.
RTX A6000 96gb [NVLink]
-
unsloth/Nemotron-3-Nano-30B-A3B-GGUF
Text Generation • 32B • Updated • 102k • 229 -
unsloth/GLM-4.7-Flash-GGUF
Text Generation • 30B • Updated • 112k • 235 -
black-forest-labs/FLUX.2-dev
Image-to-Image • Updated • 106k • • 1.27k -
unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
Image-to-Text • 108B • Updated • 21k • 129
Micro Models for Bottom 10%
Toaster Tier but not iGPU
CPU Only Ryzen 64c-1024gb + RTX A6000 96gbNVL
TPS can be as low as 1.0, seriously. its SLOW.
-
unsloth/GLM-4.7-GGUF
Text Generation • 358B • Updated • 130k • 179 -
unsloth/DeepSeek-R1-0528-GGUF
Text Generation • 671B • Updated • 4.44k • 193 -
unsloth/Llama-4-Maverick-17B-128E-Instruct-GGUF
Image-to-Text • 401B • Updated • 5.73k • 42 -
unsloth/MiniMax-M2.1-GGUF
Text Generation • 229B • Updated • 148k • 148
Image Generation Stack
The stuff we actually use, pruned on an ongoing basis.