"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked
a model
about 2 hours ago
0xSero/MiniMax-M2.1-REAP-50-W4A16
liked
a model
about 20 hours ago
unsloth/MiniMax-M2.1-GGUF
liked
a model
17 days ago
tencent/HunyuanWorld-1