Why Q8 is just 13 GiB ?

#36
by zvwgvx - opened

Why Q8 is just 13 GiB ?

Unsloth AI org

Why Q8 is just 13 GiB ?

The original model is already very small

thanks, before I didn't know that the model is mixed quantization between MXFP4 and BF16 so I calculated it out not match

Sign up or log in to comment