--- language: en license: apache-2.0 tags: - mlx - quantized - 8-bit - diffusion - wedlm library_name: mlx pipeline_tag: text-generation base_model: tencent/WeDLM-8B-Instruct model_type: wedlm quantized_by: zimengxiong model_relations: - type: quantization base_model_id: tencent/WeDLM-8B-Instruct --- # WeDLM-8B-Instruct-MLX-8bit This is an 8-bit quantized [MLX](https://github.com/ml-explore/mlx) version of [tencent/WeDLM-8B-Instruct](https://huggingface.co/tencent/WeDLM-8B-Instruct) for efficient inference on Apple Silicon. It currently does not work too well or provide meaningfull speedup due to lack of pre compilation. https://github.com/ZimengXiong/WeDLM-MLX/tree/main ## Related Models | Variant | HuggingFace | |---------|-------------| | 4-bit | [zimengxiong/WeDLM-8B-Instruct-MLX-4bit](https://huggingface.co/zimengxiong/WeDLM-8B-Instruct-MLX-4bit) | | 8-bit (this model) | [zimengxiong/WeDLM-8B-Instruct-MLX-8bit](https://huggingface.co/zimengxiong/WeDLM-8B-Instruct-MLX-8bit) | | fp16 | [zimengxiong/WeDLM-8B-Instruct-MLX](https://huggingface.co/zimengxiong/WeDLM-8B-Instruct-MLX) | ## License This model inherits the license from the base model [tencent/WeDLM-8B-Instruct](https://huggingface.co/tencent/WeDLM-8B-Instruct).