HY-MT1.5-1.8B-MNN
This is a 4-bit quantized MNN version of Tencent's HY-MT1.5-1.8B translation model, optimized for Apple Silicon (iOS/macOS) edge inference.
Model Description
HY-MT1.5-1.8B is a lightweight version of the HY-MT1.5 series, specifically designed for edge devices:
- 36 Language Support: Extended language coverage
- Edge Optimized: Designed for mobile and edge deployment
- Terminology Intervention: Custom terminology control during translation
- Context-Aware Translation: Improved accuracy with context understanding
- Industry-Leading Performance: Best-in-class for its parameter size
Quantization Details
| Property | Value |
|---|---|
| Original Model | tencent/HY-MT1.5-1.8B |
| Original Size | ~3.8 GB |
| Quantized Size | 1.07 GB |
| Compression Ratio | 72% |
| Quantization Type | 4-bit (q4_k_m) |
| Block Size | 64 |
Hardware Acceleration
Optimized for Apple Silicon with:
- β INT8 Dot Product (i8sdot)
- β FP16 Operations
- β INT8 Matrix Multiply (i8mm)
- β Scalable Matrix Extension 2 (sme2)
- β Metal GPU Acceleration
- β Apple Neural Engine (ANE) compatible
Files
βββ llm.mnn # Model structure (576 KB)
βββ llm.mnn.weight # Quantized weights (1.07 GB)
βββ tokenizer.txt # Tokenizer vocabulary
βββ llm_config.json # MNN runtime config
βββ config.json # Model config
βββ model_info.json # Model metadata
βββ export_args.json # Conversion parameters
Usage
With MNN LLM Demo
# Clone MNN and build llm_demo
git clone https://github.com/alibaba/MNN.git
cd MNN && mkdir build && cd build
cmake .. -DMNN_BUILD_LLM=ON -DMNN_LOW_MEMORY=ON
make -j8 llm_demo
# Run inference
cd /path/to/HY-MT1.5-1.8B-MNN
./llm_demo ./
Example
User: Translate into English: δ»ε€©ε€©ζ°£εΎε₯½
A: The weather is very nice today.
Prompt Templates
# Basic translation
Translate into {language}:
{text}
# With terminology
Translate into {language}, using terms: {terms}
{text}
# With context
Context: {context}
Translate into {language}:
{text}
Performance
| Metric | Value |
|---|---|
| Model Load Time | ~1s |
| Inference Speed | 40-60 tokens/s |
| Target Device | iOS / Apple Silicon |
| Memory Usage | < 2GB |
iOS Integration
This model is ideal for iOS apps. Example using MNN iOS SDK:
import MNN
let llm = LLM(modelPath: "HY-MT1.5-1.8B-MNN")
let result = llm.generate("Translate into English: δ»ε€©ε€©ζ°£εΎε₯½")
print(result) // "The weather is very nice today."
Conversion Info
- Tool: MNN llmexport.py
- MNN Version: 3.0.0
- Conversion Date: 2025-12-31
- Source Format: HuggingFace safetensors
Related Models
- HY-MT1.5-7B-MNN - Larger version for higher quality
- Hunyuan-MT-7B-MNN - Original WMT25 version
Why Choose 1.8B?
| Feature | 1.8B | 7B |
|---|---|---|
| Size | 1.07 GB | 4.47 GB |
| Speed | 40-60 tok/s | 20-30 tok/s |
| iOS Compatible | β Yes | β οΈ Mac only |
| Quality | Good | Excellent |
Choose 1.8B for: Mobile apps, real-time translation, resource-constrained devices
Choose 7B for: Desktop apps, highest translation quality, batch processing
License
This model inherits the license from the original HY-MT1.5-1.8B model.
Acknowledgments
- Tencent Hunyuan Team for the original model
- Alibaba MNN Team for the inference framework
- Downloads last month
- 10
Model tree for wangjazz/Hunyuan-MT1.5-1.8B-MNN
Base model
tencent/HY-MT1.5-1.8B