HY-MT1.5-1.8B-MNN

This is a 4-bit quantized MNN version of Tencent's HY-MT1.5-1.8B translation model, optimized for Apple Silicon (iOS/macOS) edge inference.

Model Description

HY-MT1.5-1.8B is a lightweight version of the HY-MT1.5 series, specifically designed for edge devices:

  • 36 Language Support: Extended language coverage
  • Edge Optimized: Designed for mobile and edge deployment
  • Terminology Intervention: Custom terminology control during translation
  • Context-Aware Translation: Improved accuracy with context understanding
  • Industry-Leading Performance: Best-in-class for its parameter size

Quantization Details

Property Value
Original Model tencent/HY-MT1.5-1.8B
Original Size ~3.8 GB
Quantized Size 1.07 GB
Compression Ratio 72%
Quantization Type 4-bit (q4_k_m)
Block Size 64

Hardware Acceleration

Optimized for Apple Silicon with:

  • βœ… INT8 Dot Product (i8sdot)
  • βœ… FP16 Operations
  • βœ… INT8 Matrix Multiply (i8mm)
  • βœ… Scalable Matrix Extension 2 (sme2)
  • βœ… Metal GPU Acceleration
  • βœ… Apple Neural Engine (ANE) compatible

Files

β”œβ”€β”€ llm.mnn              # Model structure (576 KB)
β”œβ”€β”€ llm.mnn.weight       # Quantized weights (1.07 GB)
β”œβ”€β”€ tokenizer.txt        # Tokenizer vocabulary
β”œβ”€β”€ llm_config.json      # MNN runtime config
β”œβ”€β”€ config.json          # Model config
β”œβ”€β”€ model_info.json      # Model metadata
└── export_args.json     # Conversion parameters

Usage

With MNN LLM Demo

# Clone MNN and build llm_demo
git clone https://github.com/alibaba/MNN.git
cd MNN && mkdir build && cd build
cmake .. -DMNN_BUILD_LLM=ON -DMNN_LOW_MEMORY=ON
make -j8 llm_demo

# Run inference
cd /path/to/HY-MT1.5-1.8B-MNN
./llm_demo ./

Example

User: Translate into English: 今倩倩氣很ε₯½
A: The weather is very nice today.

Prompt Templates

# Basic translation
Translate into {language}:
{text}

# With terminology
Translate into {language}, using terms: {terms}
{text}

# With context
Context: {context}
Translate into {language}:
{text}

Performance

Metric Value
Model Load Time ~1s
Inference Speed 40-60 tokens/s
Target Device iOS / Apple Silicon
Memory Usage < 2GB

iOS Integration

This model is ideal for iOS apps. Example using MNN iOS SDK:

import MNN

let llm = LLM(modelPath: "HY-MT1.5-1.8B-MNN")
let result = llm.generate("Translate into English: 今倩倩氣很ε₯½")
print(result) // "The weather is very nice today."

Conversion Info

  • Tool: MNN llmexport.py
  • MNN Version: 3.0.0
  • Conversion Date: 2025-12-31
  • Source Format: HuggingFace safetensors

Related Models

Why Choose 1.8B?

Feature 1.8B 7B
Size 1.07 GB 4.47 GB
Speed 40-60 tok/s 20-30 tok/s
iOS Compatible βœ… Yes ⚠️ Mac only
Quality Good Excellent

Choose 1.8B for: Mobile apps, real-time translation, resource-constrained devices

Choose 7B for: Desktop apps, highest translation quality, batch processing

License

This model inherits the license from the original HY-MT1.5-1.8B model.

Acknowledgments

Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for wangjazz/Hunyuan-MT1.5-1.8B-MNN

Quantized
(18)
this model