🎵 Tamil AI DJ Radio - MLX Optimized

Blazing-fast Qwen 2.5-0.5B for Apple Silicon (M1/M2/M3/M4)

MLX-optimized model for generating energetic Tanglish (Tamil-English) radio DJ commentary. Fused LoRA weights for maximum performance on Apple Silicon.

🎯 Model Overview

Base Model: Qwen/Qwen2.5-0.5B-Instruct (4-bit quantized)
Model Type: MLX Fused (LoRA weights merged)
Training Data: 5,027 Tanglish DJ commentary examples
Best Checkpoint: Iteration 2900 (validation loss: 1.856)
Model Size: 276MB
Framework: MLX (Apple Silicon optimized)

⚡ Performance

Speed (M1 Mac)

Loading: ~2 seconds
Inference: ~3 seconds for 150 tokens
Memory: <2GB RAM usage
Latency: ~20ms per token

Why MLX?

🚀 3-5x faster than Transformers on Mac
💾 Lower memory usage with unified memory
🔋 Better power efficiency on Apple Silicon
🎯 Native Metal acceleration

🚀 Quick Start

Installation

pip install mlx mlx-lm

Simple Usage

from mlx_lm import load, generate

# Load MLX-optimized model
model, tokenizer = load("felixmanojh/DJ-AI-Radio-MLX")

# Generate DJ commentary
messages = [
    {"role": "system", "content": "You are a Tamil AI radio DJ who speaks energetic Tanglish."},
    {"role": "user", "content": "Hype up a party track"}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=150, verbose=False)
print(response)

Example Output:

Party mode activate! Friday night ah Saturday night mode activate!
Club vibes high-energy vibes! Dance floor crowded! Everyone jumping!
Party starter! Energy maximum! Music energizing!

💡 Why Use MLX Version?

✅ Advantages

Feature	MLX	Transformers
Speed (M1 Mac)	~3s	~10s
Memory	<2GB	~4GB
Loading	2s	10s
Platform	macOS only	Cross-platform
Power Usage	Low	Higher

Use This Version For:

🍎 Mac development (M1/M2/M3/M4)
🚀 Local inference with maximum speed
💻 Demos and prototypes on Apple Silicon
🔬 Research with fast iteration cycles
📱 macOS apps with native performance

📊 Training Details

Parameter	Value
LoRA Rank	8 (fused into base)
LoRA Alpha	16
Training Iterations	6,000 (best @ 2900)
Validation Loss	1.856
Training Data	5,027 examples, 68 themes
Framework	MLX (Apple)

📚 Supported Vibes

The model generates commentary for various moods:

High Energy:

🎉 Party anthems
💪 Workout motivation
🎮 Gaming streams

Chill Vibes:

🌊 Beach relaxation
📚 Study focus
🌧️ Rain moods

Themed:

🎬 Movie/Cinema vibes
🍜 Street food sessions
🚗 Road trips
🎆 Festival celebrations

🎓 Intended Use

✅ Recommended

Local development on Mac
Fast prototyping and demos
macOS applications
Real-time commentary generation
Educational demonstrations

❌ Not Recommended

Linux/Windows deployment (use Merged model)
Production servers (use Transformers version)
Non-Mac platforms

🔄 Model Variants

📌 Choose Your Format:

Model	Format	Size	Platform	Speed
DJ-AI-Radio-LoRA	LoRA adapter	17MB	Any	Medium
DJ-AI-Radio	Merged (HF)	276MB	Any	Medium
DJ-AI-Radio-MLX (this)	Fused (MLX)	276MB	Mac only	Fast

🌐 Live Demo

Try the complete AI radio station:

🔗 Tamil AI DJ Radio Space

Features:

LLM-generated commentary (this model's merged version)
Voice cloning with Coqui XTTS
AI-generated music playback

📝 Citation

@software{tamil_ai_dj_radio_mlx_2025,
  author = {Felix Manojh},
  title = {Tamil AI DJ Radio - MLX Optimized},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/felixmanojh/DJ-AI-Radio-MLX},
  note = {MLX-optimized Qwen 2.5-0.5B for Tanglish DJ commentary}
}

📄 License

Apache 2.0 (inherits from Qwen 2.5)

🙏 Acknowledgments

Base Model: Qwen Team (Alibaba Cloud)
MLX Framework: Apple ML Research
Training Data: Claude API (Anthropic)
Inspiration: Tamil music culture

Built with ❤️ for the Tamil-speaking community

Downloads last month: 5

Safetensors

Model size

77.3M params

Tensor type

F16

U32

Model tree for felixmanojh/DJ-AI-Radio-MLX

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Quantized

(154)

this model

felixmanojh
/

DJ-AI-Radio-MLX