Devstral 2 Collection A couple of agentic LLMs for software engineering tasks, excelling at using tools to explore codebases, edit multiple files, and power SWE Agents. • 3 items • Updated 27 days ago • 38
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 135
Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 81
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 18 days ago • 94
Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published 20 days ago • 99
VTP Collection Towards Scalable Pre-training of Visual Tokenizers for Generation • 4 items • Updated 20 days ago • 39
view article Article Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models 21 days ago • 104
view article Article Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models Nov 19, 2025 • 33