✨ Big wave of foundation models: still scaling, but efficiency, reasoning, and deployment now matter more than size - DeepSeek-V3.2 - Z.ai GLM-4.7 - MiniMax-M2.1 - Xiaomi: MiMo-V2-Flash
✨ Multimodal reasoning is now default - Z.ai GLM-4.6V - Z.ai AutoGLM-Phone 9B - Bytedance: Dolphin-v2
Only a year into open source, MiniMax is already making a great impact. Not only through solid models/products, but also by how well the team uses community platforms like Hugging Face. HF Teams, blogs, Daily Papers, Spaces as project pages, and always experimenting with new ways to engage. Super impressive!
The list of hands-on notebooks (some beginner-friendly!) to get started with fine-tuning using TRL keeps growing!!
• SFT • GRPO • Tool calling & agents • RL environments with OpenEnv • LLMs and VLMs ✨ Many run on FREE Colab, making it super easy to get started fast!
The Christmas holidays are here! 🎄 Thinking about learning something new in AI?
@huggingface offers 12 FREE courses covering all the relevant topics, for every level of experience. A great challenge for the holidays (and worth saving for later 🙄)
Following up on LLaDA 2.0 , the paper is now out on Daily Papers🔥 It has sparked a lot of discussion in the community for showing how discrete diffusion LLMs can scale to 100B and run faster than traditional AR models. LLaDA2.0: Scaling Up Diffusion Language Models to 100B (2512.15745)
Nvidia is on a roll lately. Nemotron 3 Nano is my new fav local model, but here's the real flex: they published the entire evaluation setup. Configs, prompts, logs, all of it. This is how you do open models 🔥
✨ Built from real enterprise data (Enron + financial institutions), not synthetic tasks ✨ Tests end-to-end finance workflows ✨ Multimodal & cross-file reasoning ✨ Expert annotated (700+ hours) and genuinely challenging hard