AI & ML interests

None defined yet.

Recent Activity

nouamanetazi 
posted an update 2 months ago
view post
Post
4210
After training 𝐒𝐦𝐨𝐥𝐋𝐌𝟑 on 𝟑𝟖𝟒 𝐇𝟏𝟎𝟎𝐬 for nearly a month, I've come to realize something most people overlook: 𝐢𝐧𝐟𝐫𝐚𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞 𝐢𝐬 𝐭𝐡𝐞 𝐦𝐚𝐤𝐞-𝐨𝐫-𝐛𝐫𝐞𝐚𝐤 𝐟𝐚𝐜𝐭𝐨𝐫 𝐢𝐧 𝐋𝐋𝐌 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠. 🔥

Everyone talks about model architecture and data quality. And yes, those matter immensely. But here's what nobody tells you: when your training run fails at 2 AM because of mysterious 𝐍𝐂𝐂𝐋 𝐞𝐫𝐫𝐨𝐫𝐬, or when your expensive GPU cluster is running at 𝟔𝟎% 𝐞𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐜𝐲, the problem isn't your model. It's most probably a 𝐦𝐢𝐬𝐮𝐬𝐞 𝐨𝐟 𝐭𝐡𝐞 𝐡𝐚𝐫𝐝𝐰𝐚𝐫𝐞. 🛠️

Questions that seemed simple but had no clear answers: Why is 𝐌𝐨𝐄 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐬𝐥𝐨𝐰𝐞𝐫 𝐭𝐡𝐚𝐧 𝐝𝐞𝐧𝐬𝐞 𝐦𝐨𝐝𝐞𝐥𝐬? Which 𝐍𝐂𝐂𝐋 𝐟𝐥𝐚𝐠𝐬 should we actually set? How often should we checkpoint without killing throughput?

That's why we built 𝐓𝐡𝐞 𝐒𝐦𝐨𝐥 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐏𝐥𝐚𝐲𝐛𝐨𝐨𝐤 📖: a complete guide covering everything from model architecture and data curation to the SmolLM3 training marathon, post-training techniques, and crucially, the 𝐢𝐧𝐟𝐫𝐚𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞 𝐥𝐚𝐲𝐞𝐫 that most teams get wrong.

We validated real vs theoretical bandwidth across the entire stack: 𝐇𝐁𝐌𝟑 𝐡𝐢𝐭𝐭𝐢𝐧𝐠 𝟑 𝐓𝐁/𝐬, 𝐍𝐕𝐋𝐢𝐧𝐤 𝟒.𝟎 𝐫𝐞𝐚𝐜𝐡𝐢𝐧𝐠 𝟕𝟖𝟔 𝐆𝐁/𝐬, 𝐏𝐂𝐈𝐞 𝐆𝐞𝐧𝟒 𝐚𝐭 𝟏𝟒.𝟐 𝐆𝐁/𝐬. Then we ran collective operations across 𝟏𝟐𝟖 𝐆𝐏𝐔𝐬 (16 nodes, 8xH100s each) and measured how performance degrades at scale: all-reduce drops from 𝟒𝟖𝟎 𝐆𝐁/𝐬 on a single node to 𝟑𝟐𝟎-𝟑𝟓𝟎 𝐆𝐁/𝐬 across 16 nodes.

If you've ever wondered why your training runs are slower than they should be, or you're planning to scale up and want to avoid expensive mistakes, this guide might save you weeks of debugging.

𝐓𝐡𝐞 𝐒𝐦𝐨𝐥 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐏𝐥𝐚𝐲𝐛𝐨𝐨𝐤: https://lnkd.in/e5MKXUHS

Shared with ❤️ by the HuggingFace team
lbourdois 
posted an update 3 months ago
clem 
posted an update 5 months ago
clem 
posted an update 7 months ago
clem 
posted an update 8 months ago
view post
Post
7815
Today, we're unveiling two new open-source AI robots! HopeJR for $3,000 & Reachy Mini for $300 🤖🤖🤖

Let's go open-source AI robotics!
·
clem 
posted an update 8 months ago
view post
Post
3885
It's just become easier to share your apps on the biggest AI app store (aka HF spaces) for unlimited storage, more visibility and community interactions.

Just pick a React, Svelte, or Vue template when you create your space or add app_build_command: npm run build in your README's YAML and app_file: build/index.html in your README's YAML block.

Or follow this link: https://huggingface.co/new-space?sdk=static

Let's build!
  • 1 reply
·
clem 
posted an update 8 months ago
view post
Post
4083
Playing with Veo3 this morning. Share your prompt if you want me to create videos for you (bonus point if they funnily reference HF/open-source). These videos are "a cat on the moon rapping "I love Hugging Face""!
·
clem 
posted an update 8 months ago
view post
Post
3240
Very cool to see
pytorch
contributing on Hugging Face. Time to follow them to see what they're cooking!
  • 2 replies
·
clem 
posted an update 8 months ago
clem 
posted an update 8 months ago
view post
Post
4118
What are you using to evaluate models or AI systems? So far we're building lighteval & leaderboards on the hub but still feels early & a lot more to build. What would be useful to you?
·
clem 
posted an update 8 months ago
clem 
posted an update 8 months ago
view post
Post
1584
The
meta-llama
org just crossed 40,000 followers on Hugging Face. Grateful for all their impact on the field sharing the Llama weights openly and much more!

We need more of this from all other big tech to make the AI more open, collaborative and beneficial to all!
clem 
posted an update 9 months ago
view post
Post
4094
Energy is a massive constraint for AI but do you even know what energy your chatGPT convos are using?

We're trying to change this by releasing ChatUI-energy, the first interface where you see in real-time what energy your AI conversations consume. Great work from @jdelavande powered by spaces & TGI, available for a dozen of open-source models like Llama, Mistral, Qwen, Gemma and more.

jdelavande/chat-ui-energy

Should all chat interfaces have this? Just like ingredients have to be shown on products you buy, we need more transparency in AI for users!
  • 3 replies
·