VST Collection A comprehensive framework designed to cultivate VLMs with human-like visuospatial abilities. ⢠5 items ⢠Updated Nov 12, 2025 ⢠6
Cosmos-Predict2 Collection ā ļø This collection is archived. š https://huggingface.co/collections/nvidia/cosmos-predict25 ⢠13 items ⢠Updated 2 days ago ⢠33
Cosmos World Foundation Model Platform for Physical AI Paper ⢠2501.03575 ⢠Published Jan 7, 2025 ⢠81
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers ⢠23 items ⢠Updated 15 days ago ⢠103
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper ⢠2501.17161 ⢠Published Jan 28, 2025 ⢠123
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog ⢠10 items ⢠Updated 15 days ago ⢠85
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper ⢠2411.10440 ⢠Published Nov 15, 2024 ⢠129
Theia Collection Distilling Diverse Vision Foundation Models for Robot Learning ⢠6 items ⢠Updated Sep 30, 2024 ⢠9
view article Article Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 š š Jul 10, 2024 ⢠91
3D-VLA: A 3D Vision-Language-Action Generative World Model Paper ⢠2403.09631 ⢠Published Mar 14, 2024 ⢠11
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation ⢠12 items ⢠Updated 15 days ago ⢠62
OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper ⢠2408.06941 ⢠Published Aug 13, 2024 ⢠32
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks Paper ⢠2408.03615 ⢠Published Aug 7, 2024 ⢠31
Achieving Human Level Competitive Robot Table Tennis Paper ⢠2408.03906 ⢠Published Aug 7, 2024 ⢠28
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model Paper ⢠2312.13252 ⢠Published Dec 20, 2023 ⢠27