QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models Paper • 2512.19526 • Published 17 days ago • 11
MatSpray: Fusing 2D Material World Knowledge on 3D Geometry Paper • 2512.18314 • Published 19 days ago • 8
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 20 days ago • 25
MobileWorld: Benchmarking Autonomous Mobile Agents in Agent-User Interactive, and MCP-Augmented Environments Paper • 2512.19432 • Published 17 days ago • 12
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 21 days ago • 111
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published 15 days ago • 53