OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification Paper • 2512.10756 • Published 30 days ago • 34
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Paper • 2512.10739 • Published 30 days ago • 46
Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning Paper • 2512.10534 • Published about 1 month ago • 31
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 259
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 259
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning Paper • 2507.16814 • Published Jul 22, 2025 • 21
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning Paper • 2507.16814 • Published Jul 22, 2025 • 21
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26, 2025 • 104
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26, 2025 • 104
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning Paper • 2504.08672 • Published Apr 11, 2025 • 55
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning Paper • 2504.08672 • Published Apr 11, 2025 • 55
$φ$-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation Paper • 2503.13288 • Published Mar 17, 2025 • 51
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation Paper • 2503.13288 • Published Mar 17, 2025 • 51
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published Dec 27, 2024 • 87