PromptBridge: Cross-Model Prompt Transfer for Large Language Models Paper • 2512.01420 • Published Dec 1, 2025 • 9
PromptBridge: Cross-Model Prompt Transfer for Large Language Models Paper • 2512.01420 • Published Dec 1, 2025 • 9 • 2
M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark Paper • 2511.17729 • Published Nov 21, 2025 • 16
M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark Paper • 2511.17729 • Published Nov 21, 2025 • 16 • 2
R-WoM: Retrieval-augmented World Model For Computer-use Agents Paper • 2510.11892 • Published Oct 13, 2025 • 21
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29, 2025 • 140
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26, 2025 • 134
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26, 2025 • 134 • 2
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26, 2025 • 134
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28, 2025 • 63
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28, 2025 • 63 • 5
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28, 2025 • 63
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28, 2025 • 63 • 5
ztwang/Qwen2.5-Coder-7B_combined_logic_longseq_combinedcodecontests_nocl_global_step_100 8B • Updated May 26, 2025 • 2
ztwang/Qwen2.5-Coder-7B_combined_logic_longseq_combinedcodecontests_cl_global_step_100 8B • Updated May 25, 2025 • 3
ztwang/Qwen2.5-Coder-7B_combined_logic_longseq_combinedcodecontests_nocl_global_step_50 8B • Updated May 24, 2025 • 4
ztwang/Qwen2.5-Coder-7B_combined_logic_longseq_combinedcodecontests_cl_global_step_50 8B • Updated May 24, 2025 • 6