pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-planning 8B • Updated 2 days ago • 9
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-planning 8B • Updated 2 days ago • 9
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-reasoning-strategies 8B • Updated 2 days ago • 18
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-reasoning-strategies 8B • Updated 2 days ago • 18
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-self-correct 8B • Updated 2 days ago • 8
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-self-correct 8B • Updated 2 days ago • 8
Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 18 items • Updated 6 days ago • 44
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-from-rl 8B • Updated 23 days ago • 13
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-from-rl 8B • Updated 23 days ago • 13
pittawat/qwen2.5-14b-instruct-still-3-1k-grpo-with-length-0.1-cot-prompt-v6 15B • Updated Dec 3, 2025 • 3
pittawat/qwen2.5-14b-instruct-still-3-1k-grpo-with-length-0.1-cot-prompt-v6 15B • Updated Dec 3, 2025 • 3
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-124 8B • Updated Dec 3, 2025 • 3
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-124 8B • Updated Dec 3, 2025 • 3
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-93 8B • Updated Dec 3, 2025 • 6
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-93 8B • Updated Dec 3, 2025 • 6
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-62 8B • Updated Dec 3, 2025 • 3