JameSand/qwen3-1.7b-base-svd-muon-adam-1e-6-bs128-kl0.0-global_step_200 2B • Updated about 5 hours ago
JameSand/llama-muon-muonlr1e-4-spectral_norm-muonadamlr1e-6-20260110_005142-global_step_200 4B • Updated 13 days ago • 7
JameSand/Llama-3.2-3B-Instruct-muon-2e-2-muonadamlr1e-6-muonadjustlrNone-iter_0000200 Text Generation • 3B • Updated 23 days ago • 10
JameSand/Llama-3.2-3B-Instruct-muon-2e-2-muonadamlr1e-6-muonadjustlrrms_norm-iter_0000200 Text Generation • 3B • Updated 23 days ago • 10