lsteno/Qwen3-4B-Instruct-2507-RLM-RLVR-depth2-recursive-r64-a128-lr1e-5-adapter Reinforcement Learning • Updated 9 days ago • 15
lsteno/Qwen3-4B-Instruct-2507-RLM-RLVR-FullFT-lr5e-6-depth1-v1 Text Generation • 4B • Updated about 1 month ago • 146