-
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 94 -
CodeGoat24/UnifiedReward-Think-qwen35-9b
9B • Updated • 121 -
CodeGoat24/UnifiedReward-Think-qwen35-27b
3.05M • Updated • 387
Collections
Discover the best community collections!
Collections including paper arxiv:2503.05236
-
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 94 -
CodeGoat24/UnifiedReward-Think-7b
8B • Updated • 7 • 10 -
CodeGoat24/UnifiedReward-7b-v1.5
8B • Updated • 9.79k • 7
-
LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation
Paper • 2508.03694 • Published • 52 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 191 -
Improving Video Generation with Human Feedback
Paper • 2501.13918 • Published • 53 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124
-
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 94 -
mradermacher/UnifiedReward-qwen-32b-i1-GGUF
33B • Updated • 182 • 1 -
mradermacher/UnifiedReward-Think-qwen-7b-i1-GGUF
8B • Updated • 269
-
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
Paper • 2503.10615 • Published • 17 -
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
Paper • 2503.10630 • Published • 6 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 39 -
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
Paper • 2503.07536 • Published • 88
-
OmniGen2: Exploration to Advanced Multimodal Generation
Paper • 2506.18871 • Published • 78 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115 -
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation
Paper • 2502.05415 • Published • 20 -
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Paper • 2408.12528 • Published • 51
-
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders
Paper • 2503.03601 • Published • 233 -
Transformers without Normalization
Paper • 2503.10622 • Published • 172 -
RWKV-7 "Goose" with Expressive Dynamic State Evolution
Paper • 2503.14456 • Published • 154 -
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Paper • 2503.11647 • Published • 148
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 94 -
CodeGoat24/UnifiedReward-Think-qwen35-9b
9B • Updated • 121 -
CodeGoat24/UnifiedReward-Think-qwen35-27b
3.05M • Updated • 387
-
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 94 -
CodeGoat24/UnifiedReward-Think-7b
8B • Updated • 7 • 10 -
CodeGoat24/UnifiedReward-7b-v1.5
8B • Updated • 9.79k • 7
-
LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation
Paper • 2508.03694 • Published • 52 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 191 -
Improving Video Generation with Human Feedback
Paper • 2501.13918 • Published • 53 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124
-
OmniGen2: Exploration to Advanced Multimodal Generation
Paper • 2506.18871 • Published • 78 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115 -
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation
Paper • 2502.05415 • Published • 20 -
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Paper • 2408.12528 • Published • 51
-
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 94 -
mradermacher/UnifiedReward-qwen-32b-i1-GGUF
33B • Updated • 182 • 1 -
mradermacher/UnifiedReward-Think-qwen-7b-i1-GGUF
8B • Updated • 269
-
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders
Paper • 2503.03601 • Published • 233 -
Transformers without Normalization
Paper • 2503.10622 • Published • 172 -
RWKV-7 "Goose" with Expressive Dynamic State Evolution
Paper • 2503.14456 • Published • 154 -
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Paper • 2503.11647 • Published • 148
-
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
Paper • 2503.10615 • Published • 17 -
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
Paper • 2503.10630 • Published • 6 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 39 -
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
Paper • 2503.07536 • Published • 88
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 124 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27