CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published Mar 25 • 98
RLinf-USER: A Unified and Extensible System for Real-World Online Policy Learning in Embodied AI Paper • 2602.07837 • Published Feb 8 • 57
RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation Paper • 2509.15965 • Published Sep 19, 2025 • 19
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published Feb 4 • 100
MARSHAL Collection MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs 🎉 Accepted by ICLR 2026 • 6 items • Updated Feb 14 • 2