Huining Yuan's picture

8

Huining Yuan

HuiningYuan

·

HuiningYuan

AI & ML interests

Reinforcement learning, LLM Agents, World models

Recent Activity

updated a collection 1 day ago

updated a model 3 days ago

nics-efc/VPR-Tic-Tac-Toe

updated a collection 3 days ago

View all activity

Organizations

updated a collection 1 day ago

VPR

Verifiable Process Rewards for Agentic Reasoning • 4 items • Updated 1 day ago

updated a model 3 days ago

nics-efc/VPR-Tic-Tac-Toe

Text Generation • 4B • Updated 3 days ago • 13

updated a collection 3 days ago

VPR

Verifiable Process Rewards for Agentic Reasoning • 4 items • Updated 1 day ago

published a model 3 days ago

nics-efc/VPR-Tic-Tac-Toe

Text Generation • 4B • Updated 3 days ago • 13

updated a model 3 days ago

nics-efc/VPR-Sudoku

Text Generation • 4B • Updated 3 days ago • 13

updated a collection 3 days ago

VPR

Verifiable Process Rewards for Agentic Reasoning • 4 items • Updated 1 day ago

published a model 3 days ago

nics-efc/VPR-Sudoku

Text Generation • 4B • Updated 3 days ago • 13

updated a model 3 days ago

nics-efc/VPR-Minesweeper

Text Generation • 4B • Updated 3 days ago • 16

published a model 3 days ago

nics-efc/VPR-Minesweeper

Text Generation • 4B • Updated 3 days ago • 16

upvoted a paper about 2 months ago

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

Paper • 2603.24440 • Published Mar 25 • 98

updated 5 models 3 months ago

nics-efc/MARSHAL-Mini-Hanabi-Qwen3-4B

Text Generation • 4B • Updated Feb 11 • 6

nics-efc/MARSHAL-Kuhn-Poker-Qwen3-4B

Text Generation • 4B • Updated Feb 11 • 62

nics-efc/MARSHAL-Tic-Tac-Toe-Qwen3-4B

Text Generation • 4B • Updated Feb 11 • 14

nics-efc/MARSHAL-Generalist-Qwen3-8B

Text Generation • 8B • Updated Feb 11 • 9

nics-efc/MARSHAL-Generalist-Qwen3-4B

Text Generation • 4B • Updated Feb 11 • 32

upvoted 3 papers 3 months ago

RLinf-USER: A Unified and Extensible System for Real-World Online Policy Learning in Embodied AI

Paper • 2602.07837 • Published Feb 8 • 57

RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation

Paper • 2509.15965 • Published Sep 19, 2025 • 19

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Paper • 2602.04634 • Published Feb 4 • 100

upvoted a collection 5 months ago

MARSHAL

MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs 🎉 Accepted by ICLR 2026 • 6 items • Updated Feb 14 • 2