DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents Paper • 2605.04808 • Published 14 days ago • 20
DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents Paper • 2605.04808 • Published 14 days ago • 20
PRBench: End-to-end Paper Reproduction in Physics Research Paper • 2603.27646 • Published Mar 29 • 29
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning Paper • 2602.10090 • Published Feb 10 • 53
Grounding and Enhancing Informativeness and Utility in Dataset Distillation Paper • 2601.21296 • Published Jan 29 • 19
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery Paper • 2601.19325 • Published Jan 27 • 81
Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning Paper • 2511.19900 • Published Nov 25, 2025 • 49
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published Nov 20, 2025 • 110
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 60
ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation Paper • 2511.01163 • Published Nov 3, 2025 • 32