Skip to content
Change the repository type filter

All

    Repositories list

    • Python
      Apache License 2.0
      0000Updated May 6, 2026May 6, 2026
    • AcademiClaw: When Students Set Challenges for AI Agents — a bilingual benchmark of 80 university student-sourced academic tasks.
      C
      Other
      11410Updated May 5, 2026May 5, 2026
    • AlphaEval

      Public
      Python
      MIT License
      03400Updated May 4, 2026May 4, 2026
    • SII-CLI

      Public
      03510Updated May 1, 2026May 1, 2026
    • lab-site

      Public
      Website of GAIR Lab
      JavaScript
      0100Updated Apr 17, 2026Apr 17, 2026
    • Python
      Apache License 2.0
      15264600Updated Apr 17, 2026Apr 17, 2026
    • [ACL 2026] This is the repo of Data Darwinism.
      12500Updated Apr 16, 2026Apr 16, 2026
    • Python
      1982k212Updated Apr 11, 2026Apr 11, 2026
    • Apache License 2.0
      1214400Updated Mar 31, 2026Mar 31, 2026
    • OpenSWE

      Public
      Python
      Other
      1617900Updated Mar 16, 2026Mar 16, 2026
    • Python
      22900Updated Mar 15, 2026Mar 15, 2026
    • Med

      Public
      [ICML 2026] What Does Vision Tool-Use Reinforcement Learning Really Learn? Disentangling Tool-Induced and Intrinsic Effects for Crop-and-Zoom
      Python
      Other
      01900Updated Mar 10, 2026Mar 10, 2026
    • daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently
      Python
      MIT License
      33910Updated Feb 4, 2026Feb 4, 2026
    • [ICLR 2026]InnovatorBench: Evaluating Agents' Ability to Conduct Innovative LLM Research
      Jupyter Notebook
      Apache License 2.0
      11700Updated Feb 3, 2026Feb 3, 2026
    • MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning
      Python
      Apache License 2.0
      712000Updated Feb 2, 2026Feb 2, 2026
    • Go
      Apache License 2.0
      65511Updated Jan 31, 2026Jan 31, 2026
    • [ICLR 2026] SR-Scientist: Scientific Equation Discovery With Agentic AI
      Python
      04300Updated Jan 27, 2026Jan 27, 2026
    • [ACL2026 Main] AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts
      Python
      MIT License
      48150Updated Jan 23, 2026Jan 23, 2026
    • Python
      0100Updated Jan 20, 2026Jan 20, 2026
    • ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry
      Python
      44921Updated Jan 5, 2026Jan 5, 2026
    • LiveTalk

      Public
      Python
      Other
      2830390Updated Jan 2, 2026Jan 2, 2026
    • ASI-Arch

      Public
      AlphaGo Moment for Model Architecture Discovery.
      Python
      Apache License 2.0
      2111.2k90Updated Dec 3, 2025Dec 3, 2025
    • 2429310Updated Nov 6, 2025Nov 6, 2025
    • Scaling Deep Research via Reinforcement Learning in Real-world Environments.
      Python
      Apache License 2.0
      5274791Updated Oct 15, 2025Oct 15, 2025
    • LIMI

      Public
      LIMI: Less is More for Agency
      Python
      716160Updated Oct 14, 2025Oct 14, 2025
    • [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy
      Python
      48020Updated Oct 9, 2025Oct 9, 2025
    • DatasetResearch: Benchmarking Agent Systems for Demand-Driven Dataset Discovery
      Python
      Apache License 2.0
      02000Updated Sep 24, 2025Sep 24, 2025
    • Python
      MIT License
      1300Updated Sep 9, 2025Sep 9, 2025
    • [ICLR 2026] Efficient Agent Training for Computer Use
      Python
      MIT License
      814200Updated Sep 5, 2025Sep 5, 2025
    • LIMO

      Public
      [COLM 2025] LIMO: Less is More for Reasoning
      Python
      551.1k60Updated Jul 30, 2025Jul 30, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.