Skip to content
@gso-bench

GSO

GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

GSO (Global Software Optimization) is a benchmark of over 100 optimization tasks across codebases and languages.
Agents are tasked to optimize software against precise performance tests and are judged against expert developer commits.

This organization contains the source code for the GSO benchmark, including:

  • GSO, a benchmark for evaluating AI systems on real world GitHub issues.
  • Experiments, execution logs, trajectories, and results from evaluation runs on GSO.
  • Example Usage, an example of running GSO on your Agent (in this case OpenHands) to generate GSO solutions.

Pinned Loading

  1. gso gso Public

    [NeurIPS '25] GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

    Python 62 3

  2. gso-experiments gso-experiments Public

    Open sourced execution logs, trajectories, and results from evaluation runs on GSO

    Python 1

Repositories

Showing 5 of 5 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…