This organization contains the source code for the GSO benchmark, including:
- GSO, a benchmark for evaluating AI systems on real world GitHub issues.
- Experiments, execution logs, trajectories, and results from evaluation runs on GSO.
- Example Usage, an example of running GSO on your Agent (in this case OpenHands) to generate GSO solutions.