A static website generator for comparing different LLM models and their benchmark scores.
- Responsive layout with filtering options
- Support for multiple benchmark types (LMSYS, MMLU, MT-Bench, etc.)
- Easy deployment to GitHub Pages
- Local development server
- Automated data updates via GitHub Actions
# Clone the repository
git clone https://github.com/abhigna/llmbench.git
cd llmbench
# Install the package in development mode
pip install -e .To run a local development server:
# Build the static site
llmbench build
# Start a local server
llmbench serveThis will build the site to the ./docs directory and serve it at http://localhost:5000.
The package includes tools to update benchmark data:
# Update all benchmarks
update-benchmarks --benchmark all
# Update a specific benchmark
update-benchmarks --benchmark lmsys- Create a new JSON file in the
data/directory with your benchmark scores - Update the benchmark selector in
templates/index.html - Add an update function in
src/llmbench/update.py
This project is configured to automatically deploy to GitHub Pages using GitHub Actions. The workflow will:
- Update benchmark data daily
- Build the static site
- Deploy to the
gh-pagesbranch
You can also manually trigger a deployment from the Actions tab in your GitHub repository.
llmbench/
├── static/ # CSS, JS files
├── templates/ # HTML templates
├── data/ # JSON data files
├── docs/ # GitHub Pages deployment target
├── src/llmbench/ # Python package
├── setup.py # Package setup script
└── README.md # This file
To add a new model, update the models.json file in the data/ directory with the model details.
The site layout is defined in the following files:
templates/index.html- Main layout structurestatic/css/style.css- Stylingstatic/js/main.js- Filtering and data loading logic