LLM Benchmark Comparison

A static website generator for comparing different LLM models and their benchmark scores.

Features

Responsive layout with filtering options
Support for multiple benchmark types (LMSYS, MMLU, MT-Bench, etc.)
Easy deployment to GitHub Pages
Local development server
Automated data updates via GitHub Actions

Installation

# Clone the repository
git clone https://github.com/abhigna/llmbench.git
cd llmbench

# Install the package in development mode
pip install -e .

Usage

Local Development

To run a local development server:

# Build the static site
llmbench build

# Start a local server
llmbench serve

This will build the site to the ./docs directory and serve it at http://localhost:5000.

Updating Benchmark Data

The package includes tools to update benchmark data:

# Update all benchmarks
update-benchmarks --benchmark all

# Update a specific benchmark
update-benchmarks --benchmark lmsys

Adding a New Benchmark

Create a new JSON file in the data/ directory with your benchmark scores
Update the benchmark selector in templates/index.html
Add an update function in src/llmbench/update.py

Deploying to GitHub Pages

This project is configured to automatically deploy to GitHub Pages using GitHub Actions. The workflow will:

Update benchmark data daily
Build the static site
Deploy to the gh-pages branch

You can also manually trigger a deployment from the Actions tab in your GitHub repository.

Project Structure

llmbench/
├── static/                # CSS, JS files
├── templates/             # HTML templates
├── data/                  # JSON data files
├── docs/                  # GitHub Pages deployment target
├── src/llmbench/          # Python package
├── setup.py               # Package setup script
└── README.md              # This file

Customization

Adding New Models

To add a new model, update the models.json file in the data/ directory with the model details.

Changing the Layout

The site layout is defined in the following files:

templates/index.html - Main layout structure
static/css/style.css - Styling
static/js/main.js - Filtering and data loading logic

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
data		data
docs		docs
scripts		scripts
src/llmbench		src/llmbench
static		static
templates		templates
.gitignore		.gitignore
.repomixignore		.repomixignore
README.md		README.md
TODO		TODO
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Benchmark Comparison

Features

Installation

Usage

Local Development

Updating Benchmark Data

Adding a New Benchmark

Deploying to GitHub Pages

Project Structure

Customization

Adding New Models

Changing the Layout

About

Uh oh!

Releases

Packages

Languages

abhigna/autollmbench

Folders and files

Latest commit

History

Repository files navigation

LLM Benchmark Comparison

Features

Installation

Usage

Local Development

Updating Benchmark Data

Adding a New Benchmark

Deploying to GitHub Pages

Project Structure

Customization

Adding New Models

Changing the Layout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages