Supernova Event Dataset: Official Code Repository

Official repository for the research paper "Supernova Event Dataset: Interpreting Large Language Models' Personality through Critical Event Analysis" by

Pranav Agarwal¹ · Ioana Ciucă²

¹Mila, Quebec AI Institute · ²Stanford University

Actionable Interpretability Workshop at ICML 2025

📄 Paper | 🌐 Project Page | 🤗 Dataset | 📊 Demo

Overview

In this work, we interpret the personality traits of Large Language Models (LLMs) using our proposed Supernova Event Dataset, which includes Wikipedia articles consisting of historical events, biographies, news events, and scientific discoveries. We benchmark models based on their identification and ranking of key life or discovery events, a complex task requiring causal reasoning. A second LLM acts as a judge to infer each model’s personality based on its event selection and interpretation. Our analysis show distinct traits, like emotional reasoning in Orca 2 and analytical framing in Qwen 2.5, enhancing interpretability and trust.

Quick Start

Prerequisites

Python 3.8 or higher
API keys for OpenAI, Anthropic, and/or Gemini (depending on models used)

Installation

# Clone the repository
git clone https://github.com/pranaval/supernova-event-dataset.git
cd supernova-event-dataset

# Create virtual environment
python3.8 -m venv myenv
source myenv/bin/activate  # On Windows: myenv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Configuration

Create a .env file in the root directory:

OPENAI_API_KEY=your_openai_key_here
ANTHROPIC_API_KEY=your_anthropic_key_here
GOOGLE_API_KEY=your_google_key_here

Dataset

The Supernova Event Dataset contains 592 carefully curated Wikipedia articles:

Domain	Count	Description
🎭 Biographies	192	Life stories of influential figures
📰 News Events	200	Major contemporary events
📚 Historical Events	200	Significant historical occurrences
🔬 Scientific Discoveries	25*	Comprehensive discovery narratives

*Scientific discoveries use Google Gemini 2.5 Pro with Deep Research for comprehensive articles

Download Dataset

# Extract all datasets
tar -zxvf Dataset/biographies.tar.xz
tar -zxvf Dataset/historical-events.tar.xz
tar -zxvf Dataset/major-news-events.tar.xz

Usage

1️⃣ Extract Critical Events

Run event extraction for each domain:

# Biographies
python biography_dataset.py --model orca-2

# Historical Events  
python history_dataset.py --model phi-4

# News Events
python news_dataset.py --model orca-2

# Movie Scripts (optional additional domain)
python movies_dataset.py --model qwen-2.5

2️⃣ Analyze Model Personality

Consolidate results and extract personality patterns:

# Generate personality analysis for all models
python extract_personality.py

3️⃣ Visualize Results

Create personality visualizations:

# Generate radar plots and semantic space mapping
python plot_personality.py

Citation

If you find our work useful, please cite:

@article{agarwal2025supernova,
  title={Supernova Event Dataset: Interpreting Large Language Models' Personality through Critical Event Analysis},
  author={Agarwal, Pranav and Ciucă, Ioana},
  journal={arXiv preprint arXiv:2506.12189},
  year={2025}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Wikipedia for article content
Model providers (OpenAI, Anthropic, Google) for API access
Fundamental of Ollama

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Supernova Event Dataset: Official Code Repository

Overview

Quick Start

Prerequisites

Installation

Configuration

Dataset

Download Dataset

Usage

1️⃣ Extract Critical Events

2️⃣ Analyze Model Personality

3️⃣ Visualize Results

Citation

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Dataset		Dataset
assets		assets
LICENSE		LICENSE
README.md		README.md
biography_dataset.py		biography_dataset.py
extract_personality.py		extract_personality.py
history_dataset.py		history_dataset.py
movies_dataset.py		movies_dataset.py
news_dataset.py		news_dataset.py
plot_personality.py		plot_personality.py
requirements.txt		requirements.txt

License

errai34/Supernova-Event-Dataset

Folders and files

Latest commit

History

Repository files navigation

Supernova Event Dataset: Official Code Repository

Overview

Quick Start

Prerequisites

Installation

Configuration

Dataset

Download Dataset

Usage

1️⃣ Extract Critical Events

2️⃣ Analyze Model Personality

3️⃣ Visualize Results

Citation

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages