🎓 LLMSELECTOR: Which models to use for your compound AI systems?

Researchers and developers are increasingly invoking multiple LLM calls in a compound AI system to solve complex tasks. But which LLM should one select for each call?

LLMSELECTOR is a framework that automatically optimizes model selection for compound AI systems!

TLDR: You only need to design your compound system's workflow, and selecting which LLM to use is on LLMSELECTOR.

🚀 What does LLMSELECTOR offer?

Figure 1: Comparison of using any fixed model and LLMSELECTOR for different compound AI systems. We find that, perhaps surprisingly, allocating different models to different modules can improve the overall performance by 5-70%.

Compound AI systems that involve multiple LLM calls are widely studied and developed in academy and industry. But does calling different LLMs in these systems make a difference? As suggested in Figure 1, the difference can be significant, and no LLM is the universally best choice. This leads to an important question: which LLM should be selected for each call in a compound system? The search space is exponential and exhaustive search is cumbersome.

LLMSELECTOR automates LLM selection in compound AI systems. As shown in Figure 1, LLMSELECTOR offers substantial performance gains on popular compound systems (such as self-refine and multi-agent-debate): perhaps surprisingly, it can offer 5-70% performance improvement over using any fixed LLMs for all modules in a compound system.

Here, we provide a tutorial on how to use LLMSELECTOR, including the installation, a few examples, and guidance on how to extend it for customized compound AI systems.

💻 How to Use LLMSELECTOR?

🔧 Installation

You can install LLMSELECTOR by running the following commands:

git clone https://github.com/LLMSELECTOR/LLMSELECTOR
cd LLMSELECTOR
pip install -e ./llmselector

💡 Quickstart (No API key needed)

To start, let us first set up the environment.

import llmselector
if not os.path.exists('../cache/db_livecodebench.sqlite'): 
    !wget -P ../cache https://github.com/LLMSELECTOR/LLMSELECTOR/releases/download/0.0.1/db_livecodebench.sqlite
llmselector.config.config(
    db_path=f"../cache/db_livecodebench.sqlite" )

Next, let us load the livecodebench dataset.

from llmselector.data_utils.livecodebench import DataLoader_livecodebench 
from sklearn.model_selection import train_test_split
Mydataloader = DataLoader_livecodebench()
q_data = Mydataloader.get_query_df()
train_df, test_df = train_test_split(q_data,test_size=0.5, random_state=2025)

Let us first evaluate self-refine systems using fixed models.

from llmselector.compoundai.optimizer import OptimizerFullSearch
from llmselector.compoundai.metric import Metric, compute_score
model_list = ['gpt-4o-2024-05-13','claude-3-5-sonnet-20240620','gemini-1.5-pro']
Agents_SameModel ={}
for name in model_list:
    Agents_SameModel[name] = SelfRefine()
    Opt0 = OptimizerFullSearch(model_list = [name])
    Opt0.optimize( train_df, Metric('em'), Agents_SameModel[name])
results = compute_score(Agents_SameModel, test_df, Metric('em'))
print(results)

The expected output is

Name	Mean_Score
gpt-4o-2024-05-13	0.862500
claude-3-5-sonnet-20240620	0.891667
gemini-1.5-pro	0.866667

Now, let us use LLMSELECTOR to optimize the system.

from llmselector.compoundai.optimizer import OptimizerLLMDiagnoser
LLMSELECTOR = SelfRefine()
Optimizer = OptimizerLLMDiagnoser()
Optimizer.optimize( train_df, Metric('em'), LLMSELECTOR)
results = compute_score({"LLMSELECTOR":LLMSELECTOR}, test_df, Metric('em'))
print(results)

The expected output should be

Name	Mean_Score
LLMSELECTOR	0.954167

I.e., LLMSELECTOR offers a notable performance gain (6%) compared to always using any fixed model.

📖 More examples (No API key needed)

More examples can be found in examples/.

🌐 Customized systems and tasks (API keys needed)

To use LLMSELECTOR for your own compound AI systems and tasks, it is as easy as creating the systems and tasks and then invoking LLMSELECTOR.

Create your system: create the components and pipelines similar to SelfRefine defined in compoundai/module/selfrefine
Create your task: create a DataLoader object similar to these in data_utils
Invoke LLMSELECTOR: You can simply use LLMSELECTOR by Optimizer.optimize(train_df,Metric('em'),your_compound_system)

Note that you will need to set up API keys for your own systems. To do so, you can simply use

llmselector.config.config(
	db_path=f"cache.sqlite" ,
	openai_api_key="YOUR_OPENAI_KEY",
	anthropic_api_key="YOUR_ANTHROPIC_KEY",
	together_ai_api_key="YOUR_TOGETHERAI_KEY",
	gemini_api_key="YOUR_GEMINI_KEY")

✨ Can I request features and contribute?

Yes! We are happy to hear from you. Please feel free to open an issue for any feature request.

If you are interested in contributing, we would also be happy to coordinate on ongoing efforts! Please send an email to Lingjiao (lingjiao [at] stanford [dot] edu)

📣 Updates & Changelog

🔹 2025.02.21 - The project is alive now!

✅ Release the codebase, relevant examples, and demos

🎯 Reference

If you find LLMSELECTOR useful, we would appreciate if you can please cite our work as follows:

@article{chen2025llmselector,
  title={Optimizing Model Selection for Compound AI Systems},
  author={Chen, Lingjiao and Davis, Jared and Hanin, Boris and Bailis, Peter and Zaharia, Matei and Zou, James and Stoica, Ion},
  journal={arXiv},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
asset		asset
cache		cache
example		example
llmselector		llmselector
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎓 LLMSELECTOR: Which models to use for your compound AI systems?

🚀 What does LLMSELECTOR offer?

💻 How to Use LLMSELECTOR?

🔧 Installation

💡 Quickstart (No API key needed)

📖 More examples (No API key needed)

🌐 Customized systems and tasks (API keys needed)

✨ Can I request features and contribute?

📣 Updates & Changelog

🔹 2025.02.21 - The project is alive now!

🎯 Reference

About

Uh oh!

Releases

Packages

Languages

License

cashlinkadmin/LLMSELECTOR

Folders and files

Latest commit

History

Repository files navigation

🎓 LLMSELECTOR: Which models to use for your compound AI systems?

🚀 What does LLMSELECTOR offer?

💻 How to Use LLMSELECTOR?

🔧 Installation

💡 Quickstart (No API key needed)

📖 More examples (No API key needed)

🌐 Customized systems and tasks (API keys needed)

✨ Can I request features and contribute?

📣 Updates & Changelog

🔹 2025.02.21 - The project is alive now!

🎯 Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages