This is a Python testing framework for validating AI model responses across different providers through the SEMOSS API. The framework runs standardized tests against multiple models and uses OpenAI models to confirm response quality.
- Create an
.envfile based on the.env.exampleprovided in the root directory. - Open Docker Desktop and run
docker-compose up --buildin the root directory of this project to start the server and frontend services. Make sure your SEMOSS instance is running.
- If you are developing and want to see code changes reflected you will need to rebuild the docker containers using
docker-compose up --buildafter making changes.
- The server will be available at
http://localhost:8888and the frontend athttp://localhost:3000.
Required Environment Variables (in .env):
SEMOSS_ACCESS_KEY- Access key for SEMOSS APISEMOSS_SECRET_KEY- Secret key for SEMOSS APISEMOSS_BASE_URL- Base URL for SEMOSS instance (e.g.,http://localhost:9090/Monolith/api)OPENAI_API_KEY- OpenAI API key for confirmation testing
Install Server dependencies in root directory:
uv venvThen install required packages:
uv syncIf the above doesn't work, use:
uv pip install -r pyproject.tomlAlternatively, you can set up the environment using pip: Install required packages using pip:
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`- Install dependencies:
pip install -r pyproject.tomlInstall Frontend dependencies:
cd clientnpm installTo run the server, use the following command:
python server.py- The server will start of port 8888
To run the frontend, use the following command in the
clientdirectory:
npm run devProceed to http://localhost:3000 in your web browser.
src/: Contains all source code.runners/: Logic for executing tests against selected models.tests/: Standardized test cases and response models.utils/: Utility functions and model definitions.confirmations/: Logic for confirming test responses using OpenAI models.pixels/: Pixel factory class for creating pixel calls
To add a new model, update the models list in src/utils/models.py with the new model's details.
- Create a new method in
src/tests/standard_tests.pyor create a new file/class with the method. - (If required) Update the Pixel Maker class to include any new parameters needed for the test.
- Then update the
TestSelectionsclass insrc/runners/runners.pyto include the new test option. - Update the
run_selected_testsfunction insrc/runners/runners.pyto execute the new test when selected.
-
Ability to add models through UI: Update the code to read models from a JSON file so that we can add models through the UI instead of hardcoding them in
models.py -
Full Capabilities Test: Eventually when we have more tests built, I want the ability to add a model, run the full test suite and return a table of the capabilities of the model
