_______ _______ _______ ______ _______ _______ _______ ______ _______ _______ _______ _______ _______
| _ | _ | _ | _ \ | _ | _ | _ | _ \| | | _ | _ | _ | _ |
|. | |. 1 |. 1___|. | | |. 1 |. |___|. 1___|. | |.| | | | 1___|. 1 |. 1___|. 1___|
|. | |. ____|. __)_|. | | |. _ |. | |. __)_|. | `-|. |-' |____ |. ____|. __)_|. |___
|: 1 |: | |: 1 |: | | |: | |: 1 |: 1 |: | | |: | |: 1 |: | |: 1 |: 1 |
|::.. . |::.| |::.. . |::.| | |::.|:. |::.. . |::.. . |::.| | |::.| |::.. . |::.| |::.. . |::.. . |
`-------`---' `-------`--- ---' `--- ---`-------`-------`--- ---' `---' `-------`---' `-------`-------'
A command-line tool for generating AI agent projects based on Open Agent Spec YAML files. The OA CLI supports multiple LLM engines including OpenAI, Anthropic, local models, and custom LLM routers.
pip install open-agent-spec# Show help
oas --help
# Initialize a new agent project
oas init --spec path/to/spec.yaml --output path/to/output
# Preview what would be created without writing files
oas init --spec path/to/spec.yaml --output path/to/output --dry-run
# Create a base working agent with minimal spec
oas init --template minimal --output path/to/outputoas init --spec path/to/spec.yaml --output path/to/output --verboseThe spec file should be in YAML format with the following structure. Each section is explained in detail below:
open_agent_spec: "1.0.7" # OA specification version (canonical field)
agent:
name: "hello-world-agent" # Unique identifier for the agent
description: "A simple agent that responds with a greeting" # Human-readable description
role: "chat" # Agent role (schema enum: analyst, reviewer, chat, retriever, planner, executor)
intelligence:
engine: "openai" # LLM engine: openai, anthropic, grok, local, or custom
endpoint: "https://api.openai.com/v1" # API endpoint URL
model: "gpt-4" # Model name/identifier
config: # Engine-specific configuration
temperature: 0.7
max_tokens: 150
module: "CustomRouter.CustomRouter" # For custom engines: module.class format
tasks:
greet: # Task name (will become function name)
description: "Say hello to a person by name" # Task description
timeout: 30 # Task timeout in seconds
input: # Input schema (JSON Schema format)
type: "object"
properties:
name:
type: "string"
description: "The name of the person to greet"
minLength: 1
maxLength: 100
required: ["name"]
output: # Output schema (JSON Schema format)
type: "object"
properties:
response:
type: "string"
description: "The greeting response"
minLength: 1
required: ["response"]
metadata: # Optional task metadata
category: "communication"
priority: "normal"
behavioural_contract: # Optional behavioural contract
version: "0.1.2"
description: "Simple contract requiring a greeting response"
behavioural_flags:
conservatism: "moderate"
verbosity: "compact"
response_contract:
output_format:
required_fields: ["response"]The OA CLI supports multiple LLM engines through the intelligence.engine field:
Use OpenAI's API for LLM interactions.
intelligence:
engine: "openai"
endpoint: "https://api.openai.com/v1" # OpenAI API endpoint
model: "gpt-4" # OpenAI model (gpt-4, gpt-3.5-turbo, etc.)
config:
temperature: 0.7 # Response randomness (0.0-2.0)
max_tokens: 150 # Maximum response lengthRequirements:
- OpenAI API key in environment variable
OPENAI_API_KEY - Valid OpenAI account and API access
Use Anthropic's Claude models for LLM interactions.
intelligence:
engine: "anthropic"
endpoint: "https://api.anthropic.com" # Anthropic API endpoint
model: "claude-3-sonnet-20240229" # Claude model name
config:
temperature: 0.7
max_tokens: 150Requirements:
- Anthropic API key in environment variable
ANTHROPIC_API_KEY - Valid Anthropic account and API access
Use xAI's Grok models for LLM interactions via OpenAI-compatible API.
intelligence:
engine: "grok"
endpoint: "https://api.x.ai/v1" # xAI API endpoint
model: "grok-3-latest" # Grok model (grok-3-latest, grok-3-20241219)
config:
temperature: 0.7 # Response randomness (0.0-2.0)
max_tokens: 1500 # Maximum response lengthRequirements:
- xAI API key in environment variable
XAI_API_KEY - Valid xAI account and API access
- Uses OpenAI-compatible client library
Use Cortex intelligence engine for advanced reasoning and multi-layered analysis.
intelligence:
engine: "cortex"
model: "cortex-intelligence"
config:
enable_layer3: true
enable_onnx: false
openai_api_key: ${OPENAI_API_KEY}
claude_api_key: ${CLAUDE_API_KEY}
temperature: 0.2
max_tokens: 1500Requirements:
- Cortex intelligence package:
cortex-intelligence - OpenAI API key in environment variable
OPENAI_API_KEY - Claude API key in environment variable
CLAUDE_API_KEY - Valid OpenAI and Anthropic accounts with API access
Cortex-Specific Features:
- Layer 3 Intelligence: Advanced reasoning capabilities
- ONNX Runtime: Optional optimization for performance
- Multi-Engine Integration: Combines OpenAI and Claude capabilities
- Advanced Analysis: Deep problem breakdown and creative solution generation
Use locally hosted LLM models (placeholder for future implementation).
intelligence:
engine: "local"
endpoint: "http://localhost:8000" # Local model server endpoint
model: "llama-2-7b" # Local model identifier
config:
temperature: 0.7
max_tokens: 150Note: Local engine support is planned for future releases.
Use custom LLM routers for specialized use cases, custom APIs, or proprietary models.
intelligence:
engine: "custom"
endpoint: "http://localhost:1234/invoke" # Custom endpoint
model: "my-custom-model" # Model identifier
config: {} # Custom configuration
module: "CustomLLMRouter.CustomLLMRouter" # Python module.class to importCustom Router Requirements:
- Python class with
__init__(endpoint, model, config)method run(prompt, **kwargs)method that returns a JSON string- Class must be importable from the specified module path
Example Custom Router:
# CustomLLMRouter.py
import json
class CustomLLMRouter:
def __init__(self, endpoint: str, model: str, config: dict):
self.endpoint = endpoint
self.model = model
self.config = config
def run(self, prompt: str, **kwargs) -> str:
# Your custom LLM logic here
# Must return a JSON string matching the task's output schema
return json.dumps({
"response": f"Custom response to: {prompt}"
})- Purpose: Version of the OA specification being used (canonical field name; schema and code use this, not
spec_version). - Format: String (e.g., "1.0.4")
- Required: Yes
- Note: Ensures compatibility with the CLI version
- Purpose: Defines the agent's identity and characteristics
- Purpose: Unique identifier for the agent
- Format: String (kebab-case recommended)
- Required: Yes
- Example: "hello-world-agent", "financial-analyst"
- Purpose: Human-readable description of what the agent does
- Format: String
- Required: Yes
- Example: "A friendly agent that greets people by name"
- Purpose: Defines the agent's role type (must match schema enum for validation).
- Format: String (enum)
- Required: No (optional)
- Options (schema): "analyst", "reviewer", "chat", "retriever", "planner", "executor"
- Purpose: Configures the LLM engine and model settings
- Purpose: Specifies which LLM engine to use
- Format: String (enum)
- Required: Yes
- Options: "openai", "anthropic", "grok", "cortex", "local", "custom"
- Purpose: API endpoint URL for the LLM service
- Format: Valid URI string
- Required: Yes
- Examples:
- OpenAI: "https://api.openai.com/v1"
- Anthropic: "https://api.anthropic.com"
- Custom: "http://localhost:1234/invoke"
- Purpose: Model name or identifier to use
- Format: String
- Required: Yes
- Examples: "gpt-4", "claude-3-sonnet-20240229", "my-custom-model"
- Purpose: Engine-specific configuration parameters
- Format: Object (key-value pairs)
- Required: No (optional)
- Common fields:
temperature: Response randomness (0.0-2.0)max_tokens: Maximum response lengthtop_p: Nucleus sampling parameterfrequency_penalty: Frequency penalty for repetition
- Purpose: For custom engines, specifies the Python module and class to import
- Format: String ("module.class")
- Required: Only for
engine: "custom" - Example: "CustomLLMRouter.CustomLLMRouter"
- Purpose: Defines the agent's capabilities and functions
Each task becomes a function in the generated agent code. Task names should be descriptive and use kebab-case.
description: Human-readable description of what the task doestimeout: Maximum time (seconds) the task can runinput: JSON Schema defining the task's input parametersoutput: JSON Schema defining the task's expected outputmetadata: Optional metadata for categorization and organization
- Purpose: Define the structure and validation rules for task inputs and outputs
- Format: JSON Schema (JSON Schema Draft 2020-12)
- Features:
- Type validation (string, number, boolean, object, array)
- Required field specification
- Field descriptions
- Min/max length for strings
- Min/max values for numbers
- Enum values
- Nested object structures
- Purpose: Defines behavioural constraints and response requirements
- Format: Object with behavioural contract specification
- Required: No (optional)
- Note: This is separate from the behavioural contracts repository and focuses on specification rather than enforcement
output/
├── agent.py # Main agent implementation with all tasks
├── prompts/ # Jinja2 prompt templates
│ ├── greet.jinja2 # Task-specific prompt template
│ └── agent_prompt.jinja2 # Fallback prompt template
├── requirements.txt # Python dependencies
├── .env.example # Environment variables template
├── README.md # Generated documentation
└── CustomLLMRouter.py # Custom router (if using custom engine)
The OA CLI includes ready-to-use templates for common use cases:
# Basic single-task agent
oas init --template minimal --output my-agent/
# Multi-task agent with parallel execution
oas init --template minimal-multi-task --output my-multi-agent/
# Agent with tool usage capabilities
oas init --template minimal-agent-tool-usage --output my-tool-agent/Advanced security templates demonstrating multi-engine support and behavioral contracts:
# Security threat analyzer (Claude/Anthropic powered)
oas init --spec oas_cli/templates/security-threat-analyzer.yaml --output threat-analyzer/
# Security risk assessor (Claude/Anthropic powered)
oas init --spec oas_cli/templates/security-risk-assessor.yaml --output risk-assessor/
# Security incident responder (OpenAI powered)
oas init --spec oas_cli/templates/security-incident-responder.yaml --output incident-responder/
# Grok security analyzer (xAI Grok powered)
oas init --spec oas_cli/templates/grok-security-analyzer.yaml --output grok-analyzer/Security Templates Features:
- Multi-Engine Support: Templates for Claude/Anthropic, OpenAI, and xAI Grok
- Advanced Behavioral Contracts: Security-focused validation and safety checks
- Real-World Use Cases: SOC automation, threat hunting, incident response
- Agent-to-Agent Workflows: Designed for DACP orchestration
- Production Ready: Comprehensive logging, error handling, and compliance features
See SECURITY_TEMPLATES.md for detailed documentation and usage examples.
# Clone the repository
git clone https://github.com/aswhitehouse/open-agent-spec.git
cd open-agent-spec
# Install development dependencies
pip install -e ".[dev]"# Run all tests with basic reporting
pytest
# Run with comprehensive reporting
pytest tests/ -v --cov=oas_cli --cov-report=html --cov-report=term
# Run specific test categories
pytest -m contract tests/ # Behavioral contract validation
pytest -m multi_engine tests/ # Multi-engine compatibility
pytest -m generator tests/ # Generator functionality tests
# Generate detailed HTML report
pytest tests/ --html=test-report.html --self-contained-html
# Generate Allure report (requires allure-pytest)
pytest tests/ --alluredir=allure-results
allure serve allure-results- Coverage Reports: HTML and terminal coverage reports
- Test Categories: Organized by markers (contract, multi_engine, generator)
- Allure Reports: Beautiful interactive test reports
- CI Integration: Automatic reporting in GitHub Actions
- Artifact Upload: Test results and coverage reports saved
- Generator Tests: Validate code generation, file creation, and template rendering
- Contract Tests: Ensure behavioral contracts work correctly across engines
- Multi-Engine Tests: Verify OpenAI and Claude/Anthropic compatibility
- Integration Tests: End-to-end validation of agent generation
python -m buildTo create a new release:
- Update the version number in
pyproject.toml - Commit and push your changes
- Create and push a new tag
# Update version in pyproject.toml, then:
git add pyproject.toml
git commit -m "Bump version to v1.0.8"
git push origin main
# Create and push the tag
git tag v1.0.8
git push origin v1.0.8The GitHub Actions workflow will automatically:
- Run all tests
- Build the package
- Publish to PyPI
Your package will be available on PyPI within a few minutes.
This project is licensed under the MIT License. You are free to use, modify, and distribute the software, including in proprietary projects, provided the copyright notice and license text are included.
See LICENSE for the full text.