ChatPilot

A simple, keyboard-driven TUI that helps run a two-party conversation workflow using audio recordings on both sides, ASR to text, and an LLM to generate assistant replies.

Start input recording (microphone)
Stop and transcribe (role: user)
Start output recording (the other side)
Stop and transcribe (role: friend)
Send conversation to an LLM to generate assistant response
Loop

Features

Clear module architecture (audio, ASR, LLM, conversation, app)
Pluggable ASR/LLM providers (Gemini/OpenAI for LLM; OpenAI for ASR; room for local ASR)
Simple conversation persistence to JSON
Curses-based minimal TUI and keyboard controls

Requirements

Python 3.10+
Packages
- openai SDK used for both OpenAI and Gemini (via OpenAI-compatible base_url). No extra SDK required.
- sounddevice, soundfile (for audio I/O)
- pyyaml
- numpy

Installation

Create a virtual environment and install dependencies.

python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install pyyaml openai sounddevice soundfile numpy

Set your API keys.

# For Gemini (LLM & ASR) via OpenAI-compatible endpoint
export GOOGLE_API_KEY=your-gemini-key
# Optional: override base url (defaults to google v1beta openai bridge)
# export GEMINI_OPENAI_BASE_URL="https://generativelanguage.googleapis.com/v1beta/openai/"

# If switching ASR/LLM provider to OpenAI
# export OPENAI_API_KEY=sk-...

Optional: generate a default config file chatpilot.yaml at project root.

# chatpilot.yaml
audio:
  sample_rate: 16000
  channels: 1
  dtype: float32
  input_device: null
  output_device: null
asr:
  provider: openai
  model: gpt-4o-mini-transcribe
  language: null
llm:
  provider: openai
  model: gpt-4o-mini
  temperature: 0.3
  system_prompt: "You are a helpful friend in a two-party conversation. Respond concisely."
history_path: ./conversation_history.json

Usage

Run the app:

python -m chatpilot.app

Keyboard shortcuts inside the TUI:

s: start input recording (microphone)
e: end input recording, transcribe as user, auto-start output recording
o: start output recording (manual)
p: end output recording, transcribe as friend, call LLM, show status
q: quit

Conversation is saved to conversation_history.json by default.

Architecture

chatpilot/config.py: Load YAML config into dataclasses
chatpilot/audio.py: Recorder for WAV capture (start/stop)
chatpilot/asr/*: ASR interfaces and OpenAI implementation
chatpilot/llm/*: LLM interfaces and Gemini/OpenAI implementations
chatpilot/conversation.py: Message storage and OpenAI payload conversion
chatpilot/app.py: Curses TUI orchestrating the full loop

Notes

Recording relies on your system audio devices. Use arecord -l (Linux) to verify devices. You can list devices programmatically via Recorder.list_input_devices().
If you need a local/offline ASR, you can add a new class implementing ASRBase and configure asr.provider accordingly. Same for LLM.
The TUI shows high-level status; for detailed logs, you can add print/logging statements as needed.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
README.quickstart.md		README.quickstart.md
README.zh-CN.md		README.zh-CN.md
chatpilot.yaml		chatpilot.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatPilot

Features

Requirements

Installation

Usage

Architecture

Notes

License

About

Uh oh!

Releases

Packages

Languages

Yikic/InterviewPilot

Folders and files

Latest commit

History

Repository files navigation

ChatPilot

Features

Requirements

Installation

Usage

Architecture

Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages