🤫 Silent Scribe - Air-Gapped Video Transcription

A completely offline, air-gapped video-to-text transcription system with AI-powered summarization. Designed for sensitive data that requires maximum security with zero internet access.

✨ Features

🔒 Completely Air-Gapped - Runs with --network none, zero internet access at runtime
🌐 Web Interface - Drag-and-drop video uploads with real-time progress
✨ AI Summarization - Generate bullet points and paragraph summaries using Llama 2 7B
🚀 Dual Engines - Choose between faster-whisper (Python) or whisper.cpp (C++)
📦 All Models Bundled - Whisper models + Llama 2 7B (~27 GB total)
💾 Settings Persistence - Your preferences are remembered across sessions
⚙️ Performance Tuning - Adjust model size, compute type, threads, and more
📝 Multiple Formats - Outputs TXT, SRT, VTT, and AI-generated summaries
💻 Cross-Platform - Works on Mac (Intel & Apple Silicon) and Linux

🏭 Architecture

Backend: FastAPI + Uvicorn (single-worker for resource control)
Transcription Engines:
- faster-whisper (CTranslate2, CPU-optimized)
- whisper.cpp (GGML, optimized for Apple Silicon)
Summarization: Llama 2 7B Chat via llama.cpp (CPU-only)
Models:
- Whisper: 5 sizes × 2 engines = 10 variants
- Llama: 1 quantized model (Q4_K_M, ~4 GB)
Security: No network access, non-root user, read-only filesystem (except /data)

📚 Documentation

🎬 Demo - START HERE! Simple walkthrough for beginners
🚀 Quick Start - Get started in 3 steps
🔧 Troubleshooting - Solutions to common problems
⚠️ Edge Cases - How scripts handle edge cases automatically
🔗 User Flow - Visual diagrams and flowcharts
📋 Implementation - Architecture and technical details
✅ Test Checklist - Comprehensive testing guide

📋 Requirements

Docker (or Docker Desktop for Mac)
Disk Space: ~30 GB free (final image is ~27 GB)
RAM:
- 8 GB minimum for transcription only
- 16 GB recommended for transcription + summarization
CPU: Multi-core recommended (4+ cores ideal)
Time: First build takes 30-60 minutes (one-time setup)

🚀 Quick Start

Installation

Clone or navigate to the project:
```
cd silent-scribe
```
Build the Docker image (this downloads and bundles all models):
```
./build
```
⚠️ Note: Building takes 30-60 minutes and requires ~15-20 GB disk space. This only needs to be done once.
Start the application:
```
./start
```
Open the web UI:
- Navigate to http://localhost:7860 in your browser
Stop the application:
```
./stop
# or just press Ctrl+C
```

Usage

Upload a Video
- Drag and drop a video file (MP4, MOV, MKV, AVI, etc.) or click to browse
- Supports audio files too (WAV, MP3, etc.)
Configure Settings
- Engine: faster-whisper (recommended) or whisper.cpp
- Model Size:
  - tiny - Fastest, lowest quality
  - base - Fast, decent quality
  - small - Recommended - Good balance
  - medium - Slower, better quality
  - large-v3 - Slowest, best quality
- Compute Type (faster-whisper only):
  - int8 - Fastest, lower quality
  - int8_float16 - Recommended - Good balance
  - int16 - Higher quality
  - float32 - Highest quality, slowest
- Threads: Number of CPU cores to use
- Language: Auto-detect or specify (en, es, fr, de, etc.)
Start Transcription
- Click "Start Transcription"
- Watch the progress bar
- When complete, view the transcript and download TXT/SRT/VTT files
Generate AI Summary (✨ NEW!)
- After transcription completes, click "✨ Generate Summary"
- Wait 30-90 seconds (depending on transcript length)
- View bullet points + paragraph summary in the Summary tab
- Download summary files separately

Stop the Container:

make stop
# or
./scripts/stop.sh
# or just press Ctrl+C

✨ AI Summarization (NEW!)

Silent Scribe now includes local AI-powered summarization using Llama 2 7B Chat.

How It Works

Complete a transcription first
Click the "✨ Generate Summary" button
Wait while the AI processes your transcript (30-90 seconds)
Get two summary formats:
- Bullet Points: 5-10 key facts, decisions, and action items
- Paragraph Summary: 150-250 word overview

Features

🔒 Completely Offline: Uses local Llama 2 model, no API calls
🚀 On-Demand: Only generated when you click the button
🧠 Smart Chunking: Handles long transcripts with map-reduce
💾 Persistent: Summaries are saved and reload with the page
🔒 Concurrent-Safe: Only one task (transcription OR summarization) runs at a time

Performance

Short transcripts (<1000 words): 15-30 seconds
Medium transcripts (1000-5000 words): 30-60 seconds
Long transcripts (>5000 words): 60-120 seconds
Uses CPU only (works on all platforms)

Settings Persistence (🆕 NEW!)

All your settings are now automatically saved:

Engine preference (faster-whisper/whisper.cpp)
Model size (tiny/base/small/medium/large-v3)
Compute type
Thread count
Language selection
Speaker detection preference

Default language changed to English instead of auto-detect.

🔧 Advanced Usage

Running Detached

docker run -d \
  -p 7860:7860 \
  -v "$(pwd)/data:/data" \
  --name silent-scribe \
  --network none \
  silent-scribe:latest

Custom Port

docker run --rm -it \
  -p 8080:7860 \
  -v "$(pwd)/data:/data" \
  --name silent-scribe \
  --network none \
  silent-scribe:latest

Accessing Results via CLI

# List results
ls -la data/results/

# View a transcript
cat data/results/<job-id>/transcript.txt

🔒 Security

This application is designed for maximum security with sensitive data:

✅ No Network Access: Container runs with --network none
✅ All Models Bundled: No runtime downloads, all models pre-downloaded at build time
✅ Non-Root User: Application runs as unprivileged user
✅ Offline-First: HF_HUB_OFFLINE=1, TRANSFORMERS_OFFLINE=1 environment variables set
✅ No External Resources: Web UI has no CDN dependencies, fonts, or external scripts
✅ Local Processing Only: All data stays on your machine

Verifying Air-Gapped Operation

On Linux:

# Check network namespaces while container is running
docker inspect silent-scribe | grep NetworkMode
# Should show: "NetworkMode": "none"

On Mac:

# Container should not be able to resolve DNS
docker exec silent-scribe ping -c 1 google.com
# Should fail with "network unreachable"

📦 What's Included

Models

faster-whisper (CTranslate2 format):
- tiny (~75 MB)
- base (~150 MB)
- small (~500 MB)
- medium (~1.5 GB)
- large-v3 (~3 GB)
whisper.cpp (GGML format):
- ggml-tiny.bin (~75 MB)
- ggml-base.bin (~150 MB)
- ggml-small.bin (~500 MB)
- ggml-medium.bin (~1.5 GB)
- ggml-large-v3.bin (~3 GB)
Llama 2 7B Chat (✨ NEW! for summarization):
- llama-2-7b-chat.Q4_K_M.gguf (~4 GB, quantized)
- CPU-optimized with OpenBLAS
- Runs via llama.cpp (same as whisper.cpp)

Output Formats

TXT: Plain text transcript
SRT: SubRip subtitle format (for video players)
VTT: WebVTT subtitle format (for web players)
Summary (Bullets): AI-generated key points (✨ NEW!)
Summary (Paragraph): AI-generated overview (✨ NEW!)

🐛 Troubleshooting

Build Issues

Problem: Build fails with "network timeout"

# Solution: Retry the build
make build

Problem: Out of disk space

# Solution: Clean up Docker
docker system prune -a

Runtime Issues

Problem: Container won't start

# Check if port is in use
lsof -i :7860

# Use a different port
docker run -p 8080:7860 ... silent-scribe:latest

Problem: Transcription fails

Check that the video file is valid (try playing it first)
Try a smaller model (tiny or base)
Check available RAM (docker stats silent-scribe)

Problem: Out of memory

Use a smaller model (tiny, base, or small)
Reduce threads setting
Close other applications

Mac-Specific Issues

Problem: Apple Silicon performance

The image includes optimizations for both Intel and ARM
whisper.cpp is particularly fast on Apple Silicon
faster-whisper works well but is CPU-only

Problem: Docker Desktop memory limit

Go to Docker Desktop → Settings → Resources
Increase Memory to at least 8 GB (16 GB recommended)

Linux-Specific Issues

Problem: Permission denied on /data

# Fix permissions
sudo chown -R $(id -u):$(id -g) data/

📊 Performance Tips

Choose the Right Model:
- For speed: tiny or base
- For quality: medium or large-v3
- For balance: small ✅
Optimize Threads:
- Set to number of CPU cores
- Don't exceed physical cores
Compute Type (faster-whisper):
- int8_float16 is the best balance ✅
- int8 for maximum speed
- float32 for maximum quality
Engine Choice:
- faster-whisper: Better progress tracking, more tunable
- whisper.cpp: Faster on Apple Silicon

🗂️ Project Structure

silent-scribe/
├── app/
│   ├── backend/
│   │   ├── main.py              # FastAPI application
│   │   ├── config.py            # Configuration
│   │   ├── job_manager.py       # Job management
│   │   └── engines/
│   │       ├── faster_whisper_engine.py
│   │       └── whisper_cpp_engine.py
│   └── templates/
│       └── index.html           # Web UI
├── docker/
│   └── Dockerfile               # Multi-stage build
├── scripts/
│   ├── build.sh                 # Build script
│   ├── run.sh                   # Run script
│   ├── stop.sh                  # Stop script
│   └── fetch_models.py          # Model fetcher (build-time)
├── data/                        # Mounted volume
│   ├── uploads/                 # Uploaded videos
│   └── results/                 # Transcription results
├── requirements.txt             # Python dependencies
├── Makefile                     # Make shortcuts
└── README.md                    # This file

🤝 Contributing

This is a self-contained, air-gapped application. Modifications should maintain:

Zero runtime network access
All dependencies bundled at build time
Security-first design

📜 License

This project uses the following open-source components:

faster-whisper (MIT License)
whisper.cpp (MIT License)
llama.cpp (MIT License)
FastAPI (MIT License)
OpenAI Whisper models (MIT License)
Llama 2 (Meta's Llama 2 Community License Agreement)

Note on Llama 2 Usage: This project uses Llama 2 for offline summarization. The model is downloaded at build time and runs completely locally. Usage complies with Meta's license for internal tooling and offline applications.

🙏 Credits

Built on top of excellent open-source projects:

faster-whisper by Guillaume Klein
whisper.cpp by Georgi Gerganov
llama.cpp by Georgi Gerganov
OpenAI Whisper by OpenAI
Llama 2 by Meta AI

Silent Scribe - Transcribing in silence, secure and offline. 🤫🔒

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
app		app
data		data
docker		docker
docs		docs
electron		electron
resources/models		resources/models
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
DISTRIBUTION.md		DISTRIBUTION.md
GET_STARTED.md		GET_STARTED.md
INSTRUCTIONS.html		INSTRUCTIONS.html
Makefile		Makefile
README.md		README.md
build		build
load		load
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
save-image		save-image
start		start
todo.txt		todo.txt

jesusCDev/offline-scribe

Folders and files

Latest commit

History

Repository files navigation