Skip to content

fredh2006/zinc

Repository files navigation

Zinc - AI Video Editor

πŸ† Third Place at HackAI x Stan

AI-powered video editor with automatic transcription, timeline editing, and advanced effects like text-behind-person overlays.

Overview

Zinc provides a complete video editing workflow:

  • Automatic Transcription using faster-whisper with word-level timestamps
  • Timeline-Based Editing with visual clip management
  • AI-Assisted Editing using Claude/Gemini for intelligent suggestions
  • LangGraph Agent for orchestrating complex video editing workflows
  • Text Behind Person effect using MediaPipe segmentation
  • Color Grading with preset looks
  • Remotion Integration for advanced video compositions

Features

  • Transcription: Upload videos and get accurate transcripts with word-level timing
  • Timeline Editor: Visual timeline for arranging clips with drag-and-drop
  • Text Behind Person: Overlay text that appears behind the subject using real-time person segmentation
  • Color Grades: Apply cinematic color presets to your videos
  • Streaming Subtitles: Word-by-word subtitle rendering synced to speech
  • Audio Tracks: Add background music with volume control
  • Transitions: Fade, dissolve, and other transition effects between clips

Tech Stack

  • Frontend: Next.js 16, React 19, Tailwind CSS, Remotion Player
  • Backend: FastAPI, faster-whisper, FFmpeg, MediaPipe
  • AI: Anthropic Claude SDK, Google Gemini SDK
  • Workflow: LangGraph for orchestrating video editing operations
  • Video Processing: FFmpeg for compositing, Remotion for effects

LangGraph Agent

The video editing pipeline is orchestrated by a LangGraph agent that manages the entire workflow:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      LangGraph Workflow                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚   β”‚   START     │───▢│  Extract    │───▢│ Concatenate β”‚        β”‚
β”‚   β”‚             β”‚    β”‚   Clips     β”‚    β”‚             β”‚        β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β”‚                            β”‚                   β”‚                β”‚
β”‚                            β–Ό                   β–Ό                β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”‚
β”‚                    β”‚ For each clip β”‚   β”‚  Mix Audio  β”‚         β”‚
β”‚                    β”‚  (parallel):  β”‚   β”‚  Tracks     β”‚         β”‚
β”‚                    β”‚ β€’ FFmpeg cut  β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β”‚
β”‚                    β”‚ β€’ Color grade β”‚          β”‚                β”‚
β”‚                    β”‚ β€’ Subtitles   β”‚          β–Ό                β”‚
β”‚                    β”‚ β€’ Overlays    β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”‚
β”‚                    β”‚ β€’ Transitions β”‚   β”‚   Cleanup   │───▢ END β”‚
β”‚                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Nodes:

  • Extract Clips: Processes each clip in parallel - applies FFmpeg cuts, color grading, streaming subtitles, text-behind-person overlays, and speed adjustments
  • Concatenate: Joins all processed clips with transitions, mixes in audio tracks
  • Cleanup: Removes temporary files

Quick Start

Prerequisites

  • Python 3.9+
  • Node.js 18+
  • FFmpeg

1. Clone and Setup

git clone https://github.com/yourusername/zinc.git
cd zinc

2. Backend Setup

cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Start the server
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

3. Frontend Setup

cd frontend
npm install

# Configure environment (optional)
cp .env.local.example .env.local

# Start dev server
npm run dev

4. Install FFmpeg

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt-get install ffmpeg

# Windows
# Download from https://ffmpeg.org/download.html

Visit http://localhost:3000 to start editing!

Usage

1. Upload Videos

Upload one or more video files. They'll be automatically transcribed with word-level timestamps.

2. Edit on Timeline

  • Drag clips to rearrange
  • Trim clips by adjusting start/end times
  • Select transcript segments to create clips

3. Add Effects

Text Behind Person: Add text overlays that appear behind the subject using AI-powered person segmentation. Customize:

  • Text content and position
  • Font size and color
  • Feathering and threshold for segmentation quality

Color Grading: Apply preset color grades to change the mood of your video.

Subtitles: Enable streaming subtitles that highlight words as they're spoken.

4. Export

Process your edit plan and download the final video.

Project Structure

zinc/
β”œβ”€β”€ backend/                    # FastAPI backend
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ main.py            # API endpoints (transcription, editing)
β”‚   β”‚   β”œβ”€β”€ video_editor.py    # FFmpeg-based video processing
β”‚   β”‚   β”œβ”€β”€ ffmpeg_compositor.py
β”‚   β”‚   └── person_segmentation.py
β”‚   └── requirements.txt
β”‚
β”œβ”€β”€ frontend/                   # Next.js frontend
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ page.tsx           # Main editor interface
β”‚   β”‚   β”œβ”€β”€ video-editor/      # Video editor page
β”‚   β”‚   β”œβ”€β”€ components/        # React components
β”‚   β”‚   β”œβ”€β”€ lib/               # Utilities and API client
β”‚   β”‚   └── api/               # API routes
β”‚   └── package.json
β”‚
β”œβ”€β”€ remotion-renderer/          # Remotion video effects
β”‚   └── src/
β”‚
β”œβ”€β”€ packages/                   # Shared packages
β”‚   └── remotion-shared/
β”‚
└── Documentation/              # Additional docs

API Endpoints

Endpoint Method Description
/health GET Health check
/transcribe POST Transcribe a video/audio file
/transcribe/batch POST Transcribe multiple files
/edit/process POST Process an edit plan
/audio/upload POST Upload background audio

Environment Variables

Backend (backend/.env)

WHISPER_MODEL=base          # tiny, base, small, medium, large-v2, large-v3
WHISPER_DEVICE=auto         # auto, cpu, cuda
WHISPER_COMPUTE_TYPE=auto   # auto, int8, float16, float32

Frontend (frontend/.env.local)

NEXT_PUBLIC_API_URL=http://localhost:8000

Text Behind Person Effect

The text-behind-person feature uses MediaPipe's selfie segmentation to create a mask of the person in the video. Text is then composited behind this mask, creating the illusion that text is behind the subject.

Parameters:

  • threshold: Segmentation confidence threshold (0-1)
  • feather_px: Edge softening in pixels
  • temporal_smoothing: Frame-to-frame smoothing for stable masks

Development

Run Backend Tests

cd backend
pytest

Format Code

# Backend
cd backend
black app/
isort app/

# Frontend
cd frontend
npm run lint

License

MIT

Contributing

Contributions welcome! Please open an issue first to discuss what you'd like to change.

About

we're zincing out

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •