Video Transcriber 🎥➡️📝

一个强大的短视频转文本工具，支持抖音、B站等主流平台的视频链接转录。

✨ 特性

🎯 多平台支持: 支持抖音、B站等主流短视频平台
📱 抖音扫码登录: 一键扫码获取cookies，解决反爬虫限制
🤖 高精度转录: 基于OpenAI Whisper，准确率95%+
🔒 隐私保护: 本地处理，数据不外泄
🌐 多种接口: 命令行、Web API、WebSocket
⚡ 批量处理: 支持多个视频同时转录
🎵 智能音频: 自动提取和优化音频质量
📝 多种格式: 支持JSON、TXT、SRT、VTT输出
🔄 实时状态: WebSocket实时显示处理进度

🚀 快速开始

环境要求

Python 3.8+
FFmpeg (用于音视频处理)
4GB+ RAM (推荐8GB以上)
GPU (可选，用于加速)

安装

克隆项目

git clone https://github.com/yourusername/video-transcriber.git
cd video-transcriber

安装依赖

# 创建虚拟环境
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 安装Python依赖
pip install -r requirements.txt

# 安装浏览器（用于抖音扫码登录）
playwright install

# 安装FFmpeg (Ubuntu/Debian)
sudo apt update
sudo apt install ffmpeg

# 安装FFmpeg (macOS)
brew install ffmpeg

# 安装FFmpeg (Windows)
# 下载并安装: https://ffmpeg.org/download.html

首次运行

# 命令行使用
python main.py --url "https://v.douyin.com/xxxxx"

# 启动Web服务
python -m uvicorn api.main:app --reload

📖 使用方法

命令行使用

# 基础转录
python main.py --url "https://v.douyin.com/xxxxx"

# 指定Whisper模型
python main.py --url "https://v.douyin.com/xxxxx" --model small

# 包含时间戳
python main.py --url "https://v.douyin.com/xxxxx" --timestamps

# 批量处理
python main.py --batch urls.txt

# 指定输出格式
python main.py --url "https://v.douyin.com/xxxxx" --format srt

Web API使用

# 启动API服务
uvicorn api.main:app --host 0.0.0.0 --port 8000

# 访问API文档
# http://localhost:8000/docs

import requests

# 转录视频
response = requests.post("http://localhost:8000/api/v1/transcribe", json={
    "url": "https://v.douyin.com/xxxxx",
    "options": {"model": "small", "with_timestamps": True}
})

result = response.json()
print(result["data"]["transcription"]["text"])

Web界面使用

访问 http://localhost:8000 使用简洁的Web界面进行转录。

抖音扫码登录

为了更好地解析抖音视频，建议先进行扫码登录：

在Web界面点击“开始扫码登录”
使用手机抖音APP扫描二维码
在手机上确认登录
系统自动保存cookies，提升视频解析成功率

详细使用指南请参考：docs/douyin_qr_login_guide.md

🛠️ 配置选项

Whisper模型选择

模型	大小	速度	准确率	推荐场景
tiny	39MB	最快	一般	快速预览
base	74MB	快	良好	日常使用
small	244MB	中等	很好	推荐
medium	769MB	慢	优秀	高质量需求
large	1550MB	最慢	最佳	专业场景

环境变量配置

创建 .env 文件:

# 服务配置
HOST=0.0.0.0
PORT=8000
DEBUG=false

# Whisper配置
DEFAULT_MODEL=small
ENABLE_GPU=true

# 文件配置
TEMP_DIR=./temp
MAX_FILE_SIZE=100MB
CLEANUP_AFTER=3600

# 日志配置
LOG_LEVEL=INFO
LOG_FILE=./logs/app.log

📁 项目结构

video-transcriber/
├── 📄 README.md
├── 📄 requirements.txt
├── 📄 main.py                  # 命令行入口
├── 📁 api/                     # Web API
│   ├── 📄 main.py             # FastAPI应用
│   ├── 📄 routes.py           # API路由
│   └── 📄 websocket.py        # WebSocket处理
├── 📁 core/                    # 核心模块
│   ├── 📄 __init__.py
│   ├── 📄 engine.py           # 核心引擎
│   ├── 📄 parser.py           # 链接解析
│   ├── 📄 downloader.py       # 视频下载
│   └── 📄 transcriber.py      # 语音转录
├── 📁 models/                  # 数据模型
│   ├── 📄 __init__.py
│   └── 📄 schemas.py          # Pydantic模型
├── 📁 utils/                   # 工具函数
│   ├── 📄 __init__.py
│   ├── 📄 logger.py           # 日志工具
│   └── 📄 helpers.py          # 辅助函数
├── 📁 web/                     # Web前端
│   ├── 📄 index.html
│   ├── 📄 style.css
│   └── 📄 script.js
├── 📁 tests/                   # 测试文件
│   ├── 📄 test_core.py
│   ├── 📄 test_api.py
│   └── 📄 test_integration.py
├── 📁 docs/                    # 文档
│   ├── 📄 technical_specification.md
│   └── 📄 api_documentation.md
└── 📁 docker/                  # Docker配置
    ├── 📄 Dockerfile
    └── 📄 docker-compose.yml

🎯 支持的平台

平台	域名	状态	备注
抖音	douyin.com	✅	支持扫码登录
B站	bilibili.com	✅	完全支持
快手	kuaishou.com	🚧	开发中
小红书	xiaohongshu.com	📋	计划中

⚡ 性能指标

处理速度 (基于Whisper Small模型)

短视频 (0-1分钟): ~10-20秒
中等视频 (1-5分钟): ~30-60秒
长视频 (5-10分钟): ~1-3分钟

准确率

中文: 95%+
英文: 97%+
中英混合: 92%+

资源消耗

CPU: 2-4核推荐
内存: 4GB+ (Small模型)
GPU: 可选，3倍加速效果

🔧 开发指南

开发环境搭建

# 安装开发依赖
pip install -r requirements.txt

# 安装pre-commit钩子
pre-commit install

# 运行测试
pytest

# 代码格式化
black .
isort .

# 类型检查
mypy .

添加新平台支持

在 core/parser.py 中添加新的解析器
更新 models/schemas.py 中的平台枚举
添加对应的测试用例
更新文档

🐛 故障排除

常见问题

1. FFmpeg未找到

# 确认FFmpeg已安装
ffmpeg -version

# Ubuntu/Debian安装
sudo apt install ffmpeg

# 添加到PATH环境变量
export PATH=$PATH:/path/to/ffmpeg

2. 视频下载失败

检查网络连接
确认视频链接有效
更新yt-dlp版本: pip install -U yt-dlp

3. 转录准确率低

尝试更大的Whisper模型
检查音频质量
确认语言设置正确

4. 内存不足

使用更小的Whisper模型 (tiny/base)
分段处理长视频
增加系统内存

性能优化

1. GPU加速

# 安装CUDA支持的PyTorch
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118

2. 模型缓存

# 预下载模型
import whisper
model = whisper.load_model("small")

🤝 贡献指南

欢迎贡献代码！请遵循以下步骤：

Fork 项目
创建特性分支 (git checkout -b feature/AmazingFeature)
提交改动 (git commit -m 'Add some AmazingFeature')
推送到分支 (git push origin feature/AmazingFeature)
开启 Pull Request

📄 许可证

本项目采用 MIT 许可证 - 查看 LICENSE 文件了解详情。

🙏 致谢

OpenAI Whisper - 强大的语音识别模型
yt-dlp - 视频下载工具
FastAPI - 现代Web框架

📞 联系方式

项目链接: https://github.com/yourusername/video-transcriber
问题反馈: Issues

如果这个项目对你有帮助，请给个 ⭐ Star 支持一下！

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
api		api
core		core
docker		docker
docs		docs
models		models
tests		tests
utils		utils
web		web
.env.example		.env.example
.gitignore		.gitignore
FINAL_TEST_REPORT.md		FINAL_TEST_REPORT.md
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
README.md		README.md
cookies.txt		cookies.txt
cookies_example.txt		cookies_example.txt
debug_user_video.py		debug_user_video.py
main.py		main.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_tests.py		run_tests.py
start_api.py		start_api.py
test_api.py		test_api.py
test_douyin_simulation.py		test_douyin_simulation.py
test_report.md		test_report.md
test_user_video_real.py		test_user_video_real.py
test_with_cookies.py		test_with_cookies.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Video Transcriber 🎥➡️📝

✨ 特性

🚀 快速开始

环境要求

安装

📖 使用方法

命令行使用

Web API使用

Web界面使用

抖音扫码登录

🛠️ 配置选项

Whisper模型选择

环境变量配置

📁 项目结构

🎯 支持的平台

⚡ 性能指标

处理速度 (基于Whisper Small模型)

准确率

资源消耗

🔧 开发指南

开发环境搭建

添加新平台支持

🐛 故障排除

常见问题

性能优化

🤝 贡献指南

📄 许可证

🙏 致谢

📞 联系方式

About

Uh oh!

Releases

Packages

Languages

Acclerate/video-transcriber

Folders and files

Latest commit

History

Repository files navigation

Video Transcriber 🎥➡️📝

✨ 特性

🚀 快速开始

环境要求

安装

📖 使用方法

命令行使用

Web API使用

Web界面使用

抖音扫码登录

🛠️ 配置选项

Whisper模型选择

环境变量配置

📁 项目结构

🎯 支持的平台

⚡ 性能指标

处理速度 (基于Whisper Small模型)

准确率

资源消耗

🔧 开发指南

开发环境搭建

添加新平台支持

🐛 故障排除

常见问题

性能优化

🤝 贡献指南

📄 许可证

🙏 致谢

📞 联系方式

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages