λ§€μΌ μμΉ¨ μλμΌλ‘ Hugging Face νΈλ λ© λ Όλ¬Έ Top 3λ₯Ό μμ§νμ¬ Gemini Proλ‘ μμ½νκ³ , Google TTSλ‘ μμ± λ³νν ν Google Cloud Storageμ μ λ‘λνμ¬ μΉ νλ«νΌμμ μ¬μ/λ€μ΄λ‘λ κ°λ₯νκ² λ§λλ νμ€ν μλν νμΊμ€νΈ μλΉμ€μ λλ€.
- π€ μλ μμ§: λ§€μΌ μμΉ¨ 6μ(KST) Hugging Face νΈλ λ© λ Όλ¬Έ Top 3 μλ μμ§
- π AI μμ½: Google Gemini Proλ₯Ό μ¬μ©ν νκ΅μ΄ μμ½ μμ±
- π 3μ€ μμ½: κ° λ Όλ¬Έλ³ ν΅μ¬ λ΄μ©μ 3μ€λ‘ κ°λ¨ μμ½
- ποΈ TTS λ³ν: Google Cloud Text-to-Speechλ‘ κ³ νμ§ μμ± μμ±
- βοΈ ν΄λΌμ°λ μ μ₯: Google Cloud Storageμ MP3 νμΌ μ λ‘λ
- π νμ€ν μΉ νλ«νΌ: FastAPI λ°±μλ + Next.js νλ‘ νΈμλ
- π± λ°μν UI: λͺ¨λ°μΌ/λ°μ€ν¬ν± μ΅μ νλ μ¬μ©μ μΈν°νμ΄μ€
- π΅ κ³ κΈ μ€λμ€ νλ μ΄μ΄: μ¬μ/μΌμμ μ§, λ³Όλ₯¨ μ‘°μ , κ΅¬κ° μ΄λ
- π λ Όλ¬Έ λ·°μ΄: ArXiv PDF μ§μ λ§ν¬ λ° μλ² λ μ§μ
- π μ€λ§νΈ λ§ν¬: λ Όλ¬Έ μμΈ νμ΄μ§, μλ¬Έ λ§ν¬, μνΌμλ λ€λΉκ²μ΄μ
- π μμ μλν: GitHub Actionsλ₯Ό ν΅ν λ¬΄μΈ μ΄μ
- π’ Slack μλ¦Ό: μ±κ³΅/μ€ν¨ μλ¦Ό λ° μΉνμ΄μ§ λ§ν¬ ν¬ν¨
Repository β Settings β Secrets and variables β Actionsμμ λ€μ Secrets μ€μ :
| Secret Name | Description |
|---|---|
GEMINI_API_KEY |
Google Gemini API ν€ |
GCP_SERVICE_ACCOUNT_KEY |
GCP Service Account JSON (base64) |
GCP_PROJECT_ID |
Google Cloud νλ‘μ νΈ ID |
GCS_BUCKET_NAME |
Google Cloud Storage λ²ν· μ΄λ¦ |
DATABASE_URL |
PostgreSQL λ°μ΄ν°λ² μ΄μ€ URL |
VERCEL_TOKEN |
Vercel API ν ν° |
VERCEL_ORG_ID |
Vercel μ‘°μ§ ID |
VERCEL_PROJECT_ID |
Vercel νλ‘μ νΈ ID |
SLACK_WEBHOOK_URL |
Slack μΉν URL (μ νμ¬ν) |
- λ§€μΌ 6μ KST: μλ μ€ν
- μλ μ€ν: Repository β Actions β Daily Podcast Generation β Run workflow
- νλ‘ νΈμλ:
https://papercast.vercel.app - λ°±μλ:
https://papercast-backend-xxx-uc.a.run.app - API λ¬Έμ:
https://papercast-backend-xxx-uc.a.run.app/docs
- Python 3.12 μ΄μ
- Node.js 18 μ΄μ
- uv (Python ν¨ν€μ§ λ§€λμ )
- npm λλ yarn (Node.js ν¨ν€μ§ λ§€λμ )
- Google Cloud Platform κ³μ
- GitHub κ³μ
- Clone the repository:
git clone https://github.com/hanseungsoo13/papercast.git
cd papercast- Install uv (if not already installed):
# Linux/Mac
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# Or via pip
pip install uv- Install dependencies with uv:
# κ°μνκ²½ μλ μμ± λ° μμ‘΄μ± μ€μΉ
uv sync
# λλ κ°λ° μμ‘΄μ± ν¬ν¨ μ€μΉ
uv sync --dev- Configure environment:
νλ‘μ νΈ λ£¨νΈμ .env νμΌ μμ±:
# .env νμΌ μμ±
touch .env.env νμΌμ λ€μ λ΄μ© μ
λ ₯:
# Google Gemini API Key (νμ)
# λ°κΈ: https://makersuite.google.com/app/apikey
GEMINI_API_KEY=your_gemini_api_key_here
# Google Cloud Service Account (νμ)
# GCP Consoleμμ Service Account μμ± ν JSON ν€ λ€μ΄λ‘λ
GOOGLE_APPLICATION_CREDENTIALS=./credentials/service-account.json
# Google Cloud Storage Bucket Name (νμ)
GCS_BUCKET_NAME=papercast-podcasts
# Optional: κΈ°ν μ€μ
TZ=Asia/Seoul
LOG_LEVEL=INFO
PAPERS_TO_FETCH=3
PODCAST_TITLE_PREFIX=Daily AI PapersService Account JSON ν€ μ μ₯:
# credentials λλ ν 리 μμ±
mkdir -p credentials
# GCP Consoleμμ λ€μ΄λ‘λν JSON ν€λ₯Ό μ μ₯
# (μ: service-account.json)
cp ~/Downloads/your-service-account-key.json credentials/service-account.json- μ€μ κ²μ¦ (κΆμ₯):
# uvλ₯Ό μ¬μ©ν μ€μ κ²μ¦
uv run python check_config.py
# λλ μ§μ μ€ν
python check_config.py- Run locally:
νμ€ν κ°λ° μλ² μ€ν:
# ν΅ν© μ€ν μ€ν¬λ¦½νΈ (κΆμ₯)
./scripts/run-fullstack.sh
# λλ κ°λ³ μ€ν
# API μλ² (ν°λ―Έλ 1)
uv run uvicorn api.main:app --host 0.0.0.0 --port 8001 --reload
# νλ‘ νΈμλ μλ² (ν°λ―Έλ 2)
cd frontend && npm run devνμΊμ€νΈ μμ± νμ΄νλΌμΈ μ€ν:
# uvλ₯Ό μ¬μ©ν μ€ν (κΆμ₯)
uv run python src/main.py
# λλ μ§μ μ€ν
uv run python -m src.mainπ‘ κ°λ° νκ²½: νμ€ν κ°λ° μ
./scripts/run-fullstack.shμ¬μ© π‘ νμΊμ€νΈ μμ±:uv run python src/main.pyμ¬μ©
# uvλ₯Ό μ¬μ©ν λ¨μ ν
μ€νΈ μ€ν
uv run pytest tests/unit/ -v
# 컀λ²λ¦¬μ§ ν¬ν¨
uv run pytest tests/unit/ -v --cov=src --cov-report=html# Contract ν
μ€νΈ μ€ν (μ€μ API νΈμΆ λλ Mock)
uv run pytest tests/contract/ -v --run-contract-tests
# Contract ν
μ€νΈ μ€ν΅ (κΈ°λ³Έκ°)
uv run pytest tests/contract/ -v# ν΅ν© ν
μ€νΈ μ€ν
uv run pytest tests/integration/ -v
# μ 체 νμ΄νλΌμΈ ν
μ€νΈλ§ μ€ν
uv run pytest tests/integration/test_pipeline.py::TestPipelineIntegration::test_full_pipeline_end_to_end -v# λͺ¨λ ν
μ€νΈ μ€ν (Contract μ μΈ)
uv run pytest -v
# Contract ν
μ€νΈ ν¬ν¨ λͺ¨λ ν
μ€νΈ
uv run pytest -v --run-contract-testsν
μ€νΈ μ€ν ν htmlcov/index.htmlμ λΈλΌμ°μ λ‘ μ΄μ΄ 컀λ²λ¦¬μ§ 리ν¬νΈλ₯Ό νμΈνμΈμ.
π‘ μμΈν μ€μ κ°μ΄λ:
- API ν€ μ€μ κ°μ΄λ - Gemini API, Service Account μ€μ
- GCP μ€μ κ°μ΄λ - Text-to-Speech, Storage API νμ±ν β
- GitHub Actions μ€μ κ°μ΄λ - μλ μ€ν μ€μ ββ
- Slack μλ¦Ό μ€μ κ°μ΄λ - GitHub Actions β Slack μλ¦Ό
μλ μ€μ μ€ν¬λ¦½νΈ μ¬μ©:
# uvλ₯Ό μ¬μ©ν μ€ν¬λ¦½νΈ μ€ν
uv run ./setup_env.sh
# λλ μ§μ μ€ν
./setup_env.shμ΄ μ€ν¬λ¦½νΈλ .env νμΌκ³Ό νμν λλ ν 리λ₯Ό μλμΌλ‘ μμ±ν©λλ€.
GEMINI_API_KEY: Google Gemini API keyGOOGLE_APPLICATION_CREDENTIALS: Path to GCP service account JSONGCS_BUCKET_NAME: Google Cloud Storage bucket name
TZ: Timezone (default: Asia/Seoul)LOG_LEVEL: Logging level (default: INFO)PODCAST_TITLE_PREFIX: Podcast title prefixPAPERS_TO_FETCH: Number of papers to fetch (default: 3)
GitHub Repository β Settings β Secrets and variables β Actionsμμ λ€μ Secretsλ₯Ό μΆκ°νμΈμ:
| Secret Name | Description | How to Get |
|---|---|---|
GEMINI_API_KEY |
Google Gemini API ν€ | Google AI Studioμμ λ°κΈ |
GCP_SERVICE_ACCOUNT_KEY |
GCP Service Account JSON (base64 encoded) | GCP Consoleμμ Service Account μμ± ν ν€ λ€μ΄λ‘λ, base64 -w 0 < key.json λͺ
λ Ήμ΄λ‘ μΈμ½λ© |
GCS_BUCKET_NAME |
Google Cloud Storage λ²ν· μ΄λ¦ | μ: papercast-podcasts |
SLACK_WEBHOOK_URL |
Slack Webhook URL (μ νμ¬ν) | Slack APIμμ Incoming Webhook μμ± |
GCP Service Accountμ λ€μ μν μ λΆμ¬νμΈμ:
- Cloud Storage Admin: MP3 νμΌ λ° λ©νλ°μ΄ν° μ λ‘λ
- Cloud Text-to-Speech Admin: μμ± λ³ν
JSON λμ½λ© μ€λ₯κ° λ°μνλ κ²½μ°:
-
Base64 μΈμ½λ© νμΈ:
cat service-account-key.json | base64 -w 0 -
λ‘컬 ν μ€νΈ:
export GCP_SERVICE_ACCOUNT_KEY="your_base64_encoded_key" python test_credentials.py
-
μμΈν λ¬Έμ ν΄κ²°: Credential λ¬Έμ ν΄κ²° κ°μ΄λ μ°Έμ‘°
- μλ μ€ν: λ§€μΌ μ€μ 6μ (KST)μ μλμΌλ‘ μ€νλ©λλ€
- μλ μ€ν:
- GitHub Repository β Actions β Daily Podcast Generation
- "Run workflow" λ²νΌ ν΄λ¦
# GitHub CLIλ₯Ό μ¬μ©νλ κ²½μ°
gh secret list- Actions νμμ μ€ν¨ν μν¬νλ‘μ° ν΄λ¦
- κ° λ¨κ³λ³ λ‘κ·Έ νμΈ
-
"API key not valid"
- Gemini API ν€κ° μ¬λ°λ₯Έμ§ νμΈ
- API ν€ μ ν μ€μ νμΈ
-
"Permission denied" (GCS)
- Service Account κΆν νμΈ
- λ²ν· μ΄λ¦μ΄ μ¬λ°λ₯Έμ§ νμΈ
-
"Quota exceeded"
- API ν λΉλ νμΈ
- λ¬΄λ£ ν°μ΄ νλ νμΈ
# All tests with uv
uv run pytest
# Specific test types
uv run pytest tests/unit/ -m unit
uv run pytest tests/integration/ -m integration
uv run pytest tests/contract/ -m contract
# With coverage
uv run pytest --cov=src --cov-report=html# Format code with uv
uv run black src/ tests/
# Lint with uv
uv run pylint src/
# Type check with uv
uv run mypy src/
# λλ uvλ₯Ό μ¬μ©ν κ°λ° λꡬ μ€ν
uv run --group dev black src/ tests/
uv run --group dev pylint src/
uv run --group dev mypy src/papercast/
βββ src/ # Core Python modules
β βββ models/ # Data models (Paper, Podcast, ProcessingLog)
β βββ services/ # Core services
β β βββ collector.py # Hugging Face paper collection
β β βββ summarizer.py # Gemini Pro summarization
β β βββ tts.py # Google TTS conversion
β β βββ uploader.py # GCS upload
β β βββ generator.py # Static site generation
β βββ utils/ # Utilities (logger, retry, config)
β βββ main.py # Main pipeline
βββ api/ # FastAPI backend
β βββ routes/ # API endpoints
β β βββ health.py # Health check endpoints
β β βββ episodes.py # Episode endpoints
β βββ schemas.py # Pydantic response schemas
β βββ repository.py # Data access layer
β βββ dependencies.py # FastAPI dependencies
β βββ main.py # FastAPI app
βββ frontend/ # Next.js frontend
β βββ src/
β β βββ components/ # React components
β β βββ pages/ # Next.js pages
β β βββ services/ # API client
β β βββ styles/ # CSS styles
β βββ package.json # Node.js dependencies
β βββ next.config.js # Next.js configuration
βββ tests/ # Test suite
β βββ unit/ # Unit tests
β βββ integration/ # Integration tests
β βββ contract/ # Contract tests
β βββ api/ # API tests
βββ scripts/ # Utility scripts
β βββ run-fullstack.sh # Full-stack development server
β βββ run-api.sh # API server only
β βββ dev-regenerate.py # Site regeneration
βββ .github/workflows/
β βββ daily-podcast.yml # GitHub Actions workflow
βββ static-site/ # Generated static site
βββ data/
βββ papers/ # Collected papers
βββ podcasts/ # Generated podcasts
MIT
Contributions are welcome! Please read CONTRIBUTING.md for details.