A GitHub App that automatically processes pull request synchronize events, analyzes repository structure, and posts summary comments.
Features:
- 🤖 Automated PR processing on new commits
- 📊 Repository structure analysis (file/directory counting)
- 💾 Token caching for efficiency (5-minute buffer)
- 🧹 Automatic cleanup after processing
- ✅ Comprehensive test coverage
- 🔄 Background processing with ThreadPoolExecutor
- Environment Variables
- Quick Start
- Architecture
- GitHub App Permissions
- Webhook Handler
- Testing
- Development Commands
- Troubleshooting
- Resources
Required environment variables:
| Variable | Description | Example |
|---|---|---|
GITHUB_APP_ID |
Your GitHub App ID | 123456 |
GITHUB_APP_PRIVATE_KEY |
Base64-encoded private key | LS0tLS1CRUdJTi... |
GITHUB_WEBHOOK_SECRET |
Webhook secret (use UUID) | 550e8400-e29b-41d4-a716-446655440000 |
PORT |
Server port (optional) | 8000 (default) |
cat path/to/private-key.pem | base64- Open https://smee.io in your browser
- Click "Start a new channel"
- Copy the Webhook Proxy URL (e.g.,
https://smee.io/abc123...) - Update
.envrcwith this channel URL
- Go to:
Settings → Developer settings → GitHub Apps → New GitHub App - Fill in the required fields:
- Name: Your app name
- Homepage URL: Your repository URL
- Webhook URL: Your Smee.io URL from step 1
- Webhook secret: Generate a UUID for security (e.g.,
550e8400-e29b-41d4-a716-446655440000)
Set the following Repository permissions:
| Permission | Access | Purpose |
|---|---|---|
| Contents | Read | Clone repository and analyze files |
| Pull requests | Read and write | Read PR data and post comments |
| Metadata | Read | Repository metadata (automatic) |
Under "Subscribe to events", check:
- ✅ Pull request
This enables the app to receive pull_request.synchronize webhook events.
- In your GitHub App settings, generate a private key
- Download the
.pemfile - Encode it to base64:
cat path/to/private-key.pem | base64 - Add to your environment variables
- Install the app in your chosen repository or organization
mk docker-compose-upsrc/
├── app.py # Main FastAPI application and webhook handler
├── model.py # Data models (PullRequestPayload)
├── cache.py # Token caching (TokenCache class)
├── repo.py # Repository operations (RepositoryManager class)
├── utils.py # Utility functions
└── constants.py # Configuration constants
tests/
├── test_app.py # Application tests
├── test_cache.py # Cache functionality tests
├── test_model.py # Data model tests
├── test_repo.py # Repository manager tests
└── test_utils.py # Utility function tests
- Thread-safe token caching
- 5-minute expiration buffer
- Automatic token refresh
- Handles timezone-aware and naive datetimes
- Clone repository with authentication
- Checkout PR branch
- Post comments to PRs
- Automatic cleanup after processing
- Structured webhook payload parsing
- PR validation logic
- Extracts all relevant PR data
Webhook Event → Validate → Cache Token → Clone Repo → Analyze → Comment → Cleanup
↓ ↓ ↓ ↓ ↓ ↓ ↓
handle_pr is_valid() get_token() setup() analyze() post() cleanup()
- Token Caching: Installation tokens cached with 5-minute buffer to reduce API calls
- Background Processing: Uses ThreadPoolExecutor for non-blocking webhook responses
- Automatic Cleanup: Repositories deleted after processing to save disk space
- Repository Analysis: Recursively counts files and directories (excluding .git)
- Unique Clone Directories: Format:
/tmp/{repo}-{pr_number}-{short_sha}
The app requires the following GitHub App permissions to function:
| Permission | Access Level | Why Required |
|---|---|---|
| Contents | Read | • Clone repository • Read files and directory structure • Analyze repository contents |
| Pull requests | Read & Write | • Receive PR webhook events • Read PR metadata (branch, SHA, state) • Post comments on PRs |
| Metadata | Read | • Repository information • Clone URLs • Default branch info (Automatically included) |
Subscribe to the following events in your GitHub App settings:
- ✅ Pull request - Receives
pull_request.synchronizeevents
After configuring permissions:
1. Verify Installation Token:
# Check logs for successful token fetch
# Should see: "Fetching new token for installation {id}"
# Should see: "Token cached, expires at {timestamp}"2. Test Repository Access:
# Push to a PR branch
# Should see in logs: "Cloning repository to /tmp/..."
# Should see: "Successfully cloned and checked out to branch {name}"3. Verify Comment Posting:
- Check that bot comment appears on PR
- Comment should include file/directory counts
- Comment should have 🤖 bot indicator
| Error | Cause | Solution |
|---|---|---|
403: Resource not accessible |
Insufficient permissions | Check app has required permissions enabled |
404: Not Found |
App not installed | Install app on repository |
Authentication failed |
Invalid token | Verify private key is correct |
Invalid signature |
Wrong webhook secret | Update GITHUB_WEBHOOK_SECRET to match app |
Before deploying, ensure:
- Repository permissions set: Contents (Read), Pull requests (Read & Write)
- Subscribed to "Pull request" events
- App installed on target repository/organization
- Webhook secret configured and matches environment variable
- Private key properly encoded and set in environment
The handle_pr method in src/app.py is the main webhook handler that processes pull request synchronize events.
- Event:
pull_request.synchronize - When: Triggered when new commits are pushed to an existing pull request
Extracts structured data from the GitHub webhook payload using the PullRequestPayload model.
Captured fields:
action: Event action type (e.g., "synchronize")install_id: GitHub App installation IDrepository: Full repository name (owner/repo)branch: PR branch namecommit_sha: Latest commit SHAsender_login: GitHub username who triggered the eventdefault_branch: Repository's default branchnumber: Pull request numberstate: PR state (open, closed)merged_at: Merge timestamp (None if not merged)closed_at: Close timestamp (None if not closed)clone_url: HTTPS clone URL for the repository
Validates the PR is in a valid state for processing via is_valid_for_processing().
Requirements:
- State must be "open"
- Not merged (
merged_atis None) - Not closed (
closed_atis None)
Logs warning and exits early if validation fails.
- Retrieves cached installation token or fetches new one
- Tokens cached for efficiency (5-minute buffer before expiration)
- Automatic refresh when expired
- Clones the repository to
/tmp/{repo}-{pr_number}-{short_sha} - Uses GitHub App installation token for authentication
- Automatically checks out the PR branch
- Unique directory per commit (uses first 7 chars of SHA)
- Recursively analyzes repository structure
- Counts files and directories
- Excludes
.gitdirectory from analysis - Logs complete directory tree structure
Posts an automated bot comment to the PR with:
- 🤖 Bot indicator
- Clone directory location
- Branch name
- File count
- Directory count
- UTC timestamp
Comment template can be customized in src/constants.py.
- Automatically deletes cloned repository after processing
- Ensures disk space is conserved
- Runs even if errors occur (finally block)
See src/model.py for the PullRequestPayload dataclass structure.
Message templates are stored in src/constants.py for easy customization.
pytest tests/ -vpytest tests/test_utils.py -v
pytest tests/test_cache.py -v
pytest tests/test_model.py -v
pytest tests/test_repo.py -vIf you don't have pytest-cov installed:
pytest tests/ -v --no-covpytest tests/ --cov=src --cov-report=htmlThen open htmlcov/index.html in your browser to view the coverage report.
- test_utils.py: 29 tests covering utility functions
- test_cache.py: 16 tests covering token caching
- test_model.py: 12 tests covering data models
- test_repo.py: 20 tests covering repository operations
# Activate virtual environment
pipenv shell
# Install dependencies
pipenv sync
# Lock dependencies
pipenv lock
# Freeze requirements
pip freeze > requirements.txt
# Install from requirements
pip install --no-cache-dir -r requirements.txt
# Show virtual environment path
pipenv --venv# Run all tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=src
# Run specific test class
pytest tests/test_cache.py::TestTokenCache -v
# Run specific test
pytest tests/test_utils.py::TestDecodeBase64Key::test_decode_valid_base64_key -v# Build and run with docker-compose
mk docker-compose-up
# Stop containers
mk docker-compose-down
# View logs
docker-compose logs -fProblem: Failed to decode base64 key
- Solution: Ensure private key is properly base64 encoded without newlines
- Check: Run
cat key.pem | base64 | tr -d '\n'
Problem: Token expired errors
- Solution: Check token cache expiration (5-minute buffer)
- Action: Clear cache by restarting the application
Problem: Authentication failed for repository
- Solution: Verify GitHub App has repository access
- Check: Installation permissions - requires Contents: Read
- Verify: App is installed on the repository
- See: GitHub App Permissions section
Problem: 403: Resource not accessible by integration
- Solution: Missing required permissions
- Check: Repository permissions include Contents (Read) and Pull requests (Read & Write)
- Fix: Update permissions in GitHub App settings → Permissions & events
Problem: Directory already exists
- Solution: This shouldn't happen with unique SHA-based naming
- Debug: Check cleanup logic in
repo.py
Problem: Cannot post comments to PR
- Solution: Missing Pull requests write permission
- Check: GitHub App has "Pull requests: Read and write" enabled
- Verify: App is installed with correct permissions
Problem: No module named 'pytest-cov'
- Solution: Install pytest-cov:
pip install pytest-cov - Alternative: Run with
pytest --no-cov
Problem: Tests pass locally but fail in CI
- Solution: Check environment variables are set
- Verify: Python version compatibility (3.14+)
Problem: Webhook not triggering
- Solution: Check Smee.io channel is running
- Verify: Webhook secret matches in both GitHub and
.envrc - Check: GitHub App is installed and has correct permissions
Problem: Invalid signature errors
- Solution: Webhook secret mismatch
- Fix: Update
GITHUB_WEBHOOK_SECRETto match GitHub App settings
- Create GitHub App
- Smee.io - Webhook proxy for local development
- How to get tokens
- Probot Smee - Alternative Smee client
See LICENSE file for details.
Contributions are welcome! Please ensure:
- All tests pass:
pytest tests/ -v - Code follows existing style
- New features include tests
- Documentation is updated