Skip to content

A lightweight, self-hosted headless browser automation platform. Designed as an alternative to Browserless, built for speed, privacy, and scalability.

License

Notifications You must be signed in to change notification settings

sascer/HeadlessX

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

91 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

πŸš€ HeadlessX v1.3.0

Advanced Anti-Detection Web Scraping API with Comprehensive Fingerprinting Control

License: MIT Version Node.js Playwright

GitHub Stars GitHub Forks Docker CI/CD

Discussions Open Source Contributors Issues PRs Welcome

HeadlessX Demo

🎯 Unified Solution: Website + API on a single domain
πŸ›‘οΈ Advanced Anti-Detection: Canvas/WebGL/Audio spoofing, behavioral simulation
🧠 Human-like Behavior: Bezier mouse movements, keyboard dynamics, natural scrolling
πŸš€ Deploy Anywhere: Docker, Node.js+PM2, or Development


πŸ—ΊοΈ What's Coming Next?

πŸš€ HeadlessX v2.0 - Full-Stack AI-Powered Platform

The future of intelligent web scraping is here

Roadmap

🎯 Revolutionary Features Coming:

  • πŸ€– AI-Powered Admin Panel - Intelligent task management & automation
  • 🎨 Modern React Frontend - Sleek, responsive dashboard interface
  • 🧠 Smart Automation - AI-driven scraping strategies & optimization
  • πŸ“Š Advanced Analytics - Real-time insights & performance metrics
  • πŸ”„ Workflow Builder - Visual scraping pipeline creation
  • πŸŽ›οΈ Enterprise Controls - Advanced user management & permissions

Transform your web scraping experience with the next generation of HeadlessX


✨ v1.3.0 Key Features

πŸ›‘οΈ Advanced Anti-Detection Engine

  • Canvas Fingerprinting Control - Dynamic noise injection with consistent seeds
  • WebGL Spoofing - GPU vendor/model spoofing with realistic profiles
  • Audio Context Manipulation - Hardware audio fingerprint database
  • WebRTC Leak Prevention - Complete IP leak protection
  • Hardware Fingerprint Spoofing - CPU, memory, and performance masking

🧠 Human-like Behavioral Simulation

  • Bezier Mouse Movement - Natural acceleration and deceleration patterns
  • Keyboard Dynamics - Realistic dwell time and flight time variations
  • Natural Scroll Patterns - Reader, scanner, browser behavioral profiles
  • Attention Model Simulation - Human-like focus and interaction patterns
  • Micro-movement Injection - Sub-pixel accuracy for maximum realism

🌐 WAF Bypass Capabilities

  • Cloudflare Bypass - Advanced challenge solving and TLS fingerprinting
  • DataDome Evasion - Resource blocking and behavioral pattern matching
  • Incapsula/Akamai - Generic WAF bypass with adaptive techniques
  • HTTP/2 Fingerprinting - Stream prioritization and header ordering

πŸ“Š Comprehensive Device Profiles

  • 50+ Chrome Profiles - Desktop, mobile, and tablet configurations
  • Hardware Consistency - CPU, GPU, memory, and sensor correlation
  • Geolocation Intelligence - Timezone, language, and locale matching
  • Profile Validation - Real-time consistency checking and scoring

Choose your deployment:

Method Command Best For
🐳 Docker docker-compose up -d Production, easy deployment
πŸ”§ Auto Setup chmod +x scripts/setup.sh && sudo ./scripts/setup.sh VPS/Server with full control
πŸ’» Development npm install && npm start Local development, testing

Access your HeadlessX v1.3.0:

🌐 Website:  https://your-subdomain.yourdomain.com
πŸ”— API:      https://your-subdomain.yourdomain.com/api
πŸ›‘οΈ Stealth:  https://your-subdomain.yourdomain.com/api/render/stealth
πŸ§ͺ Testing:  https://your-subdomain.yourdomain.com/api/test-fingerprint
πŸ“± Profiles: https://your-subdomain.yourdomain.com/api/profiles
πŸ”§ Health:   https://your-subdomain.yourdomain.com/api/health
πŸ“Š Status:   https://your-subdomain.yourdomain.com/api/status?token=YOUR_AUTH_TOKEN

πŸ—οΈ Enhanced Anti-Detection Architecture v1.3.0

HeadlessX v1.3.0 introduces advanced anti-detection capabilities with comprehensive fingerprinting control, behavioral simulation, and WAF bypass techniques while maintaining the modular architecture from v1.2.0.

v1.3.0 Key Enhancements:

  • πŸ›‘οΈ Advanced Anti-Detection: Canvas, WebGL, Audio, WebRTC fingerprinting control
  • 🎭 Behavioral Simulation: Human-like mouse movement with Bezier curves and keyboard dynamics
  • 🌐 WAF Bypass: Cloudflare, DataDome, and advanced evasion techniques
  • πŸ“± Device Profiling: Comprehensive desktop and mobile device profiles with hardware spoofing
  • πŸ§ͺ Testing Framework: Comprehensive anti-detection testing and validation
  • πŸ”§ Separation of Concerns: Enhanced modules for fingerprinting, behavioral, and evasion services
  • πŸš€ Better Performance: Optimized browser management with intelligent profile-based pooling
  • πŸ› οΈ Developer Experience: Development tools, profile generators, and interactive testing
  • πŸ“¦ Production Ready: Enhanced error handling, detection analytics, and profile validation
  • πŸ”’ Security: Advanced authentication, profile management, and secure fingerprint storage
  • πŸ“Š Monitoring: Real-time detection monitoring, success rate analytics, and performance benchmarking

v1.3.0 Architecture Overview:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Routes        │───▢│   Controllers   │───▢│   Services      β”‚
β”‚   (api.js)      β”‚    β”‚   (rendering.js)β”‚    β”‚   (browser.js)  β”‚
β”‚   (admin.js)    β”‚    β”‚   (profiles.js) β”‚    β”‚   (stealth.js)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚   (detection.js)β”‚    β”‚   (interaction.js)
         β”‚              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β–Ό                       β”‚                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β–Ό                       β–Ό
β”‚   Middleware    β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   (auth.js)     β”‚    β”‚   Utils         β”‚    β”‚   Config        β”‚
β”‚   (error.js)    β”‚    β”‚   (logger.js)   β”‚    β”‚   (index.js)    β”‚
β”‚   (analyzer.js) β”‚    β”‚   (helpers.js)  β”‚    β”‚   (browser.js)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚   (validator.js)β”‚    β”‚   (profiles/)   β”‚
         β”‚              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β–Ό                       β”‚                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β–Ό                       β–Ό
β”‚ Fingerprinting  β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ (canvas-spoof)  β”‚    β”‚   Behavioral    β”‚    β”‚    Evasion      β”‚
β”‚ (webgl-spoof)   β”‚    β”‚ (mouse-movement)β”‚    β”‚ (cloudflare)    β”‚
β”‚ (audio-context) β”‚    β”‚ (keyboard-dyn)  β”‚    β”‚ (datadome)      β”‚
β”‚ (webrtc-ctrl)   β”‚    β”‚ (scroll-pattern)β”‚    β”‚ (waf-bypass)    β”‚
β”‚ (hardware-noise)β”‚    β”‚ (attention-mod) β”‚    β”‚ (tls-fingerpr)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                       β”‚
         β–Ό                       β–Ό                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    Testing      β”‚    β”‚  Development    β”‚    β”‚    Profiles     β”‚
β”‚ (test-framework)β”‚    β”‚   (dev-tools)   β”‚    β”‚ (chrome-prof)   β”‚
β”‚ (detection-test)β”‚    β”‚ (profile-gen)   β”‚    β”‚ (mobile-prof)   β”‚
β”‚ (performance)   β”‚    β”‚ (fingerpr-test) β”‚    β”‚ (firefox-prof)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Migration from v1.2.0:

  • All v1.2.0 functionality preserved with enhanced anti-detection capabilities
  • New environment variables for fingerprint control and stealth configuration
  • Enhanced API endpoints for profile management and detection testing
  • Backward compatible with all existing configurations and scripts

πŸ“– Detailed Documentation: MODULAR_ARCHITECTURE.md


πŸš€ Deployment Guide

🐳 Docker Deployment (Recommended)

# Install Docker (if needed)
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER

# Deploy HeadlessX
git clone https://github.com/SaifyXPRO/HeadlessX.git
cd HeadlessX
cp .env.example .env
nano .env  # Configure DOMAIN, SUBDOMAIN, AUTH_TOKEN

# Start services
docker-compose up -d

# Optional: Setup SSL
apt install certbot python3-certbot-nginx
certbot --nginx -d your-subdomain.yourdomain.com

Docker Management:

docker-compose ps              # Check status
docker-compose logs headlessx  # View logs
docker-compose restart         # Restart services
docker-compose down            # Stop services

πŸ”§ Node.js + PM2 Deployment

# Automated setup (recommended)
git clone https://github.com/SaifyXPRO/HeadlessX.git
cd HeadlessX
cp .env.example .env
nano .env  # Configure environment
chmod +x scripts/setup.sh
sudo ./scripts/setup.sh  # Installs dependencies, builds website, starts PM2

🌐 Nginx Configuration (Auto-handled by setup script):

The setup script automatically configures nginx, but if you need to manually configure:

# Copy and configure nginx site
sudo cp nginx/headlessx.conf /etc/nginx/sites-available/headlessx

# Replace placeholders with your actual domain
sudo sed -i 's/SUBDOMAIN.DOMAIN.COM/your-subdomain.yourdomain.com/g' /etc/nginx/sites-available/headlessx

# Enable the site
sudo ln -sf /etc/nginx/sites-available/headlessx /etc/nginx/sites-enabled/
sudo rm -f /etc/nginx/sites-enabled/default

# Test and reload nginx
sudo nginx -t && sudo systemctl reload nginx

Manual setup (if not using setup script):

sudo apt update && sudo apt upgrade -y
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs build-essential
npm install && npm run build
sudo npm install -g pm2
npm run pm2:start

PM2 Management:

npm run pm2:status     # Check status
npm run pm2:logs       # View logs
npm run pm2:restart    # Restart server
npm run pm2:stop       # Stop server

πŸ’» Development Setup

git clone https://github.com/SaifyXPRO/HeadlessX.git
cd HeadlessX
cp .env.example .env
nano .env  # Set AUTH_TOKEN, DOMAIN=localhost, SUBDOMAIN=headlessx

# Make scripts executable
chmod +x scripts/*.sh

# Install dependencies
npm install
cd website && npm install && npm run build && cd ..

# Start development server
npm start  # Access at http://localhost:3000

🌐 API Routes & Structure

HeadlessX Routes:
β”œβ”€β”€ /favicon.ico         β†’ Favicon
β”œβ”€β”€ /robots.txt          β†’ SEO robots file
β”œβ”€β”€ /api/health         β†’ Health check (no auth required)
β”œβ”€β”€ /api/status         β†’ Server status (requires token)
β”œβ”€β”€ /api/render         β†’ Full page rendering
β”œβ”€β”€ /api/html           β†’ HTML extraction  
β”œβ”€β”€ /api/content        β†’ Clean text extraction
β”œβ”€β”€ /api/screenshot     β†’ Screenshot generation
β”œβ”€β”€ /api/pdf            β†’ PDF generation
└── /api/batch          β†’ Batch URL processing

πŸ”„ Request Flow:

  1. Nginx receives request on port 80/443
  2. Proxies to Node.js server on port 3000
  3. Server routes based on path:
    • /api/* β†’ API endpoints
    • /* β†’ Website files (built Next.js app)

πŸš€ API Examples & HTTP Integrations

Quick Health Check (No Auth)

curl https://your-subdomain.yourdomain.com/api/health

πŸ”§ cURL Examples

πŸ›‘οΈ v1.3.0 Anti-Detection Rendering (Maximum Stealth)

curl -X POST "https://your-subdomain.yourdomain.com/api/render/stealth?token=YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "profile": "desktop-chrome",
    "stealthMode": "maximum",
    "behaviorSimulation": true,
    "timeout": 30000
  }'

πŸ“± Mobile Device Simulation

curl -X POST "https://your-subdomain.yourdomain.com/api/render?token=YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "profile": "iphone-14-pro",
    "geolocation": {"latitude": 40.7128, "longitude": -74.0060},
    "behaviorSimulation": true
  }'

πŸ§ͺ Test Anti-Detection Capabilities

curl -X POST "https://your-subdomain.yourdomain.com/api/test-fingerprint?token=YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "profile": "desktop-chrome",
    "testCanvas": true,
    "testWebGL": true,
    "testAudio": true
  }'

πŸ“Š Get Available Device Profiles

curl "https://your-subdomain.yourdomain.com/api/profiles?token=YOUR_AUTH_TOKEN"

🎭 Behavioral Simulation with WAF Bypass

curl -X POST "https://your-subdomain.yourdomain.com/api/render?token=YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "profile": "desktop-firefox",
    "cloudflareBypass": true,
    "datadomeBypass": true,
    "mouseMovement": "natural",
    "keyboardDynamics": "human",
    "timeout": 45000
  }'

Extract HTML Content

curl -X POST "https://your-subdomain.yourdomain.com/api/html?token=YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "timeout": 30000}'

Generate Screenshot

curl "https://your-subdomain.yourdomain.com/api/screenshot?token=YOUR_AUTH_TOKEN&url=https://example.com&fullPage=true" \
  -o screenshot.png

Extract Text Only

curl -X POST "https://your-subdomain.yourdomain.com/api/text?token=YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "waitForSelector": "main"}'

Generate PDF

curl -X POST "https://your-subdomain.yourdomain.com/api/pdf?token=YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "format": "A4"}' \
  -o document.pdf

πŸ€– Make.com (Integromat) Integration

HTTP Request Module Configuration:

{
  "url": "https://your-subdomain.yourdomain.com/api/html",
  "method": "POST",
  "headers": {
    "Content-Type": "application/json"
  },
  "qs": {
    "token": "YOUR_AUTH_TOKEN"
  },
  "body": {
    "url": "{{url_to_scrape}}",
    "timeout": 30000,
    "waitForSelector": "{{optional_selector}}"
  }
}

⚑ Zapier Integration

Webhooks by Zapier Setup:

  • URL: https://your-subdomain.yourdomain.com/api/html?token=YOUR_AUTH_TOKEN
  • Method: POST
  • Headers: Content-Type: application/json
  • Body:
{
  "url": "{{url_from_trigger}}",
  "timeout": 30000,
  "humanBehavior": true
}

πŸ”— n8n Integration

HTTP Request Node:

{
  "url": "https://your-subdomain.yourdomain.com/api/html",
  "method": "POST",
  "authentication": "queryAuth",
  "query": {
    "token": "YOUR_AUTH_TOKEN"
  },
  "headers": {
    "Content-Type": "application/json"
  },
  "body": {
    "url": "={{$json.url}}",
    "timeout": 30000,
    "humanBehavior": true
  }
}

Available via n8n Community Node:

🐍 Python Example

import requests

def scrape_with_headlessx(url, token):
    response = requests.post(
        "https://your-subdomain.yourdomain.com/api/html",
        params={"token": token},
        json={
            "url": url,
            "timeout": 30000,
            "humanBehavior": True
        }
    )
    return response.json()

# Usage
result = scrape_with_headlessx("https://example.com", "YOUR_TOKEN")
print(result['html'])

🟨 JavaScript/Node.js Example

const axios = require('axios');

async function scrapeWithHeadlessX(url, token) {
  try {
    const response = await axios.post(
      `https://your-subdomain.yourdomain.com/api/html?token=${token}`,
      {
        url: url,
        timeout: 30000,
        humanBehavior: true
      }
    );
    return response.data;
  } catch (error) {
    console.error('Scraping failed:', error.message);
    throw error;
  }
}

// Usage
scrapeWithHeadlessX('https://example.com', 'YOUR_TOKEN')
  .then(result => console.log(result.html))
  .catch(error => console.error(error));

πŸ”„ Batch Processing Example

curl -X POST "https://your-subdomain.yourdomain.com/api/batch?token=YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": [
      "https://example1.com",
      "https://example2.com",
      "https://example3.com"
    ],
    "timeout": 30000,
    "humanBehavior": true
  }'

Batch Processing

curl -X POST "https://your-subdomain.yourdomain.com/api/batch?token=YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://example.com", "https://httpbin.org"],
    "format": "text",
    "options": {"timeout": 30000}
  }'

πŸ“ Project Structure

HeadlessX v1.3.0 - Enhanced Anti-Detection Architecture/
β”œβ”€β”€ πŸ“‚ src/                         # Modular application source
β”‚   β”œβ”€β”€ πŸ“‚ config/                  # Configuration management
β”‚   β”‚   β”œβ”€β”€ index.js               # Main configuration loader
β”‚   β”‚   └── browser.js             # Browser-specific settings
β”‚   β”œβ”€β”€ πŸ“‚ utils/                   # Utility functions
β”‚   β”‚   β”œβ”€β”€ errors.js              # Error handling & categorization
β”‚   β”‚   β”œβ”€β”€ logger.js              # Structured logging
β”‚   β”‚   └── helpers.js             # Common utilities
β”‚   β”œβ”€β”€ πŸ“‚ services/                # Business logic services
β”‚   β”‚   β”œβ”€β”€ browser.js             # Browser lifecycle management
β”‚   β”‚   β”œβ”€β”€ stealth.js             # Anti-detection techniques
β”‚   β”‚   β”œβ”€β”€ interaction.js         # Human-like behavior
β”‚   β”‚   └── rendering.js           # Core rendering logic
β”‚   β”œβ”€β”€ πŸ“‚ middleware/              # Express middleware
β”‚   β”‚   β”œβ”€β”€ auth.js                # Authentication
β”‚   β”‚   └── error.js               # Error handling
β”‚   β”œβ”€β”€ πŸ“‚ controllers/             # Request handlers
β”‚   β”‚   β”œβ”€β”€ system.js              # Health & status endpoints
β”‚   β”‚   β”œβ”€β”€ rendering.js           # Main rendering endpoints
β”‚   β”‚   β”œβ”€β”€ batch.js               # Batch processing
β”‚   β”‚   └── get.js                 # GET endpoints & docs
β”‚   β”œβ”€β”€ πŸ“‚ routes/                  # Route definitions
β”‚   β”‚   β”œβ”€β”€ api.js                 # API route mappings
β”‚   β”‚   └── static.js              # Static file serving
β”‚   β”œβ”€β”€ app.js                     # Main application setup
β”‚   β”œβ”€β”€ server.js                  # Entry point for PM2
β”‚   └── rate-limiter.js            # Rate limiting implementation
β”œβ”€β”€ πŸ“‚ website/                     # Next.js website (unchanged)
β”‚   β”œβ”€β”€ app/                        # Next.js 13+ app directory
β”‚   β”œβ”€β”€ components/                 # React components
β”‚   β”œβ”€β”€ .env.example               # Website environment template
β”‚   β”œβ”€β”€ next.config.js             # Next.js configuration
β”‚   └── package.json               # Website dependencies
β”œβ”€β”€ πŸ“‚ scripts/                     # Deployment & management scripts
β”‚   β”œβ”€β”€ setup.sh                   # Automated installation (updated)
β”‚   β”œβ”€β”€ update_server.sh           # Server update script (updated)
β”‚   β”œβ”€β”€ verify-domain.sh           # Domain verification
β”‚   └── test-routing.sh            # Integration testing
β”œβ”€β”€ πŸ“‚ nginx/                       # Nginx configuration
β”‚   └── headlessx.conf             # Nginx proxy config
β”œβ”€β”€ πŸ“‚ docker/                      # Docker deployment (updated)
β”‚   β”œβ”€β”€ Dockerfile                 # Container definition
β”‚   └── docker-compose.yml         # Docker Compose setup
β”œβ”€β”€ ecosystem.config.js            # PM2 configuration (moved to root)
β”œβ”€β”€ .env.example                   # Environment template (updated)
β”œβ”€β”€ package.json                   # Server dependencies (updated)
β”œβ”€β”€ docs/
β”‚   └── MODULAR_ARCHITECTURE.md   # Architecture documentation
└── README.md                      # This file

πŸ› οΈ Development

Local Development

# 1. Install dependencies
npm install

# 2. Build website
cd website
npm install
npm run build
cd ..

# 3. Set environment variables
export AUTH_TOKEN="development_token_123"
export DOMAIN="localhost"
export SUBDOMAIN="headlessx"

# 4. Start server
npm start  # Uses src/app.js

# 5. Access locally
# Website: http://localhost:3000
# API: http://localhost:3000/api/health

Testing Integration

# Test server and website integration
bash scripts/test-routing.sh localhost

# Test with environment variables
bash scripts/verify-domain.sh

βš™οΈ Configuration

🌐 Environment Variables (.env)

Create your .env file from the template:

cp .env.example .env
nano .env

Required configuration:

# Security Token (Generate a secure random string)
AUTH_TOKEN=your_secure_token_here

# Domain Configuration  
DOMAIN=yourdomain.com
SUBDOMAIN=headlessx

# Optional: Browser Settings
BROWSER_TIMEOUT=60000
MAX_CONCURRENT_BROWSERS=5

# Optional: Server Settings
PORT=3000
NODE_ENV=production

🌐 Nginx Domain Setup

Option 1: Automatic (Recommended)

# The setup script automatically replaces domain placeholders
sudo ./scripts/setup.sh

Option 2: Manual Configuration

# Copy nginx configuration
sudo cp nginx/headlessx.conf /etc/nginx/sites-available/headlessx

# Replace domain placeholders (replace with your actual domain)
sudo sed -i 's/SUBDOMAIN.DOMAIN.COM/headlessx.yourdomain.com/g' /etc/nginx/sites-available/headlessx

# Example: If your domain is "api.example.com"
sudo sed -i 's/SUBDOMAIN.DOMAIN.COM/api.example.com/g' /etc/nginx/sites-available/headlessx

# Enable site and reload nginx
sudo ln -sf /etc/nginx/sites-available/headlessx /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

Your final URLs will be:

  • Website: https://your-subdomain.yourdomain.com
  • API Health: https://your-subdomain.yourdomain.com/api/health
  • API Endpoints: https://your-subdomain.yourdomain.com/api/*

πŸ“Š API Reference

πŸ”§ Core Endpoints

Endpoint Method Description Auth Required
/api/health GET Health check ❌
/api/status GET Server status βœ…
/api/render POST Full page rendering (JSON) βœ…
/api/html GET/POST Raw HTML extraction βœ…
/api/content GET/POST Clean text extraction βœ…
/api/screenshot GET Screenshot generation βœ…
/api/pdf GET PDF generation βœ…
/api/batch POST Batch URL processing βœ…

πŸ”‘ Authentication

All endpoints (except /api/health) require a token via:

  • Query parameter: ?token=YOUR_TOKEN
  • Header: X-Token: YOUR_TOKEN
  • Header: Authorization: Bearer YOUR_TOKEN

πŸ“– Complete Documentation

Visit your HeadlessX website for full API documentation with examples, or check:


πŸ“Š Monitoring & Troubleshooting

πŸ” Health Checks

curl https://your-subdomain.yourdomain.com/api/health
curl "https://your-subdomain.yourdomain.com/api/status?token=YOUR_TOKEN"

πŸ“‹ Log Management

# PM2 logs
npm run pm2:logs
pm2 logs headlessx --lines 100

# Docker logs
docker-compose logs -f headlessx

# Nginx logs
sudo tail -f /var/log/nginx/access.log

πŸ”„ Updates

git pull origin main
npm run build          # Rebuild website
npm run pm2:restart     # PM2
# OR
docker-compose restart  # Docker

πŸ”§ Common Issues

"npm ci" Error (missing package-lock.json):

chmod +x scripts/generate-lockfiles.sh
./scripts/generate-lockfiles.sh  # Generate lock files
# OR
npm install --production  # Use install instead

"Cannot find module 'express'":

npm install  # Install dependencies

System dependency errors (Ubuntu):

sudo apt update && sudo apt install -y \
  libatk1.0-0t64 libatk-bridge2.0-0t64 libcups2t64 \
  libatspi2.0-0t64 libasound2t64 libxcomposite1

PM2 not starting:

sudo npm install -g pm2
chmod +x scripts/setup.sh  # Make script executable
pm2 start config/ecosystem.config.js
pm2 logs headlessx  # Check errors

Script permission errors:

# Make all scripts executable
chmod +x scripts/*.sh

# Or use the quick setup
chmod +x scripts/quick-setup.sh && ./scripts/quick-setup.sh

Playwright browser installation errors:

# Use dedicated Playwright setup script
chmod +x scripts/setup-playwright.sh
./scripts/setup-playwright.sh

# Or install manually:
sudo apt update && sudo apt install -y \
  libgtk-3-0t64 libpangocairo-1.0-0 libcairo-gobject2 \
  libgdk-pixbuf-2.0-0 libdrm2 libxss1 libxrandr2 \
  libasound2t64 libatk1.0-0t64 libnss3

# Install only Chromium (most stable)
npx playwright install chromium

# Alternative: Use Docker (avoids dependency issues)
docker-compose up -d

πŸ” Security Features

  • Token Authentication: Secure API access with custom tokens
  • Rate Limiting: Nginx-level request throttling
  • Security Headers: XSS, CSRF, and clickjacking protection
  • Bot Protection: Common attack vector blocking
  • SSL/TLS: Automatic HTTPS with Let's Encrypt

🀝 Contributing

We welcome contributions from the community! Whether you're fixing bugs, adding features, improving documentation, or sharing ideas, your input is valuable.

Ways to Contribute

  1. πŸ› Report Bugs: Create a bug report
  2. πŸ’‘ Suggest Features: Share your ideas
  3. πŸ“– Improve Docs: Help make our documentation better
  4. πŸ’» Submit Code: Fork, code, and create a pull request

Development Workflow

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Please read CONTRIBUTING.md for detailed guidelines.


οΏ½ Community

Join our growing community of developers, data scientists, and automation enthusiasts!

πŸ’¬ Discussions

General Q&A Ideas Show & Tell

Get Help & Share Knowledge

Community Guidelines

  • Be respectful and inclusive
  • Help others learn and grow
  • Share knowledge and experiences
  • Report issues constructively
  • Follow our Code of Conduct

οΏ½πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ†˜ Support & Resources

Resource Description Link
πŸ“– Documentation Complete API reference & guides View Docs
πŸ› Bug Reports Found a bug? Report it here Report Bug
πŸ’‘ Feature Requests Suggest new features Request Feature
πŸ”’ Security Report security vulnerabilities Security Policy
πŸ’¬ Discussions Community Q&A & discussions Join Discussions
πŸ“Š Project Board Track development progress View Board
πŸ“ Changelog See what's new View Changes
πŸ—ΊοΈ Roadmap Future plans & features View Roadmap

Quick Links


🎯 Built with ❀️ by SaifyXPRO

HeadlessX v1.3.0 - The most advanced open-source anti-detection web scraping solution.

Star us on GitHub! ⭐

GitHub stars GitHub forks GitHub watchers


Made with πŸš€ by developers, for developers

Made with ❀️ for the developer community.

About

A lightweight, self-hosted headless browser automation platform. Designed as an alternative to Browserless, built for speed, privacy, and scalability.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 88.6%
  • TypeScript 6.3%
  • Shell 4.4%
  • Other 0.7%