Docs · Report Bug · Roadmap · Get Help · Watch Demo · Free Swag
GlassFlow is an open-source ETL tool that enables real-time data processing from Kafka to ClickHouse. GlassFlow pipelines can perform the following operations:
- Deduplicate: Remove duplicate records based on configurable keys and time windows - use when you need to ensure data uniqueness
- Join: Perform temporal joins between multiple Kafka topics - use when combining related data streams with time-based matching
- Deduplicate & Join: Combine both deduplication and joining in a single pipeline
- Ingest only: Direct data transfer from Kafka to ClickHouse without transformations
To get started with GlassFlow, you can:
- Try the Live Demo: Experience GlassFlow running on a live cluster at demo.glassflow.dev
- Install on Kubernetes: Follow our Kubernetes Installation Guide for production deployment
- Learn More: Explore our Usage Guide to start creating pipelines
GlassFlow is open source and can be self-hosted on Kubernetes. GlassFlow works with any managed Kubernetes services like AWS EKS, GKE, AKS, and more.
| Method | Use Case | Docs Link |
|---|---|---|
| ☸️ Kubernetes with Helm | Production and development deployment | Kubernetes Helm Guide |
Log in and see a working demo of GlassFlow running on a GPC cluster at demo.glassflow.dev. You will see a Grafana dashboard and the setup that we used.
For detailed documentation, visit docs.glassflow.dev. The documentation includes:
Check out our public roadmap to see what's coming next in GlassFlow. We're actively working on new features and improvements based on community feedback.
Want to suggest a feature? We'd love to hear from you! Please use our GitHub Discussions to share your ideas and help shape the future of GlassFlow.
- Streaming deduplication and joins for up to 7d through an inbuilt state store
- ClickHouse sink with a native protocol for high performance
- Built-in Kafka connector with SASL, SSL, etc. for nearly all Kafka providers
- Dead-Letter Queue for handling failed events
- Field mapping of your Kafka table to ClickHouse
- Prometheus metrics and OpenTelemetry logs for comprehensive observability
This project is licensed under the Apache License 2.0.
