RudderStack is a customer data pipeline tool for collecting, routing and processing data from your websites, apps, cloud tools, and data warehouse.
More information on RudderStack can be found here.
$ git clone git@github.com:rudderlabs/rudderstack-helm.git
$ cd rudderstack-helm/charts/rudderstack
$ helm dependency build
$ helm install my-release ./ --set rudderWorkspaceToken="<workspace token from the dashboard>"The RudderStack Helm chart creates a Rudderstack deployment on a Kubernetes cluster using the Helm package manager.
- Kubectl installed and connected to your kubernetes cluster
- Helm installed
- Workspace token from the RudderStack dashboard. Set up your account and copy your workspace token from the top of the home page.
To install the chart with the release name my-release, from the root directory of this repo:
$ helm install my-release ./ --set rudderWorkspaceToken="<workspace token from the dashboard>"The command deploys Rudderstack on the default Kubernetes cluster configured with kubectl.
The configuration section lists the most significant parameters that can be configured during
deployment.
To update configuration or version of the images used, change the configuration and run:
$ helm upgrade my-release ./ --set rudderWorkspaceToken="<workspace token from the dashboard>"To uninstall/delete the my-release deployment:
$ helm uninstall my-releaseThis removes all the components created by this chart.
To run a dry-run to evaluate if the changes proposed would be applied properly we can execute:
helm template ./ | kubectl apply --dry-run=client -f -We contemplate three options on having Postgres as a dependency.
- Deploying it as a Sidecar in the same stateful resource
- Deploying a new Statefulset with Postgres.
- Providing an external Postgres.
To enable the sidecar mode, specify:
postgresql:
mode: sidecar
statefulset_enabled: falseTo enable the sidecar mode, specify:
postgresql:
mode: statefulset
statefulset_enabled: trueOnly recommended with postgresql sidecar mode enable.
Currently, only supported for
backend.controlPlaneJSON:truesince the pre-stop hook reads from the local config guaranteeing that all the events reached the destination so no event is lost on the autoscaling down process.
Horizontal Pod Autoscaling is available in case of resource efficiency requirement.
backend:
terminationGracePeriodSeconds: xx
lifecycleSleepTime: xx
hpa:
enabled: trueAlso, make sure you define the lifecycleSleepTime & the terminationGracePeriodSeconds bigger
than BatchRouter.uploadFreqInS otherwise K8s will kill the pods before flushing the data into their destinations.
If you are using open-source config-generator UI, you need to set the parameter controlPlaneJSON to true in
the values.yaml file. Export workspace-config from the config-generator and copy/paste the contents into
the workspaceConfig.json file.
$ helm install my-release ./ --set backend.controlPlaneJSON=trueSince we are publishing the Chart under the {{ TBC by the RudderStack team }} page. It's possible to extend this Chart by adding it as a dependency into your own Chart, so there is no need to git clone this repo for deploying RudderStack open-source into your infrastructure.
apiVersion: v2
name: rudderstack
description: Customer Data Pipeline tool for collecting, routing and processing data.
maintainers:
- name: Data Platform
email: xxxx@xxxx.com
version: 0.4.5
appVersion: 1.16.0
dependencies:
# https://github.com/rudderlabs/rudderstack-helm
- name: rudderstack
version: 0.4.5
repository: https://TBC.github.io/rudderstack-helm # To Be Confirmed by the RudderStack teamIf you are using Google Cloud Storage or Google BigQuery for the following cases, you have to replace the contents of the file rudder-google-application-credentials.json with your service account:
- GCS as a destination
- GCS for dumping jobs
- BigQuery as a warehouse destination.
The following table lists the configurable parameters of the Rudderstack chart and their default values.
| Parameter | Description | Default |
|---|---|---|
rudderWorkspaceToken |
Workspace token from the dashboard | - |
backend.image.repository |
Container image repository for the backend | rudderlabs/rudder-server |
backend.image.version |
Container image tag for the backend. Available versions | v0.1.6 |
backend.image.pullPolicy |
Container image pull policy for the backend image | Always |
transformer.image.repository |
Container image repository for the transformer | rudderlabs/transformer |
transformer.image.version |
Container image tag for the transformer. Available versions | v0.1.2 |
transformer.image.pullPolicy |
Container image pull policy for the transformer image | Always |
backend.extraEnvVars |
Extra environments variables to be used by the backend in the deployments | Refer values.yaml file |
backend.controlPlaneJSON |
If true, backend will read config from the workspaceConfig.json file |
false |
Each of these parameters can be changed in values.yaml. Or specify each parameter using
the --set key=value[,key=value] argument to helm install. For example:
$ helm install --name my-release \
--set backend.image.version=v0.1.6 \
./Note: Configuration specific to:
- Backend can be edited in rudder-config.yaml.
- PostgreSQL can be edited in
pg_hba.conf,postgresql.conf
Installing this Helm chart will deploy the following pods and containers in the configured cluster:
- rudderstack-backend
- rudderstack-telegraf-sidecar
- rudderstack-postgresql-sidecar
- transformer
For any queries related to using the RudderStack Helm Chart, feel free to start a conversation on our Slack channel.