Skip to content

A collection of hands-on examples, helper utilities, Jupyter notebooks, and data workflows to explore the OKDP Platform.

License

Notifications You must be signed in to change notification settings

OKDP/okdp-examples

Repository files navigation

ci Release License Apache2

A collection of hands-on examples, helper utilities, Jupyter notebooks, and data workflows showcasing how to work with the OKDP Platform. This repository is meant to help you explore OKDP capabilities around compute, object storage, data catalog, SQL engines, Spark, and analytics.

Over time, these examples will be extended with lakehouse-oriented features, such as:

  • Open table formats (e.g. Apache Iceberg and/or Delta Lake).
  • Shared metadata with stronger schema enforcement and evolution.
  • Snapshot-based table management (time travel, retention, cleanup).
  • Incremental processing and analytics-ready datasets, etc.

Notebooks

The notebooks analyze datasets stored as Parquet on S3-compatible storage (MinIO). The same underlying dataset is queried using Trino and Spark.

An index.ipynb notebook is also provided as an entry point.

Trino notebooks

The following notebooks query data using Trino:

  • Querying data using Trino (Python/SQLAlchemy).
  • Querying data using Trino (SQL engine).

These notebooks use Trino external tables defined over Parquet data stored in object storage and registered via a metadata service.

Superset

Use Apache Superset (SQL Lab) to query Trino and build visualizations/dashboards on top of the same datasets.

Running the examples:

Using okdp-ui, deploy the following components:

About the datasets

At deployment time, the Helm chart:

  1. Downloads public datasets.
  2. Uploads them into object storage.
  3. Creates the corresponding Trino external tables.

ℹ️ NOTE

The datasets are not bundled in this repository and are not baked into container images.

About

A collection of hands-on examples, helper utilities, Jupyter notebooks, and data workflows to explore the OKDP Platform.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •