I'm a Data Engineer & Analytics professional with 2+ years of experience building robust data pipelines, automations, and turning complex data into actionable insights. I hold a B.Tech in Mechanical Engineering (2021) and a PGP in Data Science and Engineering from Great Learning.
What I do:
- π§ Design and build end-to-end data pipelines on Databricks
- β‘ Optimize PySpark & Spark SQL for large-scale data processing
- π₯οΈ Build Databricks Apps and Streamlit applications for data tools
- π Create interactive dashboards with Power BI
- π€ Automate workflows using Python & Selenium
What drives me: The challenge of building data systems that teams can depend on - from raw ingestion to polished dashboards. There's something satisfying about transforming chaotic data into clean pipelines and clear insights.
Deepening my expertise in building production-grade data systems:
| Area | Focus |
|---|---|
| Data Pipelines | Delta Live Tables, Structured Streaming, Batch & Real-time Ingestion |
| Databricks Platform | Performance Tuning, Unity Catalog, Workflows Orchestration, Databricks Apps |
| Delta Lake | Z-ordering, Compaction, CDC, Liquid Clustering, VACUUM |
| Spark Optimization | Partitioning, Caching, Broadcast Joins, Adaptive Query Execution |
βββ Data Engineering
β βββ PySpark & Spark SQL
β βββ ETL/ELT Pipeline Design
β βββ Delta Lake & Data Lakehouse
β βββ Data Ingestion (Event Hub, Blob Storage)
β βββ Databricks Workflows & Jobs
β
βββ Data Processing
β βββ Large-scale Transformations
β βββ Data Quality & Validation
β βββ Performance Optimization
β βββ Structured Streaming
β
βββ Apps & Automation
β βββ Databricks Apps
β βββ Streamlit Applications
β βββ Selenium Web Automation
β βββ Python Scripting
β
βββ Analytics & Visualization
βββ Power BI Dashboards
βββ SQL Modeling & Optimization
βββ Python Data Analysis (Pandas, NumPy)
βββ Data Storytelling & Reporting
β
Data Pipeline Design - Building reliable ETL/ELT workflows on Databricks
β
Databricks Apps & Streamlit - Building internal data tools and applications
β
SQL Modeling & Optimization - Complex queries, performance tuning, data modeling
β
Data Visualization & BI - Power BI dashboards, Python visualization libraries
β
Automation - Selenium web automation, Python scripting to reduce manual work
β
Cloud - Azure (Event Hub, Blob Storage), Databricks, Snowflake
- PGP in Data Science and Engineering - Great Learning
- B.Tech in Mechanical Engineering - 2021
- π Preparing for: Databricks Data Engineer Associate Certification
- π§ Data Pipeline Projects - Building scalable ETL/ELT solutions
- β‘ Spark & Delta Lake - Performance tuning and lakehouse patterns
- π₯οΈ Databricks Apps - Building data tools and applications
- π Analytics & Dashboards - BI solutions and data visualization
- π Open Source - Contributing to data engineering projects
When I'm not building pipelines, you can find me:
- πΏ Gardening - Growing vegetables and experimenting with plant care
- π§ Baking - Creating desserts and exploring new recipes
- π Diamond Painting - Relaxing with some sparkle therapy
"From raw data to reliable pipelines to clear insights." π§π
