Skip to content
View PushpakVootla21's full-sized avatar

Block or report PushpakVootla21

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
PushpakVootla21/README.md

About Me πŸ‘‹

Hi, I'm Vootla Pushpak, a Data engineer with 2 years of experience in building data pipelines & working with different data systems . I currently work at Ensono, where I design and implement data pipelines to drive business insights.

Interests πŸ”­

  • Data Engineering: I'm fascinated by the challenges of handling large-scale data and building systems that can process, store, and analyze it efficiently.

Skills 🦾

  • Programming languages: Python, SQL
  • Database Management Systems: MySQL, Microsoft SQL server
  • Cloud Services: Azure Data Lake, Azure Databricks, Azure Data Factory(ADF), Synapse Analytics, AzureSQL Database, Azure Key Vault, EC2, S3, RDS, Elastic Beanstalk, DynamoDB, Lambda.
  • Big Data Technologies: Apache Spark, HDFS, Delta Lake

Experience πŸ‘¨β€πŸ’»

Data Engineer, Ensono (Oct 2022 - Current)

  • Built a metadata-driven ingestion framework using Azure Data Factory to migrate operational data from on-prem SQL Server to Azure SQL Database, processing over 1 million records/day across 8+ tables.

  • Designed parameterized ADF pipelines driven by JSON configurations stored in ADLS Gen2, enabling dynamic table selection and reducing manual intervention by 40%.

  • Implemented incremental load logic using watermark columns , optimizing performance and reducing data volume by over 80%.

  • Applied data validation techniques such as row count checks and checksums to ensure 100% data consistency across source and target systems.

  • Engineered a Medallion Architecture in Azure Databricks using PySpark and Delta Lake, transforming data across bronze, silver, and gold layers to support analytics and reporting.

  • Utilized Delta Lake features like schema evolution, ACID compliance, and time travel to manage complex transformations and data lineage.

  • Tuned Spark jobs using partitioning, caching, and cluster resource configuration, reducing runtime by 30% and lowering compute costs by 15%.

  • Delivered curated Gold-layer datasets to analytics teams, enabling a 25% improvement in sales forecast accuracy and 15% reduction in inventory overhead.

  • Integrated Databricks notebooks into CI/CD pipelines using Azure DevOps, automating version-controlled deployments across dev, test, and prod environments.

  • Azure Databricks & Spark Optimization: Utilized Azure Databricks to process large-scale retail data, applying Spark optimizations that improved performance. This led to a 25% increase in sales forecast accuracy and a 15% reduction in inventory costs through advanced data modeling and analysis.

πŸ§‘β€πŸ”§ Projects

Certifications πŸ‘¨β€πŸŽ“

Get in Touch πŸ“©

Feel free to reach out to me on GitHub or LinkedIn if you'd like to discuss data engineering, collaborate on a project, or simply say hello!

Popular repositories Loading

  1. Job_Portal_Web_Application Job_Portal_Web_Application Public

    Java 1 2

  2. data-engineering-zoomcamp data-engineering-zoomcamp Public

    Forked from DataTalksClub/data-engineering-zoomcamp

    Free Data Engineering course!

    Jupyter Notebook 1

  3. PushpakVootla21 PushpakVootla21 Public

    Readme file for my GitHub profile

    1

  4. pyspark-cheatsheet pyspark-cheatsheet Public

    Forked from cartershanklin/pyspark-cheatsheet

    PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster

    Python

  5. Retail_Data_Engineering_Project Retail_Data_Engineering_Project Public

    Retail Data Engineering Project using Data Factory & Data Bricks

    Jupyter Notebook