Sagar Kumar Bala | Senior Data Engineer Portfolio

Personal Statement

Driving value through Data Excellence.

Senior Data Engineer with 4 years at Celebal Technologies building production-grade pipelines on Azure Databricks. Delivered end-to-end medallion architecture lakehouse solutions for a large-scale conglomerate, migrated 600+ legacy ETL jobs for a leading private-sector bank, and re-implemented Oracle PL/SQL sales datamarts at 500M+ record scale.

Comfortable owning the full pipeline lifecycle - from Autoloader ingestion and PySpark transformation to Delta Lake optimisation, ADF orchestration, and CI/CD deployment via Databricks Asset Bundles.

Impactful

Scale-Obsessed

Cloud-Native

Proven Experience

Enterprise Impact Timeline

Large-Scale Conglomerate

Enterprise Data Platform (EDP)

2024 - Present

Designed and delivered the end-to-end data ingestion layer for a large-scale Enterprise Data Platform (EDP). Owned the full pipeline lifecycle from heterogeneous source systems through the Bronze layer to the Silver layer following a strict medallion architecture.

Key Achievements

60% Reduction in onboarding effort via parameterised PySpark framework.
Built Autoloader Stream for incremental ADLS-to-Bronze ingestion.
Packaged configurations as Databricks Asset Bundles (DAB).

Tech Stack

Azure Databricks PySpark Delta Lake Autoloader Kafka DAB Unity Catalog

Leading Private-Sector Bank

Credit Risk Analytics Migration

2021 - 2023

600+

Jobs Migrated

10x

Query Speed

25TB

Data / Month

30%

Cost Saved

Migrated 600+ Pentaho ETL jobs from on-prem Cloudera Hadoop to Azure Databricks. Handled 80M unique customer records for the bank's credit risk departments.

Key Achievements

Converted Impala SQL & JS into high-performance PySpark.
Redesigned 600+ workflows into ADF Pipelines.
Optimized Delta tables with Z-Ordering.

Tech Stack

ADF REST API Cloudera Hadoop Impala Azure DevOps Synapse Power BI

Global Industrial Tech

FY24 Sales Dataset Build

2023 - 2024

Converted a large body of Oracle PL/SQL procedures and Unix shell-scripts into production-grade PySpark, re-implementing 5 regional sales datamarts on Delta Lake handling 500M+ records.

Key Achievements

100% conversion of complex PL/SQL into Spark logic.
Established automated reconciliation framework.

Tech Stack

Oracle PL/SQL Unix Shell AWS Redshift SAP HANA Databricks Workflows

Technical Arsenal

Core Competencies

Cloud & Platform

Azure Databricks ADLS Gen2 Azure Data Factory Azure DevOps Azure Synapse

Processing

PySpark Spark SQL Structured Streaming Autoloader Apache Kafka

Storage & Lakehouse

Delta Lake Unity Catalog Medallion Architecture HDFS

Education

September 2021

PG Diploma in Big Data

CDAC | Pune, Maharashtra

Grade: B

April 2016

B.Tech in Computer Science

Centurion University of Technology

CGPA: 9.2 / 10

Ready to scale?

Let's build your next Lakehouse.

Currently open to Senior Data Engineer roles and complex cloud migration projects.

Email Me +91-7735016783 LinkedIn

Senior Data Engineer building scalable pipelines.

Years Experience

Personal Statement

Driving value through Data Excellence.

Proven Experience

Enterprise Impact Timeline

Enterprise Data Platform (EDP)

Key Achievements

Tech Stack

Credit Risk Analytics Migration

Key Achievements

Tech Stack

FY24 Sales Dataset Build

Key Achievements

Tech Stack

Technical Arsenal

Core Competencies

Cloud & Platform

Processing

Storage & Lakehouse

Education

PG Diploma in Big Data

B.Tech in Computer Science

Ready to scale?

Let's build your next Lakehouse.