Badges
Certifications
Work Experience
Data Engineer
BT Group•  November 2023 - Present•  Bengaluru, Karnataka
Migrated ETL pipeline from Compute Engine to Cloud Composer, improving performance, data flow visibility, and parallelization, reducing data ingestion time from 4 hours to 45 minutes. Orchestrated data pipelines using Apache Airflow to load incremental data from Cloud Storage landing zone into BigQuery by transforming data using Apache PySpark and applying tokenization to mask sensitive/PII data. Deployed Oracle VPD with BigQuery AEAD and row-level access policy to enforce data protection, leading to a 70% reduction in data access violations and enhancing the overall visibility of sensitive data for compliance audits. Built a custom data refresh solution for non-production BigQuery environments using ThreadPoolExecutor, reducing manual operations team work by 30%. Transformed raw data using Spark and built data products for business use by following Data Mesh architecture. Engineered Terraform scripts to automate infrastructure provisioning in GCP.
Data Engineer
Pluto7, GCP Premier Partner•  November 2020 - October 2023•  Bengaluru, Karnataka
Engineered inventory tracking solution for shipments with PySpark on Dataproc and Composer, enhancing stakeholder alerts for overdue milestones and reducing delay report turnaround from one day to near real-time. Created ELT pipelines to transfer data from SAP ECC to BigQuery using Data Fusion. Replaced Excel reports with Power BI dashboards, reducing report generation time by 60% and improving data quality. Developed a BigQuery-based solution for inventory management, streamlining data access, reducing manual effort, and boosting planning efficiency by 45%. Wrote high-performance SQL scripts to convert business logic for UI and dashboard display, and implemented SCD Type 2 to maintain historical data in the data warehouse for reporting. Designed and developed automated ELT/ETL pipelines to replicate data from SQL Server, PostgreSQL, SAP ECC, and APIs into BigQuery using Dataproc, Data Fusion, and Cloud Composer for orchestration. Worked on data pipelines for different data load types like historical, incremental, and appending delta loads. Handled pipeline monitoring, package upgrades, issue resolution, and enhancements. Participated in requirements gathering, data modeling, and creating technical design documents (TDD) for customer use cases.
Education
SJB Institute Of Technology
Bachelor of Engineering in Computer Science•  August 2016 - August 2020•  CGPA: 8.3