nagarjun k

India

@Nagarjunk

Cloud Data Engineer

Badges

Problem Solving
Java
Python
Days of Code
Days of JS
Sql
C language

Certifications

Work Experience

  • Data Engineer

    BT Group•  November 2023 - Present•  Bengaluru, Karnataka

    Migrated ETL pipeline from Compute Engine to Cloud Composer, improving performance, data flow visibility, and parallelization, reducing data ingestion time from 4 hours to 45 minutes. Orchestrated data pipelines using Apache Airflow to load incremental data from Cloud Storage landing zone into BigQuery by transforming data using Apache PySpark and applying tokenization to mask sensitive/PII data. Deployed Oracle VPD with BigQuery AEAD and row-level access policy to enforce data protection, leading to a 70% reduction in data access violations and enhancing the overall visibility of sensitive data for compliance audits. Built a custom data refresh solution for non-production BigQuery environments using ThreadPoolExecutor, reducing manual operations team work by 30%. Transformed raw data using Spark and built data products for business use by following Data Mesh architecture. Engineered Terraform scripts to automate infrastructure provisioning in GCP.

  • Data Engineer

    Pluto7, GCP Premier Partner•  November 2020 - October 2023•  Bengaluru, Karnataka

    Engineered inventory tracking solution for shipments with PySpark on Dataproc and Composer, enhancing stakeholder alerts for overdue milestones and reducing delay report turnaround from one day to near real-time. Created ELT pipelines to transfer data from SAP ECC to BigQuery using Data Fusion. Replaced Excel reports with Power BI dashboards, reducing report generation time by 60% and improving data quality. Developed a BigQuery-based solution for inventory management, streamlining data access, reducing manual effort, and boosting planning efficiency by 45%. Wrote high-performance SQL scripts to convert business logic for UI and dashboard display, and implemented SCD Type 2 to maintain historical data in the data warehouse for reporting. Designed and developed automated ELT/ETL pipelines to replicate data from SQL Server, PostgreSQL, SAP ECC, and APIs into BigQuery using Dataproc, Data Fusion, and Cloud Composer for orchestration. Worked on data pipelines for different data load types like historical, incremental, and appending delta loads. Handled pipeline monitoring, package upgrades, issue resolution, and enhancements. Participated in requirements gathering, data modeling, and creating technical design documents (TDD) for customer use cases.

Education

  • SJB Institute Of Technology

    Bachelor of Engineering in Computer Science•  August 2016 - August 2020•  CGPA: 8.3

Skills

Apache Airflow
Terraform
Git
GitHub
GitLab
Tableau
Looker studio
BigQuery
MS SQL Server
Cloud SQL
MySQL
FastAPI
Python
SQL
PySpark
Shell Scripting
Spark
GCP
Linux
Data warehousing
DSA
Data Structure
Python(Advanced)
Data modelling
HDFS
Databricks
Hive