Badges
Certifications
Work Experience
Data Engineer - II
Verizon•  September 2021 - June 2023•  India
Developed and optimized over 25 Python-based ETL pipelines to handle data extraction, transformation, and loading from various databases like Oracle, SQL Server, and Teradata, significantly improving forecast accuracy by 12%. Led the migration of 160+ workflows and datasets from Hadoop to Google Cloud Platform (GCP), utilizing DataProc, Cloud Storage, and BigQuery to cut data processing time and storage costs by 43%. Modernized 30+ Hadoop workloads by transitioning them to cost-effective, scalable GCP solutions, resulting in an 18% boost in pipeline execution efficiency. Additionally, engineered 21 Apache Hive and Oozie pipelines for near real-time data feeds, incorporating CI/CD practices with Jenkins to streamline automation and deployment processes.
Systems Engineer - I
Verizon•  July 2019 - September 2021•  India
Led the design and development of a summarized factual table layer in the Enterprise Data Warehouse, integrating 500 key metrics from 30+ Teradata source tables using SQL, Shell scripting, and Teradata utilities. This enhancement streamlined data access, enabling marketing teams to deliver personalized experiences at scale. Additionally, implemented a data replication process to synchronize over 80 Teradata tables with Hadoop clusters and other cloud platforms, ensuring consistent and accessible data across diverse environments.
Education
Virginia Polytechnic Institute and State University (Virginia Tech)
Computer Science, MEng•  August 2023 - December 2024•  CGPA: 3.95
Coursework: Big Data with Machine Learning, Statistical Machine Learning, Data Visualization, Software Engineering, Neural Networks Design, Social Media Analytics, Information Security, AI Tools for Software Development Project Works: o Cardiovascular Disease Risk Dashboard: Designed and implemented an analytics dashboard using Plotly and Dash to explore cardiovascular disease risk factors. o Machine Learning Blogging Site: Developed a blog using GitHub Pages and Quarto, authoring five blogs that demonstrate the use of supervised and unsupervised machine learning algorithms with real-world datasets. o Neural Network Implementations: Created custom Multi-Layer Perceptron and Radial-Basis neural networks, trained with a momentum-optimized Backpropagation algorithm, without using deep learning libraries. o Social Media Analytics Project: Analyzed user reviews from Yelp businesses to develop sentiment analysis and topic modeling to suggest areas of improvement. Built an item-based collaborative filtering model for recommending business to users and performed community detection on user networks on Yelp to generate insights for targeted recommendations.