Badges
Certifications
Work Experience
Data Engineer
ProducePay, Inc•  October 2023 - Present•  Remote
Optimized data operations by implementing partitioning strategies in DynamoDB, leading to a 40% reduction in I/O operations and saving $10k yearly in transfer costs. Managed Tableau integration for 12 data sources, including SQL databases, cloud services, and APIs, while collaborating with 20 data scientists, PMs, and engineers to meet diverse data needs. Led automation to detect changes in the data source, catching errors early in the ETL saving 10 days of work per month. Developed a near real-time ETL process with Change Data Capture (CDC) using Streams & Tasks, and tuned query performance, resulting in a 5-day reduction in data refresh latency every week. Implemented a comprehensive data quality framework in Snowflake, using DBT to automate transformations and rectify 75% of data anomalies during ingestion.
Data Engineer Intern
SwitchPitch•  June 2022 - September 2022•  washington D.C.
Transformed market research data into Tableau dashboards, helping the CEO gain insights and make informed decisions. Expanded market presence across targeted sectors by analyzing financial metrics from 12k startups using web scraping and NLP, identifying financial risks and guiding acquisition strategies. Built an ETL pipeline using Airflow to ingest data into Amazon Sagemaker, supporting multiclass classification for precise categorization of companies across verticals and sub-verticals.
Data Engineer
AMDOCS•  March 2020 - July 2021
Spearheaded a project to classify telecom orders using ML, Python, and SQL, enabling automation of stuck order processing. Reduced application support workload by 300 hours monthly through telecom order classification. Engineered a Python-based data ingestion pipeline leveraging Apache Kafka on AWS, optimizing data transfer costs by 40%. Implemented automated ticket categorization using NLP models LSTM and BERT, saving 100 hours monthly in manual triaging efforts and optimizing team focus on issue resolution.
Data Engineer
Tech Mahindra •  February 2017 - February 2020•  Pune, India
Led the development of user analytics wireframe using Click-Through-Rate analysis, customer satisfaction rates analysis, and automated A/B Testing, resulting in improved user engagement and personalized recommendations. Automated SLA report generation pipeline using Python (Dash framework) and PL/SQL to generate visual reports with performance trends, SLOs, KPIs, and customer satisfaction rates, reducing 100 manual hours per month.
Education
Syracuse University
MS in Applied Data Science•  August 2021 - May 2023•  GPA: 3.7
University of Pune
Bachelor of Engineering in Computer Science•  August 2012 - May 2016•  GPA: 3.3