Akash Kandarkar

United States

@akashkandarkar

Data Scientist

Badges

Problem Solving
Python
Days of Code
Sql

Certifications

Work Experience

  • Data Engineer

    ProducePay, Inc•  October 2023 - July 2024•  Remote

    Optimized data operations by implementing partitioning strategies in DynamoDB, leading to a 40% reduction in I/O operations and saving $10k yearly in transfer costs. Managed Tableau integration for 12 data sources, including SQL databases, cloud services, and APIs, while collaborating with 20 data scientists, PMs, and engineers to meet diverse data needs. Led automation to detect changes in the data source, catching errors early in the ETL saving 10 days of work per month. Developed a near real-time ETL process with Change Data Capture (CDC) using Streams & Tasks, and tuned query performance, resulting in a 5-day reduction in data refresh latency every week. Implemented a comprehensive data quality framework in Snowflake, using DBT to automate transformations and rectify 75% of data anomalies during ingestion.

  • Data Engineer Intern

    SwitchPitch•  June 2022 - September 2022•  washington D.C.

    Transformed market research data into Tableau dashboards, helping the CEO gain insights and make informed decisions. Expanded market presence across targeted sectors by analyzing financial metrics from 12k startups using web scraping and NLP, identifying financial risks and guiding acquisition strategies. Built an ETL pipeline using Airflow to ingest data into Amazon Sagemaker, supporting multiclass classification for precise categorization of companies across verticals and sub-verticals.

  • Data Engineer

    AMDOCS•  March 2020 - July 2021

    Spearheaded a project to classify telecom orders using ML, Python, and SQL, enabling automation of stuck order processing. Reduced application support workload by 300 hours monthly through telecom order classification. Engineered a Python-based data ingestion pipeline leveraging Apache Kafka on AWS, optimizing data transfer costs by 40%. Implemented automated ticket categorization using NLP models LSTM and BERT, saving 100 hours monthly in manual triaging efforts and optimizing team focus on issue resolution.

  • Data Engineer

    Tech Mahindra •  February 2017 - February 2020•  Pune, India

    Led the development of user analytics wireframe using Click-Through-Rate analysis, customer satisfaction rates analysis, and automated A/B Testing, resulting in improved user engagement and personalized recommendations. Automated SLA report generation pipeline using Python (Dash framework) and PL/SQL to generate visual reports with performance trends, SLOs, KPIs, and customer satisfaction rates, reducing 100 manual hours per month.

Education

  • Syracuse University

    MS in Applied Data Science•  August 2021 - May 2023•  GPA: 3.7

  • University of Pune

    Bachelor of Engineering in Computer Science•  August 2012 - May 2016•  GPA: 3.3

Skills

Tableau
Amazon QuickSight
Redshift
PowerBI
Looker
dbt
Jira
Docker
Informatica
DataDog
Airflow
BitBucket
Postgres
Teradata
MS SQL Server
Oracle
DynamoDB
Snowflake
Python
SQL
PL/SQL
R
Shell scripting
Linux
Unix
Git
PySpark