


MS in CS @NYU| Experienced Data Engineer and Data Scientist

Personal Information


Problem Solving
Days ofStatistics


Work Experience

  • Data Engineer

    In4mation Insights•  December 2020 - Present

    • Deploying Tableau & RStudio servers using Docker, AWS EC2 & ECS • Automated data-prep by building ETL pipelines using AWS Glue, Athena & PySpark to fetch over 12TB of mobility & geocoded data from NOAA, SafeGraph & multiple REST APIs. Normalized Athena databases to reduce query & Glue job runtimes from 10 min to 2 min • Integrated geocoded data into Redshift data-warehouse, and estimated impact of weather & Covid19 on sales & marketing channels using Marketing Mixed models.

  • Data Engineer

    New York University•  November 2018 - September 2020

    • Modeled an end to end course recommendation system using item-based collaborative filtering using Spark MLlib. Increased student retention rates by 18%. • Built an ETL pipeline using PySpark, Airflow, AWS S3 and AWS EMR, which transformed raw data for over 1000+ students from multiple sources and feed it to the course recommender on a scheduled basis. • Performed data quality analysis of over 1900 flat files and semi-structured datasets available on NYC Open Data. Created an automated data cleaning flow with Tableau Prep, Spark, which utilized null analysis, regular expressions, fuzzy lookup, and data mining to handle outliers, merge duplicates, and ensure key integrity.

  • Graduate Teaching Assistant

    New York University•  January 2019 - May 2019

    • TA and grader for Artificial Intelligence (CS-GY 6613). Conducted lectures and office hours, to help 150 students understand complex AI concepts such as MDPs, CSP, Neural Networks, etc. • Created automated grading scripts in Python using pytest and OK.

  • Machine Learning Engineer

    University at Buffalo•  April 2017 - April 2018

    • Built an image processing system to denoise dirty historical documents and office records. Minimized RMSE to 7 while maintaining a 0.9 SSIM index. • Applied median filter, adaptive thresholding, and morphological operations such as Dilation and Erosion along with Canny edge detection using OpenCV to reduce noise. • Implemented different CNN architectures through a sequence of Convolutions, Pooling, and Activation functions. Comparative study of CV algorithms vs CNNs based on metrics such as PSNR, UQI, and RMSE.

  • Computer Science Tutor

    University at Buffalo•  January 2016 - December 2016

    • Conducted group & individual tutoring sessions for freshman & sophomore students for the following courses: Intro to Computer Science, Calculus-1, Statistics and Probability, Linear Algebra.


  • New York University (NYU), New York

    Computer Science, MS•  August 2018 - May 2020

  • State University of New York at Buffalo, Buffalo

    Computer Science, BS•  August 2014 - May 2018


bitsplease_NYU has not updated skills details yet.