Girish Sukhwani

United States


Data Engineer


Problem Solving
Days of Code
Days ofStatistics


Work Experience

  • Data Engineer

    Morgan Stanley•  April 2018 - Present

    - Refactored production grade Shell scripts to optimize Hive ETL batch applications. - Developed Python plugins to be leveraged by 350+ end-users of a third-party UI analytics/ETL tool. - Optimized Spark Streaming pipeline to optimize end-to-end processing and scoring of transactions in real-time. - Productionalized Hive queries to generate features for reporting and transaction scoring. - Computed and monitored usage statistics using the Python API of Dataiku analytics application.

  • Data Engineer Intern

    Pro-Tek Consulting•  October 2017 - March 2018

    - Built a Kafka-Spark-Cassandra pipeline to process large amounts of log data for application outage prediction. - Developed SQL scripts to generate daily database governance reports to monitor long running queries, extracting user to database access mappings, space utilization, and so on. - Modularized and enhanced Python/PySpark scripts to reduce code redundancy and increase efficiency of the application. - Trained and validated a fuzzy clustering model to categorize incoming support tickets based on severity and type of issue reported. - Created a POC to showcase fuzzy matching on documents to aid in categorizing documents of similar types.

  • Research Assistant

    New Jersey Institute of Technology•  June 2017 - September 2017

    - Implemented and compared CUDA and OpenCL C++ implementations of computing dot product on GPUs. - Conducted tutoring sessions for students in Data Structures, Algorithms and Database Management. - Developed AlexNet and LeNet Neural Network architectures in Keras and trained them on MNIST, CIFAR-10 and CIFAR-100 datasets to conclude that the performance of these architectures is limited to respective datasets. - Developed a generalized Neural Network model and trained it on 52 UCI datasets to achieve a mean accuracy of 88.65%. - Ported and optimized Perl/C++ implementation of Stacked Best Separating Planes (BSP) ML algorithm as a part of a research project.

  • Software Engineer Intern

    Hindustan Aeronautics Limited•  March 2015 - July 2015

    - Developed a Mission Planning and Debriefing tool which is used to manage flight related mission information such as events, faults, routes and so on. - Extracted, detected and converted the binary data on flight black box to be displayed on a UI for analytics. - Used Google Maps API and the GPS coordinates log to simulate the flight on the UI application providing ability to track the path taken in case of an incident.


  • New Jersey Institute of Technology

    Computer Science, MS•  January 2016 - May 2017

    Coursework: Data Structures and Algorithms, Introduction to Robotics, Machine Learning, Operating Systems Design, Image Processing

  • New Horizon College of Engineering

    Information Science and Engineering, BE•  August 2011 - July 2015

    Coursework: Database Management Systems, Graph Theory, Software Engineering, Data Mining, Operations Research


girishcr7 has not updated skills details yet.