Work Experience
Data Engineer
Morgan Stanley• April 2018 - Present
- Refactored production grade Shell scripts to optimize Hive ETL batch applications. - Developed Python plugins to be leveraged by 350+ end-users of a third-party UI analytics/ETL tool. - Optimized Spark Streaming pipeline to optimize end-to-end processing and scoring of transactions in real-time. - Productionalized Hive queries to generate features for reporting and transaction scoring. - Computed and monitored usage statistics using the Python API of Dataiku analytics application.
Data Engineer Intern
Pro-Tek Consulting• October 2017 - March 2018
- Built a Kafka-Spark-Cassandra pipeline to process large amounts of log data for application outage prediction. - Developed SQL scripts to generate daily database governance reports to monitor long running queries, extracting user to database access mappings, space utilization, and so on. - Modularized and enhanced Python/PySpark scripts to reduce code redundancy and increase efficiency of the application. - Trained and validated a fuzzy clustering model to categorize incoming support tickets based on severity and type of issue reported. - Created a POC to showcase fuzzy matching on documents to aid in categorizing documents of similar types.
Research Assistant
New Jersey Institute of Technology• June 2017 - September 2017
- Implemented and compared CUDA and OpenCL C++ implementations of computing dot product on GPUs. - Conducted tutoring sessions for students in Data Structures, Algorithms and Database Management. - Developed AlexNet and LeNet Neural Network architectures in Keras and trained them on MNIST, CIFAR-10 and CIFAR-100 datasets to conclude that the performance of these architectures is limited to respective datasets. - Developed a generalized Neural Network model and trained it on 52 UCI datasets to achieve a mean accuracy of 88.65%. - Ported and optimized Perl/C++ implementation of Stacked Best Separating Planes (BSP) ML algorithm as a part of a research project.
Software Engineer Intern
Hindustan Aeronautics Limited• March 2015 - July 2015
- Developed a Mission Planning and Debriefing tool which is used to manage flight related mission information such as events, faults, routes and so on. - Extracted, detected and converted the binary data on flight black box to be displayed on a UI for analytics. - Used Google Maps API and the GPS coordinates log to simulate the flight on the UI application providing ability to track the path taken in case of an incident.
New Jersey Institute of Technology
Computer Science, MS• January 2016 - May 2017
Coursework: Data Structures and Algorithms, Introduction to Robotics, Machine Learning, Operating Systems Design, Image Processing
New Horizon College of Engineering
Information Science and Engineering, BE• August 2011 - July 2015
Coursework: Database Management Systems, Graph Theory, Software Engineering, Data Mining, Operations Research
girishcr7 has not updated skills details yet.