Badges
Certifications
Work Experience
DevOps Engineer
Infocepts•  September 2023 - Present
With a major American multinational mass media and entertainment conglomerate, I collaborated with teams on various facets of creating and managing data pipelines. I monitored and managed diverse data pipelines within a production environment, focusing on identifying, analyzing, and resolving issues as they arose. I managed AWS service configurations and actively monitored resource usage to optimize costs. I engineered an automated system to monitor and terminate long-running AWS EMR and Databricks clusters, achieving a significant 20% reduction in operational costs. Additionally, I developed a custom script to facilitate efficient management and notification of cost savings across multiple AWS accounts. In conducting growth assessments and cost analyses for AWS S3, I identified optimization opportunities that led to reduced monthly expenditures. I implemented robust logging solutions using CloudWatch Logs and Log Insights, enhancing visibility across applications such as Airflow, EMR, and Databricks. I created a system to detect both over- and under-consumption of resources in EMR Serverless, effectively minimizing downtime and preventing resource bottlenecks. I spearheaded the 'End-toEnd' Job Execution Metrics Dashboard for EMR Serverless and Databricks, engaging with clients and providing timely updates that were critical for constructing a complex business case. Additionally, I conducted investigations into Databricks AI/BI and Databricks Genie for generating insights using artificial intelligence. I also explored OpenSearch to enhance observability of job logs and metrics, contributing to overall operational efficiency
Cloud and Data Engineer
Infocepts•  February 2023 - September 2023
I leveraged a range of skills, including Hadoop, Spark, SQL databases, AWS, data warehousing, Apache Airflow, and Terraform, to design and implement comprehensive end-to-end data pipelines. These pipelines effectively collected data from third-party environments and ingested it into storage systems for subsequent analysis and transfer to the semantic layer. One notable project involved building an end-to-end pipeline to analyze Zomato data, ranking restaurants based on their ratings using Apache Hadoop, Apache Hive, and Apache Spark. This project also utilized Linux for shell scripting and automation, enhancing operational efficiency. Additionally, I developed an end-to-end pipeline to analyze IPL data from 2007 to 2022, focusing on key performance indicators such as top centuries, most sixes, most fours, and match-specific metrics, as well as season-based analyses. This project demonstrated my proficiency in AWS services, including S3, Glue, and Lambda functions. Furthermore, I constructed a pipeline to process transactional data, incorporating both batch and stream processing capabilities. This included developing a third-party system to generate events for AWS Lambda, facilitating real-time transaction processing and storage. The technologies utilized for this project included AWS Lambda, AWS SQS, AWS DynamoDB, AWS S3, and React.js.
Education
Shri Ramdeobaba Kamla Nehru Engineering College
Electronics Design Technology, B.Tech•  July 2019 - July 2023