Ajay Sharma

India

@ajay_sharma_2031

Data Enginner

Badges

Python
Sql

Certifications

Work Experience

  • Data Engineer

    TCS•  December 2021 - Present•  Mumbai

    Client - Trafigura Achieved a 20% improvement in data retrieval efficiency by conducting performance tuning of SQL queries, leading to faster access and processing of data that directly aided 3 departments in leveraging data for reporting and strategic initiatives.. Created an end to end data pipeline using Azure Databricks, Datalake storage and ADF for ingesting data from Source system and process it to store it into ADLS Gen 2. Applied data cleaning techniques to resolve 90% of issues related to duplicates, null values, and data type discrepancies, ensuring high data integrity for downstream analysis teams. Collaborated with cross-functional teams, including finance and sales, to understand data requirements and deliver solutions, while working effectively in an Agile/Scrum development environment. Created Data Processing Scripts in Databricks Environment using Pyspark, Python, SQL to extract data from xls/xlsx/csv/Json files and convert it to parquet/delta files for easy handling and storage. Currently involed in migrating of data from on-premise HDFS to Azure Data Lake storage using Azure Data Factory to reduce data storage cost and data retrieval efficiency. Client - Puma Energy Crafted and executed Hive queries, consistently retrieving analytical insights from HIVE tables with 100% accuracy which streamlined data workflows to enhance the reporting team’s analytical capabilities. Experienced in solving Big Data problems using Spark with python. Contributed in data crunching, ingestion, and transformation activities, streamlining data workflows and enhancing analytical capabilities for the reporting team. Experienced in working with Structured and Semi Structured data. Played a key role in migrating substantial data from traditional RDBMS to HDFS, facilitating improved data storage and accessibility and optimized data processing using Spark with Python for enhanced performance and scalability. Gained Strong understanding of business drivers, underlying data and processes Experienced in developing Map Reduce jobs with RDD’s and Dataframes.

Education

  • Saraswati College of Engineering

    B.E•  June 2018 - June 2021•  CGPA: 7.9

Skills

Vs Code
MS SQL Studio
Docker
Git & Github
Azure Data Factory
AirFlow
MySQL
HIVE
HDFS
PySpark
Spark Structured Streaming
Databricks
Azure Synapse Analytics
Python
Shell Scripting
HiveQL
Spark SQL
SQL
Python(Intermediate)