Badges
Certifications
Work Experience
Senior Systems Engineer
Infosys Limited•  June 2019 - Present
• Developing Spark Programs using PySpark API to compare the performance of spark with HIVE and SQL • Implemented Spark using spark-SQL for faster processing of data. • Used impala for querying HDFS data to achieve better performance. • Imported data from Azure with different file formats like JSON, Parquet, ORC, CSV, Text into spark data frames, performed transformations and actions to cleanse the data.. • Used Spark-SQL to load JSON, CSV data and loaded into Hive tables and handled structured data. • Used ORC, Parquet and ORC data formats to store data into HDFS. • Used Control-M to Schedule the jobs. • Loaded the data into Spark RDD and do in-memory computations. • Raising CRQ’s and working on the production deployment activities. • Perform load and Integration Tests and creating Run Books. • Developed Database objects like tables, views and stored procedures for replicating the data from HDFS to Azure SQL. • Understanding the data mapping, Cleanse the data and create Spark jobs for data transformation and aggregation. • Deployment activity will happen via Telstra internal tools (Merlin) till UAT testing, once done need to deploy the same manifest by platform team to production environment. • Validate the components and build activities check list once done the deployment activities. • Monitoring the data loading jobs from different source systems and ensuring the jobs running BAU by tracking issues and fixing it. • Environment: Cloudera manager, Azure HDInsight, Hadoop, HDFS, Spark, Spark-SQL, Impala, Hue, Control-M, BMC ITAM, Merlin, GIT, Linux, Azure SQL, TeraData
Software Engineer
DXC.technology•  November 2016 - May 2019
Worked in two projects as T-SQL Developer and BigData Engineer
Education
JNTU, Hyderabad (Jawaharlal Nehru Technological University)
Electronics & Communication, B.Tech•  October 2012 - May 2016
Links
Skills
karthik4a9 has not updated skills details yet.