Badges
Certifications
Work Experience
ETL Engineer
Fullthrottle.ai•  July 2023 - Present
- Implemented data integration pipeline by AWS Lambda to automate ETL processes for data sourced from AWS S3, ensuring standardized data schema, discovery, and observability via integration with Datahub, and efficiently storing the standardized data in a Data Lake. - Implemented Pandas jobs in EC2 to ETL data, created business metrics from multiple sources (FTP, APIs, S3, database), and successfully loaded them into Data Lake (Postgres, Snowflake). - Enhanced efficiency of data processes within the existing budget by transitioning from linear processing to multi-threaded (Big Data) processing, resulting in a 15% speed boost. - Designed and implemented a CI/CD pipeline that connects GitHub repositories with AWS Lambda and EC2 instances, transforming the development process into an automated and reliable workflow. - Enriched logging and alerts for AWS Lambda by creating CloudWatch metric, and deployed Grafana to show those logs, reduced troubleshooting time by 50% for both EC2 job tasks and AWS Lambda.
Data Engineer
ZaloPay•  March 2021 - April 2023
- Implemented data pipeline by Apache Nifi cluster to consume data from multiple sources (APIs, Redshift, Kafka) to the intricate processes of ETL/ELT, Data Modeling, and detailed Dashboarding (Grafana), handling approximately 20M records daily, and saving into Data Lake in near real-time. - Implemented Spark jobs for data consistency including reading, comparing, and recovering data to identify missing entries, successfully applying the solution to 16 teams managing over 70 log types. - Implemented full-flow encryption/decryption of PII data by Apache Nifi, integrated with AWS services (KMS, Secrets Manager) and DataHub to identify PII fields, and save to MySQL. - Implemented data quality checks by developing automatic testing and integrating with DataHub (metadata platform) to standardize data schema, which identifies and mitigates data quality issues in the early phases of processing, resulting in time savings for engineering tasks. - Deployed DataHub to enable data discovery, and data meaning for all data in ZaloPay, and integrated with the ETL pipeline at the last stage to standardize data schema before storing it in Data Lake. - Built Grafana dashboard for monitoring of Nifi cluster metrics, offering a tool to visualize and track detailed error processors in the ETL pipeline, providing insights into its health and performance.
Software Engineer
VNG•  June 2020 - March 2021
- Implemented APIs web service for e-wallet using Java Spring Boot, and Docker, and integrated with Apache Kafka as a message queue for consuming data, handling concurrent threads, and saving them to a database. - Implemented load testing system worked with 6 teams to finalize the testing flow and implemented tests to cover all scenarios. - Built a reporting dashboard to show metrics to check if the service can handle enough throughput and latency.
Education
Arizona State University, Tempe
Computer Science & Engineering, MS•  August 2023 - Present
Ho Chi Minh City University of Science
Computer Science, BE•  June 2017 - July 2022
Links
Skills
tompham has not updated skills details yet.