Badges
Certifications
shenwenfei has not earned any certificates yet.
Work Experience
Data Engineer
United Health Centers• May 2018 - August 2023
Proven experience in designing, building, and optimizing data pipelines and architectures for efficient data processing, transformation, and storage. Expertise in integrating structured, semi-structured, and unstructured data using advanced programming and data engineering tools to support analytics, AI models, and business decision-making. Adept at delivering scalable, real-time solutions for complex business challenges. ● Developed and maintained scalable ETL pipelines using Python and PySpark to integrate structured (SQL) and semi-structured (JSON, log files) data, ensuring high-quality datasets for analytics and reporting. ● Designed and implemented real-time data processing pipelines by integrating Kafka, enabling actionable insights through low-latency analytics. ● Leveraged AWS services (S3, etc.) to build a centralized data lake, improving accessibility and scalability of large datasets. ● Optimized PostgreSQL databases by designing efficient schemas and automating data ingestion workflows, reducing query response time by 15~30%. ● Collaborated with data scientists to preprocess structured and unstructured data for training predictive models using PyTorch, TensorFlow, and Python. ● Developed custom Flask APIs to support seamless data management and real-time reporting for multiple business units such as the emergency department, using SQL Alchemy. ● Automated workflows using Apache Airflow, significantly reducing the manual effort involved in recurring data transmission tasks. ● Processed and analyzed unstructured data (images, audio, video) using Python, delivering visualizations and insights that enhanced decision-making. ● Enhanced data warehouse performance by optimizing data models and implementing best practices in SQL and Databricks. ● Worked with PyMongoDB to process and store semi-structured document data, ensuring efficient data access and scalability. ● Spearheaded the development of data pipelines in Databricks using Python Spark.
Education
State University of New York at Binghamton
MS• September 2012 - August 2014
Links
shenwenfei has not updated links details yet.