Alexander Pan

United States

@avpan

Badges

Problem Solving
CPP
Python
Days ofStatistics
Sql

Certifications

Work Experience

  • Junior Data Scientist

    All Inbox•  November 2018 - January 2020

    ● Completed ad-hoc requests for charts & dashboards for various departments and people using SQL ● Wrote SQL queries to create datasets to be used in analysis, training, and/or other purposes ● Created machine learning and probabilistic models to answer various questions like customer lifetime value & optimizing offer sequences: Used probabilistic beta-geometric, regression, & classification modeling ● Automated various company alerts using Airflow, Snowflake, Redshift, and Slack

  • Junior Data Scientist

    Statusquota•  October 2017 - October 2018

    ● Write machine learning notebooks and scripts in Python for exploration and deployment ● Write SQL and data models to perform analyses and other data manipulation tasks ● Project 1: Creating a visual and interactive analytic dashboard to allow my clients to gain insights to make business decisions. ● Project 2: Looking at a public corporation's customer purchase data. The company is well known in the clothing/fashion industry. The goal was to leverage purchases to inform future marketing decisions. ● Project 3: Using machine learning regressors and classifiers to solve multiple issues, like identifying features related to binary and multi classification problems and multivariable regression analysis. ● Project 4: Ad-Hoc classification problem for big networking client

  • Freelancer

    Upwork (formerly Elance-oDesk)•  March 2017 - September 2017

    Worked with client in developing a classification model using 10 years worth of NFL game datasets. Project is in the beginning stages. Assisted in writing solutions and grading assignments and exams for a Multivariate Statistics and a Data Science Course. The topics include, but not limited to, Principal Component Analysis, K-Means Clustering, Hierarchical Clustering, and more. Analyzed car rental data for client to see what features were used more for rentals using PCA, K-means clustering, and bayesian statistics. Analyzed client’s Google Analytics data in order to understand user behavior. I applied regression analysis, clustering methods, and basic data analysis on the dataset.

  • Fellow

    Data Incubator•  September 2016 - October 2016

    Identified(top 4% among 3000+ prospective data scientists) for skills in statistics, mathematics, data analysis, and computer science to participate in a rigorous 8 week data science fellowship program. Throughout the program, I developed my project that uses Riot Games' API in order to analyze, aggregrate 5.4 GB of match data. I used the in-game features in order to predict the win percentage of a team during the course of a match. The analysis and model can be viewed at https://hextechmodeling.herokuapp.com. Each week we completed miniprojects related to tools used in the industry: 1. SQL: Aggregated and analyzed relational databases containing 530,000 NYC restaurant inspections data from Yelp to extract statistics on inspection grades and violations across various locations and cuisines, using SQL 2. Machine Learning: Predicted a venue's popularity from information upon a venue's opening with machine learning. The 34 MB dataset comes from Yelp. The star rating was predicted with city, category, and latitude and longitude. The models were built using transformers and regressors. 3. MapReduce and Hadoop: Analyzed character entropy of extracted words and n-gram statistics of Simple English (320 MB) and Thai (900 MB) from Wikipedia, using mrjob, MapReduce, and Hadoop 4. Natural Language Processing: Developed regression model using machine learning to predict Yelp ratings based on metadata analysis on 38,000 venues and processed text analysis on 1,000,000 Yelp reviews 5. Time Series Analysis: Developed time series analysis to predict temperature in major US cities based on Fourier analysis of historical weather data over 525,000 time points in 12 years 6. Spark: Using 10 GB of posts, users, and votes data from Stackoverflow to answer statistics as well as predict tags of questions based from body text using a logistic regression model.

Education

  • San Francisco State University

    Physics, MS•  August 2012 - August 2015

  • Purdue University, West Lafayette

    Physics, BS•  January 2007 - December 2010

Skills

avpan has not updated skills details yet.