Machine Learning
Machine Learning, a subdomain of artificial intelligence, allows computers to produce output without being explicitly programmed.
This competency area includes using feature selection, model selection, selecting, using, optimizing machine learning models, procuring data, performing basic operations on data, using One-Hot encoding technique, creating new features using feature engineering, record sampling, running inference on a pre-trained machine learning model, training a brand new machine learning model from scratch, evaluating the performance of a machine learning model, tuning a machine learning model to achieve better performance, among others.
Key Competencies:
- Feature engineering - Using feature selection, and model selection.
- Machine Learning Models - Selecting, using, and optimizing machine learning models.
- Libraries - Familiarity with various machine learning libraries such as scipy, sympy, numpy, pandas, scikit-learn, and matplotlib.
- Working with Jupyter Notebook - Launch a Jupyter notebook server and create a new Jupyter notebook. Setting up a machine learning development environment (and associated tools) by launching a Jupyter notebook server from the command line and creating a new notebook.
- UCI Machine Learning Repository​ ​- ​Procuring data to use during the machine learning process from the UCI Machine Learning Repository. Data should be downloaded using a Jupyter Notebook.
- Ingest data and display first 10 observations​ - Ingesting downloaded data and inspecting that data using libraries such as Pandas, a popular Python machine learning library that provides dataframes.
- Visualize data ​- Visualizing downloaded data using libraries such as Matplotlib, a popular Python machine learning library that provides 2D plotting.
- Clean and prepare data​ - Cleaning and preparing data by removing null (or empty) or duplicate observations using libraries such as Pandas, a popular Python machine learning library that provides dataframes.
- Transform categorical features to numerical ​- Transforming data using libraries such as Pandas, a popular Python machine learning library that provides dataframes. For example, converting “True” or “False” values to “1” or “0”, respectively.
- Determine data ratios​ - Determining data ratios (for binary classification) using Pandas, a popular Python machine learning library that provides dataframes. For example, counting the “True” and “False” observations across a dataset.
- Transform features using the One-Hot encoding technique ​- Transforming categorical features to numerical features using the One-Hot encoding data transformation technique.
- Create new features using feature engineering - ​Transforming a single feature into multiple features so that a machine can easily find patterns in the data. For example, transforming a date timestamp to month, day of week, and time of day features.
- Perform record sampling​ - Removing observations based on business need and the business problem being solved.
- Split data for training and evaluation purposes ​- Performing an 80/20 split of data for training and evaluation purposes using Scikit-learn, a machine learning library for Python.
- Load pre-trained machine learning model​ - Selecting and loading a pre-trained Computer Vision machine learning model from Model Zoo.
- Run inference on pre-trained machine learning model ​- Running inference on a pre-trained machine learning model loaded from Model Zoo.
- Train Machine Learning model from scratch - ​Training a brand new machine learning model from scratch using Scikit-learn, a machine learning library for Python.
- Evaluate the performance of a Machine Learning model​ - Evaluating the performance of a machine learning model by using Scikit-learn, a machine learning library for Python. For example, test the model’s accuracy by generating and reviewing a Confusion Matrix and a Classification Report.
- Tune a Machine Learning model ​- This competency area includes tuning a machine learning model to achieve better performance by adjusting hyperparameters and re-training the model.
- Deploy Machine Learning model to Model Zoo ​- This competency area includes deploying a custom-trained machine learning to Model Zoo for use in production environments.
- Monitor Machine Learning model ​- This competency area includes monitoring a deployed machine learning model to ensure it is operating within expectations. For example, creating an alert when input data hasn’t been seen in the training data.