Supervised Learning - Yousef's Notes
Supervised Learning

Supervised Learning

Use datasets composed of labeled data to perform operations like #ml/classification and #ml/regression .

#Preconditions

  1. Obtain a labeled dataset
  2. split the dataset into Training and Holdout Datasets
  3. Ensure that records in the validation and test datasets are statistically similar and independent.
  4. Data Imputation and Feature Engineering
  5. Convert all examples into numerical feature vectors
  6. Select a performance metric that returns a single number.
  7. We have a Baseline

#Main Algorithms

#Examples

  • Handwritten Digit Recognition
  • Spam Detection
  • Customer Segmentation
  • Personalized Treatment
  • Credit Scoring
  • Churn Prediction
  • Object Detection
  • Sentiment Analysis.
  • Fraud Detection.
  • Learn Detection.

#Limitations

  • Needs significant amount of labeled data: time-consuming and expensive.
  • Training data must represent real-world scenarios and avoid biases.

#Classification

Predicts the category the data belongs to. e.g. spam detection, churn prediction, sentiment analysis, dog breed detection.

#Regression

Predicts a numerical value based on previously observed data. e.g. house price prediction, stock price prediction, height-weight prediction.

#Other Problems

Under specific conditions, supervised ML can solve problems beyond classification and regression.

  • Ranking problems
  • Metric learning
  • Time-series forecasting
  • Anomaly detection
  • Structured prediction
  • Imitation learning