Supervised Learning

Supervised learning entails learning a mapping between a set of input variables (typically a vector) and an output variable (also called the supervisory signal) and applying this mapping to predict the outputs for unseen data. Supervised methods attempt to discover the relationship between input variables and target variables. The relationship discovered is represented in a structure referred to as a model. Usually models describe and explain phenomena, which are hidden in the dataset and can be used for predicting the value of the target attribute knowing the values of the input attributes.

Supervised learning is the machine learning task of inferring a function from supervised

training data (set of training examples). The training data consists of a set of training

examples. In supervised learning, each example is a pair consisting of an input object and a

desired output value. A supervised learning algorithm analyzes the training data and

produces an inferred function.

From Statistics for Machine Learning (https://www.safaribooksonline.com/library/view/statistics-for-machine/9781788295758/94fde0ee-fc9e-4cbc-aac4-d8dc1d9d94f4.xhtml)\

Many supervised machine learning methods fall in to this category:

  • Classification problems

  • Logistic regression

  • Lasso and ridge regression

  • Decision trees (classification trees)

  • Bagging classifier

  • Random forest classifier

  • Boosting classifier (adaboost, gradient boost, and xgboost)

  • SVM classifier

  • Recommendation engine

  • Regression problems

  • Linear regression (lasso and ridge regression)

  • Decision trees (regression trees)

  • Bagging regressor

  • Random forest regressor

  • Boosting regressor - (adaboost, gradient boost, and xgboost)

  • SVM regressor

Some of the issues to consider in supervised learning are as follows:

  • Bias-variance trade-off

  • Function complexity and amount of training data

  • Dimensionality of the input space

  • Noise in the output values

  • Heterogeneity of the data

  • Redundancy in the data

  • Presence of interactions and non-linearity

Typically the problems will look like

These problems each contain a target of interest (Did the Titanic passenger survive? Did the customer churn? What’s the MPG?) and a set of training data with known values of the target. Indeed, most problems in machine learning are supervised in nature, and most ML techniques are designed to solve supervised problems.

Supervised Learning:

  • is like learning with a teacher

  • training dataset is like a teacher

  • the training dataset is used to train the machine

Example:

Classification:Machine is trained to classify something into some class.

  • classifying whether a patient has disease or not

  • classifying whether an email is spam or not

Regression:Machine is trained to predict some value like price, weight or height.

  • predicting house/property price

  • predicting stock market price

Last updated

Was this helpful?