Supervised and Unsupervised Learning in Machine Learning

To put up simply, Machine Learning(ML) is a way by which a computer learns to predict the output of data. To achieve this, an algorithm (called an ML algorithm) is trained on a specific dataset.

Mainly there are 2 types of Machine Learning named as:

  1. Supervised Learning

  2. Unsupervised Learning

Supervised Learning

Here, the model is trained on data of desired Input and output mappings. You give your algorithm the input and the desired output data at first and then train your model. After the model has learned from your data, you give a completely new value to the model and it tries to produce a new output.

Spam filtering and speech recognition are some of the most used applications of supervised learning.

But there are two subcategories to supervised learning too:

  1. Regression

  2. Classification

Regression

In regression, the algorithm gives output in numbers from many possible output numbers. Let's take an example of house price prediction.

You will need to have a sample dataset of houses with a certain number of rooms, washrooms, and floors and their price. Now, your data will be used to train your ML model

If you need to predict the price of a house with 10 rooms, 3 washrooms and 3 floors, then your ML model will tell you the price.

Classification

In classification, your algorithm will return a category (also called a class) as its output. One of the classic examples of classification is spam filtering

The algorithm is trained on a large dataset of sample emails with input as the sender, the subject line, and the content of the email and the output of a particular email (spam or not spam) is already present there

Then when a new email is sent to a user, the algorithm tries to classify it as spam or not depending on the parameters like the sender, the subject line, and the content of the email

Unsupervised Learning

In supervised learning, we used data that had input and the corresponding correct output pairs. But in unsupervised learning, that output is not present in that data. The algorithm often predicts patterns in data and the data gets categorized by the rules made by the algorithm itself.

Clustering

It is commonly used for tasks such as customer segmentation or document classification. In customer classification, the algorithm tries to classify the customers based on certain parameters like "paying users", "members", "blog readers" etc. etc. When an algorithm does this kind of task of classifying, it is often known as Clustering Algorithm

Anomaly Detection

It aims at identifying any unusual data points that don't fit in any similar pattern of data. It is used in detecting transaction fraud detection

Dimensionality Reduction

It involves reducing parameters from data while having access to much information in the dataset. It is used in tasks like data compression