Skip to main content

Command Palette

Search for a command to run...

Important Terminologies for Classification in Machine Learning

Updated
3 min read
Important Terminologies for Classification in Machine Learning

Classification is a process of categorizing a given set of data into classes, It can be performed on both structured or unstructured data. The process starts with predicting the class of given data points. The classes are often referred to as target, label, or categories.

Hence, in this blog, I will be throwing some light on various terminologies which are needed before performing classification using a machine learning model.

Following are the concepts that are required to understand:

  1. Log loss
  2. Confusion matrix
  3. Precision matrix and recall matrix
  4. One hot encoding
  5. Response encoding
  6. Laplace smoothing
  7. SGD Classifier
  8. Calibrated classifier

Log Loss

Log Loss is the most important classification metric based on probabilities. It’s hard to interpret raw log-loss values, but log-loss is still a good metric for comparing models. For any given problem, a lower log loss value means better predictions.

  • Mathematical Interpretation:

image.png

image.png

Hence further solving the above table, the results will be:

image.png

Log loss is an important parameter to compare two machine learning model accuracies.

Confusion Matrix

A confusion matrix is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known.

image.png

Precision and Recall Matrix

Precision tells us how many of the correctly predicted cases actually turned out to be positive.

image.png

Recall tells us how many of the actual positive cases we were able to predict correctly with our model.

image.png

One Hot Encoding

The input to this transformer should be a matrix of integers, denoting the values taken on by categorical (discrete) features. The output will be a sparse matrix where each column corresponds to one possible value of one feature.

image.png

The above table when encoded with 1s and 0s becomes:

image.png

Response Encoding

When the dataset is too large, using one-hot encoding will create a huge number of columns and hence increases computational time.

In such cases, we use response encoding where we get the probability of occurrence as the number of times a feature corresponds to a class.

The higher the occurrence of the data point having the same class, the higher the probability, and hence that class will be predicted.

image.png

Laplace Smoothing

Laplace smoothing is a smoothing technique that helps tackle the problem of zero probability in the Naïve Bayes machine learning algorithm.

image.png

SGD Classifier

Stochastic Gradient Classifier is a linear classifier that updates the parameters so as the model prediction is decreased with each computation cycle. It further calculates the log loss and again tries to change the parameters to obtain the best model accuracy.

image.png

Calibrated Classifier

A sigmoid function is a mathematical function having a characteristic "S"-shaped curve or sigmoid curve. A common example of a sigmoid function is the logistic function

image.png

The above terms and technologies were explained in the shortest and most efficient way possible. If you want to explore them in-depth, please feel free to surf through youtube tutorials and other websites :)