Why should we normalize data for machine learning or deep learning?

Normalization is an essential step in data preparation for machine learning or deep learning, making features fairer and more balanced representation. While its impact on accuracy may vary depending on the specific context,

it gives benefits to training speed and reduces bias making it a valuable tool for enhancing models.

Why to normalize?

Normalization, in essence, involves transforming features to a common scale for e.g. [0,1]. This process aims to address the issue of varying scales among different features, which can potentially lead to biased predictions. take them to the same scale so that the model is not biased

Let's take a simple example where you're building a logistic regression model to predict whether a customer will approve a loan or not.

You have two features: Age and Income.
Age typically falls within the range of [0, 120],
while income spans a much broader range, from [10000, 100000].

If you use these features in their original form, the Income variable will likely dominate the model's decision-making process, and overshadow the influence of Age. This is because the larger magnitude of Income values will have a greater impact on the model's weights. To rectify this imbalance, normalization comes into play.

Let's take them to the equation:

Y = weight_1 * (Age) + weight_2 * (Income) + some_constant

Just for the sake of explanation let Age is in the range of [0,120] and Income in the range of [10000, 100000]. The scale of Age and Income are very different.

If you consider them as is then weights weight_1 and weight_2 may be assigned biased weights. weight_2 might bring more importance to Income as a feature than to what weight_1 brings importance to Age.

To scale them to a common level, we can normalize them. For example, we can bring all the ages in range of [0,1] and all incomes in the range of [0,1]. Now we can say that Age and Income are given equal importance as a feature.

Does Normalization always increase the accuracy?

Apparently, No. It is not necessary that normalization always increases accuracy. It may or might not, you never really know until you implement. Again it depends on at which stage in your training you apply normalization, on whether you apply normalization after every activation, etc.

As the range of the values of the features gets narrowed down to a particular range because of normalization, it is easy to perform computations over a smaller range of values. So, usually, the model gets trained a bit faster.

Regarding the number of epochs, accuracy usually increases with the number of epochs provided that your model doesn't start over-fitting.

Keep it Mind

However, not every other dataset and use case requires normalization, it’s primarily necessary when features have different ranges. You may use when;

You want to improve your model’s convergence efficiency and make
optimization feasible
When you want to make training less sensitive to scale features, you can better solve coefficients.
Want to improve analysis from multiple models.

Normalization is not recommended when -

Using decision tree models or ensembles based on them
Your data is not normally distributed- you may have to use other data pre-processing techniques
If your dataset comprises already scaled variables

Also, In some cases, normalization can improve performance. However, it is not always necessary.

Why should we normalize data for machine learning or deep learning?

Keep it Mind

# Details

Machine Learning

Deep Learning