alternative
  • Home (current)
  • About
  • Tutorial
    Technologies
    C#
    Deep Learning
    Statistics for AIML
    Natural Language Processing
    Machine Learning
    SQL -Structured Query Language
    Python
    Ethical Hacking
    Placement Preparation
    Quantitative Aptitude
    View All Tutorial
  • Quiz
    C#
    SQL -Structured Query Language
    Quantitative Aptitude
    Java
    View All Quiz Course
  • Q & A
    C#
    Quantitative Aptitude
    Java
    View All Q & A course
  • Programs
  • Articles
    Identity And Access Management
    Artificial Intelligence & Machine Learning Project
    How to publish your local website on github pages with a custom domain name?
    How to download and install Xampp on Window Operating System ?
    How To Download And Install MySql Workbench
    How to install Pycharm ?
    How to install Python ?
    How to download and install Visual Studio IDE taking an example of C# (C Sharp)
    View All Post
  • Tools
    Program Compiler
    Sql Compiler
    Replace Multiple Text
    Meta Data From Multiple Url
  • Contact
  • User
    Login
    Register

Deep Learning - ANN - Artificial Neural Network - Data Scaling & Feature Scaling Tutorial

Normalization

Normalizing inputs – Bringing the two different inputs (e.g. age and salary) in the same scale.

If we know the min and max value, e.g. CGPA value lies between 0 and 10, then we can use normalize.

For salary, we don’t know the min and max values, and data is not normally distributed, then we can use standardization

Feature Scaling Practical

 

Batch Normalization

Batch Normalization is an algorithmic method that makes the training of deep neural networks(DNN) faster and more stable.

It consists of normalizing activation vectors from hidden layers using the mean and variance of the current batch. This normalization step is applied right before (or right after) the nonlinear function.

In Batch Normalization, you will normalize the activation too by making mean zero and standard deviation one.

Why use Batch Normalization?

To normalize the input in the same scale by making mean zero and standard deviation one. So that training will be faster and more stable

Internal Covariate Shift

What is Covariate shift?

A covariate shift is a situation in which the distribution of the model's input features in production changes compared to what the model has seen during training and validation. A covariate shift is a change in the distribution of the model's inputs between training and production data.

Below is a good example of a covariate shift.

Row 1 is trained with only a red rose and Row 2 is the Output saying Not Rose - Shift of Distribution is on the Left Side

Row 3 is trained with all color roses and Row 4 is the Output saying Not Rose - But the shift of Distribution is on the Right Side. This difference in the change of distribution is known as Covariate Shift.

Internal Covariate Shift - is defined as the change in the distribution of network activations due to the change in network parameters during training.

In the above image, consider each child as a hidden layer, input layer will pass “peas” to hidden layer 1, hidden layer 1 will pass “bees” to hidden layer 2, then “knees”, then “cheese” and finally output layer will give output-“fleas”

This shows the change in the information as they pass from the hidden layer because of unnormalized data,  due to which distribution will also change and the model will not trained properly.

Batch normalization makes sure that distribution is normally distributed at the end of each layer. which helps in reducing internal covariate shift, So that the model will trained quickly & properly.

 

How to apply batch normalization?

  • Apply batch normalization using mini-batch gradient descent.

  • Apply batch normalization layer by layer

Z11 → ZN11 → ZBN11 (scale & shift)→ g(ZN11) = A11

Advantages of Batch normalization-

  • make training more stable, because we can set hyperparameter values to a wide range

  • faster because of a higher learning rate

  • help in Regularization

  • Reduce weight initialization impact

 

4 because - [gamma weights, beta weights, moving_mean(non-trainable), moving_variance(non-trainable)]

Total parameter in 1st batch normalization is 12 i.e 4 * no. of hidden layer(3) = 12

6 trainable and 6 non-trainable parameter

Total parameter in 2nd  batch normalization is 8 i.e 4 * no. of hidden layer(2) = 8

4 trainable and 4 non trainable parameter

Deep Learning

Deep Learning

  • Introduction
  • LSTM - Long Short Term Memory
    • Introduction
  • ANN - Artificial Neural Network
    • Perceptron
    • Multilayer Perceptron (Notation & Memoization)
    • Forward Propagation
    • Backward Propagation
    • Perceptron Loss Function
    • Loss Function
    • Gradient Descent | Batch, Stochastics, Mini Batch
    • Vanishing & Exploding Gradient Problem
    • Early Stopping, Dropout. Weight Decay
    • Data Scaling & Feature Scaling
    • Regularization
    • Activation Function
    • Weight Initialization Techniques
    • Optimizer
    • Keras Tuner | Hyperparameter Tuning
  • CNN - Convolutional Neural Network
    • Introduction
    • Padding & Strides
    • Pooling Layer
    • CNN Architecture
    • Backpropagation in CNN
    • Data Augmentation
    • Pretrained Model & Transfer Learning
    • Keras Functional Model
  • RNN - Recurrent Neural Network
    • RNN Architecture & Forward Propagation
    • Types Of RNN
    • Backpropagation in RNN
    • Problems with RNN

About Fresherbell

Best learning portal that provides you great learning experience of various technologies with modern compilation tools and technique

Important Links

Don't hesitate to give us a call or send us a contact form message

Terms & Conditions
Privacy Policy
Contact Us

Social Media

© Untitled. All rights reserved. Demo Images: Unsplash. Design: HTML5 UP.