alternative
  • Home (current)
  • About
  • Tutorial
    Technologies
    C#
    Deep Learning
    Statistics for AIML
    Natural Language Processing
    Machine Learning
    SQL -Structured Query Language
    Python
    Ethical Hacking
    Placement Preparation
    Quantitative Aptitude
    View All Tutorial
  • Quiz
    C#
    SQL -Structured Query Language
    Quantitative Aptitude
    Java
    View All Quiz Course
  • Q & A
    C#
    Quantitative Aptitude
    Java
    View All Q & A course
  • Programs
  • Articles
    Identity And Access Management
    Artificial Intelligence & Machine Learning Project
    How to publish your local website on github pages with a custom domain name?
    How to download and install Xampp on Window Operating System ?
    How To Download And Install MySql Workbench
    How to install Pycharm ?
    How to install Python ?
    How to download and install Visual Studio IDE taking an example of C# (C Sharp)
    View All Post
  • Tools
    Program Compiler
    Sql Compiler
    Replace Multiple Text
    Meta Data From Multiple Url
  • Contact
  • User
    Login
    Register

Deep Learning - CNN - Convolutional Neural Network - Backpropagation in CNN Tutorial

What is BackPropagation?

It is an algorithm to train neural networks. It is the method of fine-tuning the weights of a neural network based on the error rate obtained in the previous epoch (i.e., iteration).

Backpropagation, short for "backward propagation of errors," is an algorithm for supervised learning of artificial neural networks using gradient descent. Given an artificial neural network and an error function, the method calculates the gradient of the error function with respect to the neural network's weights using the chain rule.

 

Total Trainable Parameters-

W1 = (3,3) and W2 = (1,4)

b1 = (1,1) and b2 = (1,1) 

is equal to 15 trainable parameter

Loss for Binary Classification is

L = -Yi log(Y'i) - (1 - Yi) log(1 - Y'i)

 

Forward Propagation-

Z1 = Conv(X, W1) + b1

A1 = Relu(Z1)

P1 = MaxPool(A1)

F = Flatten(P1)

Z2 = FW2 + b2

A2 = \(\sigma\)(Z2)

Backward Propagation-

we have to apply a gradient descent algorithm on all trainable parameters till the loss is minimized

\(W_{1} = W_{1} - \eta\frac{\delta L}{\delta W_{1}}\)

\(b_{1} = b_{1} - \eta\frac{\delta L}{\delta b_{1}}\)

\(W_{2} = W_{2} - \eta\frac{\delta L}{\delta W_{2}}\)

\(b_{2} = b_{2} - \eta\frac{\delta L}{\delta b_{2}}\)

 

Finding the derivative-

\(\frac{\delta L}{\delta W_{2}} = \frac{\delta L}{\delta A_2} \times \frac{\delta A_2}{\delta Z_{2}}\times \frac{\delta Z_{2}}{\delta W_{2}}\) 

\(\frac{\delta L}{\delta b_{2}} = \frac{\delta L}{\delta A_2} \times \frac{\delta A_2}{\delta Z_{2}}\times \frac{\delta Z_{2}}{\delta b_{2}}\)

\(\frac{\delta L}{\delta W_{1}} = \frac{\delta L}{\delta A_2} \times \frac{\delta A_2}{\delta Z_{2}}\times \frac{\delta Z_{2}}{\delta F}\times \frac{\delta F}{\delta P_1}\times \frac{\delta P_1}{\delta A_1}\times \frac{\delta A_1}{\delta Z_1}\times \frac{\delta Z_1}{\delta W_1}\) 

\(\frac{\delta L}{\delta b_{1}} = \frac{\delta L}{\delta A_2} \times \frac{\delta A_2}{\delta Z_{2}}\times \frac{\delta Z_{2}}{\delta F}\times \frac{\delta F}{\delta P_1}\times \frac{\delta P_1}{\delta A_1}\times \frac{\delta A_1}{\delta Z_1}\times \frac{\delta Z_1}{\delta b_1}\) 

 

Denoting A2 Matrix as a2 single image

 \(\frac{\delta L}{\delta a_{2}} \) =\(\frac{\delta L}{\delta a_2} [-Y_i log(a_2) - (1 - Y_i) log(1 - a_2)]\) 

       = \(-\frac{Y_i}{\delta a_{2}} + \frac{(1 - Y_i)}{(1 - a_2)}=\frac{-Y_i(1 - a_2) + a_2(1-Y_i)}{a_2(1 - a_2)}\) =\(\frac{-Y_i + Y_i a_2 + a_2 - a_2 Y_i}{a_2(1 - a_2)}\)

       =\(\frac{(a_2 - Y_i)}{a_2(1-a_2)}\)

\(\frac{\delta A_2}{\delta Z_{2}} \) =\(\sigma(Z_2)[1 - \sigma(Z_2)]\) = a2[1 - a2]

\(\frac{\delta Z_2}{\delta W_{2}} \) = F   

\(\frac{\delta Z_2}{\delta b_{2}} \) = 1

\(\frac{\delta L}{\delta W_{2}} \) = \(\frac{(a_2 - Y_i)}{a_2(1-a_2)}\times a_2(1 - a_2) \times F\) = \((a_2 -Y_i)F^T\) = \((A_2 -Y)F^T\)

 \(\frac{\delta L}{\delta b_{2}} \)= \(\frac{(a_2 - Y_i)}{a_2(1-a_2)}\times a_2(1 - a_2) \times 1\) = \((a_2 -Y_i)\) = \((A_2 -Y)\)

 \(\frac{\delta Z_2}{\delta F} \) = W2

 \(\frac{\delta F}{\delta P_{1}} \) = reshape(P1.shape)

\(\frac{\delta L}{\delta A_1}= \begin{cases} \frac{\delta L}{\delta P_{1xy}} & \quad \text{,if } A_{mn} \text{ is the max element}\\ 0 & \quad \text{,otherwise} \end{cases} \)

 \(\frac{\delta A_1}{\delta Z_1}= \begin{cases} 1 & \quad \text{,if } Z_{1xy} \text{ > 0}\\ 0 & \quad \text{,if } Z_{1xy} \text{ < 0}\\ \end{cases} \)

 

Cat Vs Dog Classifier Practical

Deep Learning

Deep Learning

  • Introduction
  • LSTM - Long Short Term Memory
    • Introduction
  • ANN - Artificial Neural Network
    • Perceptron
    • Multilayer Perceptron (Notation & Memoization)
    • Forward Propagation
    • Backward Propagation
    • Perceptron Loss Function
    • Loss Function
    • Gradient Descent | Batch, Stochastics, Mini Batch
    • Vanishing & Exploding Gradient Problem
    • Early Stopping, Dropout. Weight Decay
    • Data Scaling & Feature Scaling
    • Regularization
    • Activation Function
    • Weight Initialization Techniques
    • Optimizer
    • Keras Tuner | Hyperparameter Tuning
  • CNN - Convolutional Neural Network
    • Introduction
    • Padding & Strides
    • Pooling Layer
    • CNN Architecture
    • Backpropagation in CNN
    • Data Augmentation
    • Pretrained Model & Transfer Learning
    • Keras Functional Model
  • RNN - Recurrent Neural Network
    • RNN Architecture & Forward Propagation
    • Types Of RNN
    • Backpropagation in RNN
    • Problems with RNN

About Fresherbell

Best learning portal that provides you great learning experience of various technologies with modern compilation tools and technique

Important Links

Don't hesitate to give us a call or send us a contact form message

Terms & Conditions
Privacy Policy
Contact Us

Social Media

© Untitled. All rights reserved. Demo Images: Unsplash. Design: HTML5 UP.