alternative
  • Home (current)
  • About
  • Tutorial
    Technologies
    C#
    Deep Learning
    Statistics for AIML
    Natural Language Processing
    Machine Learning
    SQL -Structured Query Language
    Python
    Ethical Hacking
    Placement Preparation
    Quantitative Aptitude
    View All Tutorial
  • Quiz
    C#
    SQL -Structured Query Language
    Quantitative Aptitude
    Java
    View All Quiz Course
  • Q & A
    C#
    Quantitative Aptitude
    Java
    View All Q & A course
  • Programs
  • Articles
    Identity And Access Management
    Artificial Intelligence & Machine Learning Project
    How to publish your local website on github pages with a custom domain name?
    How to download and install Xampp on Window Operating System ?
    How To Download And Install MySql Workbench
    How to install Pycharm ?
    How to install Python ?
    How to download and install Visual Studio IDE taking an example of C# (C Sharp)
    View All Post
  • Tools
    Program Compiler
    Sql Compiler
    Replace Multiple Text
    Meta Data From Multiple Url
  • Contact
  • User
    Login
    Register

Deep Learning - RNN - Recurrent Neural Network - Backpropagation in RNN Tutorial

Taking the example of many-to-one RNN i.e Sentiment analysis

Review Input Sentiment
movie was good 1
movie was bad 0
movie not good 0

No. of the unique word - 5 i.e 'movie', 'was', 'good', 'bad', and 'not'

movie was good bad not
[1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [0, 0, 1, 0, 0] [0, 0, 0, 1, 0] [0, 0, 0, 0, 1]

 

Converting it into vector-

Review Input Sentiment
[1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [0, 0, 1, 0, 0] 1
[1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [0, 0, 0, 1, 0] 0
[1, 0, 0, 0, 0] [0, 0, 0, 0, 1] [0, 0, 1, 0, 0] 0

O1 = f(X11 Wi + OoWh )

O2 = f(X12 Wi + O1Wh )

O3 = f(X11 Wi + O2Wh )

Y' = \(\sigma(O_3W_0)\)

L = -Yi logY'i - (1 - Yi) log(1 - Y'i)

After Loss Calculation, we need to minimize the loss using Gradient Descent.

for that, we need to find Wi, Wh, and Wo such values after which the L will be minimized.

 \(W_{i} = W_{i} - \eta\frac{\delta L}{\delta W_{i}}\)

\(W_{h} = W_{h} - \eta\frac{\delta L}{\delta W_{h}}\)

\(W_{o} = W_{o} - \eta\frac{\delta L}{\delta W_{o}}\)

 

\(\frac{\delta L}{\delta W_{0}} = \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta W_{0}}\) 

 

\(\frac{\delta L}{\delta W_{i}} = \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}} \frac{\delta O_{3}}{\delta W_{i}} + \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}} \frac{\delta O_{3}}{\delta O_{2}}\frac{\delta O_{2}}{\delta W_{i}} + \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}} \frac{\delta O_{3}}{\delta O_{2}}\frac{\delta O_{2}}{\delta O_{1}}\frac{\delta O_{1}}{\delta W_{i}}\) 

summarizing the above for j=3

\(\frac{\delta L}{\delta W_{i}} = \displaystyle\sum_{j=1}^{3} \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{j}}\frac{\delta O_{j}}{\delta W_{i}}\) 

for j = 1, it will be \(\frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{1}}\frac{\delta O_{1}}{\delta W_{i}} = \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}}\frac{\delta O_{3}}{\delta O_{2}}\frac{\delta O_{2}}{\delta O_{1}}\frac{\delta O_{1}}{\delta W_{i}}\)

for j = 2, it will be \(\frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{2}}\frac{\delta O_{2}}{\delta W_{i}} = \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}}\frac{\delta O_{3}}{\delta O_{2}}\frac{\delta O_{2}}{\delta W_{i}}\)

for j = 3, it will be \(\frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}}\frac{\delta O_{3}}{\delta W_{i}} = \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}}\frac{\delta O_{3}}{\delta W_{i}}\)

for j=n

\(\frac{\delta L}{\delta W_{i}} = \displaystyle\sum_{j=1}^{n} \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{j}}\frac{\delta O_{j}}{\delta W_{i}}\) 

 

\(\frac{\delta L}{\delta W_{h}} = \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}} \frac{\delta O_{3}}{\delta W_{h}} + \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}} \frac{\delta O_{3}}{\delta O_{2}}\frac{\delta O_{2}}{\delta W_{h}} + \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{3}} \frac{\delta O_{3}}{\delta O_{2}}\frac{\delta O_{2}}{\delta O_{1}}\frac{\delta O_{1}}{\delta W_{h}}\) 

similarly, we get  \(\frac{\delta L}{\delta W_{h}} = \displaystyle\sum_{j=1}^{n} \frac{\delta L}{\delta Y'} \frac{\delta Y'}{\delta O_{j}}\frac{\delta O_{j}}{\delta W_{h}}\) 

Deep Learning

Deep Learning

  • Introduction
  • LSTM - Long Short Term Memory
    • Introduction
  • ANN - Artificial Neural Network
    • Perceptron
    • Multilayer Perceptron (Notation & Memoization)
    • Forward Propagation
    • Backward Propagation
    • Perceptron Loss Function
    • Loss Function
    • Gradient Descent | Batch, Stochastics, Mini Batch
    • Vanishing & Exploding Gradient Problem
    • Early Stopping, Dropout. Weight Decay
    • Data Scaling & Feature Scaling
    • Regularization
    • Activation Function
    • Weight Initialization Techniques
    • Optimizer
    • Keras Tuner | Hyperparameter Tuning
  • CNN - Convolutional Neural Network
    • Introduction
    • Padding & Strides
    • Pooling Layer
    • CNN Architecture
    • Backpropagation in CNN
    • Data Augmentation
    • Pretrained Model & Transfer Learning
    • Keras Functional Model
  • RNN - Recurrent Neural Network
    • RNN Architecture & Forward Propagation
    • Types Of RNN
    • Backpropagation in RNN
    • Problems with RNN

About Fresherbell

Best learning portal that provides you great learning experience of various technologies with modern compilation tools and technique

Important Links

Don't hesitate to give us a call or send us a contact form message

Terms & Conditions
Privacy Policy
Contact Us

Social Media

© Untitled. All rights reserved. Demo Images: Unsplash. Design: HTML5 UP.