Neural Network Notations edit

Dimensions edit

Dimension Variable
# Samples    
# Layers (exclude input)    
# Units in Input Layer    
# Units in Hidden Layer    
# Units in Output Layer / # Classes    

Constants edit

Constant
Learning Rate  
Regularization Factor  

Matrices edit

Notation Equation Dimensions Layers
Input   (given)   (global)
Output   (given)   (global)
Feedforward
 
Weight   (given / calculated)    
Bias   (given / calculated)    
Input        
Weighted Input        
Activation        
Predicted Output        
Backpropagation
 
Loss Function
(CE or MSE)
       
Cost Function     (scalar) (global)
Optimization    
Output Error        
Hidden Error        
Weight Update
(Gradient Descent)
       
Bias Update
(Gradient Descent)
       

Details edit

Functions and Partial Derivatives edit

 

Chain Rule edit

 

Weight / Bias Update (Gradient Descent) edit

 

Examples edit

 

Remarks edit

  •   is the matrix of the previous layer,   is that of the next layer, otherwise   implicitly refer to the current layer
  •   is the activation function (e.g. sigmoid, tanh, ReLU)
  •   is the element-wise product
  •   is the element-wise power
  •   is the matrix's sum of elements
  •   is the matrix derivative
  • Variations:
    1. All matrices transposed, matrix multiplcations in reverse order (row vectors instead of column vectors)
    2.   combined into one parameter matrix  
    3. No   term in  

References edit