Let $\textbf{W}_{ij}$ represents weight between node $i$ at layer $k$ and node $j$ at layer $(k-1)$ of a given multilayer perceptron. The weight updation using gradient descent method is given by
- $\textbf{W}_{ij}(t+1) = \textbf{W}_{ij}(t)+ \alpha \dfrac{\partial \textbf{E}}{\partial \textbf{W}_{ij}}, 0 \leq \alpha \leq 1$
- $\textbf{W}_{ij}(t+1) = \textbf{W}_{ij}(t)- \alpha \dfrac{\partial \textbf{E}}{\partial \textbf{W}_{ij}}, 0 \leq \alpha \leq 1$
- $\textbf{W}_{ij}(t+1) = \alpha \dfrac{\partial \textbf{E}}{\partial \textbf{W}_{ij}}, 0 \leq \alpha \leq 1$
- $\textbf{W}_{ij}(t+1) = – \alpha \dfrac{\partial \textbf{E}}{\partial \textbf{W}_{ij}}, 0 \leq \alpha \leq 1$
Where $\alpha$ and $E$ represents learning rate and Error in the output respectively.