4. Neural Network ... revisited
This chapter delves into advanced neural network concepts that are required to understand the Transformer model.
Chapter Contents
4.1. Dense Layer
4.2. Softmax Activation Function
4.3. Optimization
4.4. Exploding and Vanishing Gradients Problems