1. The backpropagation chain is broken, that is, some of the variables may be converted into numpy arrays. Although they can still participate in calculations, they have lost the ability to propagate gradients, resulting in the inability to propagate gradients to subsequent variables. 2. Learning rate Unreasonable settings. If the learning rate is set too large, it will easily cause the loss to become nan, causing the model to not converge. If it is set too small, it will cause the model to learn very slowly. 3. The parameters of the neural network layer are not well initialized because Parameter initialization will affect the training speed of the model
1. The backpropagation chain is broken, that is, some of the variables may be converted into numpy arrays. Although they can still participate in calculations, they have lost the ability to propagate gradients, resulting in the inability to propagate gradients to subsequent variables. 2. Learning rate Unreasonable settings. If the learning rate is set too large, it will easily cause the loss to become nan, causing the model to not converge. If it is set too small, it will cause the model to learn very slowly. 3. The parameters of the neural network layer are not well initialized because Parameter initialization will affect the training speed of the model