diff --git a/cn-Book/附录D.给训练循环添加高级技巧.md b/cn-Book/附录D.给训练循环添加高级技巧.md index d5354fb..ccb8264 100644 --- a/cn-Book/附录D.给训练循环添加高级技巧.md +++ b/cn-Book/附录D.给训练循环添加高级技巧.md @@ -194,7 +194,11 @@ $$|v|_{2}=\sqrt{v_{1}^{2}+v_{2}^{2}+\ldots+v_{n}^{2}}$$ 这种计算方法也适用于矩阵。例如,考虑以下梯度矩阵: -$$|v|_{2}=\sqrt{v_{1}^{2}+v_{2}^{2}+\ldots+v_{n}^{2}}G=\left[\begin{array}{ll} +$$G=\left[\begin{array}{ll} 1 & 2 \\ 2 & 4 -\end{array}\right]$$ \ No newline at end of file +\end{array}\right]$$ + +如果我们旨在将这些梯度裁剪到最大范数 1,我们首先计算这些梯度的 L2 范数,即为: + +$$|G|_{2}=\sqrt{1^{2}+2^{2}+2^{2}+4^{2}}=\sqrt{25}=5$$ \ No newline at end of file