• $W=\left[w_{1}, w_{2} \dots w_{m}\right]$
• $X=\left[x^{(1)}, x^{(2)} \dots x^{(m)}\right]$
• $B=\left[b, b, \dots b\right]$
• $Z=\left[z^{(1)}, z^{(2)} \dots z^{(m)}\right]$
• $A=\left[a^{(1)}, a^{(2)} \dots a^{(m)}\right]$
• $Y=\left[y^{(1)}, y^{(2)} \dots y^{(m)}\right]$
• $dW=\left[dw_{1}, dw_{2} \dots dw_{m}\right]^T$

# 反向传播

$\frac{d J}{d b}=\frac{1}{m} \sum_{i=1}^{m} \frac{d L^{(i)}}{d a^{(i)}} \frac{d a^{(i)}}{d z^{(i)}} \frac{d z^{(i)}}{d b} = \frac{1}{m} \sum_{i=1}^{m}\left(a^{(i)}-y^{(i)}\right)$

Reference:

Neural Networks and Deep Learning