sigmoid function and softplus function
\[\sigma'(x) = \sigma(x)(1-\sigma(x))\]
\[1-\sigma(x) = \sigma(-x)\]
\[\log\sigma(x) = -\zeta(-x)\]
\[\tanh(z) = 2\sigma(2z)-1\]
\[\zeta'(x) = \sigma(x)\]
\[\forall x\in(0,1), \sigma^{-1}(x) = \log\frac{x}{1-x}\]
\[\forall x>0, \zeta^{-1}(x) = \log(e^x - 1)\]
\[\zeta(x)-\zeta(-x) = x\]
derivative of traces
\[\nabla_A tr AB = B^T\]
\[\nabla_{A^T} f(A) = (\nabla_A f(A))^T\]
\[\nabla_A trABA^TC = CAB +C^TAB^T\]
\[\nabla_A|A| = |A|\left(A^{-1}\right)^T\]
Chain rule of probability
\[p(\mathbf x) = \Pi_{i=1}^n p(x_i | x_1, \cdots, x_{i-1})\]
others
\[\lim_{x\rightarrow0}x\log x = 0\]
\[\log_a n = \log_a b \log_b n\]