Softmax function and its derivative
This article is translated from the Softmax function and its derivative basic concept
The input of the Softmax function is an n-dimensional random truth vector, and the output is another n-dimensional truth vector.
And the range of values is (0,1) (0,1), and is 1.0. That is, mapping: s (a) =rn→rn s (\textbf{a}) =\mathbb{r}^n\rightarrow \mathbb{r}^n:
S (a): ⎡⎣⎢⎢⎢a1a2...an⎤⎦⎥⎥⎥→⎡⎣⎢⎢⎢s1s2 ... sn⎤⎦⎥⎥⎥\begin{equation*} S (\textbf{a}): \begin{bmatrix} a_1\\ a_2\\ ... \ \ A_n \end{bmatrix} \rightarrow \begin{bmatrix} s_1\\ s_2\\ ... \ \ S_n \end{bmatrix} \end{equation*}
The formula for each of these elements is:
Sj=eaj∑nk=1eak∀j∈1 ... N \begin{equation*} S_j=\frac{e^{a_j}}{\sum_{k=1}^{n}e^{a_k}} \qquad \forall j\in 1...N \end{equation*}
Obviously SJ S_j is always positive ~ (because of the exponent); because all SJ S_j and is 1, so there are sj<1