ユーザ用ツール

サイト用ツール


prml演習7.18

PRML演習7.18

\[ \newcommand\comment[1]{\color{red}{{\Tiny\mbox{#1}}}} \newcommand\b[1]{\pmb{\mathrm{#1}}} \newcommand\T{\mathsf{T}} \newcommand\bx{\b{x}} \newcommand\by{\b{y}} \newcommand\bw{\b{w}} \newcommand\bt{\b{t}} \newcommand\balpha{\b{\alpha}} \newcommand\bphi{\b{\phi}} \newcommand\bA{\b{A}} \newcommand\bB{\b{B}} \newcommand\bPhi{\b{\Phi}} \newcommand\bPhiT{\bPhi^{\T}} \newcommand\bwT{\bw^{\T}} \newcommand\bxT{\bx^{\T}} \newcommand\bAT{\bA^{\T}} \newcommand\nablaw{\nabla_{\!\!\bw}} \newcommand\bphixn{\bphi(\bx_n)} \] \[ \leqalignno{ &\ln p(\bw|\bt,\balpha) = \sum\left\{ t_n \ln y_n +(1-t_n)\ln(1-y_n)\right\}-\frac{1}{2}\bwT\bA\bw + const &(7.109) } \] を \(\bw\) で微分する。ここで \[ \leqalignno{ \underset{\comment{※1}}{\nablaw y_n} = \nablaw \underbrace{\sigma(\bwT\bphixn)}_{(7.108)} = \underbrace{\sigma(\bwT\bphixn)\left\{1-\sigma(\bwT\bphixn)\right\}}_{(4.88) \sigma’=\sigma(1-\sigma)}\overbrace{\bphixn}^{\llap{(C.19)より}\rlap{\nablaw\,\bwT \bphixn=\bphixn }} = y_n(1-y_n)\bphixn } \]
\( \comment{※1}\ \nablaw y_n = \frac{\partial y_n}{\partial \bw}= \left( \matrix{\frac{\partial y_n}{\partial w_1} \\ \frac{\partial y_n}{\partial w_2} } \right) \) である。(分母レイアウト)
これを使って \[ \leqalignno{ \nablaw \ln y_n = \frac{1}{y_n}\nablaw y_n = (1 - y_n) \bphixn } \] また \[ \leqalignno{ \nablaw \ln (1-y_n) = \frac{1}{1-y_n}(-1)\nablaw y_n = - y_n \bphixn } \] さらに \[ \leqalignno{ \nablaw \bwT \bA \bw = \underset{\comment{※2}}{2\bA\bw} } \]
\( \leqalignno{ \comment{※2}\ \frac{\partial}{\partial \bx} \bxT \bA \bx \underset{\comment{※3}}{=} (\bA + \bAT)\bx \underset{\comment{※4}}{=} 2\bA\bx } \) より
\(\comment{※3}\ 成分計算で確認\)
\(\comment{※4}\ \bA\ が対称のとき\)
よって \[ \leqalignno{ \nablaw \ln p(\bw | \bt, \balpha) &= \sum_{n=1}^N \{ t_n(1-y_n)\bphixn + (1-t_n)(-y_n)\bphixn\}-\bA\bw \\ &= \sum_{n=1}^N (t_n-y_n)\bphixn - \bA\bw \\ &= \underset{\comment{※5}}{\bPhiT(\bt-\by)} - \bA\bw &(7.110) } \] を得る。
\( \leqalignno{ \comment{※5}~~~\bPhi = \pmatrix{\bphi(\bx_1)^{\T} \\ \bphi(\bx_2)^{\T}} \mbox{とすると} } \)
\( \leqalignno{ \therefore \bPhiT(\bt-\by) &= \pmatrix{\bphi(\bx_1) & \!\!\!\!\bphi(\bx_2)} \pmatrix{t_1 - y_1 \\ t_2 - y_2} \\ &= \bphi(\bx_1)(t_1-y_1)+\bphi(\bx_2)(t_1-y_2) \\ &~~~~~~~(\because Block\ matrixの積) \\ &= \sum \bphixn(t_n-y_n) } \)
ここで \[ \newcommand\nyp[2]{\frac{\partial}{\partial \bw_{#1}}y_n\phi_{#2}(\bx_n)} \leqalignno{ \nablaw y_n \bphixn &= \pmatrix { \nyp{1}{1} & \nyp{1}{2} \\ \nyp{2}{1} & \nyp{2}{2} } ~~~\comment{※}{\Tiny 分母レイアウトでベクトルをベクトルで微分} \\ &= (\nablaw y_n)\bphixn^{\T} = y_n(1-y_n)\bphixn \bphixn^{\T} } \] また \[ \newcommand\naw[2]{\frac{\partial}{\partial \bw_{#1}}(A_{#2 1}w_1+A_{#22}w_2)} \leqalignno{ \nablaw \bA\bw &= \nablaw \pmatrix{ A_{11} w_1 + A_{12} w_2 \\ A_{21} w_1 + A_{22} w_2} \\ &= \pmatrix{\naw{1}{1} & \naw{1}{2} \\ \naw{2}{1} & \naw{2}{2}} ~~~\comment{※}{\Tiny 分母レイアウトでベクトルをベクトルで微分} \\ &= \pmatrix{A_{11} & A_{21} \\ A_{12} & A_{22}} = \bA^{\T} = \underset{\llap{\because\bAは}\rlap{対称なので}}{\bA} } \] これらを使って \[ \leqalignno{ \nablaw \nablaw \ln p(\bw | \bt, \balpha) &= \sum_{n=1}^N(-1)y_n(1-y_n)\bphixn\bphixn^{\T}-\bA \\ &= -\left\{\sum_{n=1}^N y_n(1-y_n)\bphixn\bphixn^{\T}+\bA \right\} \\ &= -( \underset{\comment{※6}}{ \bPhi^{\T}\bB\bPhi } + \bA) &(7.111) } \] を得る。
\( \leqalignno{ \comment{※6}~~~ \bPhiT\bB\bPhi &= \bPhiT\underset{\comment{※7}}{\pmatrix{y_1(1-y_1) & 0 \\ 0 & y_2(1-y_2)}}\bPhi \\ &= \bPhiT\left\{ \pmatrix{y_1(1-y_1) & 0 \\ 0 & 0} + \pmatrix{0 & 0 \\ 0 & y_2(1-y_2)} \right\}\bPhi \\ &= y_1(1-y_1)\bPhiT\pmatrix{1 & 0 \\ 0 & 0}\bPhi + y_2(1-y_2)\bPhiT\pmatrix{0 & 0 \\ 0 & 1}\bPhi \\ &= \sum_{n=1}^N y_n(1-y_n)\underset{\comment{※8}}{\bphixn\bphixn^{\T}} } \)
\( \leqalignno{ \comment{※7}~~~\bB=\pmatrix{y_1(1-y_1) & 0 \\ 0 & y_2(1-y_2)} } \)
\( \leqalignno{ \comment{※8}~~~ \bPhiT\pmatrix{1 & 0 \\ 0 & 0}\bPhi &= \pmatrix{\bphi(\bx_1) & \bphi(\bx_2)} \pmatrix{1 & 0 \\ 0 & 0} \pmatrix{\bphi(\bx_1)^{\T} \\ \bphi(\bx_2)^{\T}} \\ &= \pmatrix{\bphi(\bx_1) & \bphi(\bx_2)} \pmatrix{\bphi(\bx_1)^{\T} \\ 0} \\ &~~~~~(\because Block\ matrixの積) \\ &= \bphi(\bx_1)\bphi(\bx_1)^{\T} + \bphi(\bx_2)\times 0 \\ &~~~~~(\because Block\ matrixの積) \\ &= \bphi(\bx_1)\bphi(\bx_1)^{\T} \\ 同様に~~~~~~~~~~& \\ \bPhiT\pmatrix{0 & 0 \\ 0 & 1}\bPhi &= \bphi(\bx_2)\bphi(\bx_2)^{\T} } \)
prml演習7.18.txt · 最終更新: 2018/01/09 14:03 by ma

ページ用ツール