Normal Equation : Vector derivative 를 이용한 유도

2022. 4. 28. 18:40·.../Math
728x90
728x90

Normal Equation : Vector derivative 를 이용한 유도

OLS (Ordinal Least Square)에서 approximation of solution $\hat{\textbf{x}}$는 다음을 만족해야 함.
$$\begin{aligned}
\hat{\textbf{x}} &= \text{arg } \underset{\text{x}}{\text{min}} \Vert \textbf{b}-A\textbf{x} \Vert\\
&=\text{arg } \underset{\text{x}}{\text{min} } \Vert \textbf{b}-A\textbf{x} \Vert ^2\\
&=\text{arg } \underset{\text{x}}{\text{min} } (\textbf{b}-A\textbf{x} )^T(\textbf{b}-A\textbf{x})\\
&=\text{arg } \underset{\text{x}}{\text{min} } f_\text{loss} (\textbf{x})\end{aligned}$$
where

  • $\textbf{x}$ : column vector, $\textbf{x} \in \mathbb{R}^n$, ($n \times 1$)
  • $A$ : $m \times n$ matrix
  • $\textbf{b}$ : column vector, $\textbf{b} \in \mathbb{R}^m$, ($m \times 1$)
  • $m \gg n$

즉, 다음의 function $f_\text{loss}(\textbf{x})$를 최소화 시키는 $\textbf{x}$를 찾는 문제임.
$$\begin{aligned}
f_\text{loss}(\textbf{x})&= \textbf{b}^T\textbf{b} - \textbf{x}^T A^T \textbf{b} - \textbf{b}^T A \textbf{x} + \textbf{x}^T A^T A \textbf{x}
\end{aligned}$$

$f_\text{loss}(\textbf{x})$가 최소값일 때, $\textbf{x}$에 대한 gradient는 0 이 되므로 이를 이용하여 다음과 같이 벡터미분을 통해 normal equation을 구할 수 있음.

$$\begin{aligned}
0&=\dfrac{\partial f_\text{loss}(\textbf{x})}{\partial \textbf{x}}\\
&=\dfrac{\partial}{\partial \textbf{x}}\left( \textbf{b}^T\textbf{b} - \textbf{x}^T A^T \textbf{b} - \textbf{b}^T A \textbf{x} + \textbf{x}^T A^T A \textbf{x}\right)\\
&=\dfrac{\partial}{\partial \textbf{x}}\left( \textbf{b}^T\textbf{b} - \textbf{x}^T A^T \textbf{b} - (\textbf{x}^T A^T \textbf{b})^T + \textbf{x}^T A^T A \textbf{x}\right)\\
&=\dfrac{\partial}{\partial \textbf{x}}\left( \textbf{b}^T\textbf{b} - \textbf{x}^T A^T \textbf{b} - (\textbf{x}^T A^T \textbf{b}) + \textbf{x}^T A^T A \textbf{x}\right)\\
&=0-A^T\textbf{b}-A^T\textbf{b}+2A^TA\hat{\textbf{x}}\\
&=-2A^T\textbf{b}+2A^TA\hat{\textbf{x}}\\
&\quad \\
2A^T\textbf{b}&=2A^TA\hat{\textbf{x}}\\
A^T\textbf{b}&=A^TA\hat{\textbf{x}}\\
A^TA\hat{\textbf{x}}&=A^T\textbf{b}\\
\hat{\textbf{x}}&=(A^TA)^{-1}A^T\textbf{b}\end{aligned}$$


Vector derivative (summary)

$f(\textbf{x})$ $\frac{d f(\textbf{x})}{d \textbf{x}}$
$\textbf{x}^T A$ $A$
$\textbf{x}^T \textbf{b}$ $\textbf{b}$
$\textbf{x}^T \textbf{x}$ $2\textbf{x}$
$\textbf{x}^T A \textbf{x}$ $2A\textbf{x}$

Transpose 관련

  • $(A^T)^T =A$
  • $(A+B)^T =A^T+B^T$
  • $(A-B)^T =A^T-B^T$
  • $(kA)^T =kA^T, \text{ where }k\text{ is scalar.}$
  • $k^T = k, \text{ where }k\text{ is scalar.}$
  • $(AB)^T=B^TA^T$

Reference

matrixcookbook.pdf
692.7 kB

'... > Math' 카테고리의 다른 글

[Statistics] Sample Point Method  (0) 2022.05.01
Closed-form solution and Closed-form expression - Analytical Method  (1) 2022.04.29
One sample t-test : The Moon Illusion  (0) 2022.04.27
Chi Square : Independence Test (Analysis of Contingency Table)  (0) 2022.04.25
Chi Square Test : Goodness of fit test  (0) 2022.04.25
'.../Math' 카테고리의 다른 글
  • [Statistics] Sample Point Method
  • Closed-form solution and Closed-form expression - Analytical Method
  • One sample t-test : The Moon Illusion
  • Chi Square : Independence Test (Analysis of Contingency Table)
dsaint31x
dsaint31x
    반응형
    250x250
  • dsaint31x
    Dsaint31's blog
    dsaint31x
  • 전체
    오늘
    어제
    • 분류 전체보기 (748)
      • Private Life (13)
      • Programming (56)
        • DIP (112)
        • ML (26)
      • Computer (119)
        • CE (53)
        • ETC (33)
        • CUDA (3)
        • Blog, Markdown, Latex (4)
        • Linux (9)
      • ... (351)
        • Signals and Systems (103)
        • Math (172)
        • Linear Algebra (33)
        • Physics (42)
        • 인성세미나 (1)
      • 정리필요. (54)
        • 의료기기의 이해 (6)
        • PET, MRI and so on. (1)
        • PET Study 2009 (1)
        • 방사선 장해방호 (4)
        • 방사선 생물학 (3)
        • 방사선 계측 (9)
        • 기타 방사능관련 (3)
        • 고시 (9)
        • 정리 (18)
      • RI (0)
      • 원자력,방사능 관련법 (2)
  • 블로그 메뉴

    • Math
    • Programming
    • SS
    • DIP
  • 링크

    • Convex Optimization For All
  • 공지사항

    • Test
    • PET Study 2009
    • 기타 방사능관련.
  • 인기 글

  • 태그

    numpy
    DIP
    opencv
    SIGNAL
    Probability
    Optimization
    Programming
    linear algebra
    fourier transform
    Convolution
    cv2
    인허가제도
    SS
    signals_and_systems
    Vector
    Python
    signal_and_system
    Term
    function
    math
  • 최근 댓글

  • 최근 글

  • hELLO· Designed By정상우.v4.10.3
dsaint31x
Normal Equation : Vector derivative 를 이용한 유도
상단으로

티스토리툴바