이 글은 Computer Vision에서의 Coordinate Systems 을 다룬다.

종류 및 정의

World Coordinates (월드 좌표계):
- $\begin{bmatrix}x_w& y_w & z_w & 1\end{bmatrix}^\top$
- 카메라 외부에 존재하는 객체(object)의 위치를 전역적인 좌표계에서 나타낸 것임.
- 편의를 위해 camera coordinates와 같은 axis(축)과 origin(원점)을 사용하기도 함.
Camera Coordinates (카메라 좌표계):
- $\begin{bmatrix}x_c & y_c & z_c & 1\end{bmatrix}^\top$
- 월드 좌표계에서 표현된 점을 카메라 중심을 원점으로 하는 좌표계로 변환한 것임.
- 카메라 중심은 보통 optical center (or projection center)를 가르킴.
Normalized Image Plane Coordinates (정규화 이미지 평면 좌표계):
- $\begin{bmatrix}x_n & y_n & 1\end{bmatrix}^\top$
- 카메라 좌표계를 정규화된 이미지 평면으로 변환한 것임.
- 일반적으로는 sensor coordinates에 intrinsic matrix $K$의 inverse를 곱해 얻음: Camera Calibration을 수행한 결과임.
Sensor Coordinates (센서 좌표계, Pixel Coordinates ~ Image Plane Coordinate):
- $\begin{bmatrix}u & v & 1\end{bmatrix}^\top$
- normalized image plane(정규화 이미지 평면)에서 실제 디지털 센서 (픽셀) 좌표계로 변환한 것임: $K$를 통해 변환됨.
- 아래 그림에서는 virtual image plane에 해당함.
- 실제로는 optical center에 대해 뒤에 놓이며 뒤집혀져서 상이 맺히지만, virtual image plane을 통해 편하게 처리하는게 일반적.
- Image Plane Coordinate
  - image에서 mm 등을 unit으로 삼는 image coordinate $\begin{bmatrix}x_i & y_i & 1 \end{bmatrix}^\top$ 도 있음.
  - $x_i = f x_n = (u-u_o)/m_x, y_i = f y_n = (v - v_o)/m_y$.
  - $m_x, m_y$ : pixel density, (pixels/mm)
  - $f$ : focal length

다음 그림은 각 좌표계를 도식적으로 보여준다 (real image plane은 생략됨).

original: https://darkpgmr.tistory.com/77

2024.06.22 - [Programming/DIP] - [CV] Geometric Camera Model

[CV] Geometric Camera Model

Geometric Camera Model (or Camera Model)은 real world 와 camera의 pose에 따라,real world 와 camera의 image 간의 관계를approximation 함. 이 문서에서는 기본적인 Pinhole Camera Model에 기반하여 설명함.Pinhole Camera Model

dsaint31.tistory.com

서로간의 관계

World Coordinate System (월드 좌표계)에서 Camera Coordinate System (카메라 좌표계)로 변환:
$$
\begin{bmatrix}
x_c \\
y_c \\
z_c \\
1
\end{bmatrix}
= \begin{bmatrix}
R & \textbf{t} \\
0 & 1
\end{bmatrix} \begin{bmatrix}
x_w \\
y_w \\
z_w \\
1
\end{bmatrix}
$$
여기서 $R$ 은 rotational matrix, $\textbf{t}$는 translation vector임.
Camera Coordinate System에서 Normalized Image Plane Coordinate System (정규화 이미지 평면 좌표계)로 변환:
$$
\begin{bmatrix}
x_n \\
y_n \\
1
\end{bmatrix}
= \begin{bmatrix}
\frac{x_c}{z_c} \\
\frac{y_c}{z_c} \\
1
\end{bmatrix}
$$
이는 카메라 좌표계에서 z축을 정규화하여 이미지 평면으로 투영한 결과임.
Normalized Image Coordinates(정규화 이미지 평면 좌표계)에서 Sensor Coordinate System (센서 좌표계)로 변환:
$$
\begin{bmatrix}
u \\
v \\
1
\end{bmatrix}
= K \begin{bmatrix}
x_n \\
y_n \\
1
\end{bmatrix}
$$
여기서 $K$ 는 카메라 내부 파라미터 행렬 (intrinsic parameter matrix)로 다음과 같은 형태를 가짐:
$$
K = \begin{bmatrix}
f_x & 0 & u_o \\
0 & f_y & v_o \\
0 & 0 & 1
\end{bmatrix}
$$
$f_x, f_y$는 초점 거리 ($f$)와 픽셀 스케일링 팩터($m_x$, $m_y$)를 곱한 값이고, $u_o, v_o$는 이미지 센터의 pixel 좌표임.
- 이는 perspective projection과
- Image Plane to Sensor Plane Mapping에 기반함.

2024.07.06 - [Programming/DIP] - [CV] Perspective Projection (원근 투영법): Camera to Image

[CV] Perspective Projection (원근 투영법): Camera to Image

Perspective Projection3D 물체를 2D 평면에 투영하는 방법 중 하나를 의미함.이는 컴퓨터 그래픽스, 디자인, 건축 등에서 주로 사용됨.기술적 정의:Perspective Projection은 원근법을 적용하여 3D 공간에 있는

dsaint31.tistory.com

2024.07.06 - [Programming/DIP] - [CV] Image Plane to Image Sensor Mapping

[CV] Image Plane to Image Sensor Mapping

Image Plane에서의 coordinate $x_i, y_i$는mm단위를 가지며결과 image의 coordinate라고 생각할 수 있다.하지만 계산의 용이성 때문에 $x_i, y_i$ 보다는normalized image plane의 coordinate $x_n, y_n$ 이 많이 사용된다.현

dsaint31.tistory.com

같이 보면 좋은 자료

https://darkpgmr.tistory.com/77

[영상 Geometry #1] 좌표계

이번 글은 컴퓨터 비전에서 가장 어렵고 골치아픈 주제중 하나인 영상 geometry(기하학?)에 대한 것입니다. 영상 Geometry는 카메라 캘리브레이션, 스테레오 매칭, structure from motion, 모션 추정, local fea

darkpgmr.tistory.com

'Programming > DIP' 카테고리의 다른 글

[CV] Triangulation : Linear Triangulation [작성중] (1)	2024.06.30
[CV] Stereo Vision: Stereo Matching, Triangulation, Depth Map (0)	2024.06.29
[CV] Epipolar Geometry [작성중] (0)	2024.06.28
[CV] Two View Geometry (0)	2024.06.28
[CV] Camera Model Parameter Estimation: $\underset{\textbf{x}}{\text{argmin }} \mathbf{x}^\top A^\top A \mathbf{x}$ (0)	2024.06.23

[CV] Coordinate Systems

종류 및 정의

서로간의 관계

같이 보면 좋은 자료

'Programming > DIP' 카테고리의 다른 글

티스토리툴바