Optimism-corrected Accuracy

728x90

Optimism-corrected accuracy는

bootstrap sampling을 통해
resubstitution accuracy에 포함된 낙관적 bias를
추정하고 보정한 accuracy임.

여기서 optimism은 다음을 의미함.

model이
학습에 사용한 data에서 평가될 때,
실제 일반화 성능보다 성능이 좋게 나오는 정도

즉, training data에서 다시 평가한 성능은 보통 실제 성능보다 높게 나온다.

이를 apparent performance 또는 resubstitution performance라고 부른다.

이 외의 bootstrap에서 사용되는 metrics에 대해선 다음을 참고:

2026.04.14 - [Programming/ML] - Bootstrap Sampling 기반 Accuracy 추정 지표

Bootstrap Sampling 기반 Accuracy 추정 지표

0. 왜 Bootstrap Accuracy Estimation이 필요한가모델 성능 평가의 이상적인 방법은 독립적인 test set을 사용하는 것임.하지만 데이터가 부족한 경우, 충분한 test set을 확보하기 어려움.Bootstrap accuracy estimat

dsaint31.tistory.com

1. Resubstitution Accuracy

원본 dataset $D$로 model을 학습하고, 같은 dataset $D$에서 다시 accuracy를 평가한 값을 다음 심볼로 표시:

$$
\hat{\text{Acc}}_{\text{resub}}
$$

이는 model이 이미 본 data에서 평가된 결과이므로, 일반적으로 낙관적으로 추정됨.

즉, $\hat{\text{Acc}}_{\text{resub}}$ 는 실제 unseen data에서의 accuracy보다 높게 나올 가능성이 크다.

2. Bootstrap을 이용한 Optimism 추정

Optimism-corrected bootstrap에서는 $B$번의 bootstrap iteration을 수행한다.

각 iteration $b$에서 원본 dataset $D$로부터 복원추출하여 bootstrap sample $D_b^*$를 만든다.

이후 다음 절차를 수행한다.

$D_b^*$로 model을 학습함.
학습한 model을 $D_b^*$에서 평가함.
같은 model을 원본 dataset $D$ 전체에서 평가함.
두 accuracy의 차이를 optimism으로 계산함.

각 bootstrap iteration에서의 optimism은 다음과 같음:

$$o_b = \hat{\text{Acc}}_{b,\text{boot}} - \hat{\text{Acc}}_{b,\text{orig}}$$

$\hat{\text{Acc}}_{b,\text{boot}}$
- bootstrap sample $D_b^*$에서 평가한 accuracy
$\hat{\text{Acc}}_{b,\text{orig}}$
- 같은 bootstrap-trained model을 원본 dataset $D$ 전체에서 평가한 accuracy

즉, $o_b$는 다음을 의미:

bootstrap sample에서 학습한 model이,
자신이 학습한 sample에서 평가될 때 얼마나 더 좋게 보이는가

3. 평균 Optimism

각 bootstrap iteration에서 얻은 optimism을 다음과 같이 평균을 취함:

$$\frac{1}{B}\sum_{b=1}^{B} o_b$$

이 값은 resubstitution accuracy가 실제 성능보다 얼마나 낙관적으로 높게 추정되는지를 나타냄.

4. Optimism-corrected Accuracy

최종적으로 optimism-corrected accuracy는 다음과 같이 계산:

$$\hat{\text{Acc}}_{\text{opt-corr}} = \hat{\text{Acc}}_{\text{resub}} - \frac{1}{B}\sum_{b=1}^{B} o_b$$

즉, 다음을 의미함:

$$\text{corrected accuracy} = \text{apparent accuracy} - \text{estimated optimism}$$

이는 training data에서 너무 좋게 보이는 성능을 bootstrap으로 추정한 낙관성만큼 낮추는 방식임.

5. 간단한 예

원본 dataset으로 학습하고 같은 dataset에서 평가한 accuracy가 다음과 같다고 가정:

$$\hat{\text{Acc}}_{\text{resub}} = 0.90$$

그리고 bootstrap을 통해 평균 optimism이 다음과 같이 추정되었다고 가정:

$$\frac{1}{B}\sum_{b=1}^{B} o_b = 0.06$$

그러면 optimism-corrected accuracy는 다음과 같음:

$$\hat{\text{Acc}}_{\text{opt-corr}} = 0.90 - 0.06 = 0.84$$

apparent accuracy는 $0.90$이지만,
optimistic bias를 보정하면 실제 성능은 대략 $0.84$에 가깝다고 해석.

6. OOB 평가와의 차이

Optimism-corrected bootstrap은 단순한 OOB(out-of-bag) 평가와 다르다.

OOB 방식은 각 bootstrap sample에 포함되지 않은 sample만 이용해 성능을 평가한다.

반면 optimism-corrected bootstrap은 각 bootstrap iteration에서 다음 차이를 계산한다.

$$ \hat{\text{Acc}}_{b,\text{boot}} - \hat{\text{Acc}}_{b,\text{orig}}$$

OOB 성능 자체를 최종 성능으로 쓰는 것이 아니라,
학습 data에서의 성능 과대평가 정도를 추정하는데 이용.

7. Optimism-corrected Bootstrap 과 .632, .632+ Bootstrap과의 관계

Bootstrap 기반 성능 추정에는 대표적으로 다음 방법들이 있음.

optimism-corrected bootstrap
.632 bootstrap
.632+ bootstrap

Accuracy처럼 높을수록 좋은 성능지표를 기준으로 하면, 일반적으로 낙관적인 순서는 다음과 같음:

$$ \text{Resubstitution} > \text{Optimism-corrected} > .632 > .632+$$

즉, 보통 다음과 같이 해석할 수 있다.

방법	desc.
Resubstitution accuracy	가장 낙관적
Optimism-corrected accuracy	resubstitution보다는 덜 낙관적
.632 bootstrap	OOB 성능을 더 직접적으로 반영하므로 더 보수적
.632+ bootstrap	overfitting이 클수록 OOB 쪽 비중을 더 키우므로 가장 보수적

이는 일반적인 경향임.
실제 순서는 dataset size, model complexity, overfitting 정도, 성능지표에 따라 달라질 수 있다.
error rate처럼 낮을수록 좋은 지표를 사용할 경우에는 부등호 방향이 반대로 해석될 수 있다.

8. 사용할 때 주의할 점

Optimism-corrected bootstrap을 올바르게 사용하려면,
각 bootstrap iteration에서 모델링 과정 전체를 다시 수행해야 함.

단순히 이미 학습된 final model을 bootstrap sample에서 평가하는 방식은 적절하지 않다.

각 bootstrap iteration마다 다음 과정이 모두 다시 수행되어야 한다.

preprocessing
scaling
feature selection
hyperparameter tuning
threshold selection
model fitting
performance evaluation

그래야 전체 modeling pipeline이 가지는 optimism을 제대로 추정할 수 있다.

정리

Optimism-corrected accuracy는 다음을 수행하는 bootstrap 기반 성능 추정 방법임.

원본 dataset으로 model을 학습하고 원본 dataset에서 resubstitution accuracy를 계산함.
여러 bootstrap sample을 생성함.
각 bootstrap sample에서 model을 새로 학습함.
같은 model을 bootstrap sample과 원본 dataset에서 각각 평가함.
두 성능 차이를 optimism으로 계산함.
평균 optimism을 resubstitution accuracy에서 뺌.

핵심 식은 다음과 같다.

$$\hat{\text{Acc}}{\text{opt-corr}}= \hat{\text{Acc}}{\text{resub}} - \frac{1}{B}\sum_{b=1}^{B} o_b$$

optimism-corrected accuracy는
학습 data에서 너무 좋게 보이는 accuracy를
bootstrap으로 추정한 낙관성만큼 보정한 metric임.

같이보면 좋은 자료들

2024.06.05 - [.../Math] - [ML] Bootstrap Sampling

[ML] Bootstrap Sampling

Bootstrap Sampling을 이해하고 활용하기Bootstrap Sampling이란 무엇인가?Bootstrap Sampling은 통계학(Statistics)과 데이터 과학(Data Science)에서 널리 사용되는 강력한 방법론(Methodology) 중 하나임.이는 기존의 데

dsaint31.tistory.com

2024.06.20 - [.../Math] - [ML] Out of Bag: 유도하기.

[ML] Out of Bag: 유도하기.

Out of Bag (OOB)란?Out of Bag (OOB)는 Bagging (Bootstrap aggregating)과 같이 Bootstraping을 이용한 Ensemble Model에 등장하는 용어. Bootstrap Sampling을 사용할 경우, 특정 predictor를 훈련시킬 때 sample point는 여러번 사용

dsaint31.tistory.com

728x90

'Programming > ML' 카테고리의 다른 글

Balanced Accuracy (균형 정확도) (0)	2026.05.26
[ML] BFGS, L-BFGS, L-BFGS-B : Quasi-Newton method (0)	2026.04.27
Linear Regression (Summary) (0)	2026.04.25
Bootstrap Sampling 기반 Accuracy 추정 지표 (0)	2026.04.14
XAI: Coefficient, Feature importance, and SHAP (0)	2026.03.24

1. Resubstitution Accuracy

2. Bootstrap을 이용한 Optimism 추정

3. 평균 Optimism

4. Optimism-corrected Accuracy

5. 간단한 예

6. OOB 평가와의 차이

7. Optimism-corrected Bootstrap 과 .632, .632+ Bootstrap과의 관계

8. 사용할 때 주의할 점

정리

같이보면 좋은 자료들

'Programming > ML' 카테고리의 다른 글

티스토리툴바