[교육봉사] 회귀 모델 기초 이론

250x250

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

운동하는 공대생

[교육봉사] 회귀 모델 기초 이론 본문

카테고리 없음

[교육봉사] 회귀 모델 기초 이론

운동하는 공대생 2023. 11. 9. 22:18

728x90

1. 회귀 모델

회귀 모델이란 이전에서 설명했던 분류 모델과는 다르게 범주형 데이터를 예측하는 게 아니라 연속값을 예측하는 방식입니다.

모델을 구성하는 파라미터를 최적화 하기 위한 방식은 무엇이 있을까?

2. 회귀 모델 학습

Objective function

MSE(Mean Squared Error)

RMSE(Mean Squared Error), MAE(Mean Squared Error) 등등...

Optimization

Gradient Descent(경사 하강법)

3. 실습

LinearRegression을 이용해 보스턴 주택 가격 회귀 구현

page 324 ~ 327

~~(오늘은 코드 안써준다 ^^ 책 보고 하자)~~

다항 회귀

page 330 ~ 332

과적합

from sklearn.preprocessing import PolynomialFeatures
import numpy as np
x = np.array([1,2,3,4]).reshape(2,2)

# 2차 다항식 계수
poly2 = PolynomialFeatures(degree=2)
poly2.fit(x)
poly_ftr_2 = poly2.transform(x)

# 3차 다항식 계수
poly3 = PolynomialFeatures(degree=3)
poly_ftr_3 = poly3.fit_transform(x)

print(f"2차 다항식 계수")
print(poly_ftr_2)
print(f"3차 다항식 계수")
print(poly_ftr_3)

from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression
# y = 1 + 2X_1 + 3X_1^2 + 4X_2^3 데이터 생성
def polynomial_func(X):
    y = 1 + 2*X[:,0] + 3*X[:,0]**2 + 4*X[:,1]**3
    return y

X = np.arange(4).reshape(-2,2)
y = polynomial_func(X)

# Pipeline으로 Polynomial변환, Linear Regression 연결
model = Pipeline( [('poly', PolynomialFeatures(3)),
                   ('linear', LinearRegression())] )

model = model.fit(X, y)
print('Polynomial 절편')
print(np.round(model.named_steps.linear.intercept_, 2))
print('Polynomial 회귀 계수')
print(np.round(model.named_steps.linear.coef_, 2))

# X: 0 ~ 1값의 30개 랜덤샘플
np.random.seed(0)
n = 30
X = np.sort(np.random.rand(n))

# cosine 값 함수
def make_y(X):
    y = np.cos(1.5 * np.pi * X)
    return y

# y: cosine 값 + Noise
noise = np.random.randn(n)*0.1
y = make_y(X) + noise

import matplotlib.pyplot as plt
from sklearn.model_selection import cross_val_score

fig, axs = plt.subplots(1,3, figsize=(15,8))

degrees = [1, 4, 15]

# 다항 회귀 차수별 비교
for i, degree in enumerate(degrees):
    # plt.setp(axs[i], xticks=(), yticks=())
    
    # degree별 다항식 학습
    poly_features = PolynomialFeatures(degree = degree, include_bias=False)
    lr = LinearRegression()
    
    model = Pipeline([("polynomial_features", poly_features),
                         ("linear_regression", lr)])
    model.fit(X.reshape(-1, 1), y)
    
    # 교차 검증
    mse_scores = cross_val_score(model, X.reshape(-1,1), y, scoring="neg_mean_squared_error", cv=10)
    coefficients = model.named_steps['linear_regression'].coef_
    mse = -np.mean(mse_scores)
    
    print(f'\nDegree={degree} 회귀 계수는 {np.round(coefficients,2)} 입니다.')
    print(f'Degree={degree} MSE 는 {mse:.2f} 입니다.')
    
    # 적합된 회귀식으로 test 데이터에 대해 예측 곡선, 실제 곡선 비교
    X_test = np.linspace(0, 1, 100).reshape(-1,1)
    y_test = make_y(X_test)
    y_pred = model.predict(X_test)
    
    # 학습 예측값
    axs[i].plot(X, model.predict(X.reshape(-1,1)), "r", label = "Train Prediction")
    
    # test 예측값
    # axs[i].plot(X_test, y_pred, "r", label = "Test Prediction")
    
    # 실제값
    axs[i].plot(X_test, y_test, 'y--', label = "True function") # Noise: X
    axs[i].scatter(X, y, edgecolor='b', s=20, label = "Samples") # Noise: O
    
    # 기타 설정
    axs[i].set_xlabel("x"); axs[i].set_ylabel("y")
    axs[i].set_xlim((0, 1)); axs[i].set_ylim((-2, 2))
    axs[i].legend(loc="best")
    axs[i].set_title(f"Degree {degree}\nMSE = {mse:.2f} (S.E: {mse_scores.std():.2f})")

plt.show()

4. 최종 실습

자전거 대여 수요 예측

728x90

저작자표시

Comments

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

운동하는 공대생

운동하는 공대생

[교육봉사] 회귀 모델 기초 이론 본문

[교육봉사] 회귀 모델 기초 이론

1. 회귀 모델

2. 회귀 모델 학습

Objective function

Optimization

3. 실습

4. 최종 실습

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역