단순 선형 회귀에서 상관계수와 결정계수와의 관계(The relationship between a correlation coefficient and a coefficient of determination)

 

1) $ r(x, y) $

 $ r(x, y) = \frac{\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 

2) $ r(\hat{y}, y) $

 $ r(\hat{y}, y) = \frac{\sum_{i=1}^{n}(\hat{y}_i-\bar{y})(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(\hat{y}_i-\bar{y})^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = \frac{\sum_{i=1}^{n}(\hat{y}_i-\bar{y})(y_i-\hat{y}_i+\hat{y}_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(\hat{y}_i-\bar{y})^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = \frac{\sum_{i=1}^{n}(\hat{y}_i y_i-\bar{y} \hat{y}_i-\bar{y} y_i+\bar{y}^2)}{\sqrt{\sum_{i=1}^{n}(\hat{y}_i-\bar{y})^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = \frac{\sum_{i=1}^{n}[(y_i - \hat{y}_i)(\hat{y}_i-\bar{y})+(\hat{y}_i - \bar{y})^2]}{\sqrt{\sum_{i=1}^{n}(\hat{y}_i-\bar{y})^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $
 $ \qquad \quad meanwhile, $
 $ \qquad \quad SST = SSR+SSE $ 
 $ \qquad \quad \sum_{i=1}^{n}(y_i - \bar{y})^2 = \sum_{i=1}^{n}((y_i - \hat{y}_i) + (\hat{y}_i - \bar{y}))^2 $ 
 $ \qquad \quad \qquad \qquad \qquad = \sum_{i=1}^{n}[e_i + (\hat{y}_i-\bar{y})]^2, e_i = (y_i-\hat{y}_i) $  
 $ \qquad \quad \qquad \qquad \qquad = \sum_{i=1}^{n}e_{i}^{2}+2\sum_{i=1}^{n}e_i(\hat{y}_i-\bar{y})+\sum_{i=1}^{n}(\hat{y}_i-\bar{y})^2 $  
 $ \qquad \quad \qquad \qquad \qquad \therefore \sum_{i=1}^{n}e_i(\hat{y}_i-\bar{y}) = 0 $ 
 $ \qquad \quad = \frac{\sum_{i=1}^{n}(\hat{y}_i - \bar{y})^2}{\sqrt{\sum_{i=1}^{n}(\hat{y}_i-\bar{y})^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = \sqrt{\frac{\sum_{i=1}^{n}(\hat{y}_i - \bar{y})^2}{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 

3) $ r(x, y) = r(\hat{y}, y) $

 $ \hat{y}_i = \beta_0 + \beta_1 x_i $

 $ x_i = \frac{\hat{y}_i - \beta_0}{\beta_1} $

 $ \bar x = \frac{1}{n}\sum_{i=1}^{n}\frac{\hat{y}_i - \beta_0}{\beta_1} $

 $ \quad = \frac{1}{n \beta_1}(\sum_{i=1}^{n}\hat{y}_i - n\beta_0) $

 $ \quad = \frac{1}{n \beta_1}\sum_{i=1}^{n}\hat{y}_i - \frac{\beta_0}{\beta_1} $

 $ r(x, y) = \frac{\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = \frac{\sum_{i=1}^{n}(\frac{\hat{y}_i - \beta_0}{\beta_1}-\frac{1}{n \beta_1}\sum_{i=1}^{n}\hat{y}_i + \frac{\beta_0}{\beta_1})(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(\frac{\hat{y}_i - \beta_0}{\beta_1}-\frac{1}{n \beta_1}\sum_{i=1}^{n}\hat{y}_i + \frac{\beta_0}{\beta_1})^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = \frac{\frac{1}{\beta_1}\sum_{i=1}^{n}(\hat{y}_i - \beta_0-\frac{1}{n}\sum_{i=1}^{n}\hat{y}_i + \beta_0)(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}\frac{1}{\beta_1^2}(\hat{y}_i - \beta_0-\frac{1}{n}\sum_{i=1}^{n}\hat{y}_i + \beta_0)^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = \frac{\frac{1}{\beta_1}\sum_{i=1}^{n}(\hat{y}_i - \beta_0-\frac{1}{n}\sum_{i=1}^{n}\hat{y}_i + \beta_0)(y_i-\bar{y})}{\frac{1}{\beta_1}\sqrt{\sum_{i=1}^{n}(\hat{y}_i - \beta_0-\frac{1}{n}\sum_{i=1}^{n}\hat{y}_i + \beta_0)^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = \frac{\sum_{i=1}^{n}(\hat{y}_i - \beta_0-\frac{1}{n}\sum_{i=1}^{n}\hat{y}_i + \beta_0)(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(\hat{y}_i - \beta_0-\frac{1}{n}\sum_{i=1}^{n}\hat{y}_i + \beta_0)^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = \frac{\sum_{i=1}^{n}(\hat{y}_i-\frac{1}{n}\sum_{i=1}^{n}\hat{y}_i)(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(\hat{y}_i-\frac{1}{n}\sum_{i=1}^{n}\hat{y}_i)^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $
$ \qquad \quad y_i = \hat y_i + e_i $
$ \qquad \quad \sum_{i=0}^n y_i = \sum_{i=0}^n \hat y_i + \sum_{i=0}^n e_i $
$ \qquad \quad \sum_{i=0}^n y_i = \sum_{i=0}^n \hat y_i \; \because $ sum of errors = 0

 $ \qquad \quad = \frac{\sum_{i=1}^{n}(\hat{y}_i-\bar y)(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(\hat{y}_i - \bar y)^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 $ \qquad \quad = r(\hat y, y) = \sqrt{\frac{\sum_{i=1}^{n}(\hat{y}_i - \bar{y})^2}{\sum_{i=1}^{n}(y_i-\bar{y})^2}} $

 

 $ \therefore r(x, y)^2 = r(\hat y, y)^2 = \frac{\sum_{i=1}^{n}(\hat{y}_i - \bar{y})^2}{\sum_{i=1}^{n}(y_i-\bar{y})^2} = \frac {SSR}{SST} = R^2 $ (only in simple regression)

+ Recent posts