OLS coefficients
Goldman Sachs
Derive the coefficient estimate formulas for OLS linear regression
Answer
Recall that linear regression models the target variable $Y$ as $Y = X\beta + \epsilon$, with a set of coefficients $\beta$, explanatory variables $X$, and Gaussian, 0-centered noise vector $\epsilon$. We find the $\beta$ coefficients by minimising the sum of squared differences between predicted and observed target variable, $RSS = \left\lVert Y - X\beta \right\rVert^2$. Expanding this, we get $$RSS = (Y - X\beta)^T(Y - X\beta)=$$ $$Y^T Y - 2\beta^T X^T Y + \beta^T X^T X \beta$$ Taking the derivative with respect to $\beta$ and setting it to zero, we get $$-2X^T Y + 2X^T X \beta = 0$$This simplifies to: $$X^T X \beta = X^T Y$$Solving for $\beta$ gives the OLS estimator: $$\hat{\beta} = (X^T X)^{-1} X^T Y$$