* Passes through the two sample points
* Ignores the rest of the target function, Y
* Two problems:
* Need a higher degree polynomial
* Need more data if we add noise
---
## Justification
* Let's justify adding more data and minimizing MSE
* $MSE = \frac{1}{n}\Sigma_{i=1}^{n}(Y - (\beta_0 + \beta_{1}X))^2$
* This is an estimate of the true error
* So as $n \rightarrow \infty$, our error estimate converges on reality
* But it is important to remember that our error is always an estimate!
---
## Samples > Parameters
* Let's rewrite our equation with matrices