* This model is a function of two parameters, $\phi_0$ and $\phi_1$
* We could rewrite the output as $y = f[x, \phi]$
* Given vectors of labels, $Y$, and training inputs $X$, we **train** the model by finding "good" values of $\phi$
---
## Defining "Good"
* Rather than making a "goodness" metric, we generally do the opposite and define a **loss function**
* For regression, this is the sum of squares of the errors at each training point
* $L[\phi] = \sum_{i=1}^I(f[x_i,\phi] - y_i)^2$
* This loss is named *least-squares loss*
* For our linear regression example, this is
* $L[\phi] = \sum_{i=1}^I(\phi_0 + \phi_1 x_i - y_i)^2$
---
## Loss Examples