Linear Regression
- Regression
A regression function is the estimation of .
- Proposition
The regression minimize the quadratic risk
- Proof
So the minimum is obtain for
- Linear Regression
Suppose that the target follow with:
- linear expectation in :
- centered error :
- variance of the error is constant :
- independence of the error:
The hypotesis class is . We can also denote .
- Gaussian Linear Regression
Suppose that the target follow with:
- linear expectation in :
- errors follow iid
Above the is a random vector. Below, we will use the experimental plan define as
The associated model is given by .
- Identifiable
The model is identifiable iff is full rank or or injective or columns of independent.
When to do a linear regression ? For each couple of variables do a scatter plot and if you see a linear correlation with the target for a lot of features, bingo ! Be careful, if two features are "strongly" correlate you can drop one.
- Intercept
We can add a feature equal to for each data because, for now, the line go through the origin. With the intercept we add a new estimator that represent the height at the origin of the line.
- Tips
Normalize the features is not mandatory but it can be interesting if you want to compare the elements of . It can also help for numerical stabilization.
Estimationβ
You have two way to estimate : from residual sum of squares or from likelihood. They both have the same estimator of and two equivalent estimator of . Keep in mind that the first one is for linear regression and the second need the gaussian supposition.
- Residual sum of squares
Residual sum of squares is the empirical risk define as
We want to find the learning rule that minimize this risk .
- Residual sum of squares Estimator
- Maximum Likelihood Estimator
- Proposition
If the model is identifiable,
- Proof RSS
- Proof MLE
The inverse exist thank to the identifiability ! so
Check if it is the minimum OK!
We suppose so
Finally, we have Hessien definite negative. OK!
- Proposition
and is the minimal variance possible for unbiased linear estimator.
- Proof of the variance equality
So,
- Hat Matrix
The hat matrix is defined as the orthogonal projection on ,
- Proposition
If the model is identifiable, is an unbiased estimator of .
If the model is identifiable and gaussian, is a biased estimator of .
- Proof RSS
- Proof MLE
We suppose so
Finally, we have and and that is an Hessien definite negative. OK!
- Conclusion
For an identifiable linear model,
and
We don't have any information on the laws.
- Conclusion
For an identifiable gaussian linear model, you have
and you can choose between
or
but you have any way