#Residual Analysis
As we have seen, the difference between the observed value of the dependent variable (y) and the predicted value (ŷ) is called the residual (e) or random error. Each data point has one residual or error.
The Residuals Analysis is a procedure that helps us to understand how the error of our model is distributed, this allows us:
- See if the model assumptions are met
- Generate ideas that help us to improve our model or to choose another model better than the one we are using.
Residual analysis is then used to assess the appropriateness of a linear regression model by defining residuals and examining the residual plot graphs.
This procedure is done a posteriori, that is, after defining and building our linear model.
It consists of plotting the residuals or errors that become visible when comparing the values or data points that we predict through our model with the real values of our sample.
It can be done in different ways and I am going to show you what I consider to be the simplest in Python.