Residuals: Differences Between Observed and Predicted Values

August 31, 2024 3 min read Mathematics Statistics Residuals Observed Values Predicted Values Regression Analysis Statistical Model

A comprehensive guide on residuals, explaining their significance in statistical models, the calculation methods, types, and applications in various fields such as economics and finance.

On this page

Residuals are the differences between observed values and predicted values in a statistical model. They play a crucial role in assessing the accuracy and reliability of the model.

Definition§

Residuals are defined mathematically as:

\text{Residual} (e_i) = y_i - \hat{y}_i

where:

$y_i$ represents the observed value
$\hat{y}_i$ represents the predicted value

Types of Residuals§

Raw Residuals§

Raw residuals are the simplest form, calculated as the difference between observed and predicted values.

Standardized Residuals§

Standardized residuals are raw residuals divided by an estimate of their standard deviation.

Studentized Residuals§

Studentized residuals further adjust standardized residuals by taking into account the leverage of individual observations.

Special Considerations§

Homoscedasticity§

Residuals should exhibit constant variance for the model to be considered reliable.

Independence§

Residuals should be independent of each other to ensure the integrity of the model.

Examples§

Consider a simple linear regression model where you predict the weight of individuals based on their height. The residual for each individual is the difference between the observed weight and the predicted weight based on the regression line.

Numerical Example§

Given:

Observed weight ( $y_i$ ): 150 lbs
Predicted weight ( $\hat{y}_i$ ): 145 lbs

Residual ( $e_i$ ) = 150 - 145 = 5 lbs

Historical Context§

The concept of residuals has been integral to regression analysis since its inception, tracing back to the work of Sir Francis Galton in the 19th century.

Applicability§

Residuals are widely used in:

Economics: For evaluating the fit of economic models
Finance: For assessing the accuracy of predictive models in financial forecasting
Quality Control: For monitoring process performance and stability

Comparisons§

Residuals vs. Errors§

While often used interchangeably, residuals specifically refer to the discrepancies between observed and predicted values in a sample, while errors refer to the actual discrepancies in the population.

Regression Analysis: A set of statistical processes for estimating relationships among variables, where residuals are key indicators of model fit.
Leverage: A measure of the influence of an individual data point on the fit of the regression model.

FAQs§

What is the purpose of analyzing residuals?

Analyzing residuals helps assess the fit of a statistical model and identify any patterns that suggest model inadequacies.

How do residuals help in diagnosing a model?

Residuals can highlight issues like non-linearity, heteroscedasticity, and autocorrelation in a model.

References§

Galton, F. (1886). “Regression towards mediocrity in hereditary stature”. The Journal of the Anthropological Institute of Great Britain and Ireland.
Draper, N., & Smith, H. (1998). Applied Regression Analysis. Wiley.

Summary§

Residuals are fundamental in evaluating and diagnosing statistical models. By understanding the differences between observed and predicted values, statisticians and analysts can improve model accuracy and reliability across various fields.