Best Linear Unbiased Estimator (BLUE): The Optimal Estimation in Linear Regression

August 31, 2024 4 min read Statistics Mathematics Estimator Linear Regression Gauss-Markov Theorem Unbiased Estimator Optimal Estimator

An in-depth look at the Best Linear Unbiased Estimator (BLUE), its historical context, importance, and mathematical formulation in linear regression.

Introduction§

The Best Linear Unbiased Estimator (BLUE) is a statistical concept pivotal in linear regression analysis. It describes an estimator with the minimum variance among all linear and unbiased estimators, satisfying the conditions outlined in the Gauss-Markov theorem.

Historical Context§

The concept of BLUE was first introduced by the German mathematician Carl Friedrich Gauss, and further formalized by Andrey Markov. The Gauss-Markov theorem, named after these pioneering mathematicians, states that the ordinary least squares (OLS) estimator is the BLUE given certain conditions.

Types/Categories of Estimators§

Linear Estimators: Functions of the observed data that are linear in the parameters.
Unbiased Estimators: Estimators whose expected value equals the true value of the parameter being estimated.
Best Estimators: Estimators with the smallest variance among all unbiased estimators.

Key Events and Theorems§

Gauss-Markov Theorem: Under the assumptions of the classical linear regression model (linear relationship, constant variance, independence, and zero mean of errors), the OLS estimator is BLUE.
OLS Method: Deriving the coefficients that minimize the sum of squared residuals in linear regression.

Detailed Explanations§

The estimator $\hat{\beta}$ is linear if it can be written as a linear function of the observed data, $\mathbf{y}$:

\hat{\beta} = \mathbf{Ay}

where $\mathbf{A}$ is a matrix.

The unbiasedness condition implies:

E(\hat{\beta}) = \beta

To be BLUE, the estimator must have the smallest variance among all linear unbiased estimators:

\text{Var}(\hat{\beta}) \leq \text{Var}(\tilde{\beta}) \text{ for any other linear, unbiased } \tilde{\beta}

Mathematical Formulas/Models§

Given a linear regression model:

\mathbf{y} = \mathbf{X\beta} + \mathbf{u}

where:

$\mathbf{y}$ is an $n \times 1$ vector of observations,
$\mathbf{X}$ is an $n \times k$ matrix of regressors,
$\beta$ is a $k \times 1$ vector of parameters,
$\mathbf{u}$ is an $n \times 1$ vector of errors.

The OLS estimator $\hat{\beta}$ is given by:

\hat{\beta} = (\mathbf{X'X})^{-1} \mathbf{X'y}

Charts and Diagrams in Mermaid Format§

Importance and Applicability§

The concept of BLUE is essential in econometrics and statistics because it ensures that the estimator not only correctly predicts the parameters but does so with the least variability, leading to more reliable and interpretable results in practice.

Examples§

In a simple linear regression example, estimating the relationship between weight and height:

\text{Weight} = \beta_0 + \beta_1 \times \text{Height} + u

The OLS estimator for $\beta_0$ and $\beta_1$ provides the BLUE under the assumptions of the Gauss-Markov theorem.

Considerations§

The assumptions of the Gauss-Markov theorem must hold for the OLS estimator to be BLUE.
In the presence of heteroscedasticity or autocorrelation, OLS is no longer BLUE.

Heteroscedasticity: The condition where the variance of the error terms varies across observations.
Autocorrelation: When the error terms are correlated across different observations.
Consistent Estimator: An estimator that converges in probability to the true parameter value as the sample size increases.

Comparisons§

BLUE vs. Maximum Likelihood Estimators (MLE): MLE may not be unbiased but is consistent and efficient under certain conditions.
BLUE vs. Biased Estimators: Biased estimators may have lower variance than BLUE but do not correctly estimate the parameter’s expectation.

Interesting Facts§

The term “BLUE” was first coined by Samuelson in 1942.
Gauss initially formulated the least squares method to predict planetary orbits.

Inspirational Stories§

Karl Pearson’s development of the correlation coefficient paved the way for regression analysis, culminating in the establishment of BLUE.

Famous Quotes§

“An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem.” - John Tukey

Proverbs and Clichés§

“Measure twice, cut once.”

Expressions, Jargon, and Slang§

Fit the Model: Deriving the estimated parameters using the given data.
Minimize Residuals: Reducing the differences between observed and predicted values.

FAQs§

Q: What are the assumptions required for an estimator to be BLUE? A: The assumptions are linearity, constant variance, independence of errors, and zero mean of errors.

Q: Why is the OLS estimator called “ordinary”? A: Because it minimizes the sum of squared residuals, making it the most straightforward linear method.

References§

Gauss, C. F. (1821). “Theory of the Combination of Observations Least Subject to Error.”
Markov, A. (1912). “An Example of Statistical Investigation of the Text Eugene Onegin Concerning the Connection of Samples in Chains.”

Final Summary§

The Best Linear Unbiased Estimator (BLUE) holds a crucial position in the field of statistics and econometrics, ensuring the most reliable and efficient estimation of linear regression parameters under the Gauss-Markov assumptions. Understanding and applying the concept of BLUE leads to better decision-making and interpretation of statistical models in various disciplines.