Likelihood: The Probability of Observed Evidence Given an Event

August 31, 2024 4 min read Mathematics Statistics Probability Theory Likelihood Probability Statistics Bayesian Inference Data Analysis

A detailed exploration of likelihood, its mathematical foundation, applications, historical context, and more.

On this page

Historical Context§

The concept of likelihood is central to the field of statistics and has its roots in the work of early statisticians like Pierre-Simon Laplace and later Ronald Fisher. Fisher introduced the term “likelihood” in the 1920s to distinguish it from probability, even though they are related concepts. While probability refers to the likelihood of an event occurring, the term “likelihood” is used in statistical inference to quantify how plausible a given model parameter value is, given the observed data.

Definition and Mathematical Explanation§

Likelihood, in simple terms, is the probability of observing the given data under a specific model. It is denoted as $P(B|A)$ , where $B$ is the observed evidence, and $A$ is the event or hypothesis.

Mathematically, if we have a set of observed data $X$ and a parameter $\theta$ of a statistical model, the likelihood function $L(\theta|X)$ is defined as:

L(\theta|X) = P(X|\theta)

In Bayesian inference, the likelihood plays a crucial role in updating prior beliefs to form the posterior distribution using Bayes’ theorem:

P(\theta|X) = \frac{P(X|\theta) P(\theta)}{P(X)}

Key Events and Historical Development§

Early Probability Theory (1650-1750): The initial development of probability theory laid the groundwork for modern statistical methods.
Ronald Fisher (1920s): Introduction of the likelihood function and maximum likelihood estimation (MLE), which revolutionized statistical inference.
Modern Bayesian Statistics (Late 20th Century): The resurgence of Bayesian methods emphasized the importance of likelihood in updating beliefs.

Types and Categories§

Likelihood Function: A function of the parameter(s) given the data.
Profile Likelihood: Simplifies the likelihood by fixing some parameters and maximizing over others.
Conditional Likelihood: Focuses on the likelihood of observing data given certain conditions.
Marginal Likelihood: Integrates out nuisance parameters from the likelihood.

Diagrams and Mathematical Models§

Importance and Applicability§

Model Fitting: Likelihood is crucial for fitting statistical models to data.
Hypothesis Testing: It helps in comparing different models.
Bayesian Inference: It updates prior distributions to posterior distributions.
Machine Learning: Used in algorithms for parameter estimation and model validation.

Examples§

Coin Tossing: If we observe 6 heads out of 10 tosses, the likelihood of the probability $p$ of heads being 0.5 is $\binom{10}{6} (0.5)^6 (0.5)^4$ .
Regression Analysis: In linear regression, the likelihood of the model parameters given the observed data helps estimate the coefficients.

Considerations§

Assumptions: Likelihood-based methods assume the model is correctly specified.
Sensitivity: Likelihood is sensitive to the choice of model and data.
Complexity: Calculating likelihood can be computationally intensive for complex models.

Probability: A measure of the likelihood that an event will occur.
Bayesian Inference: A method that uses likelihood to update prior beliefs.
Maximum Likelihood Estimation (MLE): A method to estimate parameters by maximizing the likelihood function.

Comparisons§

Likelihood vs. Probability: Probability measures the chance of an event, while likelihood measures how well a parameter explains the observed data.
Frequentist vs. Bayesian Approaches: Frequentists rely on likelihood without prior distributions, while Bayesians use both likelihood and priors.

Interesting Facts§

Likelihood is not a probability but a function proportional to the probability of observed data.
The term “likelihood” was specifically chosen to avoid confusion with probability.

Famous Quotes§

“The theory of probabilities is at bottom nothing but common sense reduced to calculation.” - Pierre-Simon Laplace

Proverbs and Clichés§

“Seeing is believing” can be loosely tied to the concept of updating beliefs with observed data (Bayesian inference).

Expressions, Jargon, and Slang§

MLE (Maximum Likelihood Estimation): The process of estimating parameters that maximize the likelihood function.

FAQs§

Q: What is the difference between likelihood and probability?
A: Probability refers to the chance of an event occurring, while likelihood measures how well a parameter value explains the observed data.

Q: How is likelihood used in machine learning?
A: Likelihood is used in model fitting, parameter estimation, and validation of machine learning models.

Q: Can likelihood be negative?
A: No, likelihood is always a non-negative value as it represents a probability.

References§

Fisher, R.A. (1922). “On the Mathematical Foundations of Theoretical Statistics”. Philosophical Transactions of the Royal Society.
Bayes, T. (1763). “An Essay towards solving a Problem in the Doctrine of Chances”. Philosophical Transactions of the Royal Society.

Summary§

Likelihood is a foundational concept in statistics and probability, essential for understanding and performing data analysis. It quantifies how plausible specific parameter values are given the observed data, playing a critical role in model fitting, hypothesis testing, and Bayesian inference. Mastery of likelihood and its applications is vital for statisticians, data scientists, and researchers in various fields.