Median: Middle Value, Midpoint in a Range of Values

August 25, 2024 3 min read Mathematics Statistics Statistics Data Analysis Descriptive Statistics Central Tendency Median

The median is a statistical measure that represents the middle value in a range of values, offering a robust representation of a data set by reducing the impact of outliers.

On this page

The median is a statistical measure that identifies the middle value in a data set where half the numbers are above and half are below. Unlike the mean, the median does not get affected by extreme values or outliers, making it a reliable measure of central tendency.

Types of Data Sets§

Odd Number of Observations§

When the data set contains an odd number of observations, the median is simply the middle number:

\text{Median} = X\left(\frac{n+1}{2}\right)

where

n

is the number of observations.

Even Number of Observations§

For an even number of observations, the median is the average of the two middle numbers:

\text{Median} = \frac{X\left(\frac{n}{2}\right) + X\left(\frac{n}{2} + 1\right)}{2}

Special Considerations§

Robustness to Outliers§

The median is less sensitive to outliers compared to the mean. This makes it particularly useful in skewed distributions or when there are anomalous values.

Applicability§

The median is widely used in various fields, such as economics, finance, and medicine, to provide a measure of central tendency without the influence of long tails or outliers.

Examples§

Example with Odd Number of Observations§

Consider the data set: $[3, 1, 4, 1, 5, 9, 2]$

Ordered Data Set: $[1, 1, 2, 3, 4, 5, 9]$
Median: $3$

Example with Even Number of Observations§

Consider the data set: $[3, 1, 4, 1, 5, 9, 2, 6]$

Ordered Data Set: $[1, 1, 2, 3, 4, 5, 6, 9]$
Median: $\frac{3 + 4}{2} = 3.5$

Historical Context§

The concept of the median as a measure of central tendency dates back to the 13th century but was formalized in the mathematical sense in the 19th century by Francis Galton.

Mean§

The mean is the arithmetic average of a data set, which can be skewed by extreme values:

\text{Mean} = \frac{\sum X_i}{n}

Mode§

The mode is the most frequently occurring value in a data set, useful in categorical data analysis.

FAQs§

Why is the median preferred over the mean in certain situations?

The median is preferred when the data set contains outliers or is skewed because it better represents the central tendency without being influenced by extremes.

Can you find the median in categorical data?

The median is typically used for numerical data, though it can be applied to ordinal data where the values have a clear order.

References§

Galton, Francis. “Typical Laws of Heredity.” Nature (1875).
Hald, A. “Statistical Theory of Sampling.” International Statistical Review (1981).

Summary§

The median is a vital statistical tool for representing the central location of a data set. Its robustness against outliers makes it an invaluable measure in descriptive statistics. Understanding both its calculation and application is essential for accurate data analysis.