Skewness in statistics quantifies the degree of asymmetry of a distribution around its mean. When a dataset is graphed as a bell-shaped curve (normal distribution), skewness indicates whether the curve tails off to the right or left:
- Right Skewed (Positively Skewed): The right tail (larger values) is longer; the mass of the distribution is concentrated on the left.
- Left Skewed (Negatively Skewed): The left tail (smaller values) is longer; the mass is concentrated on the right.
Characteristics and Indicators
Right Skewed Distributions
Right skewed distributions have the following characteristics:
- The median is less than the mean.
- A long tail stretches towards higher values.
- Common in income distributions, where a small number of high earners skew the data.
Left Skewed Distributions
Left skewed distributions have these features:
- The mean is less than the median.
- A long tail extends towards lower values.
- Seen in situations like retirement age distributions, where early retirees skew the data.
Causes and Implications
Causes
- Right Skewed: Limited upper bound, rapid growth phenomena, or inherent data limits.
- Left Skewed: Limited lower bound, heavy penalties or constraints preventing lower values.
Implications
Understanding skewness is critical for:
- Data Analysis: Skewness affects measures of central tendency and variability.
- Modeling: Skewed data may require transformations for better fit in statistical models.
- Decision Making: Helps in interpreting the data more accurately, e.g., income inequality, risk assessment.
Mathematical Representation
Skewness is mathematically represented by:
Examples
Right Skewed Example
- Income Distribution: Most people’s income clusters at the lower end, with a few high earners extending the tail to the right.
Left Skewed Example
- Retirement Ages: Many people retire around a typical age, but some retire much earlier, dragging the tail to the left.
Historical Context
The concept of skewness dates back to Karl Pearson in the early 20th century, who developed the Pearson skewness coefficient based on moments of the distribution.
Applicability in Real-World
Data Analysis
In fields such as finance, healthcare, and social sciences, understanding skewness helps in interpreting data and making informed decisions.
Comparisons
- Symmetrical Distributions: Mean and median are equal, no skewness.
- Asymmetrical Distributions: Mean and median differ, indicating skewness.
Related Terms
- Kurtosis: Kurtosis measures the “tailedness” of a distribution, providing a sense of whether data are heavy-tailed (leptokurtic) or light-tailed (platykurtic) relative to a normal distribution.
- Outliers: Data points extremely different from others, often contributing to skewness.
FAQs
Q1: How can one identify skewness in a dataset?
- A1: By plotting the data on a histogram and observing the tail directions or calculating the skewness coefficient.
Q2: When should data be transformed for skewness?
- A2: When skewness significantly affects the performance of statistical models, e.g., logarithmic transformations for heavily skewed data.
References
- Pearson, K. (1905). “Skew Variation in Homogeneous Material”. Biometrika.
- Groeneveld, Richard A.; Meeden, Glen (1984). “Measuring Skewness and Kurtosis”.
Summary
Understanding skewness, whether right or left, is fundamental in statistics for data representation, analysis, and interpretation. By identifying the type of skewness, one can make more precise and informed decisions based on the underlying data distribution.