About Syllabus Blog Tools PYQ Quizes

Karl Pearson’s Method of Correlation

Karl Pearson’s method is one of the most widely used statistical techniques for measuring the degree and direction of linear relationship between two continuous variables. It provides a numerical value known as the correlation coefficient, denoted by the symbol r. This coefficient lies between -1 and +1.

The method is algebraic in nature and highly reliable for large and continuous data. It is also called the Product-Moment Correlation Coefficient.

Unit 5: Business Statistics and Research Methods


The core idea behind Karl Pearson’s method is to compare the deviations of each variable from its mean and calculate how closely those deviations move together.

Formula:

r = ∑(X - 𝑋̄)(Y - π‘Œ̄) / √[∑(X - 𝑋̄)² × ∑(Y - π‘Œ̄)²]

Where:

  • X, Y: Actual values of the two variables
  • 𝑋̄, π‘Œ̄: Means of X and Y respectively
  • ∑: Summation
  • r: Correlation coefficient

Assumptions of Karl Pearson’s Method

  • The relationship between the two variables is linear.
  • Both variables are measured on interval or ratio scales.
  • The data is free from extreme outliers that can distort the correlation.
  • The variables are normally distributed (for advanced interpretation).

Methods of Calculation

There are three commonly used methods for calculating Karl Pearson’s correlation coefficient:

a. Direct Method

This is the simplest approach where actual values and means are used.

Formula:

r = ∑XY / √(∑X² × ∑Y²)

Here, X and Y represent deviations from respective means: (X = x - 𝑋̄, Y = y - π‘Œ̄)

b. Assumed Mean Method

This method simplifies calculations by choosing an assumed mean close to the center of values to reduce large numbers.

Formula: Same as Direct, but X and Y are taken as deviations from assumed mean.

c. Step-Deviation Method

When data is evenly spaced, this method is used to further reduce calculations by dividing deviations with a common factor.

Formula:

r = [∑dx·dy - (∑dx)(∑dy)/n] / √{[∑dx² - (∑dx)²/n] × [∑dy² - (∑dy)²/n]}

Where dx and dy are step deviations (i.e., reduced deviations of X and Y)


Example:

Question: Calculate Karl Pearson’s coefficient of correlation for the following data:

X:  10  20  30  40  50
Y:  12  24  33  45  55
      

Solution using Direct Method:

XYX̄=30Ȳ=33.8X - X̄Y - Ȳ(X - X̄)(Y - Ȳ)(X - X̄)²(Y - Ȳ)²
1012-20-21.8436400475.24
2024-10-9.89810096.04
30330-0.8000.64
40451011.2112100125.44
50552021.2424400449.44
Totals 107010001146.8

Applying the formula:

r = ∑(X - X̄)(Y - Ȳ) / √[∑(X - X̄)² × ∑(Y - Ȳ)²]
= 1070 / √(1000 × 1146.8) 

≈ 1070 / 1070.6

 ≈ 0.999

Interpretation: The correlation is almost perfect and positive.


Merits and Demerits of Karl Pearson’s Method

Merits:

  • Gives precise numerical value of correlation.
  • Widely accepted and applicable in all disciplines.
  • Useful for further statistical analysis like regression.
  • Simple formula when data is suitable.

Demerits:

  • Not suitable for non-linear relationships.
  • Sensitive to extreme values (outliers).
  • Cannot be used for ordinal or categorical data.
  • Does not imply causation—only measures association.

When and Why to Use Karl Pearson’s Method

This method is best used when:

  • Both variables are continuous and quantitative.
  • You want to find the exact strength and direction of a linear relationship.
  • Data is normally distributed and without significant outliers.
  • A graphical inspection like scatter diagram indicates a linear trend.

Conclusion

Karl Pearson’s method of correlation is a foundational tool in statistics for quantifying linear relationships between variables. It is algebraic, objective, and powerful, making it highly relevant for academic exams, business analysis, and scientific research. However, one must ensure that the data meets the method’s assumptions for meaningful results. In the next articles, we will explore non-parametric methods such as Spearman’s Rank Correlation and the Concurrent Deviation Method.

Recent Posts

View All Posts