Chi Squared Test

It is the most popular non parametric test of statistical significance for bivariate tabular analysis with a contingency table. This test can be applied to nominal or categorical data, but cannot be used in ranking technique.

Conditions to apply \(\chi^2 \) test

  • The frequency most be absolute not relative
  • The sample observation should be independent
  • The total frequency should be reasonably large that is n >50
  • The expected frequency of each cell should not be less than 5

Conditions to apply \(\chi^2 \) test

  • Test of independence
  • Test of Goodness of Fit
  • Test of Homogeneity

Formulae

\(\chi^2 = \sum \frac{(O – E)^2}{E} \)

Where,

O : Observed Frequency, E: Expected Frequency

\(E = \frac{RT \times CT}{GT} \)

Where, RT: Row Total for individual frequency, CT: Column Total for individual Frequency, GT: Grand Total of all frequency

1. Test of Independence

It is used to test the relationships between two categorical variable where df = (r-1)(c-1). It use contingency table.

a. By Direct Method
Category ART
Category Baba+b
cdc+d
CTa+cb+dGT: a+b+c+d
\(O\)\(E = \frac{RT \times CT}{GT} \)\(O-E\)\((O-E)^2 \)\(\frac{(O-E)^2}{E} \)
a\(E_a = \frac{(a+b) \times (a+c)}{GT} \)\(a-E_a\)
\((a-E_a)^2\)
\(\frac{(a-E_a)^2}{E_a}\)
b
\(E_b = \frac{(a+b) \times (b+d)}{GT} \)
\(b-E_b\)\((b-E_b)^2\)\(\frac{(b-E_b)^2}{E_b}\)
c\(E_c = \frac{(c+d) \times (a+c)}{GT} \)
d\(E_d = \frac{(c+d) \times (b+d)}{GT} \)
\(\sum O\)\(\sum E\)\(\sum \frac{(O – E)^2}{E} \)
Test Statistics

\(\chi^2 = \sum \frac{(O – E)^2}{E} \)

Degree of Freedom

\(df = (r-1)(c-1) = (2-1)(2-1) = 1 \)

Level of Significance

Use generally 5% level of significance \(\alpha = 0.05\)

Critical Value

Use chi squared test critical value table and find critical value by using level of significance (0.05) and degree of freedom (1).

Decision

If \(\chi^2_{cal} \leq \chi^2_{tab} \) ; Accept \(H_0\) and Reject \(H_1\) and vice versa

b. By Contingency Table
aba+b
cdc+d
a+cb+dN = a+b+c+d

\(\chi^2 = \frac{N(ad-bc)^2}{(a+b)(c+d)(a+c)(b+d)} \)

If value of one of the cell(that is either a or b or c or d or both or all) is less than 5 then use Yates Correction

Yates Correction

\(\chi^2 = \frac{N[|ad-bc| – \frac{N}{2}]^2}{(a+b)(c+d)(a+c)(b+d)}\)

2. Test of Goodness of Fit

It is used to test if observed data matches a specific distribution use one categorical variable where df = k-1.

xx1x2x3x4x5
ff1f2f3f4f5

Observed Frequency (O) = f1, f2, f3, f4, f5

Expected Frequency (E) = \(\frac{\sum O}{n}\)

xf(O)f(E)O-E\((O-E)^2 \)\(\frac{(O-E)^2}{E} \)
x1f1\(\frac{\sum O}{n}\)
x2f2\(\frac{\sum O}{n}\)
x3f3\(\frac{\sum O}{n}\)
x4f4\(\frac{\sum O}{n}\)
x5f5\(\frac{\sum O}{n}\)
n=5\(\sum O\)\(\sum E\)
\(\sum \frac{(O – E)^2}{E} \)
Test Statistics

\(\chi^2 = \sum \frac{(O – E)^2}{E} \)

Degree of Freedom

\(df = k-1 = 5-1 = 4 \)

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top