This test is used to determine if there are nonrandom associations between two categorical variables in a contingency table usually 2 x 2 tables. It is done when sample sizes are small and the chi-squared test assumptions are violated (expected frequency >= 5).
When to Use Fisher’s Exact Test?
- When sample sizes are small (typically when the expected frequency in any cell of the table is less than 5).
- When analyzing categorical data in a 2×2 contingency table.
- When the chi-square test’s assumptions (large sample size) do not hold.
Steps to Perform Fisher’s Exact Test
- Create a 2×2 contingency table with observed frequencies.
- Calculate the probability of obtaining the observed table using the Fisher’s Exact Test formula.
- Compute the p-value, which is the sum of probabilities of all tables with equal or lower probability.
- Compare the p-value with the significance level (α, usually 0.05):
- If p ≤ α, reject the null hypothesis (evidence of association).
- If p > α, fail to reject the null hypothesis (no significant association).
Hypothesis for Fisher’s Exact Test
- Null Hypothesis (H₀): There is no association between the two categorical variables (they are independent).
- Alternative Hypothesis (H₁): There is an association between the two categorical variables (they are dependent).
2×2 Contingency Table Format
A contingency table is used to summarize the frequency counts of two categorical variables:
| Category B₁ | Category B₂ | Total | |
|---|---|---|---|
| A₁ | a | b | a+b |
| A₂ | c | d | c+d |
| Total | a+c | b+d | N |
where:
- a, b, c, and d are observed frequencies in each cell.
- N is the total sample size.
Mathematical Formula for Fisher’s Exact Test
\(P= \frac{^{(a+b)}C_a.^{(c+d)}C_c}{^NC_{(a+c)}} \) Where C is Combination
Or,
\(P= \frac{(a+b)!.(c+d)!.(a+c)!.(b+d)!}{N!.a!.b!.c!.d!} \)
Interpretation of the Formula
- The test calculates the probability of observing the specific arrangement of the table under the assumption that row and column totals are fixed.
- The p-value is the sum of probabilities of all tables that have a probability equal to or smaller than the observed table.
Test statistics
\(P = P_k +P_{k-1} +P_{k-2} + P_{k-3}+ …………. +P_o\)
Where
- \(P\) : P value of Fisher’s Exact Test
- \(P_o\) : is the probability of the observed contingency table (i.e., the exact table we got from the data)
- \(P_{k-1}\) or \(P_{k-2}\) : Probabilities of more extreme tables than \(P_o\)
- \(P_k\) : Probability of the most extreme table possible.
Example
| Category B₁ | Category B₂ | Total | |
|---|---|---|---|
| A₁ | 1 | 6 | 7 |
| A₂ | 4 | 1 | 5 |
| Total | 5 | 7 | 12 |
Hypothesis:
- Null: Proportion are equal
- Alternate: Proportion are not equal
Test Statistics Calculation
\(P_o =\frac{(a+b)!.(c+d)!.(a+c)!.(b+d)!}{N!.a!.b!.c!.d!} \)
Or, \(P_o =\frac{(7)!.(5)!.(5)!.(7)!}{12!.1!.6!.4.1!} = 0.0441 \)
Now modify value of contingency table keeping sum of all of them constant to get extreme table
| Category B₁ | Category B₂ | Total | |
|---|---|---|---|
| A₁ | 0 | 7 | 7 |
| A₂ | 5 | 0 | 5 |
| Total | 5 | 7 | 12 |
\(P_k =\frac{(a+b)!.(c+d)!.(a+c)!.(b+d)!}{N!.a!.b!.c!.d!} \)
Or, \(P_k =\frac{(7)!.(5)!.(5)!.(7)!}{12!.0!.7!.5.0!} = 0.00126 \)
Final Test Statistics Calculation
\(P = P_o + P_k\)
Or, \(P = 0.0441 + 0.00126 \) = 0.045326
Decision:
\(P(0.045326 < \alpha (0.05) \) Hence, do not accept null hypothesis
