Loading Calculator...
Please wait a moment
Please wait a moment
Perform chi-square goodness of fit and independence tests instantly. Enter observed and expected frequencies to compute the chi-square statistic, degrees of freedom, and individual contributions.
Formula
χ² = Σ((O−E)² ÷ E)
Degrees of Freedom
k − 1 (goodness of fit)
Common Significance
α = 0.05 (5% level)
Invented By
Karl Pearson, 1900
χ² = Σ((O − E)² ÷ E)
O = observed frequency, E = expected frequency
Use this table to determine whether your calculated chi-square statistic is statistically significant. If your value exceeds the critical value for your degrees of freedom and chosen significance level, reject the null hypothesis.
| Degrees of Freedom (df) | α = 0.05 | α = 0.01 |
|---|---|---|
| 1 | 3.841 | 6.635 |
| 2 | 5.991 | 9.210 |
| 3 | 7.815 | 11.345 |
| 4 | 9.488 | 13.277 |
| 5 | 11.070 | 15.086 |
| 6 | 12.592 | 16.812 |
| 7 | 14.067 | 18.475 |
| 8 | 15.507 | 20.090 |
| 9 | 16.919 | 21.666 |
| 10 | 18.307 | 23.209 |
| 11 | 19.675 | 24.725 |
| 12 | 21.026 | 26.217 |
| 13 | 22.362 | 27.688 |
| 14 | 23.685 | 29.141 |
| 15 | 24.996 | 30.578 |
| 16 | 26.296 | 32.000 |
| 17 | 27.587 | 33.409 |
| 18 | 28.869 | 34.805 |
| 19 | 30.144 | 36.191 |
| 20 | 31.410 | 37.566 |
Critical values are right-tail values from the chi-square distribution. If χ² > critical value, reject H₀ at the given significance level.
The chi-square test is a fundamental statistical hypothesis test developed by Karl Pearson in 1900. It measures how well observed categorical data fit an expected distribution by comparing observed frequencies to the frequencies you would expect under a given hypothesis. The test produces a single number, the chi-square statistic (χ²), that quantifies the overall discrepancy between what was observed and what was expected.
There are two main types of chi-square tests. The goodness of fit test evaluates whether a single categorical variable follows a hypothesized distribution. For example, you might test whether a die is fair by comparing the observed number of times each face appears to the expected count of one-sixth of total rolls. The test of independence examines whether two categorical variables are associated. For instance, you might test whether gender and product preference are related by analyzing a contingency table of survey results.
The chi-square distribution is a right-skewed probability distribution that depends on the degrees of freedom. As degrees of freedom increase, the distribution becomes more symmetric and approaches a normal distribution. The test is non-parametric, meaning it makes no assumptions about the underlying population distribution, which makes it one of the most versatile and widely used statistical tests in research, quality control, genetics, marketing analysis, and social science.
χ² = Σ ((Oi − Ei)² ÷ Ei)
Where Oi is the observed frequency and Ei is the expected frequency for category i.
For each category, compute the difference Oi − Ei. This tells you how far the observed count deviates from the expected count.
Square the difference (Oi − Ei)² to eliminate negative signs and give more weight to larger deviations.
Divide each squared difference by its expected value: (Oi − Ei)² ÷ Ei. This normalizes the contribution relative to the expected frequency.
Add all individual contributions together to get the chi-square statistic. Compare this value to the critical value for your degrees of freedom.
A die is rolled 60 times. You expect each face to appear 10 times. The observed results are: 8, 12, 10, 14, 7, 9.
| Face | O | E | (O−E)²÷E |
|---|---|---|---|
| 1 | 8 | 10 | 0.400 |
| 2 | 12 | 10 | 0.400 |
| 3 | 10 | 10 | 0.000 |
| 4 | 14 | 10 | 1.600 |
| 5 | 7 | 10 | 0.900 |
| 6 | 9 | 10 | 0.100 |
χ² = 0.400 + 0.400 + 0.000 + 1.600 + 0.900 + 0.100 = 3.400
With df = 6 − 1 = 5 and α = 0.05, the critical value is 11.070. Since 3.400 < 11.070, we fail to reject H₀. The die appears fair.
A company surveys 200 customers about their preferred color. Under the hypothesis that all four colors are equally preferred, the expected count is 50 each. Observed: Red = 65, Blue = 45, Green = 40, Yellow = 50.
| Color | O | E | (O−E)²÷E |
|---|---|---|---|
| Red | 65 | 50 | 4.500 |
| Blue | 45 | 50 | 0.500 |
| Green | 40 | 50 | 2.000 |
| Yellow | 50 | 50 | 0.000 |
χ² = 4.500 + 0.500 + 2.000 + 0.000 = 7.000
With df = 4 − 1 = 3 and α = 0.05, the critical value is 7.815. Since 7.000 < 7.815, we fail to reject H₀. There is not enough evidence that color preferences differ.
A genetics experiment predicts a 9:3:3:1 phenotype ratio. From 160 offspring, expected counts are 90, 30, 30, and 10. Observed: 99, 26, 25, 10.
| Phenotype | O | E | (O−E)²÷E |
|---|---|---|---|
| A | 99 | 90 | 0.900 |
| B | 26 | 30 | 0.533 |
| C | 25 | 30 | 0.833 |
| D | 10 | 10 | 0.000 |
χ² = 0.900 + 0.533 + 0.833 + 0.000 = 2.267
With df = 4 − 1 = 3 and α = 0.05, the critical value is 7.815. Since 2.267 < 7.815, we fail to reject H₀. The observed ratio is consistent with the expected 9:3:3:1 genetic ratio.
For a rough estimate, focus on the categories with the largest absolute differences between observed and expected. Categories where O and E are close contribute very little to the chi-square statistic. If one category has a huge deviation, it will dominate the total, making it easy to spot significance at a glance.
Testing whether a coin is fair after 100 flips. Expected: 50 heads, 50 tails.
| Scenario | Heads (O) | Tails (O) | χ² | Significant at 0.05? |
|---|---|---|---|---|
| Nearly fair | 52 | 48 | 0.160 | No |
| Slight bias | 58 | 42 | 2.560 | No |
| Moderate bias | 63 | 37 | 6.760 | Yes |
| Strong bias | 70 | 30 | 16.000 | Yes |
| Extreme bias | 80 | 20 | 36.000 | Yes |
Testing whether product preferences are equally distributed. Expected count per product: 100.
| Product | Observed | Expected | Contribution |
|---|---|---|---|
| Product A | 130 | 100 | 9.000 |
| Product B | 95 | 100 | 0.250 |
| Product C | 85 | 100 | 2.250 |
| Product D | 90 | 100 | 1.000 |
| Total | 400 | 400 | 12.500 |
With df = 3 and α = 0.05, the critical value is 7.815. Since χ² = 12.500 > 7.815, we reject the null hypothesis. Product preferences are not equally distributed.
Testing a Mendelian 3:1 ratio. Expected: 90 dominant, 30 recessive.
| Phenotype | Observed | Expected | Contribution |
|---|---|---|---|
| Dominant | 85 | 90 | 0.278 |
| Recessive | 35 | 30 | 0.833 |
| Total | 120 | 120 | 1.111 |
With df = 1 and α = 0.05, the critical value is 3.841. Since χ² = 1.111 < 3.841, we fail to reject H₀. The data are consistent with a 3:1 Mendelian ratio.
Chi-square tests are essential in genetics, biology, and medical research for testing whether observed data match theoretical predictions such as Mendelian ratios or drug response distributions.
Businesses use chi-square tests to determine whether customer preferences, survey responses, or purchasing patterns differ significantly from expected distributions.
Manufacturers apply chi-square tests to detect defect rate deviations, verify production line consistency, and ensure output matches quality standards across categories.
Researchers in psychology, sociology, and education use chi-square tests to analyze relationships between categorical variables such as gender, education level, and survey responses.
The chi-square approximation is unreliable when expected cell counts fall below 5. If any expected value is below this threshold, consider combining categories or using Fisher's exact test instead.
The chi-square test requires raw frequency counts, not percentages or proportions. Using percentages will produce incorrect chi-square values and misleading conclusions.
The sum of observed frequencies should equal the sum of expected frequencies. If they do not match, your expected values are calculated incorrectly or you have a data entry error.
A statistically significant chi-square result only means the difference is unlikely due to chance. With very large sample sizes, even trivial differences can be statistically significant. Always consider effect size and practical importance.
For a goodness of fit test, df = k − 1. For a test of independence in an r × c table, df = (r − 1) × (c − 1). Using the wrong df will lead to incorrect critical values and wrong conclusions.
The chi-square test assumes that each observation is independent of every other observation. Repeated measures, paired data, or clustered samples violate this assumption and require alternative tests such as McNemar's test.
The chi-square test is used to determine whether there is a statistically significant difference between observed frequencies and expected frequencies in categorical data. It is commonly applied in goodness of fit tests and tests of independence between two categorical variables.
The chi-square statistic is calculated using the formula: chi-square equals the sum of (observed minus expected) squared divided by expected, for each category. You subtract each expected value from its observed value, square the difference, divide by the expected value, then sum all contributions.
Degrees of freedom (df) depend on the type of test. For a goodness of fit test, df equals the number of categories minus one (k minus 1). For a test of independence, df equals (number of rows minus 1) multiplied by (number of columns minus 1).
A goodness of fit test compares observed frequencies in a single categorical variable against expected frequencies from a theoretical distribution. A test of independence examines whether two categorical variables are related by analyzing a contingency table of observed versus expected counts.
You should not use a chi-square test when expected frequencies in any cell are less than 5, when the data are continuous rather than categorical, when observations are not independent, or when the sample size is very small. In these cases, Fisher's exact test or other methods may be more appropriate.
A large chi-square value indicates a greater discrepancy between observed and expected frequencies. The larger the chi-square statistic, the less likely the observed distribution occurred by chance, and the more evidence there is to reject the null hypothesis.
To find the p-value, you compare the calculated chi-square statistic against a chi-square distribution table using the appropriate degrees of freedom. If the statistic exceeds the critical value at your significance level (typically 0.05), you reject the null hypothesis. Statistical software can compute exact p-values.
No, chi-square values can never be negative. Since each contribution is calculated as (observed minus expected) squared divided by expected, and squaring always produces a non-negative number while expected values are always positive, every contribution and the overall statistic are always zero or positive.
Yates correction is an adjustment applied to chi-square tests for 2x2 contingency tables. It subtracts 0.5 from the absolute difference between observed and expected values before squaring. This correction reduces the chi-square value and provides a more conservative test when sample sizes are small.
You need at least two categories to perform a chi-square test. For a goodness of fit test, you need at least two categories in a single variable. For a test of independence, you need at least a 2x2 contingency table with two categories in each of two variables.
This chi-square calculator is provided for educational and informational purposes only. Results should be verified with professional statistical software for critical research decisions. UnitTables is not responsible for errors resulting from the use of this tool.