𝑝

P-Value Calculator

Calculate p-values from z-scores and test statistics for hypothesis testing. Determine statistical significance for one-tailed and two-tailed tests instantly.

Quick Facts

• A p-value below 0.05 is generally considered statistically significant
• Two-tailed p-values are always double the corresponding one-tailed p-value
• A z-score of 1.96 corresponds to a two-tailed p-value of 0.05
• P-values range from 0 to 1 and measure evidence against the null hypothesis

Test Statistic (z-score)

Test Type

P-Value

0.015179

Significant (p < 0.05)

Yes at 0.05

No at 0.01

No at 0.001

Common Z-Scores and P-Values Table

This reference table shows commonly used z-scores and their corresponding p-values for both one-tailed and two-tailed hypothesis tests. These values are derived from the standard normal distribution.

Z-Score	One-Tailed P-Value	Two-Tailed P-Value	Significance
0.00	0.5000	1.0000	Not significant
0.50	0.3085	0.6171	Not significant
0.84	0.2005	0.4009	Not significant
1.00	0.1587	0.3173	Not significant
1.28	0.1003	0.2005	Not significant
1.50	0.0668	0.1336	Not significant
1.645	0.0500	0.1000	Significant (one-tailed at 0.05)
1.96	0.0250	0.0500	Significant (two-tailed at 0.05)
2.00	0.0228	0.0455	Significant at 0.05
2.326	0.0100	0.0200	Significant at 0.01 (one-tailed)
2.50	0.0062	0.0124	Significant at 0.01 (one-tailed)
2.576	0.0050	0.0100	Significant at 0.01 (two-tailed)
3.00	0.0013	0.0027	Highly significant
3.291	0.0005	0.0010	Significant at 0.001 (two-tailed)
3.50	0.0002	0.0005	Highly significant
3.891	0.00005	0.0001	Extremely significant
4.00	0.00003	0.00006	Extremely significant
5.00	< 0.000001	< 0.000001	Beyond any doubt

What Is a P-Value?

A p-value (probability value) is one of the most widely used concepts in statistical hypothesis testing. It quantifies the strength of evidence against a null hypothesis by measuring the probability of observing data as extreme as, or more extreme than, the results actually obtained, assuming the null hypothesis is true. The null hypothesis typically represents a default position—for example, that a new drug has no effect, or that two groups have equal means.

When you perform a hypothesis test, you calculate a test statistic (such as a z-score or t-statistic) from your data. This test statistic measures how far your observed results deviate from what the null hypothesis predicts. The p-value then translates this test statistic into a probability using a known distribution, most commonly the standard normal distribution for z-tests. A z-score of 1.96, for instance, corresponds to a two-tailed p-value of approximately 0.05, meaning there is only a 5% chance of seeing results this extreme if the null hypothesis is true.

Researchers compare the p-value to a predetermined significance level, denoted as alpha. The most common alpha value is 0.05, though 0.01 and 0.001 are also widely used depending on the field and the consequences of making errors. If the p-value falls below alpha, the result is declared "statistically significant," and the null hypothesis is rejected. If the p-value exceeds alpha, the null hypothesis is not rejected—though this does not prove the null hypothesis is true.

It is critical to understand what a p-value is not. A p-value is not the probability that the null hypothesis is true. It is not the probability that the results occurred by chance. And a statistically significant p-value does not necessarily imply practical importance. A study with a very large sample size can produce a significant p-value for a trivially small effect. For this reason, modern statistical practice recommends reporting p-values alongside effect sizes and confidence intervals to give a complete picture of the findings.

How to Calculate P-Values

Formulas

Two-tailed: p = 2 × (1 − Φ(|z|))

Right-tailed: p = 1 − Φ(z)

Left-tailed: p = Φ(z)

Where Φ(z) is the standard normal cumulative distribution function (CDF), which gives the probability that a standard normal random variable is less than or equal to z.

Worked Examples

Example 1: Two-Tailed Test (Drug Trial)

Problem: A clinical trial comparing a new drug to placebo yields a z-score of 2.50. Is the result significant at the 0.05 level using a two-tailed test?

Given: z = 2.50, two-tailed test
Formula: p = 2 × (1 − Φ(|2.50|))
Look up Φ(2.50) = 0.9938
Calculate: p = 2 × (1 − 0.9938) = 2 × 0.0062 = 0.0124
Answer: p = 0.0124. Since 0.0124 < 0.05, the result is statistically significant. Reject the null hypothesis.

Example 2: Right-Tailed Test (Quality Improvement)

Problem: A manufacturing process improvement is tested. The hypothesis is that the new process increases yield. The z-score is 1.80. Is this significant at 0.05 using a right-tailed test?

Given: z = 1.80, right-tailed test
Formula: p = 1 − Φ(1.80)
Look up Φ(1.80) = 0.9641
Calculate: p = 1 − 0.9641 = 0.0359
Answer: p = 0.0359. Since 0.0359 < 0.05, the result is statistically significant. There is evidence the new process increases yield.

Example 3: Left-Tailed Test (Cost Reduction)

Problem: A company tests whether a new supplier reduces costs. The z-score is −2.10. Is this significant at 0.01 using a left-tailed test?

Given: z = −2.10, left-tailed test
Formula: p = Φ(−2.10)
Look up Φ(−2.10) = 0.0179
Calculate: p = 0.0179
Answer: p = 0.0179. Since 0.0179 > 0.01, the result is not significant at the 0.01 level (but would be at 0.05). The evidence is suggestive but not strong enough at the stricter threshold.

Mental Math Shortcut

For quick estimation: a z-score of 2 gives a two-tailed p-value near 0.05, a z-score of 2.6 gives roughly 0.01, and a z-score of 3.3 gives roughly 0.001. Each additional 0.6 in the z-score roughly divides the two-tailed p-value by 10.

P-Value Reference Tables

Critical Z-Values by Significance Level

Significance Level (α)	One-Tailed Critical Z	Two-Tailed Critical Z	Confidence Level
0.10	1.282	1.645	90%
0.05	1.645	1.960	95%
0.025	1.960	2.240	97.5%
0.01	2.326	2.576	99%
0.005	2.576	2.807	99.5%
0.001	3.090	3.291	99.9%
0.0001	3.719	3.891	99.99%

Detailed Two-Tailed P-Values (Z = 0.0 to 4.0)

Z-Score	Two-Tailed P	Z-Score	Two-Tailed P
0.0	1.0000	2.2	0.0278
0.2	0.8415	2.4	0.0164
0.4	0.6892	2.6	0.0093
0.6	0.5485	2.8	0.0051
0.8	0.4237	3.0	0.0027
1.0	0.3173	3.2	0.0014
1.2	0.2301	3.4	0.0007
1.4	0.1615	3.6	0.0003
1.6	0.1096	3.8	0.0001
1.8	0.0719	4.0	0.00006
2.0	0.0455

Negative Z-Scores: One-Tailed P-Values (Left Tail)

Z-Score	Left-Tailed P	Z-Score	Left-Tailed P
-0.5	0.3085	-2.0	0.0228
-1.0	0.1587	-2.5	0.0062
-1.28	0.1003	-3.0	0.0013
-1.645	0.0500	-3.5	0.0002
-1.96	0.0250	-4.0	0.00003

Why P-Values Matter

Scientific Research

P-values are the cornerstone of scientific hypothesis testing, helping researchers determine whether experimental results provide genuine evidence for new discoveries or are likely due to random chance.

Medicine & Clinical Trials

Clinical trials rely on p-values to decide if treatments are effective. Regulatory agencies like the FDA require statistically significant results before approving new drugs, directly affecting patient care and public health.

Business & A/B Testing

Companies use p-values in A/B testing to determine if changes to websites, marketing campaigns, or products lead to real improvements in conversion rates, revenue, or user engagement.

Quality Control

Manufacturing and engineering use p-values for quality assurance, determining whether process changes actually improve output or if observed differences fall within normal variation.

Tips & Common Mistakes

✓ Choose your significance level before collecting data

Decide on your alpha level (0.05, 0.01, etc.) before running your analysis. Choosing the threshold after seeing the results is a form of p-hacking and invalidates the statistical test.

✓ Report exact p-values, not just "significant" or "not significant"

Instead of writing "p < 0.05," report the actual p-value (e.g., p = 0.032). This gives readers more information to assess the strength of evidence and allows for different interpretive frameworks.

✗ Don't confuse statistical significance with practical importance

A statistically significant result (small p-value) does not necessarily mean the effect is large or meaningful. With very large samples, even tiny, practically irrelevant effects can produce highly significant p-values. Always report effect sizes alongside p-values.

✗ Don't interpret a high p-value as proof the null hypothesis is true

A p-value of 0.40 does not mean there is a 40% chance the null hypothesis is true. It means the data are consistent with the null hypothesis, but also consistent with many alternative hypotheses. Absence of evidence is not evidence of absence.

✓ Apply multiple comparison corrections when testing many hypotheses

If you run 20 tests at alpha = 0.05, you expect one false positive by chance alone. Use Bonferroni correction (divide alpha by the number of tests) or false discovery rate (FDR) methods when performing multiple comparisons.

✗ Don't stop collecting data when you reach significance

Repeatedly checking p-values during data collection and stopping when significance is reached inflates the false positive rate dramatically. Determine your sample size in advance using a power analysis, and analyze the data only after collection is complete.

Frequently Asked Questions

What is a p-value in statistics?

A p-value is the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true. It ranges from 0 to 1. A small p-value (typically less than 0.05) indicates strong evidence against the null hypothesis, leading researchers to reject it. A p-value does not measure the probability that the null hypothesis is true or false.

What is the difference between one-tailed and two-tailed tests?

A one-tailed test checks for an effect in only one direction (greater than or less than), while a two-tailed test checks for an effect in either direction. A two-tailed p-value is always double the one-tailed p-value for the same test statistic. Use a one-tailed test only when you have a strong directional hypothesis specified before collecting data.

Why is 0.05 used as the significance level?

The 0.05 threshold was popularized by Ronald Fisher in the 1920s as a convenient convention, representing a 1-in-20 chance of a false positive. It is not a magical boundary. Different fields use different thresholds: particle physics requires p < 0.0000003 (5-sigma), while exploratory social science may accept p < 0.10. The appropriate threshold depends on the costs of false positives versus false negatives.

How do I convert a z-score to a p-value?

To convert a z-score to a p-value, you look up the z-score in a standard normal distribution table or use a cumulative distribution function (CDF). For a two-tailed test, the p-value equals 2 times (1 minus the CDF of the absolute z-score). For a one-tailed right test, p equals 1 minus the CDF. For a one-tailed left test, p equals the CDF directly.

What does a p-value of 0.01 mean?

A p-value of 0.01 means there is a 1% probability of observing results as extreme as or more extreme than the current results if the null hypothesis were true. It does not mean there is a 1% chance the null hypothesis is true. It indicates strong evidence against the null hypothesis and would be considered statistically significant at both the 0.05 and 0.01 significance levels.

Can a p-value be exactly zero?

In theory, a p-value can never be exactly zero because there is always some nonzero probability of observing any result under the null hypothesis. However, when p-values are extremely small (such as 10 to the power of negative 15), software may round them to zero or display them in scientific notation. In practice, researchers report these as p < 0.001 rather than p = 0.

What is p-hacking and why is it a problem?

P-hacking refers to the practice of manipulating data analysis until a statistically significant p-value is found. This includes running multiple tests without correction, selectively reporting results, removing outliers post hoc, or stopping data collection once significance is reached. P-hacking inflates the false positive rate far beyond the nominal 5% and has contributed to the replication crisis in science.

How does sample size affect p-values?

Larger sample sizes produce smaller standard errors, which lead to larger test statistics for the same effect size, resulting in smaller p-values. With a very large sample, even trivially small effects can produce statistically significant p-values. Conversely, small samples may fail to detect meaningful effects. This is why researchers should report effect sizes alongside p-values for a complete picture.

Disclaimer: This calculator is intended for educational and informational purposes only. The p-values computed use a standard normal distribution approximation and may differ slightly from exact values produced by specialized statistical software. For peer-reviewed research, clinical trials, or regulatory submissions, verify all calculations using validated statistical packages such as R, SPSS, SAS, or Python's scipy.stats. Always consult a qualified statistician for critical analyses.

Loading Calculator...

Please wait a moment

P-Value Calculator

Quick Facts

P-Value

Common Z-Scores and P-Values Table

What Is a P-Value?

How to Calculate P-Values

Formulas

Worked Examples

Example 1: Two-Tailed Test (Drug Trial)

Example 2: Right-Tailed Test (Quality Improvement)

Example 3: Left-Tailed Test (Cost Reduction)

Mental Math Shortcut

P-Value Reference Tables

Critical Z-Values by Significance Level

Detailed Two-Tailed P-Values (Z = 0.0 to 4.0)

Negative Z-Scores: One-Tailed P-Values (Left Tail)

Why P-Values Matter

Scientific Research

Medicine & Clinical Trials

Business & A/B Testing

Quality Control

Tips & Common Mistakes

✓ Choose your significance level before collecting data

✓ Report exact p-values, not just "significant" or "not significant"

✗ Don't confuse statistical significance with practical importance

✗ Don't interpret a high p-value as proof the null hypothesis is true

✓ Apply multiple comparison corrections when testing many hypotheses

✗ Don't stop collecting data when you reach significance

Frequently Asked Questions

What is a p-value in statistics?

What is the difference between one-tailed and two-tailed tests?

Why is 0.05 used as the significance level?

How do I convert a z-score to a p-value?

What does a p-value of 0.01 mean?

Can a p-value be exactly zero?

What is p-hacking and why is it a problem?

How does sample size affect p-values?

Related Calculators

Z-Score Calculator

Standard Error

Standard Deviation

Confidence Interval

T-Test Calculator

Sample Size

Expected Value

Poisson Distribution

Loading Calculator...

P-Value Calculator

Quick Facts

P-Value

Common Z-Scores and P-Values Table

What Is a P-Value?

How to Calculate P-Values

Formulas

Worked Examples

Example 1: Two-Tailed Test (Drug Trial)

Example 2: Right-Tailed Test (Quality Improvement)

Example 3: Left-Tailed Test (Cost Reduction)

Mental Math Shortcut

P-Value Reference Tables

Critical Z-Values by Significance Level

Detailed Two-Tailed P-Values (Z = 0.0 to 4.0)

Negative Z-Scores: One-Tailed P-Values (Left Tail)

Why P-Values Matter

Scientific Research

Medicine & Clinical Trials

Business & A/B Testing

Quality Control

Tips & Common Mistakes

✓ Choose your significance level before collecting data

✓ Report exact p-values, not just "significant" or "not significant"

✗ Don't confuse statistical significance with practical importance

✗ Don't interpret a high p-value as proof the null hypothesis is true

✓ Apply multiple comparison corrections when testing many hypotheses

✗ Don't stop collecting data when you reach significance

Frequently Asked Questions

What is a p-value in statistics?

What is the difference between one-tailed and two-tailed tests?

Why is 0.05 used as the significance level?

How do I convert a z-score to a p-value?