Statistical significance
Statistical significance is an important concept in data analysis and hypothesis testing. It is the probability of an effect or difference in a sample being due to a real effect or difference in the population, rather than being due to chance or random variation. Statistical significance is determined by calculating a statistic (e.g., a mean difference or a correlation coefficient) and determining the probability of obtaining that statistic given that there is no real effect or difference in the population. If the probability is very small (usually less than 0.05), then the statistic is said to be statistically significant.
Example of Statistical significance
To illustrate the concept of statistical significance, consider a study that measures the average height of men and women in a population. The sample mean difference between the heights of men and women is calculated and it is determined that the mean difference is 2.3 cm. The probability of obtaining this mean difference given that there is no real difference in the population is calculated, and it is found to be 0.002. This probability is considered to be very small, and therefore the mean difference between the heights of men and women is said to be statistically significant.
Formula of Statistical significance
The formula for determining the probability of a statistic given that there is no real effect or difference in the population is given by:
For example, if you measure the mean difference between two groups, you can calculate the probability of obtaining that mean difference given that there is no real difference in the population. This probability is known as the p-value.
Calculating statistical significance involves determining the probability of obtaining a particular statistic given that there is no real effect or difference in the population. This probability is calculated by comparing the statistic to a probability distribution (e.g., a normal distribution or a t-distribution). The smaller the probability of obtaining the statistic, the more statistically significant it is considered to be.
There are several methods for calculating statistical significance, depending on the type of data being analyzed.
- For categorical data, the chi-square test is typically used. This test compares the observed frequencies of different categories to the expected frequencies, and determines the probability of obtaining the observed frequencies given that there is no real effect or difference in the population.
- For continuous data (e.g., measurements or scores), the t-test is commonly used. This test compares the mean of a sample to a population mean, and determines the probability of obtaining the sample mean given that there is no real effect or difference in the population.
- For correlation data, the Pearson correlation coefficient is used. This coefficient measures the strength of a linear relationship between two variables, and determines the probability of obtaining the observed correlation given that there is no real effect or difference in the population.
When to use Statistical significance
Statistical significance is used primarily in hypothesis testing and data analysis. It is used to determine whether the results of a study are meaningful, or whether they could have occurred by chance. It can also be used to determine the strength of an effect or difference in a sample.
Types of Statistical significance
Statistical significance can be determined in a variety of ways. The most commonly used methods are the following:
- P-value: A P-value is the probability of obtaining a statistic as extreme as the one observed, given that there is no real effect or difference in the population. If the P-value is below a certain threshold (usually 0.05), then the statistic is said to be statistically significant.
- Confidence Interval: A confidence interval is a range of values within which the true population value is likely to lie. A confidence interval is usually calculated from a sample statistic, and if the confidence interval does not include the value 0 (or the value of any other null hypothesis), then the statistic is said to be statistically significant.
- Power: Power is the probability of correctly rejecting the null hypothesis when it is false. Power is used to determine the sample size needed to detect an effect or difference of a given magnitude, given the level of statistical significance that is desired.
Steps of Statistical significance
There are several steps to determining statistical significance:
- State the hypothesis: the hypothesis must be clearly stated and must reflect the question being asked.
- Calculate the statistic: the statistic (e.g., a mean difference or correlation coefficient) must be calculated from the data.
- Determine the probability of obtaining the statistic: the probability of obtaining the statistic must be calculated, given that there is no real effect or difference in the population.
- Compare the probability to the significance level: the probability must be compared to the predetermined level of significance, usually 0.05.
- Draw conclusions: if the probability is smaller than the predetermined level of significance, then the result is said to be statistically significant. If the probability is larger than the predetermined level of significance, then the result is said to be not statistically significant.
Advantages of Statistical significance
- Statistical significance can help to identify meaningful relationships or effects in data. By calculating the probability of obtaining a result due to chance, it is possible to determine whether a result is meaningful or not. This can be helpful in interpreting data and making decisions.
- Statistical significance can also be used to determine the strength of an effect or difference. The lower the probability of obtaining a result, the stronger the effect or difference is considered to be. This can be useful in determining the importance of a result.
- Statistical significance can also be used to determine the sample size needed to detect a given effect or difference. By calculating the probability of obtaining a result given a certain effect size and sample size, it is possible to determine the sample size needed to detect the effect or difference.
Limitations of Statistical significance
Despite its widespread use, statistical significance has several limitations.
- First, it does not tell us anything about the size or magnitude of an effect or difference. For example, two means may be statistically different, but the difference may be quite small and not meaningful in practice.
- Second, it does not tell us anything about the practical significance of the effect or difference. For example, a correlation coefficient may be statistically significant, but the practical implications may be small.
- Third, it does not take into account other factors that may be important in interpreting the results. For example, a statistically significant effect may be due to an uncontrolled variable.
In addition to statistical significance, there are other approaches that can be used to evaluate the meaningfulness of results. These include effect sizes, confidence intervals, and Bayesian statistics.
- Effect Sizes: Effect sizes measure the magnitude of a difference or an effect. They compare the size of the effect or difference to the variability in the sample or population. Common effect sizes are Cohen's d, r, and Phi.
- Confidence Intervals: Confidence intervals are used to quantify the uncertainty around an estimate. They provide an interval around a statistic that is likely to contain the true population value a certain percentage of the time. The wider the interval, the less certain we are about the true population value.
- Bayesian Statistics: Bayesian statistics are used to update probabilities based on new evidence. When applied to data analysis, Bayesian statistics can provide an estimate of the probability of an effect or difference being meaningful.
Statistical significance — recommended articles |
Quantitative variable — Anderson darling normality test — Maximum likelihood method — Statistical hypothesis — Continuous distribution — Confidence level — Multidimensional scaling — Asymmetrical distribution — Residual standard deviation |
References
- Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A., Wagenmakers, E. J., Berk, R., ... & Johnson, V. E. (2018). Redefine statistical significance. Nature human behaviour, 2(1), 6-10.
- Lykken, D. T. (1968). Statistical significance in psychological research. Psychological bulletin, 70(3p1), 151.
- Storey, J. D., & Tibshirani, R. (2003). Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences, 100(16), 9440-9445.