Statistical Confidence Intervals

We will answer the questions:

  • What is a confidence interval?

  • How are confidence intervals related to effect sizes and the p-values?

  • How are confidence intervals computed and used within classical statistics?

  • Interval of what? Common errors when interpreting confidence intervals.

What is a confidence interval?

Confidence intervals are often reported within classical statistics as a way to help understand the noise in your dataset.

 

Let's take an example:

Suppose you've collected data on a new analgesic drug, and you want to know if it is more effective than the pain medication being sold by your company's main competitor. In a thoroughly unethical experiment, you apply paper cuts to 60 of your staff, giving 30 of these your new drug and 30 your competitor's drug. 

The average analgesic effect (pain reduction) after taking the two drugs is .10 and .11 on a 10-point scale of self-reported pain. 

How are confidence intervals related to effect sizes and the p-values?

Having a threshold (the p-value) for significance creates a situation where effect sizes are always over-estimated whenever power is less than 100%. 

This is because you will occasionally get effect sizes, as measured from noisy data, that are both higher and lower than the true effect size. However, when effect sizes measured from data are low, they my fall below the significance level, but this does not happen when they are high. Thus, the overall result is that the expected value of the effect size is always higher than the true effect size.

How are confidence intervals computed and used within classical statistics?

Our goal in analyzing data is usually to compare hypotheses. In the neural and behavioral sciences, we want to know if experimental data contain evidence to support one hypothesis regarding the causes of behavior over one or more alternative hypotheses regarding the same behavior.

What added benefit do confidence intervals provide for your analysis?

Our goal in analyzing data is usually to compare hypotheses. In the neural and behavioral sciences, we want to know if experimental data contain evidence to support one hypothesis regarding the causes of behavior over one or more alternative hypotheses regarding the same behavior.