statistics bell curve
Image from Lynda.com

In my praise about the worthiness of a statistics class, I promised to discuss further the inner workings of hypothesis testing. This post is the first in four that will appear each Monday in May. The series walks through the main steps in hypothesis testing, which are:

  1. Statement of Claim 1, called the null hypothesis (H_0), and Claim 2, called the alternative hypothesis (H_a).
  2. Collection and summary of relevant data from a random sample, using a test statistic.
  3. Assessment of the likelihood of observing the data observed if H_0 was true, using the p-value.
  4. Determination of whether or not the evidence is strong enough to reject H_0.

Hypothesis testing follows a similar process as the scientific method, wherein you (and I paraphrase “Ready, Jet, Go!”, a PBS show that my younger brother loves) ask a question based on observation, make a hypothesis, test your hypothesis, observe what happens, and draw conclusions.

Hypothesis testing helps statisticians evaluate the significance of the data. For example, if one year the approval rating for a political nominee was 56% and the sponsors wanted to determine if approval had increased, they might conduct a hypothesis test to 1) measure the change, if any, in approval, and 2) calculate, using a test statistic and p-value, whether or not the change is significant (i.e. whether or not it statistically matters).

statistics and jelly beans
Image from Unbiased Research

For the purposes of this series, I will use an actual study to demonstrate hypothesis testing. Concrete examples make much better explanations than abstractions.

The study in question is the “Meta-Analysis of Intellectual and Neuropsychological Test Performance in Attention-Deficit/Hyperactivity Disorder”, a publication of the American Psychological Association in the journal Neuropsychology. The abstract states, “In this meta-analytic review, the authors sought to examine the magnitude of differences between ADHD and healthy participants on several commonly used intellectual and neuropsychological measures.”

Depending on the question type the hypotheses posited may differ. This study compares the mean differences between two quantitative variables, namely the mean test scores on various assessments between the healthy control group and the ADHD (“experimental”) group, through combination of data from multiple studies. (Note: There are two types of data: qualitative and quantitative. Generally, the former is information that you can’t measure; the latter, that you can.)

ADHD word-cloud
Image from Science-based Medicine

Any statistical study starts with hypothesis, whether or not the researchers explicitly express them. The null hypothesis, H_0, basically says that no special relationship exists. In the ADHD study, the null hypothesis is:

H_0: \mu = \mu_0 ,

where \mu is the mean measure of subjects with ADHD in a given cognitive assessment and \mu_0 is the mean measure of subjects without ADHD in a given cognitive assessment. This null hypothesis claims that there is no difference in the means between these two study groups.

Some studies have fairly straightforward null hypothesis, wherein H_a takes one of three forms:

H_a: \mu > \mu_0
H_a: \mu < \mu_0
H_a: \mu \neq \mu_0

In this study, the researchers propose four alternative hypotheses, which is not too uncommon in such research. We will focus on part of the first one and the whole of the second one, which are:

  1. H_a: \mu_{FSIQ} < \mu_{0{FSIQ}}
  2. H_a: \mu_{total} < \mu_{{0}{(total)}}

The first H_a suggests that the overall cognitive ability of ADHD subjects, as measured by Full Scale IQ, is less than the overall cognitive ability of healthy subjects.

The second H_a suggests that all of the cognitive and neuropsychological skills measured are weaker for ADHD subjects than for normal subjects.

In Pt. 2, we will synthesize and summarize the data using test statistics, which will help us visualize its significance.

Other posts in “Hypothesis Testing”