man drawing on a graph
Image from

This post is the second in four that will appear each Monday in May. The series walks through the main steps in hypothesis testing, which are:

  1. Statement of Claim 1, called the null hypothesis (H_0), and Claim 2, called the alternative hypothesis (H_a).
  2. Collection and summary of relevant data from a random sample, using a test statistic.
  3. Assessment of the likelihood of observing the data observed if H_0 was true, using the p-value.
  4. Determination of whether or not the evidence is strong enough to reject H_0

In the first part, we stated the relevant claims of the case study from the American Psychological Association called “Meta-Analysis of Intellectual and Neuropsychological Test Performance in Attention-Deficit/Hyperactivity Disorder” , which sought to test the null hypothesis that ADHD had no affect on cognitive performance against four alternative hypotheses. We focus on the first part of the first alternative hypothesis and the entirety of the second alternative hypothesis:

  1. H_a: \mu_{FSIQ} < \mu_{0{FSIQ}}
  2. H_a: \mu_{total} < \mu_{{0}{(total)}}

where \mu is the mean assessment score for subjects with ADHD and \mu_0 is the mean assessment score for subjects without ADHD.

Step 1 in hypothesis testing completed, we can move on to Step 2, whereby we collect and summarize the relevant data. Under the “Method” section, the authors explain that they extracted their data from the PsycINFO and MEDLINE bibliographic databases. They collected studies with data for overall cognitive ability (i.e., FSIQ) and without significant non-ADHD conditions and pathology.

In addition to comparisons concerning mean scores on cognitive and neuropsychological skills, the researchers also examined effect size, but introductory statistics courses do not often mention effect size, so we shall not dawdle with that business. (If you are curious about what the effect size, expressed by Q in the report, measures, this paper is very helpful.)

After you collect data, you can apply a test statistic. Test statistics compare the observed data to the expected data under the null hypothesis. For our study, a test statistic would compare the observed difference between the mean cognitive and neuropsychological scores between ADHD and non-ADHD subjects to the expected difference that the null hypothesis predicts, i.e. no difference.

The “Meta-Analysis” research for the two alternative hypotheses for which we are concerned (pp.549-550, first two sub-sections of “Results”) uses the z-statistic. A z-test is appropriate for data that has a normal distribution (bell-curve shaped) regardless of sample size or data that has a large enough (n > 30) sample size regardless of data distribution. Furthermore, the z-test only works if the standard deviation of the population that the sample represents is known.

(For an overview of what the standard deviation represents, I recommend this article.)

bell curve
Image from Illinois State

In most cases, statisticians must use t-tests instead of z-tests because of an unknown population standard deviation. T-tests are similar to z-tests in that they demand a normal distribution or a large sample size, but differ in that they can work with the sample standard deviation in the absence of a population standard deviation. As a result, the bell curve for a t-distribution is generally more spread than that for a z-distribution.

For this study, z-tests are accessible because standardized testing and other formal assessments often have a known population standard deviation.

A z-statistic relates how many standard deviations the observed data \mu differs from the expected data \mu_0. The formula for the z-test is z = \frac{\overline{x} - \mu_0}{\frac{\sigma}{\sqrt{n}}}.

Here, \overline{x} is the mean score of the ADHD subjects $mu_0$ is the mean score of the non-ADHD subjects, \sigma is the population standard deviation, and n is the sample size of studies. This research uses a total of n = 137 studies.

The first alternative hypothesis predicts that the overall cognitive ability of ADHD participants is less than the overall cognitive ability of non-ADHD participants, as measured by the Full Scale IQ. According to the z-tests run, z = 27.72 for the FSIQ scores of the ADHD groups relative to the control groups. This means that the average scores observed on the FSIQ for the ADHD groups differed from the average scores of the control group by  27.72 standard deviations.

The second alternative hypothesis predicts that all neuropsychological measures would be lesser in ADHD participants than in non-ADHD participants. For this comparison, the smallest z (there were several because the researchers took information from studies that used different neuropsychological tests, so the values had to be compared separately) was 2.5.

Without further testing these statistics, we might guess that the difference in IQ scores between ADHD and non-ADHD subjects is significant. (After all, a standard deviation of 27.72 is no small matter.) However, confirmation of this supposition will have to wait until the next post.


Other posts in “Hypothesis Testing”