Checking normality in SPSS
stats
tutor community project www.statstutor.ac.uk
• For larger samples (i.e. more than one hundred), the normality tests are overly conservative
and the assumption of normality might be rejected too easily (see robust exceptions below).
Any assessment should also include an evaluation of the normality of histograms or Q-Q plots
as these are more appropriate for assessing normality in larger samples.
Hypothesis test for a test of normality
Null hypothesis: The data is normally distributed
For both of these examples, the sample size is 35 so the Shapiro-Wilk test should be used. For
the skewed data, p = 0.002 suggesting strong evidence of non-normality. For the approximately
normally distributed data, p = 0.582, so the null hypothesis is retained at the 0.05 level of
significance. Therefore, normality can be assumed for this data set and, provided any other test
assumptions are satisfied, an appropriate parametric test can be used.
What if the data is not normally distributed?
If the checks suggest that the data is not normally distributed, there are three options:
• Transform the dependent variable (repeating the normality checks on the transformed data):
Common transformations include taking the log or square root of the dependent variable.
• Use a non-parametric test: Non-parametric tests are often called distribution free tests and
can be used instead of their parametric equivalent.
• Use a parametric test under robust exceptions: These are conditions when the parametric
test can still be used for data which is not normally distributed and are specific to individual
parametric tests – see the appropriate test worksheet.
Key non-parametric tests
Parametric test What to check for normality Non-parametric test
Independent t-test Dependent variable by group Mann-Whitney test
Paired t-test Paired differences Wilcoxon signed rank test
One-way ANOVA Residuals/ dependent variable by
group
Kruskal-Wallis test
Repeated measures ANOVA Residuals at each time point Friedman test
Pearson’s correlation
coefficient
Both variables should be normally
distributed
Spearman’s correlation
coefficient
Simple linear regression Residuals N/A
Note: The residuals are the differences between the observed and expected values. They are
usually available in the ‘Save’ options when carrying out a test. However, for ANOVA, residuals
can only be obtained through the Analyze
General Linear Model
Univariate menu.
Although non-parametric tests require fewer assumptions and can be used on a wider range of
data types, parametric tests are preferred because they are more sensitive at detecting differences
between samples or an effect of the independent variable on the dependent variable. This means