11
Statistical Analysis
The EA Protocol describes the statistical methods
used. Briefly, the data objectives of this EA were to
(1) estimate geometric mean concentrations of PFAS
in the sampling frame population (with a precision
target of at least 15% and a 5% level of significance
for PFOS), (2) compare community level data to
national levels, and (3) explore relationships
between questionnaire data and measured biological
and environmental data.
ATSDR processed the PFAS sampling results in two
ways before performing statistical analyses:
• First, ATSDR substituted all non-detect
observations with a value equal to the limit
of detection (LOD) divided by the square root
of 2. (A non-detect result means the sample
did not contain enough PFAS to be reliably
measured by this project’s highly sensitive
laboratory methods.) This substitution
method is consistent with that applied in
CDC’s NHANES. Note that Appendix B
provides the results of a sensitivity analysis
exploring alternate substitution approaches.
• Second, ATSDR calculated the total PFOA and
total PFOS concentrations measured in each
blood and urine sample. The laboratory
reports two different measurements for
PFOA and PFOS. For PFOA, the laboratory
reports the amount of branched PFOA (Sb-
PFOA) measured in the sample separate
from the amount of linear PFOA (n-PFOA) in
the same sample. ATSDR summed these
values and performed statistical analyses
using total PFOA results. Similarly, ATSDR
calculated total PFOS by summing the linear
PFOS (n-PFOS) and branched PFOS (Sm-
PFOS) concentrations. These same
summation methods are applied to NHANES
data.
For blood and urine, ATSDR first calculated summary
statistics for each PFAS (i.e., frequency of detection,
maximum detected concentration, geometric mean,
95% confidence intervals around the geometric
mean, and 25
th
, 50
th
[median], 75
th
, 90
th
, and 95
th
percentiles). The protocol specified that geometric
Statistical Terms
Geometric mean: The geometric mean is a
type of average and provides an estimate of
the central point of a set of numbers. It is
often used for environmental data that
exhibit a skewed distribution (e.g., a dataset
with several values that are much higher
than the rest of the results). The geometric
mean is less influenced by high values than
an arithmetic mean.
Percentiles (25th, 50th, 75th, 90th, 95th): A
percentile provides additional information
about the distribution of a dataset and
represents the value below which a certain
percentage of the data fall. For example, a
95th percentile of 25 micrograms per liter
(µg/L) indicates that 95% of results fall below
this concentration.
Confidence intervals: A confidence interval
defines a range of values that's likely to
include a specific value with a certain degree
of confidence (probability). It provides a
measure of how much uncertainty there is
with any particular statistic In this EA, ATSDR
estimated geometric means for the PFAS
blood levels measured among study
participants. The 95% confidence interval
around the geometric mean represents the
range within which the true population
mean is expected to lie. More specifically, if
we hypothetically repeated the study 100
times, 95 times out of 100 the mean of the
sampling frame population would fall within
this range.
Precision: Precision provides information on
the reproducibility of a study and is
associated with sample size. The larger the
sample size the higher the precision. In the
context of this EA, precision was estimated
based on the width of confidence intervals
around the geometric mean. A wide
confidence interval indicates low precision
while a narrow confidence interval suggests
high precision.