The Impact of Other Factors:
Confounding, Mediation, and
Effect Modification
Amy Yang
Senior Statistical Analyst
Biostatistics Collaboration Center
Oct. 14 2016
BCC: Biostatistics Collaboration Center
Who We Are
Leah J. Welty, PhD
Assoc. Professor
BCC Director
Joan S. Chmiel, PhD
Professor
Jody D. Ciolino, PhD
Asst. Professor
Kwang-Youn A. Kim, PhD
Asst. Professor
Mary J. Kwasny, ScD
Assoc. Professor
Julia Lee, PhD, MPH
Assoc. Professor
Alfred W. Rademaker, PhD
Professor
Hannah L. Palac, MS
Senior Stat. Analyst
Gerald W. Rouleau, MS
Stat. Analyst
Amy Yang, MS
Senior Stat. Analyst
Masha Kocherginsky, PhD
Assoc. Professor
Not Pictured:
1. David A. Aaby, MS
Senior Stat. Analyst
2. Tameka L. Brannon
Financial | Research
Administrator
Biostatistics Collaboration Center |680 N. Lake Shore Drive, Suite 1400 |Chicago, IL 60611
BCC: Biostatistics Collaboration Center
What We Do
Our mission is to support FSM investigators in the conduct of high-quality,
innovative health-related research by providing expertise in biostatistics,
statistical programming, and data management.
BCC: Biostatistics Collaboration Center
Are you writing a
grant?
YES
We provide:
Study Design
Analysis Plan
Power Sample Size
BCC faculty serve as Co-
Investigators; analysts
serve as Biostatisticians.
NO
Short or long term
collaboration?
Short
Recharge Model
(hourly rate)
Long
Subscription Model
(salary support)
How We Do It
Every investigator is
provided a FREE initial
consultation of up to
2 hours with BCC
faculty of staff
The BCC recommends
requesting grant
support at least 6 -8
weeks before
submission deadline
Statistical support for
Cancer-related projects or
Lurie Children’s should be
triaged through their
available resources.
BCC: Biostatistics Collaboration Center
Request an Appointment
- http://www.feinberg.northwestern.edu/sites/bcc/contact-us/request-
form.html
General Inquiries
- bcc@northwestern.edu
- 312.503.2288
Visit Our Website
- http://www.feinberg.northwestern.edu/sites/bcc/index.html
How can you contact us?
Biostatistics Collaboration Center |680 N. Lake Shore Drive, Suite 1400 |Chicago, IL 60611
The Impact of Other Factors:
Confounding, Mediation, and
Effect Modification
Amy Yang
Senior Statistical Analyst
Biostatistics Collaboration Center
Oct. 14 2016
Outline
Confounding
- Concept and definition
- Identifying confounding
- Quantifying confounding
- Controlling confounding
Mediation
Effect Modification
- Definition and examples
- Confounding vs Effect Modification
Confounding--Example
Cohort study -- Smoking and heart disease (HD)
Suppose that the incidence of HD for smokers is
twice that of non-smokers (Risk Ratio=2.0)
Confounding--Example
Before we can make a causal statement…
Rule out alternative explanations:
Chance, Bias, Confounding
Smoking
doubles your
risk of getting
heart disease
Confounding--Example
Suppose that the smokers are much older than the non-
smokers
We know that age is a risk factor for heart disease
- Implies the RR=2 is really reflecting the mixture of two
effects (Older age and smoking)
Age is a confounder in the study of association between
smoking and HD
Confounding--Example
Two pathways
- Direct effect of smoking
- Backdoor pathway through age non-comparability
Confounding = Existence of backdoor pathway
Smoking
(X)
Age
(Z)
Heart Disease
(Y)
Confounding
Three properties of confounder:
Should related to the exposure
Should be an independent determinant of the outcome
Should not be part of causal pathway from exposure to
outcome
Often taken as a definition of a confounder
Identifying Confounding
Not Recommended
- Approaches that are based only on statistical
associations observed in study data
e.g. Automated procedures (stepwise regression)
Recommended
- Three properties + knowledge/assumptions
about causal relationships among variables
- Study data are used to quantify confounding
- It turns out there are more blondes in the
chemical X exposed group
- Question: Is hair color a confounder?(Are
blondes really…dumber?)
- Hair color is not a confounder, because hair
color is not a risk factor for cognitive disability
Chemical X
Cognitive disability
What is not a Confounder--Example
Exposed Non-Exposed
Quantifying and Controlling Confounding in the
Analysis
Comparing the “crude” measure of association with the
adjusted” measures of association
Stratification
- Pooling (Weighted Averaging)
Modeling
Example:
Hypothetical case-control study examining the
association between formula vs. breastfeeding and
gastroenteritis among infants
Example:
Concern about socioeconomic status (SES) as a
confounder
Check the three properties:
1. SES affects whether people formula or breastfeed
2. SES affects the outcome through the degree of
crowding and hygiene issues
3. SES is not in the pathway between feeding methods
and Gastroenteritis
Formula/BF Gastroenteritis
SES
Quantifying and Controlling Confounding in the
Analysis
1. Crude association -- OR=(261*296)/(645*54)=2.22
Gastroenteritis
2. Stratify by confounder SES
OR
LOW
= 1.75 OR
HIGH
= 1.80
Positive confounder because crude OR 2.2 was larger than the
stratified ORs 1.75 and 1.80
Yes No
Formula 261 645
Breastfeedin
g
54 296
Low SES Yes No
Formula 219 447
Breastfeedin
g
33 118
High SES Yes No
Formula 42 198
Breastfeedin
g
21 178
Quantifying and Controlling Confounding in the
Analysis
3. Pooling (weighted averaging) adjusted association
- If appropriate, pool information over all strata by
calculating (weighted) average of stratum specific
measures
- Assumption: constant effect across strata
OR
LOW
=1.75 OR
HIGH
=1.80
OR
adjusted
Mantel-Haenszel weights
- Reflect amount of “information” within each stratum
- Mantel N, Haenszel W. Statistical aspects of the analysis of data from
retrospective studies of disease JNCI 22: 719-748, 1959
Mantel-Haenszel Estimation
Case control data:
OR
MH
=
OR
LOW
=1.75 OR
HIGH
=1.8
OR
adjusted
=1.77
Low SES Yes No
Formula 219 447
Breastfeedin
g
33 118
High SES Yes No
Formula 42 198
Breastfeedin
g
21 178
Modeling
Stratification and MH estimation are equivalent to…
- Calculating an unadjusted measure of association from
a model
Gastroenteritis ~ b1*Formula/BF
- Examining the measure of association after including
the confounder in the model
Gastroenteritis ~ b1’*Formula/BF + b2*SES
Preventing Confounding in Study Design
Confounding is a bias
We want to prevent in the conduct of the study and
remove once we determine that it is present
Study design strategies:
- Randomization
- Matching
- Restriction
Preventing Confounding in Study Design
Randomization
- Subjects are allocated to exposure groups by a random
method
- Gives subject equal chance of being in any exposure group
- Exposure groups will have similar distribution of
Age, gender, behavior
- This includes both measured and unmeasured confounders
- Depending on the trial, confounders may still need to be
considered in analysis (especially when n is small)
Preventing Confounding in Study Design
Matching
- On important potential confounders
30-40 years old
40-50 years old
- Smoking and Non-Smoking groups are similar with
respect to Age
- Analyses must account for matching
Smoking
(X)
Age
(Z)
Heart Disease
(Y)
Restricted to
30-40 years old
Restriction
- Restrict admission into the study to subjects who have
the same level of the confounding factor
- E.g., Confounding by Age could be minimized by enroll
subjects that are in the same age range
- Be careful! Restriction limits generalizability
Preventing Confounding in Study Design
5-10 10-20 20-30
30-40 40-60 >60
30-40
Summary -- Confounding
Three properties
Control for confounding in the analysis
- Stratification
- MH estimation
- Modeling
Design strategies to prevent confounding
- Randomization
- Matching
- Restriction
Mediation
Confounder should not be in the pathway between the
exposure and outcome
If the other variable is in the pathway between the two, it
is called a mediator
XZY
Mediation
Poverty
Diabetes
Limited access to healthy food
Mediation
Increased risk of
HPV infection
Cervical cancer
Multiple sexual partners
Mediation
It is difficult to distinguish confounder and mediator
statistically
They should be separated from each other based on an
understanding of disease process
A variable can act partially as a confounder and partially
as a mediator
Physical inactivity
Obesity
Obesity
Cardiovascular disease
(Confounder)
(Mediator)
Mediation
Question : Should we adjust for mediators, as we do for
confounders?
We can, but the meaning of this adjustment is different
- Before adjustment, we have the total effect of the
potential risk factor on the outcome
- After adjustment, we have the remaining effect of the
risk factor after the partial effect of that mediator is
considered
- Remaining effect will be smaller than total effect
Mediation
If we do not adjust for the mediator
- Crude OR = 2.4; Total effect of poverty on diabetes
If we adjust for eating unhealthy food
- OR
adjust
=1.6; Remaining effect of poverty on diabetes
Poverty
Diabetes
Limited access to healthy food
Effect Modification (Interaction)
Effect modification is present when the measure of
association between X and Y varies across a third
variable (Z)
Gender modifies the effect of marital status on
health outcomes
Effect Modification
Conceptualization of effect modification
- Approach one
The “effect” of variable X on Y is not the same across
levels of variable Z
- Approach two
The “effect” of variables X and Z on Y combined is larger
or smaller than you would expect given the “effect” of
each on Y individually
Y=X+Z+X*Z
Mathematically these two approaches are the same
Divorced Suicide
Men RR=2.38
Women RR=1 no association
Confounding vs Effect Modification
Stratification is a step in the process of adjusting for
confounding
- Bias we want to remove
Stratification is a step in the process of describing effect
modification
- We want to describe effect modification
Confounding vs Effect Modification
Confounding
- Association is similar in different strata of Z
- Compare the adjusted association with the crude association
Effect modification
- Association is different in different strata of Z
- Compare associations across strata
Crude association
Adjusted association
Stratum specific association Stratum specific association
Confounding vs Effect Modification
A factor could be confounder and/ or modifier
Example: Study of relation between social support and
depression
Road Map
1. Calculate the crude measure of association
2. Stratify the data by the potential confounder/ effect modifier
3. Calculate the stratified measure of association
4. Compare 3 using the Test for Homogeneity (Breslow-Day Test)
5. Are the associations homogeneous?
Yes No
(i.e. did not reject H0) (i.e. rejected H0)
6. Calculate the adjusted measure of 6. Present measures of
association Mantel-Haenszel estimation association stratified by
effect modifier
7. Compare 6 and 1 to describe direction
and magnitude of the confounding
Road Map Step 1
1. Calculate the crude measure of association between
the exposure and outcome (e.g. RR, OR)
Incident depression
Risk ratio = (191/8100)/(50/7600)=3.6
Yes No Total
Low social support 191 7909 8100
High social support 50 7550 7600
Total 241 15459 15700
Road Map Step 2 & 3
2. Stratify the data by the potential confounder/ effect modifier
Incident depression Incident depression
3. Calculate the stratified measure of association
RR
Men
= (26/2600)/(18/3600)=2 RR
Women
= (165/5500)/(32/4000)=3.75
Men Yes No Total
Low social support 26 257
4
2600
High social support 18 358
2
3600
Total 44 615
6
6200
Women Yes No Total
Low social support 16
5
533
5
5500
High social support 32 396
8
4000
Total 19
7
930
3
9500
Road Map Step 4
4. Compare the RRs using the Test for Homogeneity (Breslow-
Day Test)
- Equivalent to test statistics for interaction term in regression model
- Null hypothesis: the measure of association is homogeneous across
strata
If the test of homogeneity is “significant
- Reject homogeneity
- Evidence for heterogeneity (i.e. effect modification)
The choice of significant level (e.g. p<0.05) is open to
interpretation
- One conservative” approach is using significant level of larger
than 0.05 (maybe 0.10 or 0.20)
Road Map Step 5 & 6
In our example χ
2
=3.08, DF=1, P=0.08
5. Question: Does it appear we have homogeneous
association (H0: Association the same across strata)?
Assume we used conservative 10% level of significance…
No (p=0.08<0.10)
Reject H0; we have evidence of effect modification
6. Present measures of association stratified by gender
RR
MEN
= 2 RR
WOMEN
=3.75
Exercise
X-Y association stratified by potential confounder/EM Z
Z=0 Z=1 Crud
e
Adjusted Confounding? EM?
4 0.25 1 1
1 1 8.4 1
4 0.25 1 2
Adjusted estimate not relevant
present stratified associations
when there is effect modification
Properties of Stratification
Pro:
- Simple and intuitive
Con:
- Not practical when there are multiple factors
- With continuous variables (e.g. age) have to create categories
- In these situations, regression models have many strengths
Summary
Other variables in a study can be
- Confounders
Bias
Prevent in study design
Adjust for in analysis
- Effect modifiers
Personalized medicine; effects in a subgroup
Stratify and report
- Mediators
XZY
Statistically Speaking …
Whats next?
All lectures will be held from noon to 1 pm in Hughes Auditorium, Robert H. Lurie
Medical Research Center, 303 E. Superior St.
Tuesday, October 18
Statistical Power and Sample Size: What You Need and How
Much
Mary Kwasny, ScD, Associate Professor, Division of Biostatistics,
Department of Preventive Medicine
Friday, October 21
Clinical Trials: Highlights from Design to Conduct Masha
Kocherginsky, PhD, Associate Professor, Division of Biostatistics,
Department of Preventive Medicine
Tuesday, October 25
Finding Signals in Big Data Kwang-Youn A. Kim, PhD, Assistant
Professor, Division of Biostatistics, Department of Preventive
Medicine
Friday, October 28
Enhancing Rigor and Transparency in Research: Adopting
Tools that Support Reproducible Research Leah J. Welty, PhD,
BCC Director, Associate Professor, Division of Biostatistics,
Department of Preventive Medicine