Colliders in Epidemiology: an educational interactive web application





Correlation is not causation

During the last 30 years, classical epidemiology has focussed on the control of confounding [1]. However, it is only recently that epidemiologists have started to focus on the bias produced by colliders in addition to confounders [2, 3]. In the epidemiological literature different explanations have been proposed to describe the paradoxical protective effect of established risk factors; such as, for example, the protective effect of maternal smoking on infant mortality and the incidence of pre-eclampsia, namely the birth weight and the smoking pre-eclampsia paradoxes [4, 5].


What is a collider?

A collider for a certain pair of variables (outcome and exposure) is a third variable that is influenced by both. Controlling for, or conditioning the analysis on (i.e., stratiffication or regression) a collider, can introduce a spurious association between its causes (exposure and outcome) potentially explaining why the medical literature is full of paradoxical findings [6]. In DAG terminology, a collider is the variable in the middle of an inverted fork (i.e., variable W in A -> W <- Y) [7]. We hope that this web application will contribute to the increasing awareness and the general understanding of ''colliders'' among applied epidemiologists and medical researchers.


Objective

The objective of this (educational) web application is to illustrate the effect of conditioning on a collider, based on a realistic non-communicable disease epidemiology example (hypertension and dietary sodium intake). We estimate the effect of 24-hour dietary sodium intake in grams (exposure) on systolic blood pressure (outcome) accounting for the effect of age (confounder). The objective of the illustration is to show the paradoxical effect of 24-hour dietary sodium intake on systolic blood pressure after conditioning on 24-hour excretion of urinary protein (collider).


References

[1] Sander Greenland and Hal Morgenstern. Confounding in health research. Annual Review of Public Health, 22(1):189-212, May 2001.

[2] Stephen R Cole, Robert W Platt, Enrique F Schisterman, Haitao Chu, Daniel Westreich, David Richardson, and Charles Poole. Illustrating bias due to conditioning on a collider. International Journal of Epidemiology, 39(2):417-420, Nov 2009.

[3] Tyler J. Vanderweele and Stijn Vansteelandt. Conceptual issues concerning mediation, interventions and composition. Statistics and Its Interface, 2(4):457-468, 2009.

[4] Miguel Angel Luque-Fernandez, Helga Zoega, Unnur Valdimarsdottir, and Michelle A. Williams. Deconstructing the smoking-preeclampsia paradox through a counterfactual framework. European Journal of Epidemiology, 31(6):613-623, Jun 2016.

[5] S. Hernandez-Diaz, E. F. Schisterman, and M. A. Hernan. The birth weight ''paradox'' uncovered? American Journal of Epidemiology, 164(11):1115-1120, Sep 2006.

[6] Julia M Rohrer. Thinking clearly about correlations and causation: Graphical causal models for observational data. 2017.

[7] Judea Pearl. Causal diagrams for empirical research. Biometrika, 82(4):669-688, 1995.

Data generation


Data generation process

Based on a motivating example in non-communicable disease epidemiology, we generated a dataset with 1,000 observations to contextualize the effect of conditioning on a collider. Nearly 1 in 3 Americans suffer from high blood pressure and more than half do not have it under control [1]. Increased levels of systolic blood pressure over time are associated with increased cardio-vascular morbidity and mortality [2]. Summative evidence shows that exceeding the recommendations for 24-hour dietary sodium intake in grams (gr) is associated with increased levels of systolic blood pressure (SBP) in mmHg [3]. Furthermore, with advancing age, the kidney undergoes several anatomical and physiological changes that limit the adaptive mechanism responsible for maintaining the composition and volume of the extracellular fluid. These include a decline in glomerular filtration rate and the impaired ability to maintain water and sodium homeostasis in response to dietary and environmental changes [4]. Likewise, age is associated with structural changes in the arteries and thus SBP [2]. Age is a common cause of both high SBP and impaired sodium homeostasis. Thus age acts as a confounder for the association between sodium intake and SBP (i.e. age is on the back-door path between sodium intake and SBP). However, high levels of 24-hour excretion of urinary protein (proteinuria) are caused by sustained high SBP and increased 24-hour dietary sodium intake. Therefore, proteinuria (PRO in the DAG) acts as a collider via the path SOD -> PRO <- SBP.

The data generation for the simulation is based on the structural relationship between the variables depicted on the Directed Acyclic Graph. We simulated 24-hour excretion of urinary protein as a function of age, SBP, and sodium intake. We assured that the range of values of the simulated data was biologically plausible and as close to reality as possible [5, 6].

References

[1] Emelia J Benjamin, Michael J Blaha, Stephanie E Chiuve, Mary Cushman, Sandeep R Das, Rajat Deo, J Floyd, M Fornage, C Gillespie, CR Isasi, et al. Heart disease and stroke statistics-2017 update: a report from the american heart association. Circulation, 135(10):e146-e603, 2017.

[2] Qiuping Gu, Vicki L Burt, Ryne Paulose-Ram, Sarah Yoon, and Richard F Gillum. High blood pressure and cardio-vascular disease mortality risk among us adults: the third national health and nutrition examination survey mortality follow-up study. Annals of epidemiology, 18(4):302-309, 2008.

[3] Frank M Sacks, Laura P Svetkey, William M Vollmer, Lawrence J Appel, George A Bray, David Harsha, Eva Obarzanek, Paul R Conlin, Edgar R Miller, Denise G Simons-Morton, et al. Effects on blood pressure of reduced dietary sodium and the dietary approaches to stop hypertension (dash) diet. New England journal of medicine, 344(1):3-10, 2001.

[4] Tareen, N., Martins, D., Nagami, G., Levine, B., Norris, K. C.. Sodium disorders in the elderly. J Natl Med Assoc. 2005; 97, 217-224

[5] Linda Van Horn, Jo Ann S Carson, Lawrence J Appel, Lora E Burke, Christina Economos, Wahida Karmally et al. Recommended dietary pattern to achieve adherence to the american heart association/american college of cardiology (aha/acc) guidelines: A scientific statement from the american heart association. Circulation, 134(22):e505e529, Nov 2016.

[6] Michael F Carroll. Proteinuria in adults: A diagnostic approach. American family physician, 62(6), 2000.

Data generation code

alpha1 (effect of SOD on PRO) and alpha2 (effect of SBP on PRO) are parameters you can modify in 'Collider Visualization'.

generateData <- function(n, seed, beta1, alpha1, alpha2){

set.seed(seed)
Age_years <- rnorm(n, 65, 5)
Sodium_gr <- Age_years / 18 + rnorm(n)
sbp_in_mmHg <- beta1 * Sodium_gr + 2.00 * Age_years + rnorm(n)
Proteinuria_in_mg <- alpha1 * Sodium_gr + alpha2 * sbp_in_mmHg + rnorm(n)
data.frame(sbp_in_mmHg, Sodium_gr, Age_years, Proteinuria_in_mg)

}

Data display and download

Download 1.000 simulations (.csv)
Legend:
AGE = Age (years)
SOD = 24-hour dietary sodium intake (g)
PRO = 24-hour excretion of urinary protein (proteinuria) (mg)
SBP = Systolic blood pressure (mmHg)


Effect of dietary sodium intake on systolic blood pressure for different models' specifications.


Move the slider to change the magnitude of the true causal effect of sodium in SBP
Collider Model :$$\text{PRO}=\alpha_{0}+\alpha_{1}\text{SOD}+\alpha_{2}\text{SBP}$$
Move the sliders to change the magnitude of the effect of sodium and SBP in proteinuria
Legend:
AGE = Age (years)
SOD = 24-hour dietary sodium intake (g)
PRO = 24-hour excretion of urinary protein (proteinuria) (mg)
SBP = Systolic blood pressure (mmHg)

Select the model(s) to visualize the effect of SOD in SBP:


Please select a model



Assumed DAG under respective model

Authorship


Miguel Angel Luque-Fernandez (PI)

Scientific Researcher of Epidemiology and Biostatistics
Biomedical Research Institute of Granada
Non‐Communicable and Cancer Epidemiology Group (ibs.Granada)
University of Granada
Biomedical Network Research Centers of Epidemiology and Public Health (CIBERESP), ISCIII, Madrid, Spain
Assistant Professor of Epidemiology (Honorary)
London School of Hygiene & Tropical Medicine, London, UK
Scientific Collaborator, Department of Epidemiology
Harvard T.H. Chan School of Public Health, Boston, MA, USA

miguel.luque.easp at juntadeandalucia.es

Daniel Redondo Sánchez

Research Assistant
Biomedical Research Institute of Granada
Non‐Communicable and Cancer Epidemiology Group (ibs.Granada)
University of Granada
Biomedical Network Research Centers of Epidemiology and Public Health (CIBERESP), ISCIII, Madrid, Spain
Andalusian School of Public Health

daniel.redondo.easp at juntadeandalucia.es

Michael Schomaker

Senior Researcher and Statistician
School of Public Health and Family Medicine
Center for Infectious Disease Epidemiology and Research
University of Cape Town, Cape Town, South Africa

michael.schomaker at uct.ac.za

Maria Jose Sánchez Perez

Subdirector Biomedical Research Institute of Granada
Director Non‐Communicable and Cancer Epidemiology Group (ibs.Granada)
University of Granada
Director of the Granada Cancer Registry
Biomedical Network Research Centers of Epidemiology and Public Health (CIBERESP), ISCIII, Madrid, Spain
Andalusian School of Public Health

mariajose.sanchez.easp at juntadeandalucia.es

Anand Vaidya

Assistant Professor of Medicine
Harvard Medical School, Harvard University
Director of the Center for Adrenal Disorders (Diabetes, Hypertension)
Brigham and Women's Hospital (Endocrinology), Boston, MA, USA

anandvaidya at bwh.harvard.edu

Mireille E. Schnitzer

Assistant Professor of Biostatistics
Faculty of Pharmacy
University of Montreal, Montreal, Canada
Adjunt Professor of Biostatistics
Department Epidemiology, Biostatistics and Occupational Health
McGill University, Montreal, Canada

mireille.schnitzer at umontreal.ca

Acknowledgment

Funding information
Carlos III Institute of Health, Grant/Award Number: CP17/00206 and the Andalusian Department of Health, Grant Number: PI-0152/2017.