Skip to main content

The discovery BPD (D-BPD) program: study protocol of a prospective translational multicenter collaborative study to investigate determinants of chronic lung disease in very low birth weight infants



Premature birth is a growing and serious public health problem affecting more than one of every ten infants worldwide. Bronchopulmonary dysplasia (BPD) is the most common neonatal morbidity associated with prematurity and infants with BPD suffer from increased incidence of respiratory infections, asthma, other forms of chronic lung illness, and death (Day and Ryan, Pediatr Res 81: 210–213, 2017; Isayama et la., JAMA Pediatr 171:271–279, 2017). BPD is now understood as a longitudinal disease process influenced by the intrauterine environment during gestation and modulated by gene-environment interactions throughout the neonatal and early childhood periods. Despite of this concept, there remains a paucity of multidisciplinary team-based approaches dedicated to the comprehensive study of this complex disease.


The Discovery BPD (D-BPD) Program involves a cohort of infants < 1,250 g at birth prospectively followed until 6 years of age. The program integrates analysis of detailed clinical data by machine learning, genetic susceptibility and molecular translation studies.


The current gap in understanding BPD as a complex multi-trait spectrum of different disease endotypes will be addressed by a bedside-to-bench and bench-to-bedside approach in the D-BPD program. The D-BPD will provide enhanced understanding of mechanisms, evolution and consequences of lung diseases in preterm infants. The D-BPD program represents a unique opportunity to combine the expertise of biologists, neonatologists, pulmonologists, geneticists and biostatisticians to examine the disease process from multiple perspectives with a singular goal of improving outcomes of premature infants.

Trial registration

Does not apply for this study.

Peer Review reports


Premature birth is a serious public health problem affecting more than one of every 10 infants worldwide [1]. Bronchopulmonary dysplasia (BPD), defined by a requirement for oxygen supplementation at 36 weeks post-conceptional age (PCA) due to respiratory insufficiency. BPD is the most common neonatal morbidity and is associated with increased incidence of infections, asthma, other forms of chronic lung illness, and death [2, 3]. Very low birth weight (VLBW) infants (BW < 1,250 g) are at greatest risk of developing BPD and disproportionately experience long-term consequences of prematurity [4, 5].

While VLBW infants often require treatment for pulmonary complications after birth, their course upon graduation from the neonatal intensive care unit (NICU) is highly variable. Results from the NHLBI Prematurity and Respiratory Outcomes Program (PROP) revealed that some though some infants remain asymptomatic and appear to live a healthy first year of life despite of a diagnosis of BPD, others experience frequent hospitalizations for respiratory indications, need for home respiratory support and suffer from additional respiratory morbidities [6,7,8,9]. Long term, a significant proportion of former VLBW infants, with or without BPD, exhibit respiratory limitations at school age and into adulthood [10,11,12]. Predicting the long-term pulmonary outcomes for VLBW infants early in life is difficult, despite ~ 30% of infants receiving a diagnosis of BPD during their initial hospitalization. This challenge is due, in part, to the definition of BPD itself. While a diagnosis of BPD simply identifies babies requiring oxygen therapy relatively early after birth, limited information is available during the first months of life to predict the evolution of lung growth and development and the impact on gas exchange. BPD likely represents a diagnostic umbrella encompassing a broad range of pulmonary diseases of diverse etiologies and prognoses (endotypes). This hypothesis is supported by the absence of genetic studies that identify single genes that strongly correlate with BPD and conclusively predict long term respiratory compromise in prematurely born infants [13, 14].

Environmental exposures of the developing lung are recognized as a key factors that influence long-term outcomes [15] and modulation of these exposures may offer a window of opportunity to improve the undesirable consequences of lung immaturity. In addition, understanding patterns of lung disease within the BPD umbrella – particularly when using an unbiased approach like machine learning [16] - may enable redefinition of lung diseases in VLBW infants with greater linkage between phenotype, genetic, and/or environmental determinants of disease. Given the gaps in our understanding of lung disease endotypes in prematurely born infants, the molecular bases underlying these endotypes, the genetic predisposition toward individual endotypes, and the contribution(s) of environmental factors in disease inception and severity, we established the Discovery BPD program (D-BPD). D-BPD is a multi-disciplinary, seven center program (Table 1) that fosters collaboration between neonatologists, pulmonologists, immunologists, environmental biologists, basic scientists and bioinformaticians. The D-BPD collaborative will enable identification of new endotypes within the BPD umbrella and define genetic, molecular and environmental factors associated with pathogenesis.

Table 1 Participating centers and specific projects

D-BPD integrates three distinct yet interactive areas of research (Fig. 1). The clinical data core uses machine learning strategies to leverage the detailed longitudinal clinical data. The gene susceptibility program uses genome-wide association mapping and positional cloning in inbred strains of mice to identify candidate susceptibility genes. Finally, the basic science molecular program explores the mechanistic correlates of clinical and genetic findings associated with oxidative stress. A list of all investigators and research staff from each center is provided in Additional file 1.

Fig. 1
figure 1

The D-BPD research areas integration. The clinical data core, using machine learning strategies will leverage the detailed longitudinal clinical data the gene susceptibility program using genome-wide association mapping and positional cloning in murine strains to identify candidate susceptibility genes, and a basic science molecular program exploring mechanistic correlates of clinical and genetic findings associated with BPD endotypes. Image credits: Wikimedia Commons

As of this writing, the D-BPD cohort currently includes 325 infant/mother/father triads. Infants < 1,250 g at birth will be followed until 6 years of age. In this manuscript, we present the D-BPD program protocol, illustrate the breadth of data and biospecimens available for study, and outline ongoing and future investigations that will enable the identification of preventive strategies against lung diseases of prematurity.


The D-BPD structure is depicted in Fig. 2. Five clinical centers are coordinated by Fundacion INFANT through the Preterm INFANT Network. Fundacion INFANT is responsible for supervising the conduct of the clinical study, including data collection, regulatory affairs, and sample collection, early processing and storage. Fundacion INFANT and the National Institute of Environmental Health Sciences (NIEHS) monitor quality collection of data through clinical report forms. Oversight of the program rests in an NIEHS appointed Steering Committee Chair, NIH officials, and an Observational and Safety Monitoring Board (OSMB). Teams from the NIEHS, Fundación INFANT, the University of Alabama at Birmingham (UAB) and the Pontificia Universidade Católica do Rio Grande do Sul conduct every other week videoconference calls to discuss all aspects of the program including recruitment, data collection, new data, recent results, long term objectives, and regulatory matters.

Fig. 2
figure 2

Discovery BPD (D-BPD) structure

Multicenter protocol development

Outcomes of interest

The primary aim of D-BPD is to identify new endotypes within the BPD umbrella in order to define genetic, molecular and environmental factors associated with disease pathogeneses. These data will enable the prediction of respiratory morbidity through early childhood. Long-term lung disease determinations in D-BPD will be assessed by the combined clinical evaluation of respiratory signs and symptoms until the age of 6 years using physiologic evaluations of lung function at defined time points during childhood. The D-BPD program will also define genetic, molecular and environmental factors associated with the traditional definition of BPD, its severity, and the inception and evolution of other prematurity morbidities and death.


The inclusion and exclusion criteria are listed in Table 2. The protocol is outlined in Fig. 3. The protocol integrates data from the molecular to population-level. We expect to enroll 750 infants. Based on prior population studies, we estimate that 40% of this cohort will meet the diagnoses of BPD. With these parameters, the study has more than 80% power to compare an area under the curve (AUC) larger than 0.6 in a receiver operating characteristic (ROC) analysis, against a null hypothesis of an AUC with no diagnostic value (AUC = 0.5). This is a conservative estimate, as the power is larger for AUC values larger than 0.6.

Table 2 Inclusion/Exclusion criteria
Fig. 3
figure 3

D-BPD Program Protocol Time Line spans from birth to 6 years of corrected age collecting health data and biospecimens. The babies will be monitored daily during their NICU stay by the participating neonatologist (without direct clinical responsibilities) using structured data collection log sheets. Information of the clinical course will be collected daily during the first 28 days and every 2 days thereafter. Afterwards phone calls will be made every 6 months until 6 years of age are completed

Environmental and clinical data collection

Parents who consent to participate in the study are personally interviewed by participating neonatologists using questionnaires specifically designed by the NIEHS epidemiologists and biostatisticians for this study. This questionnaire collects epidemiological and clinical information associated with pregnancy. Data from VLBW infants are obtained prospectively every day during the NICU stay using specially designed forms. After discharge, families are contacted via telephone and interviewed using modified ISAAC questionnaires to monitor the respiratory status of their baby. These questionnaires have been modified to assess respiratory health at 6 months and yearly thereafter up to 6 years PCA.

Biospecimen archive (bedside to bench)

The characterization of long-term respiratory outcomes in VLBW infants is hindered by the absence of biological materials to study phenotype-specific disease determinants, from molecular alterations in mitochondrial function to genetic mutations or gene-by-environment interactions. NIEHS and Fundación INFANT, in conjunction with the Preterm Network, established standardized procedures for sample collection and central processing, and protocols for accessing the resulting biorepositories. Saliva specimens from parents are collected at study entry for DNA extraction. Infant saliva samples are obtained in the first 4 weeks of life. Early (birth) specimens allow for exploration of injuries and exposures during gestation, developmental and genetic biosynthetic capacities. Collection at later time points (after 1 week) likely reflect responses to oxidative stress, infection, inflammation, nutritional state, and tissue repair. The program is now collecting samples from placenta tissue and umbilical cord blood tissue at the time of birth.

Assessments of respiratory function (physiologic biomarkers)

The evaluation of lung function in early years of life has been hampered by the need for sedation. In addition, the absence of appropriate biomarkers for the inception of asthma contributed to the scarcity of tools to predict long-term lung health in infancy. Forced oscillatory test (FOT) uses the patient’s spontaneous respiration without sedation to define the physiology of the small and large airways. FOT applies an oscillation pressure wave generated by a loudspeaker to the respiratory system to analyze the pressure-flow relationship in terms of impedance [Zrs; encompasses both resistance (Rrs) and reactance (Xrs)]. Rrs, calculated from pressure and flow signals, is a measure of central and peripheral airway caliber, while Xrs, derived from the pressure in the phase with volume, relates to compliance (Crs) and inertance (Irs). FOT has been used to detect lung function abnormalities in asthmatics with normal spirometry [17], to identify the deleterious effects of oxidative stress (e.g., cigarette smoke exposure) on pulmonary function, and to study bronchodilator responsiveness in infants [18]. Therefore, we will use FOT to evaluate lung function in study participants between the age of 3–4 years. Participants will again be evaluated at the age of 5–7 years.

Data collection, management and storage systems

All source documents and laboratory reports are reviewed by the clinical team and the staff in charge of data entry to ensure that they are accurate and complete. Data collection is performed by clinical trial staff at the sites under supervision of the PI. During the study, investigators maintain complete and accurate documentation. Research sites that participate in this study maintain maximum confidentiality about the clinical and research information obtained from study participants. All information about study participants is kept in password-protected computer files or in locked cabinets accessible only to authorized personnel. Biological samples, tables, and files are identified by unique numbers. Questionnaire data are entered twice in the database designed by NIEHS for such purpose. This database is reviewed and maintained by the data manager.

Genetic susceptibility

In order to explore the phenotypic variation attributable to gene-environment interaction, the NIEHS has designed a process to translate findings in model organisms to human disease susceptibility in order to draw mechanistic insight that may help identify individuals who are sensitive to environmental exposures [19, 20]. BPD is a complex disorder, and because the contribution of each gene in a complex trait may be relatively minor, identification of each of the genes that ultimately contribute to a complex trait is a major challenge [21]. Furthermore, susceptibility genes interact with multiple environmental exposures or stimuli related to the etiology of a disease. In order to better define the genetic contribution to BPD susceptibility, we have chosen gene candidates a priori that have biological plausibility to contribute to the pathogenesis of BPD. These phenotypes can be tested using in vivo/in vitro in model systems and in the Buenos Aires D-BPD population. We have also performed a genome-wide association study (GWAS) of hyperoxia-induced acute lung injury in neonatal inbred mice which recapitulates some characteristics of BPD. This gene discovery approach identified a number of novel genes that have been tested and confirmed to have a role in susceptibility to acute lung injury in neonatal mice [22]. The combination of gene discovery and biologically plausible genes provides a panel of candidates that may be used to screen VLBW infants and, potentially, develop more precise intervention/prevention strategies in the treatment of BPD. Lastly, evaluating ancestry indicative markers is an excellent way to discover novel genes underlying complex diseases [23] like premature lung disease, and the availability of infant-parent triad will allow us to pursue those investigations.

Analytic approach by machine learning

A central problem regarding the phenotypic characterization of BPD relates to the current definition of the disease: oxygen requirement [24]. This operational definition fails to convey the diverse underlying pulmonary pathologies, the varying degrees of pathology between individual preterm infants due to differences in pulmonary development, the presence of lung fibrosis (and resulting changes in lung compliance), the severity of lung vascular remodeling (and resulting pulmonary hypertension) and the degree of tracheomalacia and/or bronchomalacia. These factors may vary widely between individual infants and perhaps even in the same infant over time given that BPD is a multifactorial disorder superimposed upon the developing lung. These realities suggest that BPD is most likely to be a superficial umbrella term that encompasses related but different conditions caused by distinct underlying pathophysiological mechanisms. The large amounts of data that will be amassed during the present study and the urgent need for more stringent dissection of the causes and outcomes under the BPD diagnosis supports the use of machine learning [25] for assessing these possible sub variants (endotypes). These endotypes will be generated employing latent class analysis (LCA) [26, 27], a data-driven, hypothesis-generating approach. Clusters (endotypes) will be constructed employing longitudinal data without any a priori classification such as the canonical labels “severe” or “mild” BPD. To this end, patient-specific data will be used for the construction of trajectories. Each trajectory will be based upon the time course of the assessed variables including the degree of respiratory support, growth, infection, early childhood respiratory function and symptoms. The dimensionality of these variables will be reduced using principal component analysis [28]. The use of LCA guarantees the acquisition of unbiased endotypes enabling circumvention of simple clinical phenotypic characterization based upon a single dimension of the disease. Thus, the resulting endotypes will encompass all relevant descriptors of disease progression. Once the endotypes, or clusters, are generated, the next step will be the segregation of transversal (non time-dependent) variables among the different clusters including, but not limited to, genetic markers, environmental conditions, sex, chorioamnionitis and other pathophysiological outcomes. These transversal variables should allow a better understanding of the molecular basis underlying individual endotypes. These data could lead to better diagnostics and the eventual possibility of developing personalized treatments for each endotype. Thus, machine learning is one of the novel fundamental approaches of the D-BPD program that will enable the team to propose new definitions that will be used in clinical study design, drug development and assessments of novel therapies as part of a personalized medicine therapeutic approach for each individual patient.

Molecular basis of disease onset and severity

One of the main objectives of our machine learning approach is to characterize the underlying endotypes in infants with a diagnosis of BPD. Bridging the gap between endotypes and causal mechanisms is a major challenge [29]. We will tackle this issue by utilizing identified candidate genes for disease. The connection between endotypes and candidate genes will be assessed, enabling the achievement of the ultimate goal of the D-BPD program: to define the molecular basis that contribute to endotypes of BPD. This knowledge will facilitate the pursuit of specific treatments, ranging from improved palliative care to the development of long-term projects for target-specific drug design. To this end, the identified variants/mutants will be classified using bioinformatics [30]. The first step consists of assessing the effects of genetic mutations on gene expression at the level of transcription, splicing or mRNA half-life, and protein structure/function [30,31,32]. Candidate proteins will be studied by employing a combined in silico/in vitro approach. The effect of the mutations will be evaluated on the basis of previous reports regarding functional data, interaction analysis with other proteins or RNA/DNA, and available data from system biology or structural data when NMR and/or crystal structures of the candidate proteins are available. Bioinformatics, homology modelling and molecular dynamic simulations will be applied in parallel with in vitro approaches that consist of recombinant expression and purification of candidate proteins and/or individual subdomains. The wild-type and relevant mutants will be assessed at the structure-dynamics-function level and will encompass a complete battery of spectroscopic and biophysical characterization methods including far-UV circular dichroism spectroscopy, vibrational spectroscopy, fluorescence and spectroscopy in order to determine structure and stability. For each specific protein, depending on their known functions, individual protocols for assessment of function of the mutant proteins will be designed including, but not limited to, interaction assays for complex formation, redox properties and enzymatic functions.

Study approval and oversight

The multi-center D-BPD protocol and consent, additional information to be completed by the participants, such as survey instruments or questionnaires, proposals, and any other advertising/contracting material has been be submitted to the NIEHS IRB and all participating local IRBs for approval in writing. The protocols, consent, and survey instruments are reviewed annually for progress and compliance. We will submit and obtain approval from the NIEHS IRB and all participating local IRBs for all subsequent modifications to the protocol, informed consent documents, and any documentation pertaining to the study. We are responsible for obtaining approval from the NIEHS IRB and all participating local IRBs of the ongoing continuing review throughout the entire duration of the study. We will notify the NIEHS IRB and all participating local IRBs of serious adverse events and protocol violations per their requirements.

Training and quality control

Since the inception of the study, Fundacion INFANT has held bi-weekly training webinars with the research team from each site to ensure uniform approaches to data and specimen collection.

Summary and progress through enrollment

Enrollment began in June, 2013 and is ongoing (Fig. 4). Consent rates have ranged from 45 to 90% by center (67% for the overall consortium) for a total enrollment of 325 participants. The biospecimen archive of DNA, cord blood, physiologic testing results, and breadth of the investigative teams has prompted the initiation of several ancillary studies that have added dimensions to the original D-BPD design (Table 3).

Fig. 4
figure 4

D-BPD cohort diagram

Table 3 Ancillary projects arising from D-BPD


In summary, the current gap in understanding BPD as a complex multi-trait spectrum of different disease endotypes will be addressed by a bedside-to-bench and bench-to-bedside approach in the D-BPD program. Other observational programs have been successful in identifying perinatal and clinical risk factors and have very elegantly described respiratory physiology in infants [6, 33]. A few important assets that distinguish our program from others include: 1) The recruitment of case/parent triads which makes it possible to perform transmission/disequilibrium tests to identify preferential transmission of alleles from parent to affected child within different triads (comprising an affected child plus two parents). The transmission/disequilibrium test (TDT) considers parents who are heterozygous for an allele associated with disease and evaluates the frequency with which that allele or its alternate is transmitted to affected offspring [34]. Compared with conventional tests for linkage, the TDT has the advantage that it does not require data either on multiple affected family members or on unaffected sibs. Moreover, the use of parental data, instead of nonrelated controls avoids ethnic confounding, even if the parents represent a mixture of ethnic backgrounds. 2) We plan to study lung function utilizing standard spirometry testing and novel lung function evaluations with Forced Oscillatory Testing beyond the first year of life. Therefore, our studies can extend the characterization of lung development into childhood and, consequently, identify manifestations of premature lung disease that may not be apparent until later in life. 3) Finally, the ultimate goal of the endotype discovery paradigm is to build upon the foundations earlier studies, including PROP, to identify novel pathways that contribute to pulmonary outcomes in prematurely born infants. Specifically, we will develop machine learning algorithms to identify endotypes from our cohort to enable the use of an unbiased, hypothesis generating approach. Similar approaches have recently been used to uncover disease endotypes “hidden” under the same umbrella term (e.g.: fever or asthma) [16]. Our hypothesis is that this will disaggregate premature lung disease into several subgroups with different etiologies and prognoses hidden under the BPD definition to date. A limitation of our program is a lack of standardized physiologic testing during the NICU course including a room air challenge at 36 weeks PCA. The room air challenge enables identification of infants with immature control of breathing and/or a weak chest wall/airway. Given the longitudinal nature of our study and the development of trajectories for clustering, we are confident that analysis of the longitudinal data will enable the unbiased identification of the above-referenced infants.

Overall the D-BPD Program will provide enhanced understanding of mechanisms, evolution and consequences of lung diseases in preterm infants. The D-BPD program represents a unique opportunity to combine the expertise of biologists, neonatologists, pulmonologists, geneticists and biostatisticians to examine the disease process from multiple perspectives with a singular goal of improving outcomes of premature infants.

Availability of data and materials

The datasets used and/or analyzed during the current study will be available from the corresponding author on reasonable request.



Bronchopulmonary dysplasia


Discovery bronchopulmonary dysplasia


Forced oscillatory test


Neonatal intensive care unit


National Institute of Environmental Sciences


Post-conceptional age


Retinopathy of prematurity


Very low birth weight


  1. Rubens CE, Gravett MG, Victora CG, Nunes TM, Group GR. Global report on preterm birth and stillbirth (7 of 7): mobilizing resources to accelerate innovative solutions (global action agenda). BMC Pregnancy Childbirth. 2010;10(Suppl 1):S7.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Day CL, Ryan RM. Bronchopulmonary dysplasia: new becomes old again. Pediatr Res. 2017;81(1–2):210–3.

    Article  PubMed  Google Scholar 

  3. Isayama T, Lee SK, Yang J, Lee D, Daspal S, Dunn M, Shah PS, Canadian Neonatal N, Canadian Neonatal Follow-Up Network I. Revisiting the definition of bronchopulmonary dysplasia: effect of changing panoply of respiratory support for preterm neonates. JAMA Pediatr. 2017;171(3):271–9.

    Article  PubMed  Google Scholar 

  4. Short EJ, Klein NK, Lewis BA, Fulton S, Eisengart S, Kercsmar C, Baley J, Singer LT. Cognitive and academic consequences of bronchopulmonary dysplasia and very low birth weight: 8-year-old outcomes. Pediatrics. 2003;112(5):e359.

    Article  PubMed  Google Scholar 

  5. Jobe AH. Lung maturation: the survival miracle of very low birth weight infants. Pediatr Neonatol. 2010;51(1):7–13.

    Article  PubMed  Google Scholar 

  6. Keller RL, Feng R, DeMauro SB, Ferkol T, Hardie W, Rogers EE, Stevens TP, Voynow JA, Bellamy SL, Shaw PA, et al. Bronchopulmonary dysplasia and perinatal characteristics predict 1-year respiratory outcomes in newborns born at extremely low gestational age: a prospective cohort study. J Pediatr. 2017;187:89–97 e83.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Maitre NL, Ballard RA, Ellenberg JH, Davis SD, Greenberg JM, Hamvas A, Pryhuber GS, Prematurity, Respiratory Outcomes P. Respiratory consequences of prematurity: evolution of a diagnosis and development of a comprehensive approach. J Perinatol. 2015;35(5):313–21.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Pryhuber GS, Maitre NL, Ballard RA, Cifelli D, Davis SD, Ellenberg JH, Greenberg JM, Kemp J, Mariani TJ, Panitch H, et al. Prematurity and respiratory outcomes program (PROP): study protocol of a prospective multicenter study of respiratory outcomes of preterm infants in the United States. BMC Pediatr. 2015;15:37.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Poindexter BB, Feng R, Schmidt B, Aschner JL, Ballard RA, Hamvas A, Reynolds AM, Shaw PA, Jobe AH, Prematurity, et al. Comparisons and limitations of current definitions of bronchopulmonary dysplasia for the Prematurity and respiratory outcomes program. Ann Am Thorac Soc. 2015;12(12):1822–30.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Ambalavanan N, Carlo WA, Tyson JE, Langer JC, Walsh MC, Parikh NA, Das A, Van Meurs KP, Shankaran S, Stoll BJ, et al. Outcome trajectories in extremely preterm infants. Pediatrics. 2012;130(1):e115–25.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Bhandari A, Panitch HB. Pulmonary outcomes in bronchopulmonary dysplasia. Semin Perinatol. 2006;30(4):219–26.

    Article  PubMed  Google Scholar 

  12. Vom Hove M, Prenzel F, Uhlig HH, Robel-Tillig E. Pulmonary outcome in former preterm, very low birth weight children with bronchopulmonary dysplasia: a case-control follow-up at school age. J Pediatr. 2014;164(1):40–45 e44.

    Article  PubMed  Google Scholar 

  13. Lal CV, Ambalavanan N. Genetic predisposition to bronchopulmonary dysplasia. Semin Perinatol. 2015;39(8):584–91.

    Article  PubMed  Google Scholar 

  14. Sampath V, Garland JS, Helbling D, Dimmock D, Mulrooney NP, Simpson PM, Murray JC, Dagle JM. Antioxidant response genes sequence variants and BPD susceptibility in VLBW infants. Pediatr Res. 2015;77(3):477–83.

    Article  CAS  PubMed  Google Scholar 

  15. Sly PD, Carpenter DO, Van den Berg M, Stein RT, Landrigan PJ, Brune-Drisse MN, Suk W. Health consequences of environmental exposures: causal thinking in global environmental epidemiology. Ann Glob Health. 2016;82(1):3–9.

    Article  PubMed  Google Scholar 

  16. Howard R, Rattray M, Prosperi M, Custovic A. Distinguishing asthma phenotypes using machine learning approaches. Curr Allergy Asthma Rep. 2015;15(7):38.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Czovek D, Shackleton C, Hantos Z, Taylor K, Kumar A, Chacko A, Ware RS, Makan G, Radics B, Gingl Z, et al. Tidal changes in respiratory resistance are sensitive indicators of airway obstruction in children. Thorax. 2016;71(10):907–15.

    Article  PubMed  Google Scholar 

  18. Gray D, Willemse L, Visagie A, Czovek D, Nduru P, Vanker A, Stein DJ, Koen N, Sly PD, Hantos Z, et al. Determinants of early-life lung function in African infants. Thorax. 2017;72(5):445–50.

    Article  PubMed  Google Scholar 

  19. Huang L, Luo Y, Wen X, He YH, Ding P, Xie C, Liu T, Yuan SX, Jia DQ, Chen WQ. Gene-gene-environment interactions of prenatal exposed to environmental tobacco smoke, CYP1A1 and GSTs polymorphisms on full-term low birth weight: relationship of maternal passive smoking, gene polymorphisms, and FT-LBW. J Matern Fetal Neonatal Med. 2018;32(13):2200–8.

    Article  Google Scholar 

  20. Keating ST, El-Osta A. Epigenetics and metabolism. Circ Res. 2015;116(4):715–36.

    Article  CAS  PubMed  Google Scholar 

  21. Rusyn I, Kleeberger SR, McAllister KA, French JE, Svenson KL. Introduction to mammalian genome special issue: the combined role of genetics and environment relevant to human disease outcomes. Mamm Genome. 2018;29(1–2):1–4.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Nichols JL, Gladwell W, Verhein KC, Cho HY, Wess J, Suzuki O, Wiltshire T, Kleeberger SR. Genome-wide association mapping of acute lung injury in neonatal inbred mice. FASEB J. 2014;28(6):2538–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Qu HQ, Li Q, Xu S, McCormick JB, Fisher-Hoch SP, Xiong M, Qian J, Jin L. Ancestry informative marker set for han chinese population. G3 (Bethesda). 2012;2(3):339–41.

    Article  Google Scholar 

  24. Jobe AH, Bancalari E. Bronchopulmonary dysplasia. Am J Respir Crit Care Med. 2001;163(7):1723–9.

    Article  CAS  PubMed  Google Scholar 

  25. Ching T, Himmelstein DS, Beaulieu-Jones BK, et al. Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface. 2018;15(141).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Rabe-Hesketh S, Skrondal A. Classical latent variable models for medical research. Stat Methods Med Res. 2008;17(1):5–32.

    Article  PubMed  Google Scholar 

  27. Spycher BD, Minder CE, Kuehni CE. Multivariate modelling of responses to conditional items: new possibilities for latent class analysis. Stat Med. 2009;28(14):1927–39.

    Article  CAS  PubMed  Google Scholar 

  28. Zhang Z, Castello A. Principal components analysis in clinical studies. Ann Transl Med. 2017;5(17):351.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Belgrave D, Henderson J, Simpson A, Buchan I, Bishop C, Custovic A. Disaggregating asthma: Big investigation versus big data. J Allergy Clin Immunol. 2017;139(2):400–7.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Mooney SD, Krishnan VG, Evani US. Bioinformatic tools for identifying disease gene and SNP candidates. Methods Mol Biol. 2010;628:307–19.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Chen JY, Youn E, Mooney SD. Connecting protein interaction data, mutations, and disease using bioinformatics. Methods Mol Biol. 2009;541:449–61.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Reva B, Antipin Y, Sander C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 2011;39(17):e118.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Ren CL, Feng R, Davis SD, Eichenwald E, Jobe A, Moore PE, Panitch HB, Sharp JK, Kisling J, Clem C, et al. Tidal breathing measurements at discharge and clinical outcomes in extremely low gestational age neonates. Ann Am Thorac Soc. 2018;15(11):1311–9.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Spielman RS, McGinnis RE, Ewens WJ. Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet. 1993;52(3):506–16.

    CAS  PubMed  PubMed Central  Google Scholar 

Download references


We acknowledge Dr. Janet Hall, Clinical Director of the Clinical Research Branch, NIEHS and Dr. Clarice Wainberg from the Biostatic and Computational Biology Branch, NIEHS for their support of the D-BPD program, and Social and Scientific Systems (SSS) for their help with oversight of compliance and IRB approvals.


Supported by NIH intramural grant subcontract No. PHR-SSS-S-15-004643. Dr. Kleeberger from the NIH, as the principal investigator, was involved in the design of the program and contributed to the writing of the manuscript.

Author information

Authors and Affiliations



GO, MTC and DAP equally contribute to coordinated and lead the study. FN is involved in data management and analysis. JM and HC are involved in sample processing, analysis and interpretation. MS, GC, AB, LMP, NV, GM, JD, ELT, CO, FG, MQ, AB, SLG, SG are all involved in data and sample collection and in study coordination at different clinical centers. DB and SRK are involved in the molecular translation of the sample findings. MHJ is involved in the respiratory and physiologic testing coordination, interpretation and analysis. TET and FP are involved in the study design and coordination as well as interpretation as senior investigators. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Gaston Ofman.

Ethics declarations

Ethics approval and consent to participate

A copy of the protocol, informed consent forms, additional information to be completed by the participants, such as survey instruments or questionnaires, proposals, and any other advertising/contracting material has been approved by the NIEHS IRB and all participating local IRBs.

The investigators are responsible for obtaining approval from the NIEHS IRB and all participating local IRBs of the ongoing continuing review throughout the entire duration of the study. The investigator must notify the NIEHS IRB and all participating local IRBs of serious adverse events and protocol violations per their requirements.

This protocol is considered to be of minimal risk. Parents of newborns who fulfill the inclusion criteria are approached for the study by the trained neonatologist who is not the primary care physician. Parents who are interested are provided with written copies of the consent form by the neonatologist and are given ample opportunity to study the consent, process the information in the document, and ask questions about the study in order to make an informed decision about study participation. Individual questions are answered at the end of the consent session. The Associate Investigator, or an authorized designated person, discusses the consent with the participant and answer his/her questions. The participant is informed that study participation is voluntary and that s/he can withdraw from the study at any time and for any reason. All participants, as well as the person obtaining consent, must read, sign, and date two original copies of the consent form before study participation. One copy is maintained in the participant’s study file at the hospital, and a second copy is given to the participants for their records. The consent process is conducted in a private location in order to maintain patient confidentiality. The acquisition of informed consent is documented in the participant’s medical records. All informed consent discussions and written consent forms are delivered in Argentinian Spanish.

Consent for publication

There is no individual person’s data involved in this investigation.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Additional file 1:

Investigators and Research Staff. (DOCX 12 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ofman, G., Caballero, M.T., Alvarez Paggi, D. et al. The discovery BPD (D-BPD) program: study protocol of a prospective translational multicenter collaborative study to investigate determinants of chronic lung disease in very low birth weight infants. BMC Pediatr 19, 227 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: