Parental and infant characteristics and childhood leukemia in Minnesota

Background Leukemia is the most common childhood cancer. With the exception of Down syndrome, prenatal radiation exposure, and higher birth weight, particularly for acute lymphoid leukemia (ALL), few risk factors have been firmly established. Translocations present in neonatal blood spots and the young age peak of diagnosis suggest that early-life factors are involved in childhood leukemia etiology. Methods We investigated the association between birth characteristics and childhood leukemia through linkage of the Minnesota birth and cancer registries using a case-cohort study design. Cases included 560 children with ALL and 87 with acute myeloid leukemia (AML) diagnoses from 28 days to 14 years. The comparison group was comprised of 8,750 individuals selected through random sampling of the birth cohort from 1976–2004. Cox proportional hazards regression specific for case-cohort studies was used to compute hazard ratios (HR) and 95% confidence intervals (CIs). Results Male sex (HR = 1.41, 95% CI 1.16–1.70), white race (HR = 2.32, 95% CI 1.13–4.76), and maternal birth interval ≥ 3 years (HR = 1.31, 95% CI 1.01–1.70) increased ALL risk, while maternal age increased AML risk (HR = 1.21/5 year age increase, 95% CI 1.0–1.47). Higher birth weights (>3798 grams) (HRALL = 1.46, 1.08–1.98; HRAML = 1.97, 95% CI 1.07–3.65), and one minute Apgar scores ≤ 7 (HRALL = 1.30, 95% CI 1.05–1.61; HRAML = 1.62, 95% CI 1.01–2.60) increased risk for both types of leukemia. Sex was not a significant modifier of the association between ALL and other covariates, with the exception of maternal education. Conclusion We confirmed known risk factors for ALL: male sex, high birth weight, and white race. We have also provided data that supports an increased risk for AML following higher birth weights, and demonstrated an association with low Apgar scores.


Background
Leukemia is the most common cancer occurring in children under age 15 in the United States with an annual incidence of 45.5 cases per million [1]. Most cases occur in children less than 5 years with a peak incidence at 2-3 years [2]. Translocations commonly found in childhood leukemia have been detected in neonatal blood spots, in some cases more than 10 years before the onset of disease [3]. The early age at which leukemia diagnoses occur together with the presence of leukemic translocations detected at birth implicates a role for prenatal exposures in childhood leukemia etiology.
Birth characteristics and risk of childhood leukemia have been the focus of many studies and have been addressed in several reviews [2,4,5]. Variables previously examined include parental sociodemographic characteristics, maternal reproductive history, pregnancy conditions, labor and delivery factors, and infant characteristics. Despite decades of research, few definitive risk factors have been identified. The weight of the evidence to date indicates that prenatal exposure to diagnostic radiation, certain genetic syndromes (particularly Down syndrome), male sex, and high birth weight occur more frequently in children subsequently diagnosed with leukemia [2,5].
The objective of this study was to examine the relation between characteristics recorded in birth records and childhood leukemia using a case-cohort study design which is relatively resistant to recall and selection biases that are of concern in case-control studies of childhood leukemia that frequently use survey based data collection methods.

Study population
All childhood cancer cases diagnosed from 1988-2004 between the ages of 28 days and 14 years were ascertained from the Minnesota Cancer Surveillance System (MCSS) and linked to their birth records through probabilistic record linkage [6]. It is estimated that MCSS ascertains 99.9% of cancer cases [7]. A case-cohort study design was implemented, for which a comparison group was selected without regard to case status and is referred to as the subcohort [8]. In particular, four children per cancer case were randomly selected from births recorded among Minnesota residents during 1976-2004, matching on birth year. To maintain consistent exclusion criteria between cases and the sub-cohort, subjects who died during the neonatal period who were born in 1980 (the earliest year neonatal deaths were recorded) or later were excluded. For the present analysis, childhood cancer cases other than leukemia were excluded, resulting in more than four sub-cohort children matched to each case. One subject with a diagnosis of AML following lymphoma was excluded due to the possibility that the AML was therapy related. Individuals with reported Down syndrome (n = 14) were also excluded from all analyses except where noted. The University of Minnesota Institutional Review Board (IRB), the Minnesota Department of Health (MDH) IRB, and the MCSS approved all protocols for data use.

Variables
Birth certificate variables that were examined included: parental sociodemographic characteristics: parental age, parental race, maternal birthplace, education, marital status; maternal reproductive characteristics: plurality, previous live births, prior fetal loss, last fetal death and live birth intervals; index pregnancy conditions: anemia, diabetes (none, pre-existing or gestational), hypertension (none, chronic, pregnancy-associated or eclampsia), adequacy of prenatal care index [10]; labor and delivery characteristics: labor induction, delivery method; and infant characteristics: sex, birth weight categorized by percentile and as a linear term, size for gestational age, gestational age in weeks, one and five minute Apgar score, assisted ventilation, Down syndrome, and congenital abnormalities. Two gestational age variables were available in birth records: 1) imputed from last menstrual period and 2) the physician's estimate. The physician's estimate of gestational age was used where the imputed gestational age was missing (n = 644 (7.3%)). Size for gestational age was calculated using the derived gestational age variable according to the method of Brenner et al [11].
Statistical analysis SAS version 9.1 (SAS Institute, Cary, NC) was used for all statistical analyses. A modification of the stratified Cox proportional hazards model was used to compute risk estimates using a macro written by Langholz and Jiao [8] that computes standard errors using an asymptotic variance estimator appropriate for the analysis of case-cohort designs. All models were stratified by birth year. Personyears of follow-up was calculated as the time from birth to cancer diagnosis date, the age the child turned 15, or the age of the subject on 12/31/2004, whichever came first with modification of follow-up time for cases who were also selected in the sub-cohort (n = 3) [8].
Variables were selected for multivariate models using a likelihood-ratio (LR) chi-square p-value cut-off of 0.1 for ALL comparing the intercept only model to that contain-ing the covariate. For AML, only results from birth year stratified models are reported due to the small number of AML cases. Heterogeneity in ALL risk factors by sex or age group (<5 vs. 5-14 years) were examined by including an interaction term between sex or age group and the variable of interest. Tests of statistical significance for heterogeneity of risk were conducted using the test statement in PROC PHREG using a statistical criterion of p < 0.05.

Results
A total of 831 cases of leukemia occurring in children diagnosed from 28 days through 14 years were ascertained by MCSS from 1988-2004. Of these, 695 (84%) were matched to birth records including 12 leukemia cases with Down syndrome (Table 1). Linkage success decreased by age at diagnosis with 88.9%, 84.1%, and 68.4% of cases matching in the 0-4, 5-9, and 10-14 year old age groups respectively. ALL diagnoses were most frequent (81%), followed by AML (14%), chronic myeloproliferative disease (1.7%), myelodysplastic syndrome and other myeloproliferative diseases (1.0%), and unspecified and reticuloendothelial leukemias (2.6%). The majority of cases occurred in children under age 5 years for both ALL and AML (Table 1). Due to the small number of cases with other leukemia types, further analyses were limited to ALL and AML cases. Table 2 reports birth year adjusted results of regression modeling for the association between parental characteristics and childhood leukemia. Maternal education greater than high school increased the risk for ALL (HR = 1.21, 95% CI, 1.0-1.47), while maternal report of black compared to white race in birth records decreased the risk by approximately 60%. Factors associated with an increased risk for AML were maternal age (HR = 1.21 per 5 year increase, 95% CI, 1.0-1.47), and maternal foreign birthplace (HR = 2.18, 95% CI, 1.14-4.18). No particular national origin was overrepresented; three cases each had mothers who were reported to have been born in Laos and Canada, while one case mother each was born in Ger-many, El Salvador, Korea, Nigeria, and Peru. Similar associations were found for ALL for paternal age and race and for AML for paternal age. No other significant associations were found for paternal characteristics: race (AML), or education (ALL and AML) (data not shown).
Regression results for maternal reproductive, pregnancy and labor and delivery characteristics are reported in Table  3. Children whose mothers had a last fetal death interval of <3 years compared to those whose mothers had no fetal deaths were at an increased risk for ALL (HR = 1.30, 95% CI, 1.00-1.69). Children who were 3 or more years younger than their previous live born sibling compared to those who had no live siblings were also at an increased risk for ALL (HR = 1.40, 95% CI, 1.12-1.75). No other notable differences were found for maternal reproductive variables. We did not find evidence to suggest differences in pregnancy history or labor and delivery characteristics between cases and the sub-cohort with the exception of a marginally significant decreased risk for AML associated with less than adequate prenatal care (HR = 0.57, 95% CI 0.33-1.0) and an increased AML risk associated with maternal gestational weight gain of 30 or more pounds (HR = 2.59, 95% CI 1.27-5.31) based on few cases. Of note is that the association between maternal gestational weight gain and AML remained after adjustment for birth weight (data not shown).
Several notable associations were found between ALL and AML and infant characteristics (Table 4). Male sex increased the risk for ALL (HR = 1.38, 95% CI 1.16-1.65) only. Birth weight was positively associated with both ALL and AML with HRs of 1.12 (95% CI, 1.03-1.23) and 1.33 (95% CI 1.06-1.66), respectively, for each 581 gram (1 sd) increase. When risk was examined by birth weight percentile, individuals above the 25 th percentile (>3117 grams) relative to those in the 5 th -25 th percentile (2496-3117 grams) were at increased risk for ALL and AML. A pattern consistent with a linear relation between birth weight and ALL was observed for all percentile categories  [12].
No statistically significant differences in risk between sexes were noted for any parental or infant characteristics for ALL with the exception of maternal education (p = 0.04) in birth year adjusted models, which was associated with an approximately thirty percent increased risk for males whose mothers had education beyond high school but not females (data not shown).
We conducted multivariate analyses for cases diagnosed at 0-14 years of age (Table 5) and by age strata (0-4 years, 5-14 years) (data not shown). Multivariate models included: maternal age, maternal ethnicity, last live birth interval, sex, one minute Apgar score, maternal education, birth weight percentile category and gestational age category. Risk estimates were generally similar to birth year adjusted results for cancers diagnosed 0-14 (Table 5). No statistically significant variation in risk between age groups (0-4 years vs. 5-14 years) or sex was found for variables included in multivariate models.

Discussion
The results of this study confirm previously established risk factors for ALL: male sex, white race, and higher birth weights. Further, children whose siblings were born three or more years prior were at an increased risk for ALL, while those who had low one minute Apgar scores were at an increased risk for both leukemia subtypes. Risks for ALL by age group and sex were generally similar with the exception of that for maternal education in males.
Mothers of AML cases reported being foreign born more often than those of sub-cohort members. However, there was no distinct pattern in maternal nationality suggesting that this association may be a chance finding. AML cases were also more likely to have received inadequate prenatal care. However, a biological rationale for this association is lacking leading us to conclude that random variation is a more plausible explanation.
Mothers of AML cases were also more likely to have gained ≥30 pounds during their pregnancy in contrast to another study that showed no increased risk for AML associated with maternal gestational weight gain in excess of thirty pounds [13]. It is of interest to examine maternal weight gain and risk of leukemia because of its connection with macrosomia [14] which was associated with an  increased risk of AML in this study and others discussed below. Due to small numbers we could not fully address whether the association between higher birth weights and AML was mediated in part through maternal weight gain. However, within the limitations of our data, birth weight and maternal weight gain were independent risk factors in models that included both variables (data not shown).
AML risk increased linearly with maternal age in birth year adjusted analyses. Of note is that maternal age also had a significant linear effect on ALL risk in 0-4 year olds in birth year adjusted models (data not shown). The relation between advanced maternal age (defined as ≥35 in most studies) and leukemia has been examined in many studies with inconsistent findings. Studies reviewed by Little published prior to 1998 and those since for leukemia overall or ALL also support a slight positive association with a mean risk estimate of around 1.2 for mothers who were ≥35 years at the time of birth compared to younger mothers [4,13,[15][16][17][18][19][20][21][22][23][24][25]. Most recent studies also generally sup- port an increased risk for AML with a mean risk estimate of approximately 1.5 for mothers who were at least 35 at the time of birth vs. younger mothers [13,15,18,19,[22][23][24]. Studies that have reported risks for ALL by age group have generally supported stronger risks for younger children associated with older maternal ages [13,15,23,25]. Confirmation of increased risk due to advanced maternal age may point to particular mechanisms, such as age associated genetic changes, in need of further study that may help to inform leukemia etiology research and reproductive decision making.
Individuals who were ≥ 3 years younger than their previous live-born sibling were at increased risk for ALL relative to children whose mothers had no other live births. Longer birth intervals have been associated with adverse maternal outcomes, particularly labor complications and pre-eclampsia [26]. In our study, the risk for ALL associated with maternal hypertension, was elevated by 30% but was consistent with chance. Pre-eclampsia and hypertension during pregnancy have been examined in at least 6 other studies with inconsistent results [4].
Substantial evidence indicates that high birth weight (≥ 4000 grams) is a risk factor for childhood leukemia, particularly ALL [5]. Our results add to this evidence and are consistent with a linear model for birth weights above 2,496 grams. The relation between high birth weight and AML is less clear but the weight of the evidence favors an association [5]. Our results also support a biological link between higher birth weights and AML and favor a linear model, within the confines of our sample size. Larger infants may be at an increased leukemia risk due to increased fetal insulin like growth factor 1 (IGF-1) exposure [27] which has been shown to be positively correlated with birth weight [28][29][30][31] and is a known mitogen for hematopoietic cells [32]. Increased levels of circulating IGF-1 may offer a proliferative advantage to initiated cells, allowing increased opportunity for malignant transformation.
Low one minute Apgar scores increased the risk for both AML and ALL. Apgar scores are used to rapidly evaluate the general health of the infant shortly after delivery with values above 7 considered excellent while values below 7 can indicate an infant who has undergone oxygen depri- vation and is at increased risk of neonatal death [33]. Because infants with low Apgar scores often receive oxygen [34], a low Apgar score could indicate an infant who has undergone oxidative stress during delivery and/or who has been exposed to downstream potentially carcinogenic treatments [35]. Although oxygen exposure, as indicated by assisted ventilation, was not associated with leukemia in our study, we had inadequate power to fully address this hypothesis. At least three studies have reported positive associations between low Apgar scores and childhood leukemia or childhood cancer in general [36][37][38]. Two studies have reported increased risks for ALL [36] or childhood cancer overall associated with neonatal oxygen exposure [38]. Oxygen exposure after birth represents an understudied area with respect to childhood leukemia. Additional data is required to fully understand the impact of newborn medical treatments on risk.
One of the strongest and yet unexplained risk factors for childhood ALL is male sex. We did not find any statistical differences in risk factors by sex with the exception of maternal education which seems unlikely to be due to anything more than chance. Few studies have reported sex-specific risk estimates. Three studies that examined the birth weight association in sex-specific analyses have reported divergent results with one study showing that the increased risk for higher birth weights was limited to males [39] and two others showing the opposite [22,40]. In our study risk estimates were in the same direction and not statistically significant between sexes for birth weight modeled as a continuous or categorical variable. A recent review hypothesized that sex dimorphism in the risk for complex diseases may be related to differential epigenetic gene regulation by sex hormones [41]. Sex specific epigenetic differences have not been explored, to our knowledge, in the regulation of lymphoid cell lineage gene expression.
A major strength of our study is the case-cohort study design and collection of exposure data prior to disease onset obviating recall bias of particular concern in casecontrol studies frequently collecting data through parental interview after disease status is known. Retrospective case-control studies are also susceptible to selection bias, where estimates could be biased if the exposure distribution of the comparison population is not representative of the source cohort. Because we randomly sampled members of the sub-cohort from the entire birth cohort distribution of cases, selection bias was avoided.
Our study also has limitations. We did not have follow-up information on non-case members of the sub-cohort or have knowledge of cases diagnosed prior to 1988 (the year of inception of the cancer registry). It is also possible that some sub-cohort members who were later diagnosed with leukemia were not captured by MCSS due to out-migration. However, given the rarity of childhood leukemia, it is unlikely that out-migration or missed sub-cohort diagnoses prior to 1988 could have caused a large bias in our results. Also of note is that the success of matching cases captured by MCSS to birth records decreased with diagnosis age. The most likely reason for this is due to immigration of individuals born elsewhere which would not have biased our results because our study was of Minnesota born children. A number of individuals had missing data for variables recorded in birth records. Biased estimates from missing data occur when the data are not missing at random [42]. We do not believe that this could have caused a large bias in our results because the percentage of cases compared to sub-cohort members with missing data was similar for any given variable. Other limitations include the validity of birth record variables; variables with high validity include birth weight, Apgar scores, and delivery method. Variables that are less well measured include prenatal care, and complications of pregnancy, labor and delivery [43]. Sociodemographic variables have been generally found to be reliable in at least two studies [44,45]. Most of the findings reported here are for variables that are considered accurate and therefore most likely provide a valid estimate of risk. Finally, some of our findings may have resulted from chance variation in the distribution of exposures between cases and controls due to small numbers and multiple comparisons.

Conclusion
We have reported findings from a population-based casecohort study of childhood leukemia in Minnesota. Our study adds to the current knowledge of leukemia risk factors, particularly birth weight with evidence supporting a positive linear relation between birth weight and both ALL and AML. Our data also suggests that early-life factors recorded in birth records do not explain the male excess found in ALL. Further, we have shown an increased risk between low Apgar scores and leukemia, the reasons for which should be the subject of further investigation.