We used data from the Early Childhood Longitudinal Study – Birth Cohort (ECLS-B), a prospective, longitudinal study of a nationally representative sample of approximately 10,700 children born in the US in 2001 and followed through kindergarten. The ECLS-B was conducted by the US Department of Education Institute of Education Sciences National Center for Education Statistics (NCES) in collaboration with several other federal agencies. Multiple births, low birth weight, and selected ethnic minority children, including American Indians, were oversampled. Children born to mothers less than 15 years old and infants who died or were adopted before 9 months were excluded. Our analyses included data from birth certificates, and from the 9 month, 4 year (preschool sample), 5 year (2006–2007 kindergarten sample), and 6 year (2007–2008 kindergarten sample) waves of data collection. Data included direct child assessments during home visits, parent/caregiver computer assisted personal interviews (CAPI), self-administered questionnaires at 9 months, and audio-computer assisted parent (or other caregiver) interviews at 4, 5, and 6 years for sensitive items. The weighted CAPI response rates ranged between 54-74% [20], and weighted child assessment response rates for children with parental data ranged between 96-99% across time points [21–23]. The majority of children (~72%) were followed to 5 years when they entered kindergarten. Children who were not age-eligible to enter kindergarten in 2006 were also included in the 2007 kindergarten sample (n = 1,300), along with a small percentage of children (~5%) who repeated kindergarten. We used all available measurements on child height and weight.Our sample included approximately 6,550 children whose mothers reported data about depressive symptoms at 9 months. Children included in the height trajectory analyses had at least two valid height measurements, and those included in the BMI analysis at least two valid BMI values. Multiple births (n = 1,350) were excluded because of potentially different growth trajectories than singletons. We examined weight trajectories over time and changes in weight between time points for implausible values and outliers (more than 3 standard deviations (SD) above average weight gain for two time points). We also examined the effect of height and BMI outliers on our estimates; exclusion of outliers (>3 SD or < −3 SD for height and BMI) did not change the parameters so our final sample included these observations. In the final sample, approximately 6,000 children had valid measures at 4 years, 4,600 at 5 years, and 1,300 at 6 years. For a flow diagram of participants included in and excluded from the study, please see Figure 1.
Maternal depressive symptoms were assessed using a twelve-item version of the Center for Epidemiological Studies Depression Scale (CES-D) [24] administered at 9 months. The CES-D assesses depressive symptoms during the past week using a four-point Likert scale: 0 = rarely or never, 1 = some or a little, 2 = occasionally or moderately, and 3 = most or all [25]. The twelve-item scale yields a total score from 0–36, which we categorized into three groups: scores <5 (no symptoms), 5–9 (mild symptoms), and ≥10 (moderate to severe symptoms). The moderate to severe symptoms score of ≥10 corresponds to the score of ≥16 used as the standard cutoff for a high level of depressive symptoms on the original 20-item scale. The twelve-item version CES-D has been validated and applied in other large national studies [26]. Internal consistency for the CES-D short form in our sample was high (Cronbach α = 0.88).
We omitted mothers who were missing 9 or more of the 12 items from the CES-D scale (n = 1200), but respondents with at least four completed items were included in the analyses (~9,500 mothers, 88.9% of the original sample.) The majority of mothers completed all items (n = 6,150) or were missing only one (n = 250); less than 150 mothers were missing two or more items. Scores for missing items were imputed using the average scale score from the completed items. Sensitivity analyses using a more stringent cut-off for CES-D completed items (only including mothers missing 0 or 1 items) did not yield significant differences in the model estimates (data not shown).
Child length/height and BMI were directly measured by trained interviewers at home visits at each wave of data collection. At 9 months, a Measure Mat was used to assess children’s recumbent length. At ages 4–6 years, children’s standing height was assessed using a Portable Stadiometer. Children’s weight at 9 months was assessed by first weighing the mother and child together on a digital scale; the mother’s weight was then subtracted to obtain the child’s weight [27]. At later ages, children were weighed independently [25]. At 9 months, two measurements were averaged for weight and length scores. If the difference between the two measures exceeded 5%, the weight or height measurement closest to the weighted average for sample children of the same age and birth weight was used [27]. Likewise, at 4 years, two measurements were taken and averaged unless measurement differences exceeded 5%, in which case the field interviewer checked the measurements to confirm that no error had occurred and re-measured the child if necessary [20]. At ages 5 and 6, three measurements were taken for each assessment and the closest two measurements were averaged [20].
BMI was calculated as weight (kg) divided by height (m) squared for children 4–6 years of age. BMI was not calculated for 9 month old children because it is not typically used for children under 2 years. Since BMI is not a preferred nutritional indicator in children under 2, we also used weight-for-length z-score at 9 months along with BMI z-scores at years 4, 5 and 6 based on the US 2000 Growth Charts (using the zanthro command in Stata). We also examined weight-for-length (at 9 months) and weight-for-height (4–6 years) z-scores at each time point.
Analyses adjusted for socio-demographic, household, maternal and child characteristics based on maternal report at 9 months unless otherwise noted. Household income was categorized as: <$25,000, $25,000-$49,999, $50,000-$99,999, and ≥ $100,000. Household food security was measured by 18 items from the US Department of Agriculture Household Food Security Scale, assessing food availability and hunger over the past twelve months [28]. It was categorized as secure or insecure; food-insecure households included those with and without hunger. Home ownership was measured as a dichotomous variable (owned or not owned home) as was family structure (single or two-parent family).
Maternal characteristics included age, race/ethnicity (Non-Hispanic (NH) White, NH Black, NH Asian, Hispanic, and Other), education (some High School (HS) or less, HS graduate, some college, college and beyond), prepregnancy weight, weight gain during pregnancy (without subtracting birth weight), parity, and smoking status. Child characteristics included child sex, birth weight and gestational age (from birth certificate), age at interview, overall child health status, and whether the child was ever breastfed.
We used multiple imputation to impute missing values for covariates with missing responses using 10 imputed datasets; all covariates were missing less than 3% of responses. Sensitivity analyses were also conducted using complete case analyses and nearest neighbor hot deck analyses, which produced similar results.
Statistical analysis
We first evaluated the association between maternal depressive symptoms and the covariates using chi-square statistics, and unadjusted logistic regression. Exploratory data analyses of child height and BMI over time included graphing cross-sectional box plots, average trajectories, and individual spaghetti plots to examine the shape of trajectories and identify potential outlying values. Unadjusted longitudinal analyses of the associations between height and BMI and all covariates were conducted using random effects models, including random intercepts and slopes. Covariates measured at baseline were selected based on significant unadjusted associations (P <0.05) with height or BMI or because they were conceptually related to child growth. Assessment of time-varying covariates with height and BMI trajectories did not produce meaningful differences in the model estimates; baseline covariates were used in the final models.
Non-linearity of growth models was accounted for by including quadratic terms for child age in the final models for both height and BMI. Both partially adjusted (controlling for child age and sex) and fully adjusted models (controlling for all covariates) were fit. Main effect models examined a linear shift in depressive symptoms (main effect models). Interaction models included an interaction term for maternal depressive symptoms and child age to determine if growth trajectories varied by the levels of maternal depressive symptoms over time. The analysis resulted in four models each for child height and BMI trajectories: Model 1a, c) partially adjusted main effect models; Model 1b, d) fully adjusted main effect models; Model 2a, c) partially adjusted interaction models; and 2b, d.) fully adjusted interaction models. We conducted two additional analyses using 1.) combined weight-for-length z-scores (9 months) and BMI z-scores (4–6 years) and 2) weight-for-length/height z-scores at all ages. The models using BMI values (kg/m2) showed similar results to the BMI trajectory models with z-scores, so we present the results for the BMI trajectory models using BMI values due to their clinical relevance and ease of interpretation. Model fit was assessed by comparing the Akaike information criterion (AIC) values of potential models and conducting likelihood ratio tests of nested models (main effects vs. growth trajectory models for height and BMI).
Random effects were used to examine the relation of maternal depressive symptoms with child height from age 9 months to 6 years and BMI from age 4 to 6 years, adjusting for socio-demographic and child health covariates. Since coefficient and standard error estimates did not change substantially between the models using independent, exchangeable, and unstructured covariance between the random slopes and intercepts, independent covariance was specified in the final models for simplicity. Analyses were conducted using Stata 11.0 (Statacorp, College Station Texas). P-values were based on two-sided tests.
Weighted analyses were conducted for descriptive statistics and point estimates due to oversampling of particular groups. Unweighted analyses were performed for regression models because the analyses focused on the relations between variables rather than prevalence or point estimates [29].
The ECLS-B is a restricted-use secondary dataset. The authors received a restricted-use license and access to the data from the US Department of Education National Center for Education Statistics (NCES). This study was approved by the NCES and Johns Hopkins Bloomberg School of Public Health Institutional Review Board for human subject research. The authors abided by the confidentiality regulations and restrictions for using the data and rounded all figures to the nearest 50 based on the NCES data reporting requirements. This manuscript was submitted to NCES for a disclosure review and was approved for publication.