Statistical analyses
Extension of weight-for-age
WHO reference curves are based on ‘core data’ from the US NCHS, collected from 1963–1975 on 11,507 girls and 11,410 boys aged 1–24y, kindly provided by Dr. M. de Onis of the WHO [5]. A waiver for use of this anonymized, publically available dataset was obtained from the Institutional Review Board of the Montreal Children’s Hospital (McGill University). For the development of WHO reference curves for school-aged and adolescent children, these data were merged with ~8,000 cross-sectional observations from the MGRS (ages 18–71 months) to smooth the transition at age 5y [5]. These MGRS data are not yet in the public domain.
To extend the weight-for-age reference curve beyond age 10y, WHO exclusion criteria were first applied to create a reduced dataset (NCHS-R) with 11,193 girls and 11,106 boys. As in the WHO reports, there were exclusions for both ‘outlying’ heights-for-age (14 girls, 8 boys) and ‘unhealthy’ weights-for-height (300 girls, 296 boys), the latter defined by the WHO as weights-for-height <0.135th or >97.7th centiles (−3 and +2 SD, respectively). After exclusions, there were 673 ± 204 (mean ± SD) boys and 646 ± 185 girls for each annual interval between 5-19y. Detailed descriptive statistics for each cohort are summarized in (Additional file 1: Table S1). WHO global deviance and information criteria [3–5] were then applied using the GAMLSS statistical package of Stanisopoulos and Rigsby to develop optimal Box-Cox power exponential (BCPE) models that explicitly fit the time-evolution of 4 parameters: μ (median), σ (coefficient of variation), ν (skew) and τ (kurtosis) [14, 15]. These parameters are then joined by cubic splines with degrees of freedom (df) chosen to balance accuracy and smoothness. Before the BCPE model was applied, the time axis also required a power transformation (exponent λ) to better capture periods of rapid change [4, 5, 14, 16]. A detailed review of the exclusion process, modeling procedure, and diagnostic validation is found in the Statistical Methods and Models manual at the CPEG website [17]. Optimal models were
● For girls, λ = 1.22, df(μ) = 14, df(σ) = 6, df(ν) = 5, and τ = 2
● For boys, λ = 1.30, df(μ) = 13, df(σ) = 8, df(ν) = 5, and τ = 2
When τ = 2, kurtosis may be ignored, and the BCPE model reduces to the simpler 3-parameter skew normal or LMS model (L = ν, M = μ, and S = σ) [4, 5, 14, 16]. Model fit was confirmed through standard goodness-of-fit tests and diagnostic plots [4, 8, 17].
For specific ages and genders, published LMS data (WHO reference and standard) were used to generate smoothed centiles 3, 10, 25, 50, 75, 90, and 97 (−2 to +2 SD) for height-for-age, BMI-for-age (2–19y), length-for-age, length-for-weight, and head circumference-for-age (0–2y). The same is true for the weight-for-age curves from 2–10y. Beyond age 10y, weight-for-age centiles are based on the NCHS-R dataset fitted here.
Comparison with ‘2010 WHO growth charts for Canada’
At monthly intervals from 5–10 years of age, calculated weight-for-age centiles were compared to corresponding WHO centiles using the absolute deviation in kilograms (kg, mean ± SD).
Comparing smoothed to empiric centiles (NCHS-R)
For ages 5–19y, the smoothed centile lines calculated from their LMS parameters were compared graphically to empiric centiles, which were calculated separately for each gender after the raw NCHS-R data were grouped (binned) by year of age. In addition, the smoothed centiles were used to determine the proportion of the reference population falling below each centile line.
Bootstrap resampling
To estimate sample bias, standard errors, and 95% confidence intervals, smoothed LMS centile curves for each gender and anthropometric measure were fitted to 1,000 nonparametric bootstrap replicates drawn from the NCHS data. The 95% confidence intervals are the standard intervals of Efron and Tibshirani [18, 19]. The same procedure was used to examine the effects of sample size as bootstrap samples were varied from 50–300 per month (total sample 10,250–61,500).
All statistical analyses were performed in R [20]. Unless otherwise noted, all values are means ± standard deviation (SD) and statistical significance is p <0.05.