Is utility-based quality of life associated with overweight in children? Evidence from the UK WAVES randomised controlled study

Background Quality-Adjusted Life Years (QALYs) are often used to make judgements about the relative cost-effectiveness of competing interventions and require an understanding of the relationship between health and health-related quality of life (HRQOL) when measured in utility terms. There is a dearth of information in the literature concerning how childhood overweight is associated with quality of life when this is measured using utilities. This study explores how weight is associated with utility-based HRQOL in 5–6 year olds and examines the psychometric properties of a newly developed pediatric utility measure – the CHU9D instrument. Methods Weight and HRQOL were examined using data collected from 1334 children recruited within a UK randomised controlled trial (WAVES) (ISRCTN97000586). Utility-based HRQOL was measured using the CHU9D, and general HRQOL measured using the PedsQL instrument. The association between weight and HRQOL was examined through a series of descriptive and multivariate analysis. The construct validity of the CHU9D was further assessed in relation to weight status, in direct comparison to the PedsQL instrument. Results The HRQOL of children who were either overweight or obese was not statistically different from children who were healthy or underweight. This result was the same for when HRQOL was measured in utility terms using the CHU9D instrument, and in general terms using the PedsQL instrument. Furthermore, the results support the construct validity of the newly developed CHU9D as the PedsQL total HRQOL scores corresponded well with the individual CHU9D dimensions. Conclusion At age 5–6 years, the inverse association between overweight and HRQOL is not being captured by either the utility-based CHU9D instrument nor the PedsQL instrument. This result has implications for how the cost-effectiveness of childhood obesity interventions is measured in children aged 5–6 years. Trial registration ISRCTN Registry: ISRCTN97000586 19th May 2010.


Background
Childhood obesity is a growing problem worldwide [1][2][3]. The direct annual costs of obesity and associated health consequences across the EU is about 7 % of national health budgets [4] and within the UK National Health Service (NHS), is approximately £4.2 billion, with an estimated cost of £16 billion to the wider economy [5].
A range of interventions have been developed to prevent and manage childhood obesity [6]. However, there is an absence of evidence on the costeffectiveness of such interventions. Whilst there is much evidence to suggest that weight status has an effect on adult health-related quality of life (HRQOL) [7][8][9][10][11], and many studies have reported similar associations in adolescents [12][13][14], these studies report HRQOL in general terms rather than in the more specific utility terms required for an economic analysis. In the UK, for decision making bodies such as the National Institute for Health and Care Excellence (NICE) it is recommended that HRQOL is measured in utility terms to facilitate the construction of Quality-Adjusted Life Years (QALYs). QALYs are then used as the unit of assessment for comparing the cost-effectiveness of alternative interventions [15] and are now used to inform resource allocation decisions worldwide [16]. Conventional practice within economic evaluations is to measure HRQOL on a cardinal 0-1 utility scale with death (0) and full health (1) denoting either end of the scale [17]. Very few studies have looked at the impact of childhood overweight/obesity on HRQOL when it is measured in utility terms [18] yet this information is vital for the construction of QALYs. This study directly addresses this evidence gap.
Assessment of health status in children differs from adults and requires a different conceptual approach due to rapid rates of development, dependency on parents/caregivers and differences in disease epidemiology [19]. Utility-based HRQOL in children therefore needs to be measured using an instrument specifically designed for children. The CHU9D is a recently developed generic HRQOL measure designed to produce utility information. Originally tested for 7-11 year olds [20,21], it has more recently demonstrated good construct validity in adolescents aged 11-17 years [22]. Although there is emerging evidence regarding the psychometric properties of the CHU9D instrument [22,23], more evidence is required with respect to its validity for use in different age groups and country settings. Different terms are used in the literature to describe validity, and in this context, discriminant validity refers to the degree with which the instrument discriminates between groups with known differences, and convergent validity refers to the degree to which two theoretically related measures of construct are actually related. Both are subtypes of construct validity [24]. This paper explored the relationship between weight status and utility-based HRQOL (measured on a 0-1 scale reflecting full health and death) in children aged 5-6 years. Also it examined the construct validity of the CHU9D instrument by reporting specifically on the discriminant and convergent validity. To facilitate this assessment, the CHU9D was directly compared to the PedsQL instrument [25], a widely used, validated generic HRQOL measure in children.

Methods
The WAVES study is a UK-based cluster-randomised controlled trial assessing clinical and cost-effectiveness of an obesity prevention intervention targeting children, funded by the UK National Institute for Health Research (ISRCTN97000586; Date of registration: 19/5/2010) from 2010 to 2015. Fifty-four schools (recruited from a random sample of 200) participated in the study. The study had full ethics approval and was conducted in accordance with the World Medical Association's Declaration of Helsinki (National Research Ethics Service Committee, West Midlands, The Black Country No. 10/H1202/69). The random sample was weighted to achieve sufficient representation (to enable sub group analysis) from the two most prevalent ethnic minority groups in the West Midlands, UK: South Asian (Bangladeshi, Indian and Pakistani) and Black (African and Caribbean). All children in school year 1 (aged 5-6) from participating schools were invited to take part. Written parental consent was obtained for each study participant through a signed consent form and verbal assent from the children at the point of measurement. Parental consent was obtained for 1470 children (60 % of those eligible), and 1401 children (95 % of those consented/57 % of those eligible) were available for baseline measurements. For practical reasons the schools were split into two groups, half the schools had baseline measurements taken in 2011 and the other half in 2012. Data on participants' date of birth, sex and postcode were obtained from school records. Ethnicity data were collected through a parent completed questionnaire, or school records when this was not available. Small area deprivation was used as a proxy for socioeconomic status. Deprivation was assessed using the index of multiple deprivation (IMD) [26]. The IMD score for the residential area of each child was identified based on their postcode using an online facility [27]. These scores were then allocated to the appropriate IMD quintile; those in the first quintile, living in an area classified by the IMD as one of the 20 % most deprived in England and those in the 5 th in an area classified as one of the 20 % least deprived.

Measurement of weight status
For all participants, height and weight measures were taken at school by trained researchers using standardised instruments and procedures. Height was measured to the nearest 0.1 cm using a Leicester height measure. Weight was measured in light clothing without shoes to the nearest 0.1 kg using a Tanita SC-331 S body composition analyser. BMI was calculated by dividing weight (in kilograms) by height (in metres) squared (kg/m2) and used to categorise the children into underweight, healthy weight, overweight and obese groups. The 2 nd , 85 th and 95 th centiles of the UK 1990 Growth reference charts for BMI [28] were used to define the four weight categories, in line with standard UK definitions [29].

HRQOL measures
As the focus of this study was to explore the association between weight status and HRQOL when measured in utility terms, two instruments were selected for the measurement of HRQOL. Both are generic instruments and thus are designed to measure a wider notion of HRQOL and are not specific to any one disease or condition. The CHU9D is a preference-based utility instrument designed exclusively for use in children and previous research has shown this instrument is the most appropriate choice in this age group [30]. As a utility-based instrument, it is designed to produce a HRQOL score that is preference-based and set between the values of 0 (death) and 1 (full health), however like many preference-based utility instruments, it does produce scores that are deemed to be 'worse than death' and therefore have values of less than 0. The PedsQL was chosen as a 'gold standard' comparator as this is a widely used HRQOL instrument validated for use in this age group and was the instrument of choice for the WAVES trial from which the data was generated. Although this instrument is non-utility based would be expected to generate HRQOL values which move in the same direction as the CHU9D utility values.

CHU9D
The CHU9D instrument contains 9 dimensions: school work/homework; tired; sleep; worried; sad; annoyed; daily routine; ability to join in activities; and pain, and every dimension contains 5 levels indicating the severity of the dimension. Each of the possible 1,953,125 unique health states are assigned a health utility value ranging from 0.33 to 1 based on an algorithm that reflects the preference weight attached to each dimension [31] .

PedsQL
The PedsQL is a 23-item instrument including four domains: physical (8 items), emotional (5 items), social (5 items), and school (5 items) functioning [25,32]. For this study we used the child self-report PedsQL version designed for use in 5-7 year olds. Emerging from the instrument is a score (transformed on to a 0-100 scale) for each type of functioning, with higher scores indicating better quality of life. Each item has three response options: not at all; sometimes; a lot; which in the scoring process are assigned values of 100; 50; 0, respectively. Provided data are available for at least half of the relevant items, the mean score for each of the four domains is then calculated by summing the values for the relevant items and dividing by the number of items answered. This is repeated including all items for the total score. The PedsQL instrument has good reliability and validity in both sick and healthy populations [32][33][34][35].
Both the CHU9D and the PedsQL were administered at the same time point by researchers on a one-to-one basis. The items and possible responses were read out and to help the children understand how to answer, for the PedsQL, a visual prompt (of a face ranging from smiley to sad associated with each response option) was provided as recommended by the developers of the instrument for administration to young children.

Statistical analysis
In the absence of a gold standard for the measurement of utility-based HRQOL in young children, and with no prior knowledge of how weight status affects utility-based HRQOL in children, to measure the construct validity of the CHU9D, we looked at the relationship between CHU9D and PedsQL in relation to weight status. This method allowed us to explore two subtypes of construct validity: discriminant and convergent validity. We explored discriminant validity by determining if the CHU9D instrument was able to discriminate between children within different weight groups, and the convergent validity by assessing how the CHU9D correlated with the PedsQL measure.
To explore the relationship between HRQOL and sample characteristics we report mean (and SD) CHU9D and PedsQL scores by weight status category, gender, ethnic group and deprivation quintile. Differences in HRQOL scores between groups were assessed using either the Kruskal-Wallis test, or the non-parametric test for trend. To examine the construct validity of the CHU9D, we split the sample according to the median PedsQL total score and examined separately the mean CHU9D utility value for children who scored on or above this median score, and those who scored below it. This difference was then compared using the one-way ANOVA test. Next, we looked at the distribution of response to each of the CHU9D dimensions by weight status category to assess if there were any significant differences in response. We hypothesised that children in the overweight and obese category would report more problems in each dimension compared to children in the healthy and underweight category. We assessed the significance of differences in response using the chi-squared test. To determine how well the PedsQL scores correspond with the CHU9D dimensions we estimated the mean PedsQL total score for each level of CHU9D response with the expectation that with increasing severity on each CHU9D dimension, the mean PedsQL total score would be lower. A scatter plot (along with fitted regression line and 95 % CIs) for the CHU9D utility values and the total PedsQL scores was used to visualise the correlation between the instruments, and the correlation coefficient was calculated using the Spearman's rho statistic. To explore the correlation further we looked at the relationship between theoretically similar dimensions within both instruments. Our prior expectation was that the following dimensions would be correlated: Finally, to compare the CHU9D utility values between the weight groups we used a linear mixed regression model (with random effect for school), adjusted for potential confounders (age, gender, ethnicity and deprivation quintile). All analyses were undertaken in 2014, using Stata version 13.

Results
Full data (including PedsQL total score, CHU9D utility value, and weight status group) were available for 1344 children and are presented in Table 1. The proportion of children in the study sample who were either obese or overweight (21.7 %) is similar to the most comparable national data available [36] in which 22.6 % of children measured in their Reception Year during the 2011/12 school year were classified as overweight or obese.

Discriminant validity
Using the known-groups method, the CHU9D (but not the PedsQL) differentiated HRQOL in children of different ethnic origin (p =0.028) with White British children having the highest mean utility score (Table 2). There was a statistically significant trend of decreasing HRQOL by increasing level of deprivation which was identified by both instruments (P < 0.05). When children were categorised into two groups according to their weight status, neither instrument differentiated between the two groups.
To explore the discriminant validity of the CHU9D instrument, the mean and standard deviations for the CHU9D utility values were estimated for children who had a score either above, or below, the median PedsQL total score (71.73) for the sample. The mean utility scores were 0.87 (SD 0.109) and 0.76 (SD 0.143) respectively (p < 0.001). Table 3 shows the distribution of the CHU9D dimensions by weight status category. Overall, the majority of children had no or few problems for all dimensions, irrespective of weight status. There were no underlying differences in the distribution of response to any of the CHU9D dimensions between children in the different weight categories. Table 4 shows how the mean PedsQL scores corresponded with the options for each of the CHU9D dimensions. The mean PedsQL total scores decrease linearly with increasing severity on each of the CHU9D dimensions. Figure 1 shows the relationship between the CHU9D utility values and the PedsQL total scores. Although there is a moderate association between the instruments with higher CHU9D utility values corresponding with higher PedsQL total scores, there are some anomalies. For example, one child reported a CHU9D utility of 0.32, yet had a PedsQL total score of 76.09, and another child reported a CHU9D utility score of 0.9, yet had a PedsQL total score of 13.04.

Convergent validity
Overall, the correlation between the CHU9D utility values and PedsQL total scores showed a statistically significant moderate, positive correlation (rs = .4696, p = <0.001). The content and coverage of the two instruments were further assessed by examining the correlation  (Table 5).
Using conventional cut-off values for Spearman's ρ, we found that each CHU9D dimension was either weakly, or very weakly correlated with each of the predetermined PedsQL domain functioning scores. As the CHU9D dimensions are coded with 1 as highest level and 5 as lowest level, the signs on the coefficients were consistently negative. Table 6 shows the results of the linear mixed regression model (with random effect for school) which compared the CHU9D utility score between the two weight status groups, adjusted for potential confounders (age, gender, ethnicity and deprivation quintile). Children who are overweight or obese have a lower CHU9D utility value (i.e. poorer HRQOL) but this association is not statistically significant. Children from a non-White British background have lower mean CHU9D utility values and this association approaches significance (p = 0.07) for the South Asian population. Also, children from the least deprived areas have significantly higher CHU9D utility values relative to children from the most deprived areas.

Discussion
Weight management interventions increasingly target preadolescent children and this has implications for the methods of outcome measurement within economic evaluation as few instruments exist that are designed to elicit utilities in this age group. This paper contributes evidence on the use of the newly developed utility-based CHU9D instrument, within an ethnically and socioeconomically diverse UK population of young children.

Relationship between CHU9D and weight status
The results indicate that there is no statistically significant relationship between the CHU9D utility values and weight status in children aged 5-6 years. Adjusted for potential confounding factors, compared to the healthy/ underweight group, children who were overweight/obese reported lower CHU9D utility values, but this effect was not statistically significant. A similar result was found using the PedsQL. When focusing on the CHU9D dimensions, there were no statistically significant differences in scores by child weight status group for any of the dimensions.
Four previous studies that have measured utility-based HRQOL in children [18,[37][38][39] have shown similar findings. In a US-based study, Belfort et al. (2011) used the Health Utilities Index-2 (HUI-2) instrument to measure utility-based HRQOL in children and adolescents aged 5-18 years, and found that utility scores were, on average, 0.04 lower in overweight/obese participants compared with healthy weight [37]. Boyle et al. (2010) used the EQ-5D-Y to investigate the effect of weight on the HRQOL in a UK-based population aged 11-15 years and found children who were overweight or obese had a significantly lower HRQOL than children of healthy weight [38]. A recently published paper explored the relationship between BMI and HRQOL using CHU9D in two cohorts of Australian children, aged 9-12 years and 14-16 years. They found mean CHU9D utility values to be lower in children who were overweight or obese (compared to 'healthy' weight children), but this effect was only significant in the younger age group [39]. Despite these reports of a negative relationship between HRQOL and being overweight in children, the evidence is mixed in terms of whether this effect reaches statistical significance. Within a UK-based pilot study that was linked to this study, the same direction of effect was found, but there was no statistical difference between utility values and weight status groups in children aged 5-6 years [18]. Three reasons were offered to help explain this result. The first related to the small pilot sample (n = 160), that may not have been large enough to assess subgroup differences. The sample size within this study population is substantially higher, and a similar result was found. The second reason suggested that the CHU9D is not sensitive enough to detect a difference in utility-based HRQOL between overweight and non-overweight children. In this study, the PedsQL total scores are available for comparison, and although the PedsQL shows a negative relationship between weight and HRQOL, again this does not reach statistical significance. Thirdly it was suggested that within this age group, the co-morbidities attached to obesity do not substantially affect HRQOL when measured on a 0-1 utility scale, and it is only once these children approach adolescence that the effects of being overweight have a negative impact on utility values. This might help explain the results within this study.

Psychometric properties of CHU9D
This study has also contributed evidence on the construct validity of the CHU9D and the results support the convergent and the discriminant validity of the instrument. The most significant, consistent finding within the study population was that HRQOL when measured using both the CHU9D and the PedsQL, was lower within children from the most deprived areas, compared to children from the least deprived areas. This demonstrates that both instruments are discriminating between these groups of children with known differences. Also  with respect to discriminant validity, the results showed that the mean CHU9D values were significantly higher for all children with a PedsQL total score greater than or equal to the sample median total PedsQL score, compared to children with a PedsQL total score less than the sample median. Furthermore, PedsQL total scores corresponded well with the individual CHU9D dimensions, with a lower mean PedsQL total score with increasing severity on each CHU9D dimension. Regarding the convergent validity, overall, there was a moderate, statistically significant positive correlation between the PedsQL total scores and the CHU9D utility values. However, despite this correlation between the overall scores of both instruments, we found only a weak, or very weak correlation between the dimensions of each instrument that were pre-determined as being theoretically similar. One possible explanation is that although the PedsQL total scores and the CHU9D utility values tap into a similar underlying construct (HRQOL), the individual dimensions of each instrument, while appearing quite similar, might actually be describing something that is quite specific and different. So at the dimension level the correlations are weak but when combined, the overall instrument scores become moderately correlated.

Strengths and weaknesses of the study
The data within this study was collected from the WAVES trial which was designed to include a diverse socioeconomic and multi-ethnic population. Parental consent for participation in the WAVES trial was obtained for 57 % of eligible pupils which could lead to sample selection bias. However when the proportion consented out of those eligible was considered by several socio-demographic characteristics, although there was some variation, the differences were generally modest (sex (boys = 65 %, girls = 67 %), ethnicity (white = 75 %, South Asian = 61 %, Black African Caribbean = 64 %; deprivation (most deprived quintile = 65 %, least deprived quintile = 72 %).
As it is rare to have utility information available for children as young as 5 years and for this to be reported     who were measuring 'underweight' in our sample (3 %) a decision was made to pull the 'underweight' and 'healthy' weight children into one weight category. There is no a priori reason to assume that the HRQOL of underweight and healthy weight children are similar but we could not explore this in a statistically robust fashion and the focus of this paper was on the effects of being overweight on HRQOL, not underweight. To enable a comprehensive analysis of the effects of being underweight would have required a purposive sampling approach to ensure adequate numbers of children in this category.

Conclusion
This paper contributes utility data from a large UKbased pediatric population alongside information on the psychometric properties of the instrument used to generate these data. Studies suggest that overweight is negatively associated with HRQOL in children but the extent of the association, how it varies across age groups, and how it translates to the 0-1 utility scale is as yet underresearched. This paper offers support for the convergent and discriminant validity of the CHU9D, as a measure of utility-based HRQOL in children aged 5-6 years. It offers evidence that overweight is negatively associated with HRQOL in children in this young age group but that this association is weak. Utility values are frequently used within health economic studies conducted globally to derive QALYs to inform resource allocation decisions. Future studies need to determine how weight status is associated with HRQOL in utility terms, in different age cohorts, and across different country settings, to help inform the methods of economic evaluations alongside clinical trials of childhood obesity prevention and management.