Development of a wheelchair mobility skills test for children and adolescents: combining evidence with clinical expertise

Background Wheelchair mobility skills (WMS) training is regarded by children using a manual wheelchair and their parents as an important factor to improve participation and daily physical activity. Currently, there is no outcome measure available for the evaluation of WMS in children. Several wheelchair mobility outcome measures have been developed for adults, but none of these have been validated in children. Therefore the objective of this study is to develop a WMS outcome measure for children using the current knowledge from literature in combination with the clinical expertise of health care professionals, children and their parents. Methods Mixed methods approach. Phase 1: Item identification of WMS items through a systematic review using the ‘COnsensus-based Standards for the selection of health Measurement Instruments’ (COSMIN) recommendations. Phase 2: Item selection and validation of relevant WMS items for children, using a focus group and interviews with children using a manual wheelchair, their parents and health care professionals. Phase 3: Feasibility of the newly developed Utrecht Pediatric Wheelchair Mobility Skills Test (UP-WMST) through pilot testing. Results Phase 1: Data analysis and synthesis of nine WMS related outcome measures showed there is no widely used outcome measure with levels of evidence across all measurement properties. However, four outcome measures showed some levels of evidence on reliability and validity for adults. Twenty-two WMS items with the best clinimetric properties were selected for further analysis in phase 2. Phase 2: Fifteen items were deemed as relevant for children, one item needed adaptation and six items were considered not relevant for assessing WMS in children. Phase 3: Two health care professionals administered the UP-WMST in eight children. The instructions of the UP-WMST were clear, but the scoring method of the height difference items needed adaptation. The outdoor items for rolling over soft surface and the side slope item were excluded in the final version of the UP-WMST due to logistic reasons. Conclusions The newly developed 15 item UP-WMST is a validated outcome measure which is easy to administer in children using a manual wheelchair. More research regarding reliability, construct validity and responsiveness is warranted before the UP-WMST can be used in practice. Electronic supplementary material The online version of this article (doi:10.1186/s12887-017-0809-9) contains supplementary material, which is available to authorized users.


Background
Two of the most common motor disorders in childhood in the Netherlands are Cerebral Palsy with a prevalence of 2.5 per 1000 births [1], and neural tube defects with a prevalence of 6.52 per 10.000 births [2]. A large proportion of these children use a manual wheelchair for their daily mobility [3]. In adults, several studies have reported on the importance of wheelchair mobility skills (WMS) to overcome mobility problems and improve participation [4,5]. Moreover, it has been shown that WMS training in adults can decrease their mobility problems by improving their WMS [6][7][8][9]. In children, evidence is limited, with only one pilot study by Sawatzki et al. looking at the effects of WMS training in six children using a manual wheelchair [10]. At the same time though, the importance of WMS training in children was recently confirmed in a qualitative study exploring factors associated with levels of physical activity [11]. One of the facilitating factors identified by children and their parents was WMS training. This can be illustrated by a quote from one of the parents: "Wheelchair training, that is very important I think, .she can do much more now….a lot of places are not adjusted for wheelchairs ….and you can just go….your life becomes a lot more fun" [11].
In the last decade a large variety of WMS related outcome measures has been developed for adults using a manual wheelchair [12]. In order to evaluate a WMS training for children, there is a need for such an outcome measure in this population as well. The pilot study by Sawatzki et al. was the only intervention study reporting on the use of a WMS outcome measure in children and used an adapted version of the WST 3.2 [10]. However, this WMS outcome measure was developed for adult manual wheelchair users and has not been validated for use in children. It is recommended to validate an outcomes measure again if it is applied in a new population [13]. This is important because certain items could be irrelevant, other items might need adaptation or new items need to be included for different populations. In this case wheelchair outcome measures have been developed for adults with spinal cord injury, stroke or amputation, whereas children more often use a manual wheelchair due to congenital defects such as cerebral palsy or neural tube defects.
To the best of our knowledge, no WMS outcome measure has been specifically developed for or validated in children using a manual wheelchair.
The best available WMS outcome measures for adults could potentially be used for validation in children. Unfortunately, there is currently no consensus among clinicians and researchers on the best outcome measure in adults to evaluate WMS [12,14,15]. One of the reasons for this lack in consensus could be the difference in definitions used for the selection of items, including wheelchair user function, manual wheelchair use, wheelchair driving or wheelchair mobility [12,16]. In this paper we use the term WMS, as skills that address aspects of wheelchair mobility. In the International Classification of Functioning (ICF) (http://apps.who.int/classifications/ icfbrowser/) wheelchair mobility is classified in chapter 4 (Mobility) as moving around using equipment (d465) and defined as "moving the whole body from place to place, on any surface or space, by using specific devices designed to facilitate moving or create other ways of moving around, such as a wheelchair". This definition excludes other activities in a wheelchair such as transferring oneself or handling objects.
There is currently no outcome measure available for the evaluation of WMS training in children. Therefore, the objective of this study was to develop (based on available literature and expert opinion) a WMS outcome measure for children using a manual wheelchair.

Methods
In this study, the recommendations for the development of outcomes measures by the 'COnsensus-based Standards for the selection of health Measurement Instruments' (COSMIN) checklist [17] was followed. The COSMIN checklist was developed in a Delphi study by an international team of leading experts in epidemiology, psychometrics, and health care [17]. One of these recommendation involves combining evidence from literature with clinical expertise, i.e. opinion of the target population and health care professionals [13]. This process is illustrated in Fig. 1 and included the following phases: (1) Identification of potentially relevant WMS items with good measurement properties through a systematic review and best evidence synthesis regarding validity, reliability and responsiveness of existing WMS outcome measures (2) Selection of WMS items relevant for children using the opinion of children, their parents and health care professionals, (3) Pilot testing the feasibility of WMS items in children using a manual wheelchair.

Phase 1 item identification of WMS Data sources and searches
We updated the most recent systematic review on WMS from 2010 by Fliess-Douer et al. [12]. The same search string as Fliess-Douer et al. [12] was applied to the following databases: Pubmed, Cochrane and Web of Science up to July 2015. The full search strategy for Pubmed is described in Additional file 1.

Study selection
The selection of articles was independently performed by two reviewers (MS and JdG). While the search string was similar to Fliess-Douer et al. [12] the criteria used for selection were adapted to include WMS outcome measures for people with all types of disability, instead of only those for people with a spinal cord injury (SCI). This resulted in the following inclusion criteria : (1) aim of the study was to assess wheelchair skill performances in a wheelchair, (2) outcome measure is constructed for people using a manual wheelchair, (3) available statistical data regarding reproducibility or validity (4) full report written in English and publication date January 2010-July 2015. Studies were excluded when: (1) constructed for people using power wheelchairs, (2) developed for assessing in virtual environment, (3) focused on 'body function and structures' (measuring specific physiological and/or biomechanical variables which do not comply with the terms of 'activity' or 'participation' domains as defined in the ICF (http://apps.who.int/classifications/icfbrowser/).

Assessment of methodological quality
Studies reporting a total and item score were divided into sub studies to be able to differentiate between statistical methods being used. Two reviewers (MS and JdG) independently evaluated the methodological quality of the included studies using the COSMIN checklist [18]. The COSMIN checklist contains twelve boxes, which assess the methodological quality of the studies regarding reliability, measurement error, content validity, hypothesis testing, cross-cultural validity, structural validity, criterion validity, and responsiveness. The items in each box are rated with a 4-point scoring system; excellent, good, fair, and poor. A quality score per measurement property was obtained by taking the lowest rating of any item in a box ("worst score counts"). One item in each box concerns the sample size requirements, with a minimal requirement of n > 30 for an adequate sample size. As the COSMIN checklist was originally developed for health related questionnaires, sample size requirements might differ for performance based measures and can alternatively be based on power calculations as earlier discussed by Bartels et al. [19]. Therefore, the sample size requirement for assessment of methodological quality of reliability was adjusted to N ≥ 20, based on a sample size determination for a WMS outcome measure with power calculation from Kirby et al. [20].

Data extraction and best evidence synthesis
Two reviewers (MS, OV) independently performed the data extraction and assessed the results of the studies based on the quality criteria described by Terwee et al. [21]. The possible ratings per measurement property were "positive," "indeterminate" and "negative". Studies looking at different measurement properties of the same outcome measure were pooled for best evidence synthesis. This synthesis combines the methodological quality of the studies with the consistency of their results [22]. The level of evidence for each outcome measure was subsequently rated as "strong", "moderate", "limited", "conflicting", or "unknown" per measurement property. This method is similar to the method used for the systematic review of clinical trials as suggested by the Cochrane Collaboration Back Review Group [22].

Selection of WMS outcome measure
The WMS outcome measures with some level of evidence across reliability and validity were grouped together for item selection in phase 2.

Phase 2: item selection of WMS for children
The resulting list of WMS items identified in phase 1 was assessed on their relevance for children using a manual wheelchair. Relevance checking was performed through a focus group or individual interviews with children using a manual wheelchair, their parents and health care professionals. The children and their parents were recruited from a voluntary WMS training program, which was set to start a few weeks later. Physiotherapy students were trained by an experienced qualitative researcher to conduct interviews with parents and children following a topic list. Individual interviews were conducted with the children and their parents separately or, in case this was preferred by the child, together. The parents and children were asked open ended questions about their current limitations regarding wheelchair mobility, their expectations of the WMS training and training goals. Open ended interview questions were preferred over relevance checking per item as this method assured an open mind regarding WMS which are relevant for children, without being influenced by WMS for adults. All interviews were recorded by video and transcribed verbatim. After transcription, a qualitative Framework Method Analyses [23] was performed for all interviews by two independent researchers to determine relevant items. The coding framework was based on the compiled list of items from the results of phase 1. Concurrently with the individual interviews, a focus group interview was conducted with health care professionals with clinical expertise in pediatric rehabilitation. All health care professionals were currently working at a special needs school and employed by De Hoogstraat rehabilitation centre, the Netherlands. Every potential WMS item from phase 1 was assessed in the focus group with health care professionals on the appropriateness for children and rated as 'relevant', 'relevant with adaptations' or 'not relevant'. Professionals were asked to keep in mind a total test duration of an hour, to make sure all items were critically assessed on relevance. One researcher (LdG) documented the answers given by the professionals. The results of the qualitative framework analyses of the target population was combined with the opinion of the health care professional to develop a new assessment tool with the work name: Utrecht Pediatric Wheelchair Mobility Skills Test (UP-WMST).

Phase 3: Pilot testing of WMS items
One occupational therapist and one physiotherapist were asked to provide written comments and answer question regarding: 1) the feasibility to assess WMS within one hour; 2) the ease of handling material; and 3) clarity of instructions when administering the UP-WMST to children using a manual wheelchair. This was followed by individual interviews with the therapists. Both health care professionals received a manual of the UP-WMST with instructions about test set-up and instructions per item. Children who use a manual wheelchair were recruited from a special needs school in Utrecht, the Netherlands.

Search results
The search strategy combined with the previous results from Fliess-Douer et al. [12] resulted in a total of 699 unique articles, of which 31 were selected for full text assessment (Fig. 2). Nine studies were excluded after full text assessment. After exclusion, 22 studies were considered eligible for this review. The main reasons for exclusions were; the absence of psychometric properties of the outcome measure being used [10,[24][25][26][27]; outcome measures focused on the level of 'body function and structures' [25,28] and one outcome measure was a questionnaire [29].

Measurement properties
The methodological quality and level of evidence of the studies are presented in Tables 3 and 4 for each measurement property, arranged per outcome measure. No studies assessed all measurement properties. Reliability and hypothesis testing were the most frequently reported properties. Different methods were used to assess inter-rater reliability; some studies used two raters to separately assess the same video recording, whereas other studies used two raters to separately administer the test. Only three studies [44,45,47] demonstrated levels of evidence on content validity. Criterion validity was not assessed as there is no gold standard available. Some studies reported on the Smallest Detectable Change or Limits of Agreement, but no studies calculated the Minimal Important Change needed to determine the level of evidence for the measurement error of an instrument. Therefore no levels of evidence were found for any of the outcome measures on criterion validity and interpretability.
Wheelchair Skills Test (WST) The WST 1.0 was originally developed by Kirby et al. [34] consisting of 33 items measuring wheelchair user functional skills in daily life for adults using a manual wheelchair. Fourteen of these items assess WMS, the other items assess other activities in a wheelchair, such as transfers or handling objects. The level of evidence for content validity of this outcome measure is unknown. A number of items and the outcome parameter were adapted in the WST 2.4 by Kirby et al. [20]. The WST 2.4 demonstrated good methodological quality for the reliability of the total score. The scoring of individual items reached a poor methodological quality, due to statistical flaws. Overall the WST shows moderate levels of positive evidence on reliability of the total score, moderate positive levels of evidence for hypothesis testing and unknown or no information on the other measurement properties.
Wheelchair Propulsion Test (WPT) Askari et al. [47] reported on the WPT, which is a quick test consisting of one WMS item measuring several parameters of wheelchair propulsion. This studies demonstrates limited to moderate levels of positive evidence on reliability. Moderate levels of positive evidence on content validity and hypothesis testing. Even though the structural validity showed good methodological quality, the level of evidence is unknown as the explained variance was not mentioned in the results.
Wheelchair Circuit (WC) The WC was developed to measure wheelchair mobility of adult manual wheelchair users with a SCI [30,31]. Most of the items assess WMS, with an additional assessment of wheelchair transfer and wheelchair endurance. Several items were later adapted in the Adapted Manual Wheelchair Circuit (AM-WC) [32] to facilitate widespread utilization.
Vereecken et al. [33] focused on the driving skills, and adapted the WC into the Wheelchair Assessment Instrument for people with Multiple Sclerosis (WAIMS). No studies reported on content validity regarding the SCI or MS population. The methodological quality of reliability was rated fair to good with a moderate to strong level of positive evidence. There is conflicting evidence regarding hypothesis testing, and only limited positive evidence on responsiveness.

Test of Wheeled Mobility (TOWM) and Wheelie Test
Fliess-Douer et al. [36][37][38] demonstrated poor content validity. All 38 items, except the wheelchair transfer, assess WMS. Although a large sample size was used to create a list of essential WMS, there was no assessment if all items together comprehensively reflect the construct to be measured. The statistical method regarding the reliability of item quality scores was inadequate, however the method used for all other scores was appropriate. Therefore the level of positive evidence is moderate for test-retest reliability of all scores, except for the item quality scores. There is unknown or no level of evidence for all other measurement properties.

Tufts Assessment of Motor Performance (TOMP)
This assessment tool for functional motor skills in all disabilities was developed by Gans et al. [49]. The tool consists of 32 items in total with two items assessing WMS. This study demonstrated a limited level of positive evidence for inter rater reliability. No other measurement properties were assessed.      Table 3 Methodological quality of measurement properties on reliability and best evidence synthesis    Table 4 Methodological quality of measurement properties on validity, responsiveness and best evidence synthesis .

Conclusion phase 1: item identification of WMS
There is no widely used WMS outcome measure with levels of evidence across all measurement properties e.g. validity, reliability and responsiveness. However, the WST [20,34], WPT [47], WC [30][31][32][33] and 5AML [42] already showed some level of evidence on aspects of reliability and validity. The individual WMS items of these four outcome measures seem to be the best WMS items available from literature for validation in children. The WST, WPT, WC and 5AML were combined into an overall list of 22 unique WMS items, excluding items not related to mobility as defined by the ICF d465 (http://apps.who.int/classifications/icfbrowser/). The first column in Table 5 shows the compiled list of WMS items and the original outcome measures they were selected from.

Results phase 2: item selection of WMS for children
Individual interviews took 30-60 min and were conducted with three girls, eight boys and their parents. The children's age ranged from 6 to 13 years old. The group consisted of two children with cerebral palsy, seven children with spina bifida, one child with congenital sodium diarrhea and one child with congenital myeasthenic syndrome. Parents and children gave descriptions of different community activities in daily life in which the WMS of the child were inadequate or where they would like to improve on. Framework data analysis resulted in WMS which were literally part of the compiled list of potentially relevant items as can be seen in the fourth column of Table 5. For example, children would like to improve in their ability to go over a steep ramp or to go up and down a high or low curb. In addition, there were new codes developed for the coding framework to categorize recurring themes which could not be attributed to a single WMS item. The subsequent four columns in Table 5 show these categories: 'crossing the road', 'maneuver in crowded places' or 'small rooms' and 'propel over uneven surfaces' and their match to existing WMS items or if not available a new WMS item. The focus group conducted with health care professionals consisted of five occupational therapists and five physiotherapists, with an average age of 34.4 (SD = 7.8) years and 8.0 (SD = 4.8) years of experience in working with children in a wheelchair. All 22 potentially relevant items were assessed in the focus group. Most items were considered relevant for children, however six items were deemed not appropriate for a WMS outcome measure for children: 'ascending or descending stairs', 'propelling in a wheelie', 'turn 180°in wheelie position left and right' and 'get over a pothole'. Total time of administration was considered important due to the extra instruction time and shorter attention span of children when administering an outcome measure. When considering these time restraints health care professionals suggested that while 'holding a wheelie' is a useful skill, it is already part of 'ascending a platform' and therefore not needed to be tested separately. The item 'avoids moving obstacles' was suggested to be adapted into an item measuring the ability to perform a 'sudden stop' as this was seen to be more relevant for children.

Conclusion phase 2: item selection of WMS for children
The WMS items which were deemed relevant by both the children or their parents and by the health care professionals were selected for further pilot testing in phase 3. The item 'avoiding moving obstacles' was adapted into 'sudden stop'. Even though holding a wheelie was seen as not relevant by health care professionals, it was retained as a separate item as this WMS was regarded as highly relevant by children and their parents. This resulted in a 16 item WMS outcome measure, from here on called the UP-WMST.

Results phase 3: Pilot testing of WMS items
One physiotherapist (30 years old, 4.5 years of experience) and one occupational therapist (28 years old, 5 years of experience) jointly administered the UP-WMST in eight children. All items were scored with an ability score (pass/fail) and a performance time score. The children's age ranged from 5 to 11 years old, with five children diagnosed with Cerebral Palsy and three with other disabilities. The two health care professionals commented on the ease of administering the UP-WMST in an hour. For most items both health care professionals confirmed that the items had clear instructions and were easy to administer. However, the following items were less easy to administer. The dependability of weather conditions and the extra time burden of testing in-and outdoors made the outdoor items for rolling over soft surface 'propel over grass' and 'propel over gravel' too difficult to administer. The indoor item for rolling over soft surface 'propel over a mat' was retained. The material for the item 'side slope' was seen as too big and difficult to handle when setting up the test. Table 6 shows the remaining UP-WMST items after excluding the outdoor and side slope items. Therapists also suggested future changes in the scoring method of the items with a height difference. When a child passes the ability score, the quality of execution could be a more important indicator of the performance than the time it takes to complete the item.

Discussion
The objective of this article was to develop a WMS outcome measure for children. The results of the literature review in phase 1 are in accordance with previous systematic reviews [12,14,15] and show the wide range of available outcome measures used for assessing WMS. Only the TMT [43] was developed for children, but due to the small sample size (n = 11) this instrument was excluded for data synthesis and analysis. There are two WMS items in the TMT 'propelling down the hall' and 'propelling up a ramp'. These two items are part of WMS outcome measures for adults and were therefore assessed on relevance in phase 2. No other WMS outcome measure has been developed or validated for children using a manual wheelchair. Furthermore none of the identified outcome measures showed good levels of evidence across all measurement properties. For example, most outcome measures showed a low level of evidence on content validity. Content validity is defined by COSMIN as 'the degree to which the content of a measurement instrument is an adequate reflection of the construct to be measured' [21]. Without good content validity, it is impossible to select the best outcome measure for a specific goal [51]. The construct the UP-WMST aims to assess skills of 'wheelchair mobility' as defined by the ICF d465 (http://apps.who.int/classifications/icfbrowser/). Phase 1 of this study shows that most existing WMS outcome measures do not assess 'wheelchair mobility' as defined by the ICF d465, but rather related concepts such as wheelchair user function or manual wheelchair use. While also important, these are different constructs and therefore only sections of these outcome measure were included that were relevant for assessing WMS.
In addition to assessing content validity and reliability, it is important to assess whether an outcome measure is responsive to detect change over time. The results of phase 1 showed limited levels of evidence on responsiveness available for only one outcome measure [31]. However, while there is no evidence regarding the responsiveness of the WST, the WST has been used in randomized controlled trials and seems responsive to measure change [6][7][8][9]52]. Based on all the available psychometric data assessed, there was not a single WMS outcome measure suitable for validation in children. Therefore the second best option was to select outcome measures with some level of evidence on reliability and validity. The WST [20,34], WPT [47], WC [30][31][32][33] and 5AML [42] already showed some level of evidence on an aspect of reliability and validity and both the WC and WST proved to be responsive to measure change in randomized controlled trials. The WMS items of these four outcome measures are the best available WMS items in current literature and were used or adapted for validation in children.
While there was little evidence available on the content validity of the identified WMS items, the results of phase 2 in this study show that most of the items on WMS were deemed as relevant by parents, children and health care professionals. These results can also be corroborated with a recent Delphi Survey [16], which reported on similar relevant items for a new WMS test In addition to selecting relevant WMS items for children, it is also important to evaluate the applicability of the outcome measure in clinical practice. As suggested by Kirby [53] there are more assessment criteria which are useful to assess when selecting an outcome measure, such as time burden, availability of materials and ease of administering the test. Therefore, we examined the feasibility of administering the UP-WMST in phase 3 of this study. While the outdoor items were previously seen as relevant in phase 2, they were excluded from the UP-WMST after the results of phase 3 due to time burden of testing both in-and outdoors. When developing a more advanced test for wheelchair mobility in children, these outdoor items should be reconsidered for inclusion. Results of phase 3 also showed the need for an additional outcome parameter for the height difference items. All items are currently assessed on performance time and ability. The combination of these two outcome parameters seem to be in line with recent findings by Sawatzky et al. [54] that propulsion speed and ability are related. However, according to the results of phase 3, a more extensive scoring method should be included for the height difference items. Such a method could include a five point scoring method as used in the TOWM and Wheelie test [38], a performance score as used in the WC [31], or a safety score as described in the WST [35]. We are currently continuing with the validation of the UP-WMST and development of a qualitative scoring method which is able to distinguish between beginner or more advanced execution methods on an item.
This study was limited to WMS items with good measurement properties available in current literature. Surprisingly there was not one WMS outcome measure available with good levels of evidence across all measurement properties. The second best option was to select the best available WMS outcome measures for adults with some level of evidence across reliability and validity. The levels of evidence of these selected WMS items for responsiveness, minimal detectable change and minimal important change remain unknown. The feasibility of the UP-WMST was assessed by two health care professionals from the same rehabilitation center. It would be interesting to assess if the administration of the UP-WMST in a different setting or with different health care professionals would lead to the same results. Before the UP-WMST can be used in clinical practice, additional research towards responsiveness, interpretability, reliability and construct validity of the newly developed UP-WMST is warranted. Furthermore, the necessity of including basic WMS items could have been enhanced by the sampling of children and their parents used in this study for relevance checking. Children were recruited from a voluntary wheelchair mobility skills training program and interviewed a few weeks before the start of the program. Therefore, their level of wheelchair mobility could have been lower and this could have resulted in some bias towards more basic WMS. At the same time this is the group of children who attend a wheelchair skill training program and therefore the group of children the UP-WMST is developed for. Nevertheless, interviews only took place before the start of the wheelchair skill training program and children and parents might have underestimated the possible WMS a child is able to learn. Therefore, future research should evaluate possible ceiling effects of the UP-WMST.

Conclusion
No single WMS outcome measure with good levels of evidence across all measurement properties was available for validation in children. However, four outcome measures did show levels of evidence on reliability and validity. The individual WMS items of these four outcome measure is the best knowledge available from literature and were used for relevance checking and validation in children. Parents, children using a manual wheelchair and health care professionals agreed on the necessity of including more basic WMS in an outcome measure for children compared to adults. The resulting 15 item UP-WMST outcome measure is easy to administer and demonstrates content validity for assessing WMS in children using a manual wheelchair. While this is the first step towards developing a WMS outcome measure for children, further assessment of reliability, construct validity and responsiveness is needed.