The supine position was the most reliable of the three positions used in testing hip extensor muscle strength in young people with cerebral palsy because it demonstrated the smallest values of absolute reliability across the three testing positions. Although reliability indices (relative reliability) across the three testing positions appeared similar, the amount of change required to be 95% confident that real change over measurement error had occurred (absolute reliability) was less in the supine positions than the prone and standing test positions. For example strength increases of more than 8% across groups could be interpreted as true change when measured in the supine position; in contrast group increases of 31% and 34% would be required to be interpreted as true change when measured in the prone and standing positions, respectively. The supine position was stable for the participant and required participants to generate a force across gravity; these two factors may have enabled the participant to be able to generate a more consistent force isolated to the hip extensors, making the test more repeatable and, therefore more reliable.
The prone position was also stable for the participant but required them to generate a force against gravity. The need for participants to exert more effort in lifting the weight of their leg before exerting force on the dynamometer may have contributed to reduced reliability compared to the supine position. The standing position has not been evaluated before in the measurement of hip extensor strength in young people with cerebral palsy although high levels of retest reliability (ICC = .92) have been reported in using a modified standing position to assess hip extensor muscle strength in adults without impairment . This position was thought to be advantageous because it is a more functional position to assess the strength of the hip extensors. However, in standing the participant must perform the dual task of maintaining the challenging testing position while performing the task. Dual tasking has been shown to make primary motor tasks, such as walking, more difficult in other neurological conditions . The dual task may have made the performance of the test less consistent, and therefore reduced reliability.
The results suggest that measuring hip extensor strength in a group of young people with cerebral palsy can be measured with sufficient reliability in the positions of supine to monitor changes in strength. Measuring hip extensor strength in the supine position means that group changes of more than 8% could be confidently attributed to real change. Therefore, using hand-held dynamometry to quantify hip extensor strength is likely to be useful to clinicians and researchers who want to evaluate the effect of group interventions and programs to improve hip extensor strength with the aim of improving hip function during important every day functional activities such as walking.
Measuring changes in individuals is not as reliable as measuring changes across groups. For the supine position, percentage increases of 55% to 60% would be required to be 95% confident that real change had occurred. There are examples where strength training interventions in young people with cerebral palsy have led to improvements of this magnitude . However, strength increases from interventions typically are of a lesser magnitude in the range of 25-30% . Therefore, the results of the current study suggest that hip extensor strength is not able to be measured with sufficient reliability for clinicians to monitor typical changes for an individual prescribed a strength training program.
The results of the current study suggest that using the mean of all three trials, the mean of the second and third trials, or the maximum appears to have little impact on the calculation of reliability. However, when the first trial only, or the mean of the first and second trials was used, reliability was lower. For standing and supine, ICC values using the first trial only or the mean of the first two trials were below .64 (.55 to .64), indicating poor reliability. The results of this study, suggest that using the first trial only or the mean of the first and second trials is not as reliable as basing the estimate of strength on a combination of three trials. This is relevant clinically, because clinicians want to be able to test in the most reliable manner, but also the most efficient.
The results of our study also suggest that it might be misleading to rely only on coefficients, such as the ICC, to evaluate the reliability. Our results indicated little difference in the reliability coefficients between the three testing positions, all ranging from .73 to .80. However, clinicians and researchers are interested in whether observed change represents true change or measurement variability. This information is gained by expressing reliability in the units used for measurement. In terms of the units of measurement, our results indicated that the supine position for testing hip extensor muscle strength was more reliable than the prone or standing positions, since less change would be required to be interpreted as true change. Correlation coefficients do not indicate differences in repeated tests, rather the retest variability relative to the differences between subjects. For these reasons, it has been recommended that reliability be expressed in the units of measure and not only in terms of correlation coefficients .
This study has contributed to the literature by providing guidance about the most reliable method for measuring hip extensor strength in young people with cerebral palsy. The current study builds on previous research [17–19] by comparing three starting positions for testing, including the standing position, which had not been previously evaluated. The reliability coefficients in our study for testing in prone (.75 to .80) are comparable to those reported by van der Linden et al  (.75 to .83) but somewhat larger than values reported by Crompton et al  (.26 to .40). The reliability coefficients for testing in supine in our study (.74 to .78) are comparable to that reported by Crompton et al  (.79 to .82). Similar to Crompton, we concluded that the supine position to be more reliable than the prone testing position. The current study also adds to previous research by determining whether fewer than three trials can be used for testing, as has been used in previous trials [17–19].
However, there are some limitations. Only a subset of young people with spastic diplegic cerebral palsy and mild to moderate disability were evaluated. The criteria excluded young people with more severe and different types of cerebral palsy who may also benefit from monitoring muscle strength. The sample size of the current study was relatively small, although the number of participants was equal to the sample size estimated for a study of this nature . A larger sample size may serve to narrow the confidence intervals about the reliability coefficient. Also, it needs to be considered how a static measurement of hip extensor muscle strength, as measured with a hand-held dynamometer relates to dynamic hip extensor muscle action during functional tasks such as walking and this could be the subject of further research. It could also be considered whether a retest interval of 12 weeks between measures was a limitation since many retest reliability studies use much shorter retest intervals. However, because the choice of retest interval should be related to the intended purpose of a measurement , and monitoring muscle strength in young people with cerebral palsy would involve the reassessment of muscle strength over 6 to 12 weeks , we think that the choice of a 12 week retest interval was appropriate. Finally, the results of the current study do not provide information about other forms of reliability, such as inter-tester reliability. We evaluated retest reliability as we felt it was the most clinically relevant for hip extensor muscle strength in young people with cerebral palsy where a clinician or researcher is interested in monitoring change over time. Despite this, research on inter-tester reliability, which evaluates the repeatability between two raters at one time has also demonstrated moderate to good levels of reliability (ICC ranged from .67 to .82) using the make test to measure hip extensor muscle strength in the supine position in young people with cerebral palsy .