Introduction
The assessment of young people’s health-related quality of life (HRQOL) is considered to be of increasing importance in public health research and the evaluation of medical and psychosocial treatment [
1,
2]. A large number of measures of HRQOL have been developed specifically for children and adolescents (here defined as persons aged 8–11 and 12–18, respectively) taking the special requirements in these age-groups into account [
1,
3‐
6]. However, one disadvantage of those instruments is their lack of correspondence to adult HRQOL instruments. This shortcoming makes it difficult to track changes in HRQOL across the life course in, for example, cohort studies investigating severe or progressive chronic childhood conditions that last into adulthood. It is therefore desirable to have a modified version of an adult instrument at hand that is also suitable for younger age-groups and can be used in the transition from childhood and adolescence into adulthood.
The generic EQ-5D is a brief and easy to administer instrument that provides scores for different health dimensions as well as an index value which can be used to assess health status and is useful in health economic analyses. Since the EQ-5D has been utilized internationally in many different settings, such as clinical trials and population surveys [
7], the instrument was considered a suitable candidate for development of a modified version that could be used in children and adolescents. Within the framework of an international task force on behalf of the EuroQol Group including 13 experts in quality of life research from seven countries (Germany, Italy, South Africa, Spain, Sweden, the Netherlands, United Kingdom), a version for use in respondents from 8 years onwards—the EQ-5D-Y—was developed based on the standard adult EQ-5D. All experts additionally had specific expertise in child psychology, paediatrics, health economics, statistics, sport sciences, or rehabilitation sciences. The methodology of the questionnaire development process of the EQ-5D-Y as well as background information regarding the modifications and their consequences are described elsewhere [
8]. In summary, the development process included the revision of the content and wording of EQ-5D to ensure relevance and clarity for young respondents. After translation of the resulting modified version, cognitive interviews were conducted in Germany, Italy, Spain and Sweden to test the instrument’s comprehensibility in children and adolescents. Results indicated the adapted EQ-5D-Y was satisfactorily understood by young respondents in different countries and that it might be a useful tool to measure HRQOL in children and adolescents in an age-appropriate manner.
In order to investigate the feasibility, reliability, and validity of the EQ-5D-Y in a multinational, multilinguistic context, a series of national validation studies were undertaken which were coordinated and methodologically harmonized to ensure the comparability of the findings. The results from the validation studies performed in five countries (Germany, Italy, South Africa, Spain, Sweden) are presented in the current paper.
Since the EQ-5D is widely used for economic evaluation purposes, many questions arise with respect to the EQ-5D-Y regarding the possible development of preference weights in the future. Even though this paper concentrates on the new EQ-5D-Y as a stand-alone outcome measure as it is used in many settings (such as population health surveys, routine health system use, and use in clinical settings), we will address some of these important questions in an outlook at the end.
Discussion
This study aimed to examine the feasibility, reliability and validity of the newly developed EQ-5D-Y in four European countries and South Africa.
The results clearly show that the EQ-5D-Y is easy to fill in, has few missing values and is highly feasible for children as a HRQOL measure. The very low proportions of missing values in Italy and Sweden may be due to the fact that an investigator was at hand to help if necessary (Italy), or because some children might have received assistance at home (Sweden). On the whole, however, the overall small proportion of missing or inappropriate responses confirmed the feasibility of the EQ-5D-Y. Furthermore, the fact that there are only small differences regarding non-responses between the countries suggests the instrument might be viable in a cross-cultural setting. The most frequent problems were observed in filling out the VAS, suggesting that there is potential for further refinement of its presentation and instruction.
In general, only a low prevalence of severe problems was reported in the different dimensions of the EQ-5D-Y, which is typical for general population samples. The highest proportion of problems was reported on the ‘having pain or discomfort’ and ‘feeling worried, sad or unhappy’ dimensions. For the other EQ-5D-Y dimensions of mobility, ‘looking after myself’ and ‘doing usual activities’, only relative small proportions of respondents reported problems. The very high ceiling effects of up to 99% (especially in the ‘looking after myself dimension) are connected to several methodical limitations of the new instrument. The findings indicate that the ability of EQ-5D-Y to detect moderate impairments of HRQOL is limited and that consequently the instrument might not be very capable of discriminating between respondents in the general population. Furthermore, the large ceiling effects in the test scores cause problems in determining the instruments psychometric properties such as convergent validity and reliability. In this regard, more differentiated response options can be considered to be helpful to improve the EQ-5D-Y in the future. A five level response choice of the EQ-5D is currently in development (EuroQol group, personal communication). On the basis of the data presented here, the development of such a modified measure can be highly recommended.
The EQ-5D-Y shows fair to moderate levels of test–retest reliability, with high percentage of youths reporting the same levels of problems in the profile domains and satisfactory ICC with respect to the VAS. However, as noted above—the examination of reliability was limited by partly high ceiling effects. Reliability should therefore be further tested in a different context—e.g. clinical samples—to reduce these ceiling effects.
Regarding convergent validity, we interpreted correlation coefficients according to the guidelines provided by Cohen et al. [
21]. In interpreting validity correlations, it has to be considered that due to measurement errors, a correlation can never reach the maximum of 1 but only the square root of the product of the reliabilities of the instruments involved. Against this background, it can be said that the EQ-5D-Y demonstrated convergent validity and displayed distinct patterns of association with child-specific measures of HRQOL and other comparable scales. As expected, the VAS, as an overall measure of global health, showed the highest correlation with the KIDSCREEN-10 Index of general HRQOL, the General Health Item, and with the Life Satisfaction Ladder. The VAS was also associated with both physical well-being and psychological well-being, suggesting that VAS scores are driven by aspects of both physical and psychological health.
The EQ-5D-Y dimension ‘feeling worried, sad or unhappy’ displayed convergent validity in terms of a strong association with the KIDSCREEN-27 and PedsQL Psychological Well-being dimension, and discriminant validity [
22] in terms of low correlation with other health information. The EQ-5D-Y dimensions ‘mobility’ failed to display convergent validity—at least with KIDSCREEN-27 Physical Well-being dimension. However, it can be argued that by looking at the content of the Physical Well-being dimension of the KIDSCREEN, the latter is more focussed on physical well-being/energy level and less on physical functioning than is the case with the EQ-5D mobility dimension. Additionally, again the reduced variation in EQ-5D-Y test scores generally limits the possibilities for correlations with other measures. This might be improved by extending the range of response options, as mentioned above.
In general, due to the lack of objective data on the health of participants, the results on known groups’ validity have to be interpreted carefully. Overall, the response categories were used in a more differentiated manner by respondents who reported health problems. Even though a number of meaningful differences between the ‘known groups’ could be detected by the EQ-5D-Y, for no health attribute significant differences across all five countries could be observed (irrespective of the indicator such as presence of chronic conditions, impaired self-reported or mental health). In general, the observed ability of the EQ-5D-Y to discriminate between the compared groups supports the validity of all its dimensions but ‘looking after myself’. However, due to the large ceiling effects, only respondents with severe health problems seem to be identified reliably with the instrument. The fact that differences were not seen may also be partly attributable to the types of conditions present. For example, some of the children who report a chronic condition might do so due to an allergy with minor symptoms and thus cannot be expected to differ that much in HRQOL from their ‘healthy’ peers. Furthermore, all children (except for the Swedish household sample) were obviously healthy enough to attend school and thus cannot suffer from a very serious condition.
Even though we observed substantial ceiling effects on most EQ-5D-Y dimensions, these results are consistent with those observed when using the EQ-5D in population health surveys [
23]. Although ceiling effects with the adult version have been shown to be higher than those of other measures such as the SF-12 and HUI3, the EQ-5D was nevertheless shown to perform as well or better than those other instruments in terms of discriminant validity [
24]. It should also be noted that the EQ-5D-Y actually reduced the ceiling effect on some dimensions in comparison with the EQ-5D [
8]. Finally, it should be remembered that these were general population samples, where higher ceiling effects would be expected, and that further testing of the EQ-5D-Y is required in clinical samples, where the ceiling effect would likely be significantly reduced.
This study has several strengths, but also some limitations. All samples included comprised children and adolescents from the general population. Thus, no information on the performance of EQ-5D-Y in specific populations is available. Another limitation is that due to ethical constraints, it was not possible to obtain additional clinical data on respondents’ physical and psychological health status. Instead, several screening instruments were used. However, these additional screeners represent self-report questionnaires as well. Thus, to a certain extent, the association between these additional measures and the EQ-5D-Y might be attributable to the ‘same source of information bias.’ The statistical and psychometric analyses reported in this paper represent a first examination of the EQ-5D-Y psychometric properties. It was beyond the scope of this paper to examine other issues, such as sensitivity to change, which should be examined in future studies. Similarly, the content validity, i.e. whether the instrument encompasses all aspects of HRQOL that are important in children and adolescents was not examined, though as stated earlier, the intention was to adapt an adult tool for use in children primarily to allow for follow-up and comparisons over a wide range of ages.
Another important topic that could not be appropriately addressed within the scope of this paper is the further possible use of the new instrument in economic evaluation. Since there are differences between the EQ-5D-Y and the standard EQ-5D, the existing social value sets may not be applicable. Furthermore, valuing EQ-5D-Y health states raises some potentially interesting issues. The normative argument using social preference weights in economic evaluation is that it is the preferences of the general public that are relevant—not those of patients themselves—in making resource allocation decisions. This would suggest that preference weights for the EQ-5D-Y should be established by eliciting values from the general public, in much the same manner as for the EQ-5D [
25]. Time Trade Off and other improved methods for eliciting preferences [
26] are equally applicable to the valuation for EQ-5D-Y. However, it is unclear whether, in asking the general public to value EQ-5D-Y states, they should be informed that the states they are being asked to consider will be potentially experienced by children. Whether participants are informed or not could conceivably make a difference to the values. Similarly, there may be a systematic difference between the values the general public assign to such states and the values young people themselves place on the states, if they were asked to consider EQ-5D-Y states hypothetical to them. This relates to a wider debate about whose values are relevant in economic evaluations and in how far subgroup preferences (such as young people) are useful in economic evaluation [
27,
28].
The issue of valuation of EQ-5D-Y states and appropriate means by which social preferences for those states should be elicited is currently under consideration and discussion and the considerable experience of the entire EuroQoL group is guiding this process. The present paper can only provide a basis for this further discussion, since clearly no weights can be developed until the underlying descriptor domains are found to be reliable and valid.
Acknowledgments
This work was financially supported by the EuroQol Foundation. (The EuroQoL Foundation is a non-profit organization.) We thank Professor Magnus Svartengren and PhD Candidate Sun Sun at Karolinska Institutet, Sweden for participation in data collection and analyses. We also thank the Italian team members: Carlo Tomasetto, Maria C Matteucci and Patrizia Selleri, from the Department of Educational Sciences and Serena Broccoli, from the Department of Statistics, University of Bologna; Barbara Pacelli, from the Epidemiology Unit, Local Health Authority, Bologna, Italy; Francesca Borghetti, Centre of Pharmacoeconomics, University of Milan, Milan, Italy.