The validity and reliability of the Arabic version of the EQ-5D: a study from Jordan

BACKGROUND AND OBJECTIVES: EQ-5D is a generic measure that permits comparisons in quality of life across disease states, and which may provide useful data for health policy and resource allocation decision-making. There are no published reports on the acceptability and psychometric properties of the EQ-5D in the Arabic language. We therefore investigated the validity and reliability of the Arabic translation of the EQ-5D in Jordan. METHODS: The study was conducted on a convenience sample consisting of consecutive adult Arabic-speaking outpatients or visitors attending a university teaching hospital. Subjects were interviewed twice using a standardized questionnaire containing the EQ-5D, Short Form 36 Health Survey (SF-36). To assess the validity of the Arabic version of the EQ-5D, ten hypotheses relating responses to EQ-5D dimensions or the visual analogue scale (EQ-VAS) to SF-36 scores or other variables were examined and test-retest reliability was assessed. RESULTS: The study included 186 subjects who had a mean age of 45.3 years and included 87 (47%) females. The major problem reported in more than 102 (55%) of the subjects was anxiety/depression. All of the ten a-priori hypothesis relating EQ-5D responses to external variables were fulfilled. Cohen's κ for test-retest reliability (n=52) ranged from 0.48 to 1.0. CONCLUSION: The Arabic translation of EQ-5D appears to be valid and reliable in measuring quality of life in Jordanian people.

M easurement of health-related quality of life (HRQL) has become an imperative in clini-cal trials and disease management programs for a variety of diseases. HRQL can be considered a ma-jor outcome in clinical trials in the absence of objective measures. In Jordan, where Arabic is the first language, conducting clinical trials is difficult due to the lack of validated Arabic translations or versions of quality of life instruments. The EQ-5D is a standardised generic instrument for use as a measure of health and quality of life outcomes. Applicable to a wide range of health conditions and treatments, it provides a simple descrip-tive profile and a single index value for health status. The EQ-5D is now available in most major languages with cultural adaptations. [1][2][3] There are no published reports on the acceptability and psychometric proper-ties of the EQ-5D iin the Arabic language, which is the first language for more than 300 million people. Using a convenience sample of lay people we therefore inves-tigated the validity and reliability of the Arabic version of the EQ-5D in Jordan as a prelude to a future popula--The validity and reliability of the Arabic version of the EQ-5D: a study from Jordan tion-based valuation of health states. Jordan is a small country in the Middle East with a population of over 5 million people. This study is part of a larger project aiming at developing, translating and adapting impor-tant instruments and questionnaires into Arabic.

METHODS
The study group was a convenience sample that con-sisted of consecutive adult Arabic-speaking outpatients or visitors attending the University of Jordan hospital in the period between June to August 2007. The inclu-sion criteria were that the subject should be an adult Jordanian and should have no obvious cognitive defi-cit. The hospital is located at the center of Amman, the capital of Jordan, and is considered one of the oldest and largest hospitals in Jordan serving more than 0.5 million people every year. The study site was ideal for testing the instrument as it serves a diverse group of pa-tients from all over the country.
Translation and cross-cultural adaptation was done according to the EuroQol group' s guidelines. 4 Two in--dependent translators performed forward translation, followed by backward translation by another two trans-lators. When the consensus version was determined, cognitive debriefing was done by ten laypersons. They underwent a structured interview to assess understand-ability and ease of completion of the Arabic EQ-5D.
The translation process was smooth and straightfor-ward. The translators only disagreed on the translation of "discomfort" during the forward translation process as it can be translated into several closely related words. A discussion session was conducted until the transla-tors agreed on the most appropriate translation. During the interview with the ten laypersons, they reported no concerns with the phrasing of the Arabic EQ-5D. In general, the instrument was very easy to complete and clear to all readers.
The study was approved by the University of Jordan Academic and Research Committee. All participants were interviewed by trained research assistants using a questionnaire containing the Arabic version of the EQ-5D and the short form health survey (RAND SF-36-Arabic version). 5 Demographic data were also collected using a standard questionnaire. A random sample was then chosen, given a copy of the EQ-5D and re-inter-viewed over the phone using the EQ-5D after 2 to 4 weeks to assess test-retest reliability.
The EQ-5D6 consists of a self classifier with five single item health dimensions, each with three response levels, and a visual analogue scale (EQ-VAS). Both the health state descriptors and the visual analogue scale of the perceived health state of the Arabic version of the EQ-5D were used in the current study. The SF-36 is a validated, 5,7,8 36-item instrument measuring perceived health in eight dimensions with higher scores (range 0 to 100) reflecting better perceived health.
Known-groups construct validity 9 of the EQ-5D self-classifier and EQ-VAS was examined by testing ten priori hypotheses based on the literature or clinical experience. Hypotheses relating EQ-5D dimensions to other variables were: 1. Subjects reporting problems for any EQ-5D di-mension would have lower scores for all SF-36 scales; 10 2. Subjects reporting problems for EQ-5D mobil-ity, self-care, usual activities or pain/discomfort dimensions would have larger score reductions for SF-36 physical functioning (PF), role limita-tion due to physical problem (RP) and bodily pain (BP) scales than for role limitation due to emo-tional problem (RE) and mental health (MH) scales; 3. Similarly, subjects reporting problems for the EQ-5D anxiety/depression dimension would have larger score reductions for SF-36 RE and MH scales; 4. Subjects with mobility, self care, or usual activities problems should have their lowest score in role limitation due to a physical problem; 5. Elderly (age ≥60 years) or 6. Subjects with chronic diseases should report more problems than other subjects. Hypotheses for the EQ-VAS were: 7. EQ-VAS scores would be higher in subjects re-porting better global health measured using a 5point scale (i.e lower score on the first question of  the SF-36); 11,12 8. EQ-VAS scores would correlate negatively with increasing age; 12-13 9. Females will report lower (worse) EQ-VAS scores; 10. Subjects with chronic disease will report lower (worse) EQ-VAS scores than those without. Hypothesized trends were tested as appropriate depending on the type of data and distribution using the chi-square, the Fisher exact test, t test, the Mann-Whitney test, and Pearson or Spearman correlation coefficients. To minimize false-positive tests of signifi-cance, a significance level of P<.01 14 should be used as a criterion for hypothesis fulfillment. Test-retest reli-ability of EQ-5D dimensions was investigated using the Cohen k. According to Landis and Koch, 15 k coef-ficients of less than 0.0 are poor, 0.0 to 0.20 are slightly poor, 0.21 to 0.40 are fair, 0.41 to 0.60 are moderate, 0.61 to 0.80 are substantial, and 0.81 to 1.00 are almost perfect. Data were analyzed with SPSS for Windows (version 9, SPSS Inc, USA).

RESULTS
During the study period, 200 subjects were asked to participate and only 14 refused. The main reason behind refusal was lack of time. The demographic and clinical characteristics of those who refused to participate were similar to the study subjects. One hundred eighty-six subjects completed the baseline questionnaires. Table 1 shows the general characteristics of the subjects. Table 2 shows distribution of responses to EQ-5D dimensions. There was only one missing item from the self care and the usual activities dimensions indicating the practical-ity and simplicity of the translated version. Among five dimensions of the EQ-5D, the proportion of having any problem was highest for anxiety/depression with 102 (55%) subjects reporting having moderate or ex-treme problems. The mean EQ-VAS score was 72.2 (SD 15.5).
Cronbach' s a was 0.75 indicating that the EQ-5D has an acceptable internal consistency.
Fifty-two subjects (28.0%) participated in the fol-low-up telephone interview, with a 3-week median interval (interquartile range: 2 to 4 weeks). Cohen' s k values for EQ-5D mobility, self care, usual activities, pain/discomfort and anxiety/depression items were 0.66, 1.0, 0.48, 0.66, and 0.48 respectively (P≤.001 for all dimensions). Intraclass correlation coefficient for the EQ-VAS between the two periods was 0.78.
All of the four hypotheses relating EQ-5D dimen-sions to SF-36 scales were fulfilled (Tables 3 and 4). Subjects reporting moderate or extreme problems for EQ-5D dimensions had lower SF-36 scores than those without such problems. Similarly subjects reporting problems for EQ-5D mobility, self-care, usual activities or pain/discomfort dimensions had larger score reduc-tions for SF-36 PF, RP and BP scales than for RE and MH scales. When subjects were grouped by their re-sponses to the EQ-5D anxiety/depression dimension, the difference in scores for the SF-36 MH and RE scales was larger than that for all other scales. In addition, sub-jects with mobility, self care, or usual activities problems had their lowest score in role limitation due to physical problem. Elderly participants (n=45) and those with at least one chronic medical problem (n=95) had signifi-cantly reported more problems in all of the EQ-5D di-mensions apart from anxiety/depression (Table 4).
On the other hand, all four hypotheses for the EQ-VAS were fulfilled ( Table 5). The EQ-VAS was posi-tively correlated with global health (lower score on SF-1 indicates better health) and negatively correlated with increasing age. Participants with at least one chronic problem had significantly lower results than those with-out. Females scored lower on the EQ-VAS, but this was significant only at the 0.05 level.

DISCUSSION
This is the first report on the reliability and validity of the EQ-5D in Arabic. All of the ten priori hypotheses were fulfilled, suggesting that the translation has proper-ties similar to those of other validated EQ-5D versions. Internal consistency of the instrument was also found to be acceptable. We also found evidence to support test-retest reliability of the EQ-5D self-classifier, with Cohen' s k being moderate to perfect (0.48-1.0). The k values in our study were in general better than those reported in previous studies of subjects after stroke us-ing EQ-5D (Cohen' s k: 0.63-0.80, 3-week, n=234)16   and in those with rheumatic diseases (Cohen' s k: 0.29-0.61, 1-week, n=52). 3 The significant and reasonably high intraclass correlation coefficient obtained in the EQ-VAS reliability study (0.78) demonstrates that the EQ-VAS is a feasible measure of self-reported health.
The results have demonstrated that the major prob-lem reported by 55% of participants was anxiety/de-pression. Anxiety and depression are commonly as-sociated with the etiology of many diseases including asthma, diabetes and hypertension. This reuslt is very important in the view of the large prevalence of diabetes and hypertension in Jordan, which exceeds 25%. 17 We have utilized the Arabic version of the EQ-5D in several clinical settings including diabetes, rheuma-toid arthritis and allergic rhinitis; the instrument was found to be of high clinical value and was strongly cor-related with the clinical indicators. This indicates that the Arabic version of the EQ-5D is externally valid. It should be noted that the participants have completed the EQ-5D with the help of a research assistant. EQ-5D is a very simple instrument; therefore we think there will not be any major difference in the results if the EQ-5D was self administered.
One study limitation was that we used a convenience sample, which may limit the generalizability of the re-sults. This limitation was obvious as the sample was characterized by a high level of education. However; this study is a pilot investigation and we hope in the near future to conduct a population-based investiga-tion. It would have been more appropriate to validate the EQ-5D using a similar utility measure rather then using the SF-36. However, in our case this was not pos-sible as currently the SF-36 is the only available well validated and well translated generic QOL instrument in the Arabic language.