The EUROPEP questionnaire for patient’s evaluation of general practice care: Bulgarian experience

Aim To validate the Bulgarian EUROPEP-questionnaire and its implementation to measure patient evaluation of general practice care in Bulgarian population. Methods A multicenter cross-sectional study was conducted at twenty five primary care practices from South-Central Region of Bulgaria. A total of 1000 adult patients aged over 18 years and visiting the practice for more than a year were approached consecutively to take part in the study. The internal consistency and test-retest reliability of the EUROPEP questionnaire were evaluated. To confirm the construct validity of the questionniare, еxplanatory factor analysis was performed. Results Cronbach’alpha for “clinical behaviour” is 0.95 and for “organisation of care” 0.81. Factor analysis identifed two factors, which accounted for 77.0% of the total variation in these items. On average, 58.7% of respondents rated the level of care received as excellent. The waiting time in the waiting room was the item most poorly rated (33.8%). The item “keeping patients' records and data confidential” was the most highly rated (88.8%). Patients were less satisfied with “providing quick services for urgent health problems” (78.5% excellent or good) and “getting an appropriate for them appointment” (76.2% excellent or good). Conclusion Two scales with satisfactory psychometric properties were established in the Bulgarian version of the EUROPEP-questionnaire. The study identified areas requiring improvement in general practice, such as reduction in waiting times and obtaining patient’s convenience appointment.

Health care systems based on person-centered care are designed to respect patient expectations, needs, and priorities (1). Patient perceptions of the quality of health care services are increasingly recognized as relevant to the evaluation of health care outcomes (2). Patient satisfaction, which is generally a multidimensional construct, has thus become a valuable indicator of medical care quality (2,3).
Review of the available literature shows that patient satisfaction is related to general practitioners' good communication skills and establishment of a good patient-physician relationship (4,5). In general, improved patient satisfaction with health care contributes to patient compliance with treatment and improves health outcomes (4,5).
There is no universal gold standard for measuring patient satisfaction. Different surveys of patient satisfaction with general practice care in Europe used the European Task Force on Patient Evaluations of General Practice Care (EURO-PEP) questionnaire. It is an internationally standardized and validated instrument using patient evaluation of their regular general practitioner (GP) based on their experience over the preceding year (6). Initially, the EUROPEP-instrument was developed to allow the comparison of outcomes in general practice care across Europe and provide an educational feedback to both general practitioners and patients (6,7).
In Bulgaria, after the health care reform in 2000, various surveys of patient satisfaction have been performed. Re-gardless of the accumulated data on patient satisfaction with general practice care, there has been no instrument allowing for the comparison of the results with those from other studies. As Bulgaria was not included in the initial comparative study of patient satisfaction in European countries, a special survey using the Bulgarian EUROPEPinstrument was conducted to collect the data on patient satisfaction with general practice. To our knowledge, this is the first study of patient satisfaction in which the Bulgarian EUROPEP-questionnaire was used.
The aim of the study was the validation of Bulgarian EU-ROPEP-questionnaire and its implementation to measure patient evaluation of general practice care in Bulgarian population.

PARTICIPANTS AND METHOD
We performed the translation and validation of Bulgarian EUROPEP-instrument and carried out the first cross-sectional study of patient evaluation of general practice care using the Bulgarian EUROPEP-instrument as a part of the Medical University of Plovdiv project to develop a standardized methodology for large-scale measurement of patient experiences with general practitioners in Bulgaria. The study was conducted in a randomly selected region of Bulgaria among 1000 adult general practice patients from April 2015 to July 2015.

EuROPEP instrument -translation and validation of Bulgarian version
The EUROPEP-instrument is a questionnaire that includes 23 items categorized into five qualitative domains each measuring different aspects of care including doctorpatient-relationship; medical care; information and support; continuity and co-operation, and accessibility. All items are aggregated into two dimensions: clinical behavior (items 1-16) and organization of care (items 17-23) ( Table 1).
Responses to each item are rated on a five-point Likert scale (from 1 = poor to 5 = excellent). The EUROPEP-instrument was linguistically validated according to a standard procedure (8) and cross-culturally adapted (9) into Bulgarian in several stages ( Figure 1).
Stage І. English source version of the EUROPEP questionnaire was translated into Bulgarian (forward translation) by three independent translators, who provided written rationales for decision making, linguistic difficulties, and problems regarding the content. The three Bulgarian versions were compared and synthesized at a consensus Stage ІІ. The psychometric quality of Bulgarian EUROPEPinstrument was tested on a convenience sample of 160 patients. We selected 8 general primary practices in the city of Plovdiv and asked each of the general physicians to give 20 of their patients a copy of the questionnaire and a cover letter. Patients were eligible for participation if they were aged 18 years or more, had a valid health insurance, had been registered with the same GP practice continuously for at least one year before the date of sample selection, and they had visited their GP at least once in that period. After providing informed consent, the patients completed the questionnaire at home and sent it by post to the Department of Health Management and Health Economics, Medical University of Plovdiv. After the filled-out questionnaire was received, it was sent again to the same patients four weeks later. The questionnaire had to be completed twice by the same patients, four weeks apart, to test the reliability of their answers.
Stage ІІІ. Cognitive interviews were performed using concurrent think aloud and probing techniques, as described elsewhere (10), to elicit information about potential problems in the Bulgarian translation of the EUROPEP-instrument. At the cognitive debriefing, the translated Bulgarian EUROPEP-instrument was administered to 7 patients, who met the specified age and other representative criteria for the instrument target population and had no previous information about the questionnaire. The local Medical University of Plovdiv project manager, who is a sociologist, several members of the project team, as well as project partners (psychologist, general practitioner and social worker) were also present. Each patient, after having completed the questionnaire, was interviewed by the local project manager. Interviews addressed each item of the Bulgarian EUROPEP-instrument if patients had indicated difficulty in understanding the question or would phrase it differently. Patients were allowed to propose alternative translation of the relevant items, which they felt were difficult to understand. Based on the suggestions or interpretations evaluated for conceptual equivalence and equivalence in construct operationalization and discussed discrepancies with the original questionnaire, the final Bulgarian EUROPEP-instrument was created.
Stage ІV. A cross-sectional study of patient satisfaction with general practice care was conducted using the Bulgarian EUROPEP questionnaire. The preliminary results of the Bulgarian EUROPEP-instrument validation were presented at the EGPRN Meeting in Edirne-Turkey, 2015 (11).

Participants
The study participants were selected using a three-stage random sampling. Initially, the country's region was selected, followed by general practices and patient selection. Twenty-five primary care practices (3.5%) from all five districts of the South-Central Region in Bulgaria were selected. The National Health Insurance Fund contract partners list was used, with random number assignment and selection, using a step-wise approach. The patient sample size was calculated at a maximum variance of 50% with 95% confidence interval and bounded to a maximum error of 5%. The sample size was set at 385 patients. Based on the literature review, the response rate of patients when mailing method is used ranges from 30% to 60% (12,13). The final sample was calculated at about 1000 participants. Eligible participants were aged 18 years or more and had valid health insurance. All of them had visited the primary care practice at least once in the preceding 12 months.

Method
Initially, a telephone contact was established with the selected GPs. They were familiarized with the study objectives and consented to participate. The envelopes containing the EUROPEP questionnaire, instructions, informed consent forms, and an addressed and stamped envelope were delivered to the GP practices personally by the investigators or by using courier services. GPs handed out the envelopes to all eligible patients, in consecutive order, at the end of patient visit. In the instructions, the patients were asked to send the completed questionnaires to the investigator (RD) directly. A maximum of 40 adult patients per practice were consecutively included from those who had visited the practice for a consultation. A total of 1000 adult patients were invited to participate in the study. Written informed consent was obtained from all participants after the explanation of the study protocol.
To minimize the influence of physicians and bias when completing the questionnaire, the patients were instructed to complete the questionnaire at home and mail it to the research center, using the prepaid envelope. No personal identification was used and data sets contained only anonymous data. Therefore, the use of reminders or assessment of non-response bias was not possible. Unique questionnaire numbers ensured the correct identification of each general practice and allowed for the comparison of general practice characteristics with patients' evaluation. Data were collected and analyzed at the research center.

Statistical analysis
Descriptive statistic parameters (mean,±standard deviation [SD]) and percentages were calculated. Internal consistency was assessed using Cronbach's alpha and average inter-item correlation. We defined an alpha of 0.80 as the lowest acceptable value (6,14). For the evaluation of intrarater reliability, the split-half method was used and Spearman-Brown coefficient was calculated (rsb). An average inter-item correlation of at least 0.50 was regarded as good (14,15). Intra-class correlation coefficient (ICC), using the test-retest method, was also used to estimate the inter-rater reliability to assess consistency and reproducibility. The Wilcoxon signed-rank test was applied to compare item scale scores obtained during the test and re-test.
The item-scale correlation coefficients were calculated to determine whether each item on a scale was substantially related to the total score computed from the other items on that scale.
Construct validity was assessed by correlations of scale scores and evaluating the relationship between patient satisfaction and nine additional questions included in the questionnaire, similarly to the Norwegian and Portugues studies (3,16). The results obtained from each patient during test re-test were inputted, cleaned, cross checked, and analyzed with the corresponding patient's previous results. Data were processed by IBM SPSS Statistics 22 software. The level of statistical significance was set at P < 0.05.
According to the revised EUROPEP-2006 instrument and the user's manual, we accepted a benchmark of scale's scores of 75% or above, ie, the percentage of positive patient evaluations of general practice care (4 or 5 on the Likert scale), corresponding to "good" and "excellent" rates to each item (8).
Exploratory factor analysis with the principal axis factoring extraction method was used to assess the underlying structure of the items and orthogonal rotation, using the Varimax method in the Final Bulgarian Version (3,16). Initially, sampling adequacy was assessed by using the Keiser-Meyer Olkin test (KMO) and the Bartlett's test of sphericity. Severely violated assumption of multivariate normality distribution of the data excludes the application of confirmatory factor analysis.

Psychometric quality of Bulgarian EuROPEP-instrument
Internal consistency and test-retest reliability. The itemscale correlation coefficient for all items is satisfactory (r >0.70). The overall Cronbach's alpha for Bulgarian EURO-PEP-instrument is 0.958 (for "clinical behaviour" is 0.95 and for "organisation of care" 0.81). Additionally, the internal consistency in each of the 5 subscales is considered satisfactory ( Table 1). The high reliability of the instrument was confirmed by split-half method (0.96) and ICC-coefficient (0.97). Positive scores on "clinical behavior" dimension were significantly related only to positive scores on perceived health status (r sb = 0.23, P = 0.004), however, this correlation was almost negligible. The sex and age of the respondents did not influence their evaluations of GPs.
It was found that Chronbah's alpha coefficient between the question 12 and question 17 was very high (0.985), which allowed us to combine the two questions (Table 1). Also, during the cognitive interviews, patients argued that there was a conceptual and construct equivalence between Q12 and Q17. Taking into account this opinion, the construct and the meaning of Q12 was integrated with that of Q17. The resulting Q16 reads: "Explaining the purpose of the medical check and preparing for what to expect from other specialists, hospital care, tests and treatments" ( Table 2).
Several ammendments were introduced to the final version of the questionnaire. All statements were converted to questions and a true zero was added, including answers (eg, "I do not have an opinion"). Also, verbal statements were included alongside with the numeric values of the scales (verbal numeric rating scale).

Construct validity.
To confirm the construct validity of the Bulgarian EUROPEP-questionniare, exploratory factor analysis was performed by using the principal axis factoring extraction method with pairwise deletion of missing values revealed evidence for a 2-factor structure related to perceived patient evaluations of general practice care ( Table  2). All items were organized into two subscales, "clinical behavior" (16 items) and "organization of care" (6 items). The KMO test and the Bartlett's test of sphericity showed that the data were adequate for factorial analysis (KMO = 0.971 and Bartlett's test P < 0.001). The two factors with eigenvalues >1 accounted for 77.0% of the total variation in these two items: factor 1 explained 72.9% of the total variation and factor 2 explained 4.1%. The high values of factor loading (>0.700) of each of the items in factor 1 required the use of rotation. After the rotation analysis, factor 1 explained 48.5% of the variance and factor 2 explained 28.5%. The level of factor-loadings for all items was >0.6. Factor analysis with eigenvalues <1 broke the first factor "clinical behavior" into two almost equal sub-factors -items 9-16 and items 1-8. However, it did not reach the five different domains of EUROPEP questionnaire.
The construct validity testing was assessed through correlations of scales and comparisons of responses to some additional questions included in the questionnaire (Table 3). Both "clinical behavior" and "organization of care" scales correlated significantly with general health status and number of GP consultations over the previous year. However, the low coefficient values indicated no significant relation-  ship. Our study confirmed that a better health status is associated with more positive evaluation of care (χ 2 = 56.08, P = 0.005).

Patients' and GPs' characteristics
Of a total of 511 completed and returned questionnaires, 15 (2.9%) were discarded due to incomplete information (missing data were more than 50%) and 496 were finally pro-cessed (overall item-response rate of 49.6%). The mean±SD age of respondents was 53.4 ± 15.2 years (Table 4).
With respect to GP practices, 24 (96%) were solo GP practices, and 18 (72%) GPs were women. Comparison between the sample structure and the general population regarding the type of GP practice revealed no statistically significant differences (χ 2 = 0.443, P = 0.505).

Ceiling effect and item response rate to Bulgarian EUROPEP-instrument
The item response rate was high with a small number of missing answers ( were most appreciated by patients. On the other hand, "Waiting times in the waiting room" and "Booking a convenient appointment" were the items rated most poorly. The benchmarking of the positive answers (the gold standard) was achieved in 21 items.

DISCuSSION
The validation process of current study revealed a satisfactory level for Cronbach's alpha and Spearman coefficients and identified some areas requiring improvement in general practice. Compared to the results of other studies, the calculated internal consistency coefficient (Cronbach'α) of the aggregated scores for all five sub-scales was very high (6,12,13,16,(18)(19)(20). It was also found out that the item-scale correlation exceeded the value of 0.70 for all items in the sub-scales, whereas in other studies, the respective values were over 0.40 (Italy) and over 0.50 (Norway) (16,20).
Recent studies reveal similar data, taking into consideration the mailing of the questionnaires (12,13,21). A survey in England with a very large sample (nearly two million respondents) reported suboptimal response rates (40.6%) (22). In a study in Slovenia, the response rate was about 84% (12). The response rates in 16 European countries varied from 47.1% to 89.0% (6).
The current study reviewed that the most evaluations of general practice care are 4 or 5 on the Likert scale for all of the EUROPEP questions. Similar to our results is the percentage of patients in Slovenia. Their positive evaluations (good/excellent) were 80.0% or higher for all items, except for the waiting time (18,23).
In our opinion, mean scores and overall high patient evaluations are satisfactory. In comparison, in Turkey the mean percentage rate of satisfaction was calculated at 88.3% (24). The mean scores and ceiling effect are consistent with previous EUROPEP studies (12,(16)(17)(18)20). Similarly to the Slovenian study, our results showed that the items "keeping your records and data confidential" and "listening to you" were the most highly rated (over 88.8% and over 85.9% 'excellent' or 'good' rates), respectively (12). Most Bulgarians as well, rate highly the option to receive *Percentages are calculated from the total number of patients included in the study (n = 496), excluding the "missing" and "not relevant" answers. †One of the possible answers was "I can not answer".
GP's medical advice on the phone and the availability of prophylactic and preventive services. These findings are broadly consistent with other literature data on patient evaluations of their care (6,18). Unlike the results of our study, Akturk et al (13) the established highest positive ratings for the items "providing quick services for urgent health problems".
Similar to other studies, our results, indicate that patients were less satisfied with "organization of care" compared to "doctor-patient relationships and "clinical behavior" (12,18,22,25). Our results show that waiting time in the waiting room was the item rated most poorly, similar to the ratings provided in Germany (26). It seems that in Bulgaria and in Turkey, in terms of organization of care, waiting time in the waiting room is the least satisfactory item (18,27,28). Interestingly, Pakistani patients give highest ratings for "listening to you", whereas "waiting time" was rated most poorly (25). Unlike our results, participants from the above-mentioned study, consider "respecting patient confidentiality" as unimportant (25). Interesting results were established in 16 other countries, participating in an international study -the percentage of good and excellent scores for "keeping confidentiality of medical records" was very high (6,23). Our results are comparable to the results from the multicenter study carried out in several European countries ( Table 6) (23).
All items of the current study had skewed frequency distributions, suggesting a ceiling effect. This fact indicates that the large majority of patients had positive experiences. It might be speculated that our results reflect cultural traits and/or specifics in the organization of care -in fact, likewise, in many other countries, people are reluctant to express negative ratings even if they reflect their actual feelings (23,29). In our study, it was found that patients' sat- isfaction was not affected by the type of the practices. The satisfaction ratings tended to increase with lower number of GPs in the practice as shown in a study from Switzerland (30). It is debatable, whether the cultural bias in the EUROPEP-instrument is able to delineate the differences, noted between the countries (31). We believe that further research on cultural validation will contribute to better knowledge of the specificity of patient evaluations in Bulgaria. We also take into consideration the options for improving the instrument itself.
Patients in Denmark were more satisfied with male GPs, whereas Bulgarian patients felt more satisfied and emotionally supported by their female GPs (6). Furthermore, the results from another study did not differentiate between the type of practice (solo or group) and patient's preferences (18). Unfortunately, we were not able to explore the practice size as a predictor of satisfaction. Patients in the UK were more satisfied if they were able to "get through on telephone" if the practices were located in rural areas. Differently, our study did not find such correlation. On the contrary, patients in Germany were more satisfied if practices were located in urban areas (29).
As established in other studies healthier patients give better evaluations of health services (20,31). German patients with a self-reported chronic condition generally report higher satisfaction, but this is not the case with all aspects of care (32).
We also found that patients who visit the general practice more often were generally more satisfied with their GP, similar to patients in Slovenia (6,33).
Our cross-sectional study had several limitations. The overall response rate was unsatisfactory. After the linguistic validation process and the cross-cultural adaptation, the Bulgarian EUROPEP-instrument included 22-items. In fact, minimum amendments were made to the original EU-ROPE-instrument -the Bulgarian version is shorter by one question.
Representativeness could not be claimed regardless of the random sampling method, used as patients in the current study were mainly recruited in the cities. Overall, the sample is not representative for the general popultion. Unfortunately, comparative analysis between responders and non-responders could not be accomplished due to missing data.
The response rate to the survey questions was low; therefore, the results might be affected by response bias. The mailing of the questionnaires as well as the nonuse of reminders are likely explanations to it. According to the study protocol, GPs were asked to recruit patients in a consecutive order, starting at a random point of time, however, no feed back was received to check the actual recruitment process.
Based on this, it is assumed that results for overall patient evaluation and comparability are skewed. Therefore, more additions and revisions should be made to the Bulgarian EUROPEP-questionnaire based on the conceptual equivalence and cultural relevance of the content of the questionnaire for the Bulgarian population. This could be achieved through expert discussions and patient's cognitive debriefing. The larger-scale testing will be subject of another detailed publication.
In conclusion, the present study identified two scales in the Bulgarian EUROPEP-instument with satisfactory psychometric properties. However, Bulgarian cultural, economic, and health system characteristics and the established high ceiling effects indicate the need for further instrument development and future research. Additional research is needed to further clarify patients' evaluations of care in general, and in terms of specific aspects, in order to answer how they reflect optimal care and outcomes. The results of the representative study will provide information necessary to better management and political decisions making. The result will be improved quality of general medical care in our country, as well as providing baseline data for international comparisons.