Development and the initial validation of a new self-administered questionnaire for an early detection of health status changes in smokers at risk for chronic obstructive pulmonary disease (MARKO questionnaire)

Aim To develop and do an initial validation of a new simple tool (self-administered questionnaire) that would be sensitive and specific enough to detect early changes in smokers leading to future development of chronic obstructive pulmonary disease (COPD). Methods 224 consecutive participants (50.9% women), with mean ± standard deviation age of 52.3 ± 6.7 years, 37.5 ± 16.7 pack-years smoking history (85.8% active smokers), and no prior diagnosis of COPD were recruited. The MARKO questionnaire was self-administered twice; at the general practitioner's office and after 2-4 weeks at the tertiary care hospital. Participants were assessed for COPD by a pulmonologist after filling in a quality of life (QoL) questionnaires, history-taking, physical examination, lung function test, 6-minute walk test, and laboratory tests. They were divided into four subgroups: “healthy” smokers, symptomatic smokers, and smokers with mild and moderately severe COPD. Results Psychometric analyses indicated that the 18-item questionnaire had a very good internal consistency (Cronbach’s alpha = 0.91) and test-retest reliability for a four week period (ρc = 0.89, 95% confidence interval [CI] 0.85-0.92, Lin’s concordance). A significant correlations of MARKO scores were found with two QoL questionnaires; r = 0.69 (P < 0.001) and r = 0.81 (P < 0.001). Receiver operating characteristic curve analysis showed an area under the curve of 0.753 (95% CI 0.691-0.808, P < 0.001), with a sensitivity of 71.83% and specificity of 64.24% to discriminate “healthy” smokers from other subgroups. Conclusion Based on psychometric analyses and high convergent validity correlation with already validated QoL questionnaires, the newly developed MARKO questionnaire was shown to be a reliable self-administered short health status assessment tool. Trial registration Clinicaltrial.gov NCT01550679

Aim To develop and do an initial validation of a new simple tool (self-administered questionnaire) that would be sensitive and specific enough to detect early changes in smokers leading to future development of chronic obstructive pulmonary disease (COPD).
Methods 224 consecutive participants (50.9% women), with mean ± standard deviation age of 52.3 ± 6.7 years, 37.5 ± 16.7 pack-years smoking history (85.8% active smokers), and no prior diagnosis of COPD were recruited. The MARKO questionnaire was self-administered twice; at the general practitioner's office and after 2-4 weeks at the tertiary care hospital. Participants were assessed for COPD by a pulmonologist after filling in a quality of life (QoL) questionnaires, historytaking, physical examination, lung function test, 6-minute walk test, and laboratory tests. They were divided into four subgroups: "healthy" smokers, symptomatic smokers, and smokers with mild and moderately severe COPD.
Results Psychometric analyses indicated that the 18-item questionnaire had a very good internal consistency (Cronbach's alpha = 0.91) and test-retest reliability for a four week period (ρc = 0.89, 95% confidence interval [CI] 0.85-0.92, Lin's concordance). A significant correlations of MARKO scores were found with two QoL questionnaires; r = 0.69 (P < 0.001) and r = 0.81 (P < 0.001). Receiver operating characteristic curve analysis showed an area under the curve of 0.753 (95% CI 0.691-0.808, P < 0.001), with a sensitivity of 71.83% and specificity of 64.24% to discriminate "healthy" smokers from other subgroups.
Conclusion Based on psychometric analyses and high convergent validity correlation with already validated QoL questionnaires, the newly developed MARKO questionnaire was shown to be a reliable self-administered short health status assessment tool.
Trial registration: Clinicaltrial.gov NCT01550679 Development and the initial validation of a new selfadministered questionnaire for an early detection of health status changes in smokers at risk for chronic obstructive pulmonary disease (MARKO questionnaire) Chronic obstructive pulmonary disease (COPD) is one of the major causes of chronic morbidity and mortality throughout the world (1). Millions of people suffer from this disease for years, and die prematurely from it or its complications, thus producing a significant impact on the health care system and economy. Since COPD is a preventable and treatable disease, early detection is very important (1,2). Chronic airflow limitation, which is a major characteristic of COPD, is caused by a mixture of small airways lesions and parenchyma destruction caused by chronic inflammation. According to the Global Initiative for Chronic Obstructive Lung Disease (GOLD), to establish the diagnosis, the patient has to have a significant exposure, characteristic symptoms, and a significant degree of airflow limitation (Tiffeneau index <0.7) (1). American Thoracic Society/European Respiratory Society (ATS/ERS) use even more stringent criteria for a significant airflow limitation based on the lower limit of normal (LLN), arguing that using a single point criteria for all age groups significantly under-or overestimates the incidence and prevalence of COPD in different age groups (3). On the other hand, pathophysiological changes, symptoms, and diminished health related quality of life (HRQoL) often precede clinically significant airflow limitation (1). Even though cigarette smoking is the major cause of COPD, only a fraction (<1/3) of the smokers develop the disease. Based on the recently published data from a subsample (n = 50 008) of the UK Biobank, there are significant shared genetic mechanisms underlying airway limitation, COPD, and smoking addiction (4). Despite recommendations for an early diagnosis (1,2), up until now, there have been no predictive parameters to evaluate the risk for developing COPD in a particular person exposed to tobacco smoke. Interdisciplinary Association for Research in Lung Disease (Associazione Scientifica Interdisciplinare per lo Studio delle Malattie Respiratorie, AIMAR) guidelines recommend using a stepwise approach that starts with the screening questionnaire as a first step in the identification of a high risk population. Already validated HRQoL questionnaires like St' George Respiratory Questionnaire (SGRQ) or COPD Assessment Test (CAT) have not been developed and validated for such a purpose (5,6). The aim of our study was to construct, develop, and conduct an initial validation of a new simple tool (self-administered questionnaire) that would be sensitive and specific enough to detect early changes in smokers leading to future development of COPD.

MethODS
This study was a part of broader research project "Early Detection of COPD Patients in GOLD 0 (Smokers)

Validation study
Participants. The study conducted between 2010-2013 included 224 consecutive participants (50.9% women) with mean ± standard deviation age of 52.3 ± 6.7 years and 37.5 ± 16.7 pack-years smoking history (85.8% active smokers). They were recruited at 15 GP offices representing an equal number of GPs in two major cities (Zagreb and surroundings 7 GPs, Split and surroundings 8 GPs). The participants were approached by their GPs during a random visit to their office (not related to respiratory problems) if they were smokers or ex-smokers of the predefined age group.
The pre-screening for inclusion/exclusion criteria was conducted through a structured interview. The inclusion criteria were that participants have to have signed the written consent; be smokers/ex-smokers of either sex aged 40-65 years with a smoking history of at least 20 pack-years (calculated as the number of cigarettes smoked per day multiplied by the number of years of smoking divided by 20); and have no previous diagnosis of COPD. The exclusion criteria were any clinically relevant chronic disease (cardiovascular, cerebrovascular, diabetes, hepatitis, nephropathy, chronic dialysis, systemic disorder, cancer) significantly affecting QoL at the time of the first visit; immunosuppressive therapy; preceding acute respiratory disease four weeks before the visit; hospitalization for any reason during past three months; myocardial infarction, cerebrovascular infarction or transient ischemic attack during past six months; diagnosis of asthma; and an inability to perform the diagnostic protocol. After the diagnostic workup participants were divided into four subgroups defined as "healthy" smokers (no respiratory symptoms and FEV 1 /FVC≥0.7, n = 72), symptomatic smokers (chronic respiratory symptoms as dyspnea, cough and/or sputum production and FEV 1 /FVC≥0.7, n = 110), COPD GOLD 1 (chronic respiratory symptoms, FEV 1 /FVC<0.7 and FEV 1 ≥80% predicted, n = 23), and COPD GOLD 2 (chronic respiratory symptoms, FEV 1 /FVC<0.7 and FEV 1 <80% and ≥50% predicted, n = 19) (1).

Measuring instruments
The MARKO questionnaire was constructed and developed in the Croatian language for the purpose of this study by a group of experts; three medical doctors (pulmonologists ŽV, DP, and PMAC) and two psychologists (BRV, AKĐ). It was constructed in the Croatian language because the whole MARKO study was planned and performed in Croatia and did not involve participants from other countries. The questionnaire comprised 18 questions covering the manifestation and frequency of the symptoms present at the early stages of the COPD that could impact the patients' HRQoL. The participants were asked to rate the frequency of their symptoms over a designated period of time (eg, over past three months for coughing, shortness of breath, expectoration, and over past 12 months for pulmonary infections). They also rated their breathing quality and general health status. Furthermore, they reported on the shortness of breath during daily life activities requiring different physical strain, and compared their physical abilities and fatigue with respect to their referential age group. The total scores ranged from 0 to 57 points, with the higher scores indicating poorer HRQoL.
CAT is a validated, short (8-item), and simple self-administered questionnaire, with good discriminant properties, developed for use in routine clinical practice to measure the health status of patients with COPD (6). The test was developed using Rasch analysis as a single dimensional construct. Internal consistency was excellent with Cronbach's α = 0.88 and a good test-retest reliability (intraclass correlation coefficient = 0.8). Every item is rated on a six point scale from 0 to 5. Total scores range from 0 (indicating no impairment) to 40 (indicating maximum impairment). It is openly accessible and available in more than 60 languages. It was validated in 6 different countries using 4 different languages and translated to the Croatian language using an internationally recommended procedure (7).
SGRQ was designed to measure the overall health status and well-being of the patients with obstructive airways disease (5). It is a standardized self-administered airways disease-specific questionnaire divided into three domains: symptoms (8 items), activity (16 items), and impacts (26 items). Internal consistency (Cronbach's α) for these domains for COPD was 0.61, 0.90, and 0.88, respectively. For each domain and for the overall questionnaire, scores range from zero (no impairment) to 100 (maximum impairment). The questionnaire is available in more than 70 languages and openly accessible. SGRQ was not previously validated for the Croatian language but was translated using an international recommended procedure and used widely in many COPD clinical trials in Croatia (7). The SGRQ scores in our study were calculated using score calculation algorithms and missing data imputation (if total number of missing items was ≤10) using the Excel® SGRQ calculator.

Procedure
The purpose of this initial validation was to understand the basic psychometric characteristics of this newly constructed questionnaire and determine how it compares to the already existing and validated HRQoL questionnaires used for COPD, like CAT and SGRQ. Also it was important to understand if the newly developed questionnaire discriminates between all 4 subgroups of participants. We also understand that we have different domains and some redundant questions as they differ in the level of symptoms severity. They were put into the construct on purpose because the final evaluation would be made based on the results of participants' follow-up. The main purpose why this questionnaire was developed was to try to pick-up early changes in HRQoL that are predictive for the future develop-ment of COPD in smokers at risk, or with a progression of an early COPD. As there are no up-to-date instruments that can be compared with this, the second validation will be done using follow-up data that will allow us to discard redundant questions and fully analyze the construct validity of MARKO questionnaire. The MARKO questionnaire was self-administered twice in a validation study; first at the GP's office and after 2-4 weeks at the tertiary care hospital during pulmonologist's assessment. During the pulmonologist's assessment the staff and the participant were blinded for the results of MARKO questionnaire obtained at the GP's office. At the tertiary care hospital participants were referred to a designated team consisting of a pulmonologist, study nurse, and lung function laboratory technician. They filled in the self-administered MARKO questionnaire followed by CAT and SGRQ, after which they went through a structured and predefined diagnostic workup (history-taking, physical examination, lung function with bronchodilator test, 6-minute walk test, laboratory tests) to determine the diagnosis and staging of their COPD according to the GOLD and were divided in four subgroups that were used for further comparisons as previously described (1).

Data analyses
Data analyses were conducted using STATISTICA

Validation study
Men and women were of comparable age (52.0 vs 52.6 years, P = 0.537) at the time of inclusion but men smoked significantly more (43.0 vs 32.2 pack-years, P < 0.001) and were more likely to have quit (P = 0.012), although most participants were current smokers (85.8%) ( Table 1). More than half of the participants of both sex (men, 56.4% vs women, 56.1%, P = 0.840) had chronic disorders other than respiratory and almost half of all participants were on some chronic disease treatment (42.7% vs 43.9%, P = 0.562). Men had a significantly higher body mass index (27.5 vs 25.4 kg/m 2 , P < 0.001) with significantly higher systolic and diastolic blood pressure (P = 0.014, P = 0.003, respectively) and a comparable heart rate (P = 0.751). Chronic or recurring respiratory symptoms were present in more than 60% of participants, with cough/sputum being present in approximately half of them and wheezing in more than 20%, with no significant difference between sexes (P > 0.300 for all comparisons). No significant difference was found for FEV 1 (P = 0.620) and FEV 1 /FVC ratio (P = 0.066) but men had significantly lower FVC (P = 0.001) ( Table 1).
The item-to-total correlations identified four questions whose coefficients were lower than 0.50, and the scores of this revised 14-item version of the questionnaire were also tested and compared with the scores of the 18-item questionnaire.
Internal consistency of the 14-item version was a bit better (Cronbach's alpha = 0.94), with a comparable test-retest re-liability for a four week period (ρc = 0.88, 95% CI 0.84-0.91, Lin's concordance; r = 0.88, 95% CI 0.81-0.95, P < 0.001, Pearson correlation). The median (IQR) scores of the 18-and 14item versions of the MARKO questionnaire, CAT scores, and SGRQ scores and subgroup comparisons are presented in Table 2. There were no significant differences in the scores of both versions for sex (P > 0.200 for both). The correlations of both scores with age were not significant (r <0.02, P > 0.800 for both).
Although analysis of variance was significant for between group comparisons ("healthy" smokers, symptomatic smokers, COPD GOLD 1, and COPD GOLD 2) for all HRQoL questionnaires (MARKO, CAT, and SGRQ, P < 0.001 for all), only the 18-item MARKO questionnaire showed a significantly lower median score in "healthy" smokers compared to other three groups (M = 7 vs 13 vs 10 vs 18, P < 0.001, P = 0.045 and P < 0.001, respectively;  The 14-item MARKO questionnaire, CAT, and SGRQ did not show significantly different scores between "healthy" smokers and COPD GOLD 1 subgroups (P > 0.05 for all comparisons; Table 2). Also the 18-and 14-item MARKO questionnaires were the only that significantly discriminated COPD from non-COPD participants (M = 14 vs 11, P = 0.008; 10 vs 8, P = 0.015; Table 2). ROC curve analysis for the 18-item MARKO questionnaire showed an AUC of 0.634 (95% CI 0.567 to 0.698, P = 0.004), with a sensitivity of 62.50% and specificity of 49.45%, PPV 21.37%, and NPV 85.71% for the score criterion of >10 for COPD. With each additional point on the scale of the 18-item MARKO questionnaire, the odds for COPD diagnosis significantly increased by 5% (OR 1.05, 95% CI 1.01 to 1.08, P = 0.009). AUC for the 14-item MARKO questionnaire was 0.623 (95% CI 0.555 to 0.687, P = 0.010). Although the scores for other questionnaires were lower in non-COPD participants, these differences were not significant (P > 0.09 for all).
Active smokers were significantly different from ex smokers only in the SGRQ symptoms domain (M = 16.6 vs 6.3, P = 0.001; Table 2). Having a comorbidity did not produce a significantly different score on any of the used questionnaires (P > 0.110), but using a chronic treatment for other than respiratory disorder produced a significantly different scores for the 14-item MARKO questionnaire (M = 8 vs 7, P = 0.040), SGRQ total score (M = 14.7 vs 11, P = 0.026), SGRQ activity domain (M = 23.5 vs 17.1, P = 0.022), and SGRQ impact domain (6.1 vs 2, P = 0.026; Table 2).
All four questionnaires significantly discriminated (Table  2) between the subgroups with or without chronic respiratory symptoms (P < 0.001 for all comparisons), with or without wheezing (P < 0.001 for all comparisons), with or without chronic cough and sputum (P < 0.001 for all comparisons), with or without night awakening (P < 0.01 for all comparisons), with or without chest pain (P < 0.001 for all comparisons), with or without fatigue (P < 0.001 for all comparisons), and with or without rhonchi during auscultation of lungs (P < 0.05 for all comparisons). ROC curve analysis for the 18-item MARKO questionnaire showed an AUC of 0.873 (95% CI 0.821 to 0.914, P < 0.001) with a sensitivity of 100% and specificity of 47.70%, PPV 44.14%, and NPV 100% for the score criterion of >8 for fatigue. None of the questionnaires ( Table 2) significantly discriminated participants with a soft noise compared to normal noise during auscultation (P > 0.740), but the 18-and 14-item MARKO questionnaires showed a significantly different scores for the prolonged expiration (M = 16 vs 10, P = 0.007; 10 vs 7, P = 0.008; respectively). ROC curve analysis for the 18-item MARKO questionnaire showed an AUC of 0.667 (95% CI 0.596 to 0.731, P = 0.004), with a sensitivity of 56.00% and specificity of 70.62%, PPV 21.21%, and NPV 91.91% for the score criterion of >14 for prolonged expiration.

DISCuSSION
The main result of our initial validation study was that the MARKO questionnaire showed expected properties in a setup and population of the intended use (8). It was validated for comprehension and had a very good internal consistency and test-retest reliability, with high convergent validity correlation with the already validated COPD HRQoL questionnaires (SGRQ and CAT). A very important finding was that MARKO questionnaire better detected early symptoms in smokers than the other two questionnaires, significantly discriminating symptomatic smokers/ex-smokers and COPD patients from "healthy" smokers/ex-smokers. Almost no differences were seen between the 14-and 18-item versions of the MARKO questionnaire, with a significantly better result for the 18-item version only regarding discriminating other subgroups from "healthy" smokers/ex-smokers. These results represent the first step and a prerequisite for further validation of the MARKO questionnaire regarding its predictive power as an early marker of future development of COPD (as a single marker or in combination) that can be used for screening in a primary care setting.
Population screening for COPD is not a recommended strategy but early diagnosis in a population at risk is highly recommended because of a high proportion of undiagnosed or late diagnosed COPD associated with high morbidity (1,2,9). Several approaches for use in primary care were tested but only to make an early diagnosis of the already present COPD (10). The MARKO questionnaire showed comparable results regarding the diagnostic potential for COPD in a primary care setting to the results of a meta-analysis of COPD Diagnostic Questionnaire (CDQ) by Haroon et al (10). However, rather than constructing a diagnostic questionnaire for COPD, our aim was to construct a questionnaire that could identify early changes in HRQoL in smokers leading to subsequent development of COPD.
Having such an instrument could help in starting secondary prevention earlier or starting an early intervention. In regard to this aim, the MARKO questionnaire showed a higher sensitivity for early symptoms of future possible COPD than SGRQ or CAT, with high convergent validity correlation with these already validated COPD health status questionnaires. This high convergent validity correlation is also important because it shows specificity for respiratory disorders and could probably mean that it could be associated with already known features of CAT and SGRQ, showing association with many facets of COPD, like underlying inflammation, airway limitation, breathlessness, progression of disease, morbidity, and mortality (11)(12)(13)(14)(15). On the other hand, at least for the 18-item version, the results were not influenced by common comorbidities and concomitant treatment. In the systematic review by Haroon et al, the major risk for bias when evaluating the questionnaires and handheld flow meters for screening purposes was inadequate blinding between index tests and spirometry, which was not the case in our study (10).
Further validation is expected after a follow-up of the cohort of smokers recruited into the MARKO study, when the potential of this tool to predict future development of COPD in smokers/ex-smokers at risk for COPD will be evaluated (as a single tool or combined with other markers).
Based on basic psychometric analyses and high convergent validity correlation with already validated HRQoL questionnaires, the newly developed MARKO questionnaire was shown to be a reliable self-administered short health status assessment tool. It had a better discriminating power for early changes associated with smoking susceptibility than other two questionnaires (CAT and SGRQ), thus being in accordance with the newest recommendations as a first step in making an early diagnosis. These properties will be tested prospectively in an ongoing cohort study to evaluate the predictive power of the MARKO questionnaire to identify individuals who will develop COPD among individuals at risk.
Declaration of authorship ŽV participated in conceiving the study, its design and coordination, and drafting and revising of the manuscript for important intellectual content. DP participated in conceiving the study, its design and coordination, data acquisition, statistical analysis, and drafting and revising the manuscript for important intellectual content. BRV participated in conceiving the study, its design and coordination, and drafting and revising the manuscript for important intellectual content. AKĐ participated in conceiving the study, its design and coordination, and drafting and revising the manuscript for important intellectual content. ML participated in study coordination, data acquisition, and drafting and revising the manuscript for important intellectual content. IG participated in study coordination, data acquisition, and drafting and revising the manuscript for important intellectual content. SL participated in study coordination, data acquisition, and drafting and revising the manuscript for important intellectual content. IJ participated in study coordination, data acquisition, and drafting and revising the manuscript for important intellectual content. PMAC participated in conceiving the study, its design and coordination, and drafting and revising the manuscript for important intellectual content. All authors contributed to the original ideas and gave the final approval for the final version of the manuscript.