The lung function score and its components as predictors of overall survival and chronic graft-vs-host disease after allogeneic stem cell transplantation

Aim To retrospectively assess if the modified lung function score (LFS) and/or its components, forced expiratory volume within the first second (FEV1) and diffusion capacity for carbon monoxide corrected for hemoglobin level (cDLCO), predict overall survival (OS) and chronic graft-vs-host-disease (cGvHD). Methods We evaluated 241 patients receiving allogeneic hematopoietic stem cell transplantation (allo-HSCT) at the University of Regensburg Transplant Center between June 1998 and July 2005 in relation to their LFS, FEV1 and cDLCO, before and after HSCT. Results Decreased OS after allo-HSCT was related to decreased pre-transplantation values of FEV1<60% (P = 0.040), cDLCO<50% of predicted value (P = 0.025), and LFS≥III (P = 0.037). It was also related to decreased FEV1 at 3 and 12 months after HSCT (P < 0.001 and P = 0.001, respectively) and increased LFS at 3 and 12 months after HSCT (P = 0.028 and P = 0.002, respectively), but not to changes of cDLCO. A higher incidence of cGvHD was related to decreased FEV1 at 6, 12, and 18 months (P = 0.069, P = 0.054, and P = 0.009, respectively) and increased LFS at 12 months (P = 0.002), but not to changes in cDLCO. Conclusions OS was related to both LFS and FEV1, but cGvHD had a stronger relation to FEV1 than to cDLCO or LFS. FEV1 alone offered more information on the outcome after allo-HSCT than LFS or cDLCO, suggesting limited value of LFS for the patients’ assessment after allo-HSCT.

Pulmonary complications significantly contribute to lateonset morbidity and mortality after allogeneic hematopoietic stem cell transplantation (allo-HSCT). Patients with pulmonary dysfunction surviving longer than 2 years had a 15.1-fold increased risk of late mortality than the general population (1). Late onset non-infectious pulmonary complications can present in different forms, such as restrictive changes on pulmonary function testing (PFT) only, late interstitial pneumonitis (IP), cryptogenic organizing pneumonia (COP), airflow obstruction detected by PFT only, or bronchiolitis obliterans (BO)/bronchiolitis obliterans syndrome (BOS). Both, restrictive or obstructive changes can occur isolated or in combination (2)(3)(4)(5). Although the only currently accepted form of chronic graft-vs-hostdisease (cGvHD) of the lung is BO/BOS, it seems that all forms can occur associated with cGvHD and, although not pathophysiologically fully understood, may reflect potential overlapping forms or different phenotypes of pulmonary cGvHD (2)(3)(4)(5)(6)(7)(8)(9)(10)(11). BO/BOS is presumably the most detrimental form characterized by frequent non-responsiveness to treatment, progressive clinical course, and irreversibility, all of which contributes to its high morbidity and mortality (12,13).
cGvHD is a major complication in long-term survivors after allo-HSCT (1,14,15), with a 6-year incidence of up to 61% in patients receiving peripheral blood stem cells (PBSC) (15) and with relevant impact on the quality of life in many patients (16)(17)(18).
The lung function score (LFS) combines the forced expiratory volume in the first second (FEV 1 ) and the diffusion capacity of the lung for carbon monoxide corrected for hemoglobin level (cDLCO) in an equally distributed manner. The LFS was first proposed by Parimon et al (19) as an approach to correlate PFT results prior to allo-HSCT with the clinical outcome. Later it was modified into more precise subcategories by the National Institute of Health (NIH) Consensus Development Project on the criteria for clinical trials in cGvHD (Table 1) and suggested as a score to quan-tify pulmonary cGvHD and evaluate the effect of cGvHD treatment (20). In our clinical practice, we have seen cDL-CO decreasing already after induction of treatment and remaining low for several months after allo-HSCT without obvious impact on the outcome. Therefore the aim of this study was to evaluate the association of pre-and post-HSCT LFS, defined according to the NIH consensus development project definition (20), and the LFS constituting parameters cDLCO and FEV 1 individually, with overall survival (OS) and development of cGvHD after allo-HSCT.

Patient characteristics
This retrospective single-center study included 241 out of 247 adult patients of Caucasian origin who received allo-HSCT at the University of Regensburg Medical Center, Regensburg, Germany between June 1998 and July 2005; 6 patients were excluded due to missing data on pulmonary function before allo-HSCT. The median follow-up was 711 days (range, 22-3091 days) and the last day of recording data was March 31, 2007. Mean age was 44.5 years, 39% of patients were female and 69% male, 48% had a related and 52% an unrelated donor, 46% of patients had Eastern Cooperative Oncology Group (ECOG) index 0, 46% had ECOG index 1, and only 2% had ECOG index 2.
Prior to transplant, patients gave informed consent on the use of patient-and treatment-related information for retrospective analyses and publication. Standard myeloablative conditioning regimens consisted mainly of 8-12 Gy fractionated total body irradiation followed by high dose cyclophosphamide +/− fludarabine or classic busulfan/cyclophosphamide, whereas reduced intensity conditioning (RIC) consisted mainly of the FBM (fludarabine/BCNU/melphalan) regimen (21). T-cell depletion for unrelated donor HSCT was performed by serotherapy with antithymocyte globulin (ATG) in 147 patients, with alemtuzumab in 4 patients, and with ex vivo selection of donor CD34+ cells in 24 patients. The severity of acute GvHD was graded from 0 to 4 using the Glucksberg scale (22). cGVHD was classified into no, limited, and extensive disease according to Shulman et al (23) and grouped by the presence or absence of cGvHD (Table 2).
PFT was scheduled before allo-HSCT and 3, 6, 9, and 12 months after transplant. Thereafter, patients were supposed to return to the center at 6-month intervals for follow-up or at shorter intervals if clinical complications were present. PFT was performed in our center according to the guidelines of the European Respiratory Society using the MasterScreen Body (Viasys Health Care, Würzburg, Germany) including spirometry, body plethysmography, and diffusion capacity measurements using the single breath method. The data were digitally stored. The following variables were considered longitudinally: vital capacity (VC), total lung capacity (TLC), FEV 1 , FEV 1 /VCratio, and the diffusion capacity using the single-breath method (DLCO). This study focused only on FEV 1 and cDL-CO. Because LFS is composed of percentage of predicted values of cDLCO and FEV 1 , we also used percentages of predicted values for better comparability. Predicted val-   (24,25) and DLCO was adjusted to the hemoglobin level (cDLCO).

Statistical analysis
All statistical analyses were performed using SPSS, 23.0 (IBM, Corporation, Armonk, NY, USA). Χ 2 test was used to compare two categorical variables and analysis of variance was used to compare multiple categorical variables. Brown-Forsythe test was used if homoscedasticity was not assumed. Post-hoc analysis was done with the Scheffé procedure or, in case of unequal distribution of variances, with Dunnett-T3 test. For description of the time course of pulmonary function parameters matched-pair analysis was used. For OS, actuarial curves were obtained by the Kaplan-Meier analysis and compared using the log-rank test. To assess the relation between LFS, cDLCO, and FEV 1 and the development of cGvHD, Cox-regression analysis was used. Stem cell source, GvHD prophylaxis, acute GvHD, related or unrelated donor, female donor into male recipient, reduced intensity or myeloablative conditioning, busulfan in the conditioning regimen, ECOG before HSCT, CMV-reactivation risk, thoracic radiation, total body irradiation, history of smoking, age over 40 years, and Tcell depletion were tested in a forward and backward analysis as covariates. As acute GvHD (none or grade 1 vs cDLCO and FEV 1 showed a weak but significant positive correlation before allo-HSCT (r = 0.4421; Figure 2A), at 3 (r = 0.3773; Figure 2B), at 6 (r = 0.4016; Figure 2C) and at 12 months after allo-HSCT (r = 0.3135; Figure 2D), and chang-es in cDLCO more than those in FEV 1 contributed to an increase in LFS. We next determined the influence of pre-transplantation PFT parameters on clinical outcome. Pre-HSCT cDLCO showed no linear relation with OS ( Figure 3A). Yet, patients with cDLCO<50% of predicted value had significantly lower OS than patients with cDLCO≥50% (20.0% vs 41.1%, P = 0.025, Figure 3C). After we classified pre-HSCT FEV 1 values by 10% increments, a trend but not a significant impact of decreased FEV 1 on OS was observed (P = 0.052, Figure 3B). However, patients with pre-HSCT FEV 1 <60% of predicted value had significantly shorter OS than patients with pre-HSCT FEV 1 ≥60% (0% vs 38.4%, P = 0.040, Figure 3D).
After allo-HSCT, no relation between OS and cDLCO was seen at 3 (P = 0.187; Figure 3E) and 12 months (P = 0.090; Figure 3G). In contrast, decreased FEV 1 demonstrated a significant relation with OS at both time points (both P < 0.001, Figure 3F+H).
Although no significant relation was found between pre-HSCT LFS and OS, shorter OS was observed with an increase in LFS grade, but the trend was not significant (5-year OS LFS I: 41.2%; LSF II: 36.8%; LFS III: 26.7%, P = 0.191, Figure 4A), suggesting LFS≥III can be considered a predictive threshold of shorter survival. Patients with a pre-HSCT LFS III/IV had a shorter overall survival than patients with pre-HSCT LFS I/II (307 vs 918 days respectively, P = 0.069, Figure 4B). OS was significantly shorter in patients with a baseline LFS III compared to patients with LFS I (median OS 307 vs 2208 days, P = 0.037, not shown).

Relationship between cGvHD and lFS
LFS has been proposed as a parameter in the assessment of chronic pulmonary GvHD (20). Therefore, we tested whether LFS values predicted the occurrence of cGvHD in our patient cohort. Of the 241 patients, 109 (45%) developed cGvHD, 14.7% until day +120, 25% until day +142, 50% until day +180, 75% until day +229, and 87% after one year (median time of onset: 180 days, range 94-1912 days). As mentioned above, acute GvHD (none or grade 1   duced intensity vs myeloablative), there was still a relation between decreased FEV 1 and development of cGvHD but it was not significant anymore (P = 0.069, Table 3), which might be due to very small number of patients with decreased FEV 1 .
One year after allo-HSCT, cDLCO was evaluated in 111 and FEV 1 in 116 patients, out of these 71 and 72, respectively, developed cGvHD. Both adjusted and unadjusted Cox-regression model showed a significant influence of LFS on the development of cGvHD (Table 3, P = 0.002 for both). In the unadjusted model decreased FEV 1 showed a trend toward a relation with the occurrence of cGvHD (P = 0.107, not shown), which became almost significant in the adjusted model (Table 3; P = 0.054). There was no relation between cDLCO and cGVHD development.
Eighteen months after allo-HSCT, cDLCO was evaluated in 104 and FEV 1 in 106 patients. Out of these, 67 patients developed cGvHD at the time of PFT or subsequently. Only FEV 1 showed a significant influence on cGvHD in the un-adjusted and adjusted model (not shown, P = 0.036 and 0.009. respectively).
Decrease in FEV 1 after day +90 and incidence of cGvHD We further determined the difference in FEV 1 at day +180 and day +365 compared to day +90, as well as at day +180 compared to day +365, and considered a 10% decrease as relevant. A decrease of more than 10% from day +90 to day +180 was seen in 19 out of 83 patients in whom PFT was done; from day +90 to day +365 in 23 of 87 patients; and from +180 and day +365 in 16 of 89 patients.
The incidence of treatment related mortality did not significantly differ between the patients with a relevant FEV 1 decline and the patients with stable or increased FEV 1 between day +90 and day +180 as well as between day +90 and day +365, but it increased in patients with a FEV 1 decline between day +180 to day +365 from 5.5 to 25% ( Table 4). In addition, 15 patients with a decrease in FEV 1 between day +180 and day +365 had a higher incidence of and day +180 or day +90 and day +365. The incidence of pulmonary cGvHD did not differ in patients with or without FEV 1 decline between day +90 and day +180, whereas the incidence of lung disease in patients with a decline of FEV 1 >10% between day +90 and day +365 was 26.1% compared to 9.4% only in patients with stable FEV 1 . The incidence of cGvHD irrespective of specific organ manifestations did not significantly differ (47.4% vs 59.5% day +90 until +180; 65.2% vs 56.3% day +90 until day +365) between patients with a FEV 1 decline and those with stable FEV 1 (Table 4).

DiSCuSSion
A promising approach to improve the understanding and treatment of cGvHD was the NIH Consensus Development Project on criteria for clinical trials in cGvHD. One goal of this project was to improve the clinical assessment of pulmonary cGvHD by proposing LFS as a grading score for pulmonary cGvHD (20). In the new diagnostic and response criteria of the National Institutes of Health Consensus Development Project, the lung function score (26,27) is no longer recommended and FEV 1 as single parameter to assess GvHD of the lung is suggested (27), which confirms our finding that cDLCO has no relation to the development of cGvHD.
In our study, overall survival was related to FEV 1 and LFS. Pre-HSCT FEV 1 showed a higher influence on overall survival than LFS and cDLCO. FEV 1 <60% and cDLCO<50% were associated with inferior survival, consistent with prior reports (28). Combining pre-HSCT cDLCO with FEV 1 may translate into better ability to identify groups at increased risk for treatment-related mortality, but this is not supported by our data.
Parimon et al (19) demonstrated a stronger relation between OS and a differently defined pre-HSCT LFS in a very large patient cohort. This discrepancy might be explained by the smaller number of patients in our study and differences in LFS categorization (in the study by Parimon et al FEV 1 and cDLCO where scored with 1 for >80%, 2 for 70%-80%, 3 for 60%-70%, and 4 for <60%, composed in a LFS grade of I for 2, II for 3-4, III for 5-6 and IV for 7-8 points).
After allo-HSCT, both decreased FEV 1 and increased LFS levels were associated with shorter OS, suggesting that both FEV 1 and LFS are useful parameters in assessing the impact of pulmonary function loss after allo-HSCT on clinical outcome. Again, while it seems reasonable to hypothesize that the LFS has a higher clinical value compared to the use of FEV 1 alone and this might result from combining the LFS constituting compounds FEV 1 and cDLCO, this was not shown in our study. According to the current guidelines of the ATS/ERS taskforce (29), FEV 1 can be used to measure the severity of obstructive and restrictive changes in pulmonary function, as either corresponds to a decrease in FEV 1 . Pulmonary damage due to different patterns of pulmonary disease will be merged together within the LFS: Airflow obstruction is a common complication after allo-HSCT (30,31), and in some cases evolves from/to BO (32-34); restrictive changes, accompanied by a reduced FEV 1 , have been frequently reported (10,31,32,(35)(36)(37); and a reduced cDLCO has been observed in many patients already prior to allo-HSCT, often followed by a temporary decline and by a partial recovery after transplantation (28,35,38). In addition, decreased cDLCO is found in numerous pulmonary complications following allo-HSCT, not only including late onset noninfectious lung injury, but also early complications such as clinical or subclinical alveolitis and interstitial pneumonitis, pulmonary hemorrhage, engraftment syndrome or pulmonary vascular disease, and presents as reversible pulmonary toxicity secondary to conditioning regimens (4,7,29,36,(39)(40)(41). Consistent with the study by Walter et al (42), we found a significant association of FEV 1 with cGvHD at 6 and 18 months and a strong trend at 12 months after allo-HSCT. Furthermore, the incidence of cGvHD was associated with a decrease of more than 10% FEV 1 at day +365, especially between day +180 and day +365 and resulted in elevated treatment-related mortality and reduced survival. One year after allo-HSCT we also showed a significant relation of LFS with cGvHD. We also showed that cDLCO<50% potentially contributed to the LFS interrelation with cGvHD, but it alone was not related to cGvHD.
In contrast to our study, which showed no significant association between impaired FEV 1 (38,39,43). The relatively early drop in pulmonary function, mainly reflected by a decrease in cDLCO, might be attributed to infectious complication or cytokine-mediated effects after allo-HSCT (4,7,44,45). Walter et al further restricted their data to patients developing cGvHD within one year after HSCT, whereas in our study no such time limit was set. Patients developing cGvHD at later time points can have normal LFS at day +90, therefore showing no relation between day +90 LFS and cGvHD, as observed in our cohort. Furthermore our study population is smaller than the one evaluated by Walter et al (42), therefore our study is potentially underpowered to detect a (minor) predictive role of LFS at 3 months for survival and for Cox-regression models with up to 5 different categories as assumed by inconsistent hazard ratios for FEV 1 at 12 months as well as for FEV 1 , cDLCO, and LFS at 3 months.
Another limitation of our study was that since only patients transplanted until 2005 were included in the analysis, severity grading of cGVHD was not performed according to the NIH consensus (27,46). Conditioning regimens as well as GvHD prophylaxis and treatment approaches may differ between centers, therefore possibly limiting the results of our study. However, up to now calcineurin inhibitor plus methotrexate have remained the gold standard and response rates for second line treatment in steroid re-fractory GvHD rates are similar across different approaches and no definite recommendation as to which is superior can be given.
Also, we compared the lung function with overall cGvHD rather than with lung GvHD. In our cohort of 241 patients, only 24 had symptomatic lung GvHD, therefore statistical analysis has to be interpreted with caution due to small patient number. During the follow-up, FEV 1 decreased slightly, which might be due to long-term toxicity, but also due to mild cGvHD not clinically affecting the lungs or cGVHD resulting in subtle changes within the lung.
This study showed that FEV 1 as a single parameter had a strong association with both OS and cGvHD at most time points before and after allo-HSCT. However cDLCO did not show such an association, which gives only limited support for the application of the LFS as defined by the NIH Consensus Project on cGvHD (20) with respect to its predictive value on transplantation outcome and its relation with cGvHD. Therefore, prospective trials investigating the value of LFS combining FEV 1 and cDLCO as a predictor of treatment response are needed. The presented results further allow to formulate clinically relevant implications, such as a) a regular screening of FEV 1 after allo-HSCT identifies patients with lung manifestations of cGvHD, while cDLCO appears to be only of clinical relevance if <50% of the predicted normal value, b) the assessment of FEV 1 at day +90 is recommended as baseline to assess the toxicity of the conditioning regimen, but is unlikely to detect changes already related to pulmonary cGvHD, c) the majority of patients developing pulmonary cGvHD show a decline of FEV 1 between day +180 and day +365 after allo-HSCT and d) reduction of FEV 1 >10% compared to baseline is associated with increased morbidity and mortality. Additionally, novel parameters like acinar airways ventilation heterogeneity and lung clearance index (47) might evolve as markers for early diagnosis of pulmonary involvement in cGvHD, and their evaluation alone or in combination with LFS or FEV 1 is warranted.
Funding DW received support from the German José Carreras Foundation.
Ethical approval received from the Ethics Committee of the University of Regensburg, Germany.
Declaration of authorship DD contributed to the data evaluation, provided ideas for the study, and contributed to the analysis and writing. She assembled the data set and takes responsibility for data integrity and the accuracy of the data analysis. RR contributed to the data evaluation and provided ideas for the analysis. He takes responsibility for data integrity. CS contributed to data collection, provided ideas for the study, and took part in the writing. DW contributed to data collection and evaluation, provided ideas for the study, and took part in writing. BH contributed to data collection. EH contributed to data collection, provided ideas