Diagnostic Accuracy of Ultrasonographic Respiratory Variation in the Inferior Vena Cava, Subclavian Vein, Internal Jugular Vein, and Femoral Vein Diameter to Predict Fluid Responsiveness: A Systematic Review and Meta-Analysis

This systematic review and meta-analysis aimed to investigate the ultrasonographic variation of the diameter of the inferior vena cava (IVC), internal jugular vein (IJV), subclavian vein (SCV), and femoral vein (FV) to predict fluid responsiveness in critically ill patients. Relevant articles were obtained by searching PubMed, EMBASE, and Cochrane databases (articles up to 21 October 2021). The number of true positives, false positives, false negatives, and true negatives for the index test to predict fluid responsiveness was collected. We used a hierarchical summary receiver operating characteristics model and bivariate model for meta-analysis. Finally, 30 studies comprising 1719 patients were included in this review. The ultrasonographic variation of the IVC showed a pooled sensitivity and specificity of 0.75 and 0.83, respectively. The area under the receiver operating characteristics curve was 0.86. In the subgroup analysis, there was no difference between patients on mechanical ventilation and those breathing spontaneously. In terms of the IJV, SCV, and FV, meta-analysis was not conducted due to the limited number of studies. The ultrasonographic measurement of the variation in diameter of the IVC has a favorable diagnostic accuracy for predicting fluid responsiveness in critically ill patients. However, there was insufficient evidence in terms of the IJV, SCV, and FV.


Introduction
Achieving a satisfactory response to fluid replacement in critically ill patients has remained a challenging issue [1][2][3][4]. An insufficient fluid volume can lead to low cardiac output (CO), which may result in reduced tissue perfusion [1,2]. However, excessive fluid infusion might also be detrimental. As depicted by the Frank-Starling curve, an increase in preload does not correspond to an equal increase in stroke volume (SV) when it reaches the maximum slope and plateau [5]. Excessive fluid volume is a significant risk factor for acute lung injury, bowel edema, and compartment syndrome [6,7]. Therefore, it is crucial to determine whether the patient needs additional fluid or not. However, it is not easy to precisely predict fluid responsiveness before fluid administration because the etiology of shock associated with diverse aspects of fluid balance is difficult to ascertain. This is despite the fact that many clinical manifestations of shock such as low blood pressure, tachycardia, altered mental state, cool clammy skin, or low urine output have been described [8]. To evaluate fluid responsiveness, information about the increase in SV after fluid challenge is useful. Invasive methods used in the past include the Swan-Ganz catheter, which directly measures capillary wedge pressure, and has been the gold standard for CO or SV measurement [9]. However, it is a substantially invasive and difficult procedure, especially in patients with cardiovascular instability [4,9]. Therefore, to overcome the shortcomings of the Swan-Ganz measurement method, a pulse wave analysis method that attempts to measure CO or SV has been proposed [10]. Other minimally invasive and non-invasive methods such as arterial pulse wave analysis also have several limitations in terms of artifact validation, arterial compliance, alteration in vasomotor tone, or non-pulsatile blood flow [10]. Moreover, the above-mentioned techniques are unable to predict fluid responsiveness before the fluid challenge.
Recently, several researchers have applied a point-of-care ultrasound for critically ill patients [11]. Ultrasonography is non-invasive, and its cost is relatively low. It can measure SV effectively [11]. In addition, ultrasonography can detect the variation of IVC diameter (∆IVC), which reflects the cardiac preload [11,12]. By measuring the cardiac preload, it is possible to predict volume status and fluid responsiveness. The measurement of the IVC diameter is easily performed via a subxiphoid view even by non-highly trained operators, whereas measuring the SV via an echocardiogram requires an experienced intensivist or cardiologist. In addition, the internal jugular vein (IJV), subclavian vein (SCV), and femoral vein (FV) are easier to visualize because they are more superficial than the IVC. The diameter of the IVC varies with inspiration and expiration [12]. In patients who breathe spontaneously, intrathoracic pressure decreases during inspiration; this results in accelerated venous return. During expiration, intrathoracic pressure increases, and venous return decreases [12]. Consequently, the IVC diameter decreases during inspiration and increases during expiration. When mechanical ventilation is employed, this phenomenon reverses. However, the ∆IVC is not always visible in patients with obesity, intraabdominal fluid collection, or bowel gas. Thus, other large veins might be used as alternatives in such patients.
To date, several meta-analyses have demonstrated that ∆IVC showed favorable outcomes [12,13]. However, these evaluated data up to 2017, and many more studies have been published since then. Moreover, there has been no systematic review and meta-analysis regarding other large veins such as the IJV, SCV, or FV. To update the evidence on ∆IVC and explore its alternatives, we conducted a systematic review and meta-analysis for respiratory variation in the diameters of the IVC, IJV, SCV, and FV.

Published Study Search and Selection Criteria
This study was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis of Diagnostic Test Accuracy (PRISMA-DTA) statement published study search and selection criteria [14]. The preset protocol of this study was registered on PROSPERO (CRD42020206037, https://www.crd.york.ac.uk/prospero/, last accessed date: 21 October 2021). Relevant articles were obtained by searching PubMed, EM-BASE, and Cochrane databases through 21 October 2021. These databases were searched using the following keywords: "((subclavian vein) OR (inferior vena cava) OR (internal jugular vein) OR (femoral vein)) AND ((fluid responsiveness) OR volume) AND (diameter OR collapsibility OR measurement) AND (ultrasonography OR ultrasound OR sonography OR sonographic OR (point of care))". We also manually searched the reference lists of relevant articles. The titles and abstracts of all searched articles were screened for exclusion. Review articles and previous meta-analyses were also screened to obtain additional eligible studies. The search results were then reviewed and articles were included if the study investigated the diagnostic accuracy of the IVC, SCV, IJV, and FV to predict fluid responsiveness.
The inclusion criteria for diagnostic test accuracy (DTA) reviews were as follows: (1) the study population included patients who received fluid replacement due to sepsis, hypovolemia, or circulatory failure; (2) an ultrasonographic measurement of the respiratory variability of the IVC, SCV, IJV, and FV diameter was performed as an index test; (3) tests that enabled measurement of fluid responsiveness were performed for reference standard; (4) the primary outcome of the study was the diagnostic accuracy of ultrasonographic respiratory variability of the diameters of IVC, SCV, IJV, and FV to predict fluid responsiveness; (5) adequate information was provided to build a 2-by-2 contingency table consisting of true positive (TP), false positive (FP), false negative (FN), and true negative (TN) outcomes. Articles that involved another disease, those that did not provide 2-by-2 contingency table information, non-original articles, non-human studies, pediatric studies, or those published in a language other than English were excluded.

Data Extraction
Data from all eligible studies were extracted by two investigators. Extracted data from each of the eligible studies included: the first author's name, year of publication, study location, study design and period, number of patients analyzed, measured vein, index test, threshold of index test, reference standard, device used for the reference standard, threshold of reference standard, and fluid responsiveness. The number of TP, FP, FN, and TN for the index test in predicting fluid responsiveness were collected. If the eligible study reported multiple thresholds and accuracy of the index test, we extracted the subset with optimal threshold or highest performance.

Quality Assessment
All studies were independently reviewed by two investigators. Disagreements concerning the study selection and data extraction were resolved by consensus. As recommended by the Cochrane Collaboration, the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 tool was used to evaluate the risk of bias in DTA [15]. Disagreements in this regard were resolved by discussion with the third independent author. The QUADAS-2 assesses four domains for bias and applicability as follows: (1) patient selection; (2) index test; (3) reference standard; (4) flow and timing.

Statistical Analysis
For statistical analysis using meta-analysis, we used the "metandi" and "midas" modules of Stata version 17.0 (Stata Corporation, College Station, TX, USA) and "mada" package of the R programming language, version 4.0.3 (R foundation, Vienna, Austria). QUADAS-2 assessment was performed using Review Manager Software 5.4 (The Cochrane Collaboration, Oxford, Copenhagen, Denmark). We constructed a 2-by-2 contingency table (TP, FP, FN, TN) by calculating or extracting from each primary study. For rigorous statistical analysis and heterogeneity across the studies, we used both the hierarchical summary receiver operating characteristics (HSROC) model [16] and the bivariate model [17]. A bivariate mixed-effects regression model for the synthesis of diagnostic test data and the derived logit estimates of sensitivity, specificity, and respective variances was used to construct a hierarchical summary ROC curve [17]. The HSROC model assumes that there is an underlying ROC curve in each study with parameters that characterize the accuracy and asymmetry of the curve [16]. An area under the ROC curve (AUROC) close to 1 and 0.5 indicated a strong test and poor test, respectively. Results with p-values < 0.05 were considered statistically significant. To investigate the heterogeneity, I2 was calculated from results as I2 = 100% × (Q − df)/Q, where Q is Cochran's heterogeneity statistics and df is the degree of freedom [18]. I2 lies between 0% and 100%. A value of 0% indicates no observed heterogeneity and values greater than 50% are considered to indicate substantial heterogeneity. To detect the threshold effect, Spearman's correlation coefficient between sensitivity and specificity was calculated after logit transformation. The HSROC shape (asymmetry) parameter was β (beta), where β = 0 corresponds to a symmetric ROC curve in which the diagnostic odds ratio does not vary along the curve [16]. Due to the trade-off between sensitivity and specificity, we used bivariate random-effects modeling of sensitivity and specificity as we expected that this pair of performance measures will be interdepen-dent. We used the bivariate box plot that describes the degree of interdependence including the central location and identification of any outliers [19]. The inner oval represents the median distribution while the outer oval represents the 95% confidence bound. The skewness provides indirect evidence of some threshold variability [19]. A multiple univariable bivariate meta-regression was conducted to investigate the possible source of heterogeneity. Covariates were manipulated as mean-centered continuous or dichotomous (yes = 1. No = 0) fixed effects. Publication bias was first assessed visually using a scatter plot. We used the diagnostic log odds ratio (lnDOR), which should have a symmetrical funnel shape when publication bias is absent [20]. Formal testing for publication bias was conducted by the regression of lnDOR against the square root of the effective sample size, with p < 0.05 for the slope coefficient indicating significant asymmetry [20].

Selection and Characteristics
A total of 1587 studies were identified through searching databases. After removing duplicates, 1136 studies were retrieved. We excluded 1044 studies through a title and abstract review because they were non-original (n = 236), studied other diseases (n = 575), were non-human studies (n = 11), or were written in a non-English language (n = 71). We reviewed 92 full-text articles. After the full-text review, 62 articles were excluded due to insufficient data (n = 36), lack of 2-by-2 data (n = 25), and not being original (n = 1). Finally, 30 studies  comprising 1719 patients were included in this review ( Figure 1); detailed information about the eligible studies is shown in Table 1. In cases of the IJV [29,36,42], FV (was not detected), and SCV [41], we were not able to conduct the meta-analysis due to an insufficient number of studies. Two studies [36,42] reported on both the IVC and IJV. He et al. [45] reported on three subsets according to tidal volume (TV) (6 mL/kg, 9 mL/kg, 12 mL/kg) and the subset of 9 mL/kg TV showed the highest AUROC. Thus, we extracted the subset of 9 mL/kg TV. Three studies [39,40,48] reported on subsets of patients with standardized breathing and spontaneous breathing. We extracted subsets of spontaneous breathing because other studies included only patients with spontaneous breathing. Corl et al. [44] reported results obtained by both experts and novices. We extracted the results of experts because other studies were conducted by experts. One study by Blavius [50] was a comparative study between artificial intelligence and human. We extracted the result of the training dataset by humans because the number of test datasets was much smaller than the test set (20 vs. 175). Caplan et al. [48] reported different results according to the measuring site (1, 2, 3, and 5 cm apart from the aortocaval junction). We extracted a subset of 3 cm from the aortocaval junction because it was similar to other eligible studies.

Meta-Regression, Subgroup Analysis, and Evaluation of Heterogeneity
The univariable meta-regression and subgroup analysis using possible confounders are summarized in Table 3. We conducted the subgroup analysis according to possible confounders as follows: ΔIVC, IVC collapsibility index, reference test, ICU admission, sepsis, fluid infusion, mechanical ventilation, and the heterogeneity on a bivariate boxplot. In

Publication Bias
In Deek's funnel plot using the diagnostic odds ratio, there was no asymmetry on visual inspection ( Figure 5). There was also no statistically significant asymmetry (p = 0.66).

Publication Bias
In Deek's funnel plot using the diagnostic odds ratio, there was no asymmetry on visual inspection ( Figure 5). There was also no statistically significant asymmetry (p = 0.66).

Quality Assessment
The details of the quality assessment are depicted in Figure 6. In terms of patient selection, the risk of bias was unclear in nine studies (30.0%) [21,22,33,35,40,41,45,48,50]. Consequently, these studies showed no consecutive patient selection or no description of

Quality Assessment
The details of the quality assessment are depicted in Figure 6. In terms of patient selection, the risk of bias was unclear in nine studies (30.0%) [21,22,33,35,40,41,45,48,50]. Consequently, these studies showed no consecutive patient selection or no description of it. In other domains of QUDAS-2 assessment, all studies showed a low risk of bias.

Discussion
Our results suggest that the diagnostic accuracy of ultrasonographic ∆IVC for predicting fluid responsiveness is acceptable. The pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnostic odds ratio, and AUROC of ∆IVC were 0.75, 0.83, 4.37, 0.30, 14.3, and 0.86, respectively. In the subgroup analysis, there was no difference between patients on MV and those breathing spontaneously. Despite the systematic review, we found only three studies on the IJV and one on the FV. We found no study on the SCV. There was insufficient evidence to support the diametric measurement of these large veins as an alternative to that of the IVC. More prospective studies are warranted, which should consider the threshold of the index test and the heterogeneity of the reference standard.
Recently, several previous systematic reviews and meta-analyses were conducted to investigate the diagnostic accuracy of ∆IVC. Orso et al. [12], in a meta-analysis including 20 studies with ∆IVC, reported that the pooled sensitivity, specificity, and AUROC were 0.71, 0.75, and 0.71, respectively. They included several studies of pediatric patients, whereas we excluded these studies. Si et al. [13], in a meta-analysis including 12 studies comprising only patients on MV, reported a sensitivity, specificity, and AUROC of 0.73, 0.82, and 0.85, respectively. In our subgroup analysis, studies comprising patients on MV showed a sensitivity, specificity, and AUROC of 0.74, 0.85, and 0.87, respectively, whereas studies comprising patients with spontaneous breathing showed similar results, with a sensitivity, specificity, and AUROC of 0.75, 0.81, and 0.85, respectively. Si et al. [13] concluded that ∆IVC was a poor predictor in patients with TV < 8 mL/kg or PEEP > 5 cm H 2 O through subgroup analysis (k = 6) (sensitivity, specificity, and AUROC of 0.66, 0.68, and 0.70, respectively). However, in our subgroup analysis (k = 5), ∆IVC in this setting showed better results, which were a sensitivity, specificity, and AUROC of 0.73, 0.77, and 0.82, respectively. In our analysis, similar to that of Si et al. [13], the performance of ∆IVC was higher in patients with TV ≥ 8 mL/kg or PEEP ≤ 5 cm H 2 O (sensitivity, specificity, and AUROC of 0.74, 0.88, and 0.90), but the meta-regression test did not show a significant difference (p = 0.31). Overall, compared with previous meta-analyses [12,13], we updated our interpretation with data from 11 studies that have been published since 2018. However, two studies [51,52] in the previous meta-analyses by Orso et al. [12] and one study [53] in the other meta-analysis by Si et al. [13] did not have 2-by-2 contingency data in our recalculation. Thus, we excluded these three studies. Only one previous meta-analysis investigated the IVC diameter, without a delineation of respiratory variation [54]. They analyzed two case-control and three before-and-after studies. They found a significantly lower diameter of the IVC in hypovolemic status and the mean difference was 6.3 mm (95% CI, 6.0-6.5). However, this effect size is apparently too small to use in clinical practice. Indeed, the inherent size of the IVC may vary in each patient. Similar static index tests, such as central venous pressure, showed no clinical significance in the previous study [55]. Since this study was published, there has been no meta-analysis investigating the IVC diameter alone. In common with ∆IVC, a more dynamic index would be appropriate for evaluating volume status.
Due to the limited number of studies that met our inclusion criteria, we did not conduct the meta-analysis for the IJV, SCV, and FV. We found only three studies that evaluated the IJV [29,36,42]. The specificity, sensitivity, and AUROC of these studies were sufficiently high for predicting fluid responsiveness. The AUROC of the IJV ranged from 0.825 to 0.915. The AUROC of the SCV was also sufficient, with a value of 0.970. Both the IJV and the SCV are located in proximity of the right atrium. Thus, these would be alternative vessels to investigate. However, the FV would be limited due to its distance from the right atrium. One eligible study in our meta-analysis reported a strong correlation between IVC-CI and IJV-CI (r = 0.976, n = 44) [36]. One study that was excluded because there was no fluid challenge, reporting a moderately strong correlation between IVC-CI and SCV-CI (r = 0.781, n = 34) [56]. In the case of the FV, only one study was excluded because it reported only a modest correlation between IVC-CI and FV-CI (r = 0.642, 57) [57]. In future reviews, the IJV and SCV need to be further investigated.
In the eligible studies of our meta-analysis, several conventional reference standards were used after fluid loading to determine the fluid responsiveness. The increase in CO or SV was considered as a response to fluid replacement. Therefore, the accurate measurement of CO or SV is crucial. To measure CO or SV, the most reliable method is the insertion of a Swan-Ganz catheter [4]. This involves an injection of ice-cold water into the right atrium through a pulmonary artery catheter and measurement of CO or SV using the temperature change [58]. It measures SvO2 to reflect accurate, real-time change in hemodynamics [59]. However, it is a difficult technique to perform in practice, especially if indicated often, and has limitations because it is invasive and even more difficult to perform in the presence of arrhythmias, pulmonary infarction, or catheter injury with vascular complications [60]. In our analysis, no study used a Swan-Ganz catheter as a reference standard. Another way to measure CO or SV is to extract the arterial waveform. Since the SV is estimated using the area under the dicrotic notch at the start of the rise of arterial pressure, the SV can be calculated for every heartbeat [59,61]. VigileoTM (Edwards life science, Irvine, CA, USA), MostCare™ (Vytech, Padova, Italy) using PRAM (Pressure Recording Analytical Method), and PiCCO ® (Pulsion Medical Systems, Munich, Germany), which uses blood pressure waveforms, were proposed as less invasive methods [60,61]. These involve the insertion of a central venous catheter and a relatively small-sized device, approximately 4-5 Fr, into the artery, and allow the monitoring of continuous values even when the patient is unstable. The arterial waveform analysis method is less invasive than the Swan-Ganz method. However, re-calibration is required every 6-12 h in the case of vascular elasticity, aortic insufficiency, or inaccurate arterial pressure waveforms [60,61]. The method of measuring CO or SV using echocardiography involves measuring the velocity-time integral using the diameter of the left ventricular outlet and Doppler ultrasound [62]. Echocardiography is useful because it can also provide the differential diagnosis of cardiac dysfunction and hypovolemia by measuring chamber size and cardiac function. However, it is not able to detect continuous changes like the Swan-Ganz catheter and should be performed by an expert who has a high level of experience in general [63]. The bioimpedance method can measure the CO or SV only by direct contact [64]. The fluctuation of the volume of the body with pulsatile changes results in electrical impedance, and the variation of the systolic period is measured, allowing the value of the CO or SV to be monitored [65]. However, reliability is limited in some critically ill patients, and appropriate improvements are likely to be necessary in future studies. Evidence for the superiority of one method over another from the above techniques is limited [10]. We assumed that these reference standards have similar diagnostic accuracy.
Our analysis has several limitations. First, all eligible studies were observational. Second, several eligible studies have an unclear risk of bias in terms of patient selection. Third, the threshold of the index test varied and there was considerable heterogeneity. To overcome this issue, we investigated the correlation between sensitivity and specificity to detect the threshold effect. Fourth, the reference standard was heterogeneous. We also conducted a meta-regression test to evaluate the heterogeneity. Fifth, both patients on MV and those breathing spontaneously were included, although the physiology of the two is antonymous. We conducted a meta-regression, which showed no significance. Sixth, we did not find sufficient eligible studies involving the IJV and SCV. We found only three studies on the IJV. We did not conduct the meta-analysis due to statistical instability. We found no study that measured the respiratory variation of the FV diameter. Future studies are needed to investigate and correct the above deficiencies. Seventh, there would exist a "grey zone" to discriminate response to fluid resuscitation even though the ∆IVC is an easy-to-determine quantitative variable. Thus, integrating an additional qualitative sonographic evaluation may be more helpful in future study [66]. Finally, we included only published original articles and those written in English. This would be expected to introduce publication bias; however, this was not noted in our analysis.

Conclusions
Our systematic review and meta-analysis suggest that the ultrasonographic measurement of the respiratory variation in the diameter of the IVC has a favorable diagnostic accuracy for predicting fluid responsiveness in critically ill patients. However, we concluded that there is insufficient evidence in the case of the IJV, SCV, and FV diameters to have clinical application.