A Large-Scale Proteomics Resource of Circulating Extracellular Vesicles for Biomarker Discovery in Pancreatic Cancer

Pancreatic cancer has the worst prognosis of all common tumors. Earlier cancer diagnosis could increase survival rates and better assessment of metastatic disease could improve patient care. As such, there is an urgent need to develop biomarkers to diagnose this deadly malignancy earlier. Analyzing circulating extracellular vesicles (cEVs) using ‘liquid biopsies’ offers an attractive approach to diagnose and monitor disease status. However, it is important to differentiate EV-associated proteins enriched in patients with pancreatic ductal adenocarcinoma (PDAC) from those with benign pancreatic diseases such as chronic pancreatitis and intraductal papillary mucinous neoplasm (IPMN). To meet this need, we combined the novel EVtrap method for highly efficient isolation of EVs from plasma and conducted proteomics analysis of samples from 124 individuals, including patients with PDAC, benign pancreatic diseases and controls. On average, 912 EV proteins were identified per 100μL of plasma. EVs containing high levels of PDCD6IP, SERPINA12 and RUVBL2 were associated with PDAC compared to the benign diseases in both discovery and validation cohorts. EVs with PSMB4, RUVBL2 and ANKAR were associated with metastasis, and those with CRP, RALB and CD55 correlated with poor clinical prognosis. Finally, we validated a 7-EV protein PDAC signature against a background of benign pancreatic diseases that yielded an 89% prediction accuracy for the diagnosis of PDAC. To our knowledge, our study represents the largest proteomics profiling of circulating EVs ever conducted in pancreatic cancer and provides a valuable open-source atlas to the scientific community with a comprehensive catalogue of novel cEVs that may assist in the development of biomarkers and improve the outcomes of patients with PDAC.


INTRODUCTION 58 59
Pancreatic ductal adenocarcinoma (PDAC) has the worst prognosis of all common tumors, with 60 a 5-year survival of 12% 1 . With rising incidence, it is expected that PDAC will become the 61 second leading cause of cancer-related deaths by 2030 2 . A critical factor for this dismal 62 development is the late diagnosis, with less than 20% of patients presenting with a potentially 63 resectable and curable tumor [3][4][5] . Earlier cancer diagnosis could increase the survival rates by an 64 estimated 5-fold, and more reliable and real-time assessment of treatment effects in patients 65 with cancer could improve quality of life and reduce healthcare costs 6,7 . Unfortunately, there 66 are no credentialed serologic biomarkers with high enough performance to assist in the early 67 detection of PDAC. The best-established biomarker for PDAC, carbohydrate antigen 19-9 (CA19-68 9), is fraught with poor sensitivity and specificity and is only used for monitoring disease on 69 treatment or after surgical resection 8,9 . 70 71 Extracellular vesicles (EVs), including exosomes and microvesicles, are nanosized particles 72 released by most cell types and can be detected in the circulation 10 . EVs play important roles in 73 the transmission of oncogenic and inflammatory signals 11 , communications between cells and 74 their microenvironment 12 . In addition, exoDNA, exoRNA and protein profiles highly reflect 75 parental cells, therefore offering an attractive strategy for diagnosing cancers non-invasively by 76 analyzing EVs in the circulation 11,13 . Previous studies employed EVs to discover biomarkers for 77 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 PDAC 13-16 , however those discovery proteomics experiments were carried out using cell lines or 78 tumor tissue, which are not representative of the heterogeneity of human PDAC and are unable 79 to recapitulate the systemic responses to cancer [14][15][16] . In addition, the EV biomarkers discovered 80 in those studies have been compared only against healthy controls [14][15][16] . It is unclear how they 81 would perform in subjects with underlying benign diseases of the pancreas, which is highly 82 desirable from the clinical standpoint as many patients with PDAC have underlying chronic 83 pancreatitis and cysts. 84

85
To meet this need, we conducted a large EV proteomics study from peripheral blood across a 86 range of patients with pancreatic cancer, benign pancreatic diseases such as chronic 87 pancreatitis and intraductal papillary mucinous neoplasm (IPMN), and healthy controls. 88 Circulating EV (cEV) proteins detected included those involved in metabolism and immune 89 regulation, in addition to proteins involved in protein binding, exocytosis, endocytosis and 90 regulation of cellular protein localization that have been identified in previous studies 17,18 . Our 91 study represents the largest EV proteomics profiling dataset of pancreatic cancer, therefore 92 providing a resource to the research community to further the understanding of EV roles in 93 pancreatic cancer biology and biomarker discovery. To demonstrate the utility of this dataset, 94 we identified multiple candidate biomarkers for cancer diagnosis and validated several in an 95 independent cohort of patients with pancreatic cancer. In addition, we identified a set of cEV 96 proteins associated with metastasis and prognosis which could provide a valuable resource for 97 future biomarker studies as well as understanding of systemic changes during pancreatic tumor 98 progression. 99 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 100 101 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Proteomics Characterization of Circulating EVs 104
In this study, we sought to identify proteins in extracellular vesicles in the blood that may be 105 used as biomarkers for the diagnosis and prognosis of pancreatic cancer. With the approval of 106 our institutional review board, we enrolled a total of 124 patients to the discovery cohort of 107 this biomarker study (Methods and Supplementary isolation, which are not scalable for large clinical studies. As described in recent reports, EVtrap 119 is a magnetic bead-based isolation method that enables highly efficient capture of EVs from 120 biofluids, confirmed by multiple common EV markers [19][20][21][22][23] . Over 95% recovery yield can be 121 achieved by EVTRAP with less contamination from soluble proteins, a significant improvement 122 over current commercially available methods as well as ultracentrifugation 19,21,24 . 123 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 124 Following EV isolation, samples were digested in-solution and analyzed by liquid 125 chromatography-tandem mass spectrometry (nanoLC-MS/MS) on a high-resolution mass 126 spectrometer (Q-Exactive HF-X). The workflow for cEVs isolation and enrichment and 127 subsequent cEV mass spectrometry analysis is illustrated in Figure 1. 128 129 First, to confirm that EVtrap can efficiently isolate extracellular vesicles from plasma, a test 130 plasma sample was processed to remove platelets and other large particles and enriched for 131  Table 2). These results 139 provided the confidence to proceed with the analysis of our discovery set of plasma from 124 140 subjects. In this cohort, we identified 1,708 unique proteins (Supplementary Table 3). The 141 number of unique EV proteins detected per 100µL of plasma sample varied from 817 to 1,128, 142 with an average of 912 unique proteins per sample (Supplementary Figure 2B). We did not 143 observe differences between non-tumor and tumor samples regarding the overall number of 144 EV proteins identified. Within the PDAC group, we did not observe significant differences in the 145 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 average number of EV proteins detected for different disease stages. Collectively, these data 146 demonstrate high reproducibility of EV isolation and robust label-free MS quantification of 147 cEVs. 148 149

Diseases of the Pancreas Express Distinct Circulating EV Proteome Compared to Controls 150
Next, we aimed at identifying specific cEV proteins associated with clinical parameters with the 151 potential to serve as diagnostic biomarkers. We first compared the proteomics profile of 152 individuals with underlying pancreatic diseases (PDAC, chronic pancreatitis and IPMN) against 153 healthy controls. We selected EV proteins expressed in at least 50% of subjects in the disease 154 group with a fold change of expression ≥2 or ≤2 compared to controls and p-value ≤0.01 after 155 adjusting for multiple testing. A total of 207 proteins were identified that met the criteria, with 156 the largest number of differentially expressed markers in PDAC (176), followed by chronic 157 pancreatitis (55) and IPMN (3) (Supplementary Table 4). Principal component analysis (PCA) of 158 these markers showed control samples as a tight cluster segregated away from PDAC samples 159 but closer to IPMN and chronic pancreatitis patients (Figure 2A). 160 161

Circulating EV Proteome Discriminates Pancreatic Cancer from Benign Pancreatic Diseases 162
To further assess the potential of cEV proteins for cancer detection, we compared proteomic 163 profiles of cEVs between patients with PDAC with those with underlying benign diseases of the 164 pancreas (chronic pancreatitis and IPMN). We identified 182 differentially expressed proteins in 165 malignant cases (92 over-expressed and 90 with reduced expression) (Supplementary Table 5). 166 Several of those markers had remarkable overexpression in PDAC (greater than 10-fold), 167 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 including PDCD6IP, SERPINA12, RUVBL2, among others, as shown in the volcano plot ( Figure  168 2B). Unsupervised clustering showed a clear separation between PDAC and benign pancreatic 169 diseases. Individuals with IPMN were more closely related to controls, whereas chronic 170 pancreatitis cases were more related to PDAC ( Figure 2C). In addition, the PDAC cohort was 171 separated into two subgroups: the first, enriched for early-stage tumors and more closely 172 related to the other pancreatic diseases (chronic pancreatitis and IPMN); the second, enriched 173 for advanced and metastatic cases with expression profiles further apart from early-stage 174 cancer and pancreatic diseases ( Figure 2C). We further noticed that some proteins such as 175 PDCD6IP, SERPINA12, KRT20 showed statistically significant population-wise enrichment in 176 pancreatic cancer compared to benign pancreatic diseases ( Figure 2D). Together, these data 177 indicate the existence of a set of EV markers that can separate controls, benign and malignant 178 pancreatic diseases, as well as proteins that separate early versus late-stage PDAC, suggesting 179 their potential to serve as diagnostic biomarkers. 180 181

Functional and Systems Biology of cEV Proteome 182
To gain molecular insight into the functions of the previously identified 182 proteins 183 differentially expressed in pancreatic cancer as compared to benign pancreatic diseases, we 184 conducted pathway analysis using the Gene Ontology (GO) and REACTOME databases 185 Table 5). We identified protein modules in protein localization, biomolecule 186 binding/docking, peptidase activities among changes enriched in PDAC compared to benign 187 diseases (Supplementary Figure 3). Interestingly, KRT20 (keratin 20), a gastrointestinal 188 epithelia-associated keratin, was increased in PDAC patient EVs, while keratins associated basal 189 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Circulating EV Proteomics Reveal Markers Associated with Metastasis and Worse Prognosis 203
We then investigated whether cEV proteins can assist in the distinction of early versus late-204 stage pancreatic cancer. We compared the cEV proteome profiles of individuals with metastatic 205 cancer to those without metastasis and identified 85 proteins differentially expressed between 206 the two groups (Supplementary Table 6). Supervised clustering between metastatic and non-207 metastatic diseases showed a clear separation with two distinct expression patterns ( Figure  208 3A). The majority of non-metastatic cases represented stages 1 and 2 diseases, with a minority 209 representing locally advanced stage 3 disease. In particular, PSMB4, RUVBL2 and ANKAR ( Figure  210 3B) EV protein levels were increased, whereas RAP2B, SERPINA12 and IGLV4-69 abundance 211 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. levels were decreased in the cEVs of patients with metastatic disease ( Figure 3C). Together, 212 these findings suggest the presence of a core set of cEV proteins with the potential to 213 distinguish early versus metastatic pancreatic cancer. 214

215
We further analyzed whether the expression of certain cEV proteins had prognostic relevance 216 in our cohort. We first classified individuals with PDAC as having low or high expression of any 217 given markers based on each marker's first and third quartile. Survival was estimated by the 218 Kaplan Meier method. We identified that the cEV expression of RALB, CRP, and CD55 had a 219 significant correlation with worse survival, while PDCD6IP expression had a trend for improved 220 outcomes ( Figure 3D). 221 222

Signature for Pancreatic Cancer Diagnosis 224
Because pancreatic cancer is extremely heterogeneous, the chance of identifying a single 225 biomarker with sufficient diagnostic performance is likely low. Instead, the identification of a 226 panel of candidate markers may have enhanced diagnostic performance. 227

228
To identify a signature that shows the most discriminatory power between 'benign diseases' 229 and 'PDAC,' we employed a binary classification approach using Support Vector Machines 230 (SVM). Classification models, built based on a large number of proteins, contain irrelevant 231 markers that can reduce the predictive accuracy. Hence, we implemented a consensus feature 232 selection method based on two algorithms: one using recursive feature elimination (RFE) 233 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ; https://doi.org/10.1101/2023.03.13.23287216 doi: medRxiv preprint algorithm (SVM-RFE) 26 and second, RFE combined with a non-parametric Wilcox rank test 234 (sigFeature) 27 . The top 16 markers were selected whose classification performance can be 235 tested in the independent validation cohort (Supplementary Table 7). A summary of selection 236 process is shown in Supplementary Figure 4. The classification performance of these 16 237 markers, individual and in all combinations, were tested using 80% training data and evaluated 238 in the remaining 20% test data. The quality of training was assessed using five repetitions of 10-239 fold cross-validation. The optimal kernel parameters were estimated by tuning over a wide 240 range of values. Receiver operating characteristic (ROC) analysis was used as the metric to 241 assess the performance of the classifier model. We found a set of 7-EV protein signature 242 comprised of RUVBL2, PDCD6IP, ATP5F1, DLD, KRT20, CCT4, and SERPINAI2, that gave 100% 243 accuracy when tested in the discovery cohort (Supplementary Table 7). Recurrence of these 244 putative markers in our dataset varied from 55% to 97%. 245

246
The model was further validated on an independent validation cohort whose proteome was 247 obtained using an alternate technology, parallel reaction monitoring (PRM) mass spectrometry. 248 The markers chosen for validation included 16 markers selected for SVM classification model 249 and an additional 9 markers to result in top 25 markers that are significantly differentially 250 expressed in the discovery cohort with a fold change increase in PDAC ≥ 5.5 and p-value ≤0.01 251 (Methods, Supplementary Table 8). The independent validation cohort consisted of 36 new 252 subjects (24 with PDAC, 6 with chronic pancreatitis, and 6 with IPMN) (Supplementary Table 9). 253 A total of 10 proteins, including all 7 signature proteins, showed a significant difference (p < 254 0.05) in patients with PDAC as compared to benign pancreatic diseases ( Figure 4A). The 255 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ; https://doi.org/10.1101/2023.03.13.23287216 doi: medRxiv preprint performance of individual validated markers according to the specific underlying disease in the 256 validation cohort is presented in Supplementary Figure 5. The performance of 7-EV protein 257 signature was further tested using SVM model, in our independent validation cohort, yielding 258 an 89% prediction accuracy ( Figure 4B) Furthermore, we discovered several cEV proteins associated with metastatic disease and poor 277 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) In this dataset, we identified several EV proteins as significantly associated with metastasis or 292 survival. For instance, PSMB4 and RUVBL2 levels were increased in cEVs of patients with 293 metastatic PDAC. Notably, PSMB4 (proteasome subunit beta type-4), a protein of the ubiquitin-294 proteasome degradation pathway, has been identified as the first proteasomal subunit with 295 oncogenic properties and associated to poor prognosis in several tumors including melanoma, 296 breast and ovarian cancers [30][31][32][33] . 297 298 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ; https://doi.org/10. 1101/2023 We also discovered RALB, CRP and CD55 expression on EVs to have a significant correlation 299 with poor survival, while PDCD6IP expression was associated with improved outcomes. 300 Interestingly, PDCD6IP (programmed cell death 6-interacting protein), was also identified as a 301 PDAC-enriched protein in the tissue-based proteomics studies from Le Large et.al 34 and 302 Hoshino et.al 18 . In line with our findings, tissue expression of PDCD6IP in liver metastasis of 303 pancreatic cancer has been found to also correlate with improved prognosis in patients with 304 PDAC in the study of Law et.al 35 . Collectively, these data suggest that some tissue-specific 305 proteins can be detected in circulating EVs and their quantifiable levels in the blood may have 306 the potential to serve as diagnostic or prognostic biomarkers in pancreatic cancer. 307

308
In our validation studies, all seven putative markers identified from the model were significantly 309 enriched in the plasma of PDAC patients. Based on the top seven markers, we derived a 7-EV 310 protein panel that yielded an 89% prediction accuracy for diagnosing pancreatic cancer. A 311 recent modeling study showed that a new diagnostic assay for PDAC would have to perform 312 with a minimum sensitivity of 88% and a specificity of 85% to reduce healthcare expenditure 313 and prolong survival 6 . Serum CA19-9, the best-established blood test for PDAC, has a pooled 314 sensitivity of 75.4% and a specificity of 77.6% 36 . It commonly rises late in the disease and may 315 be elevated in nonmalignant conditions such as biliary obstruction and pancreatitis, making it 316 unsuitable as a diagnostic biomarker for PDAC 37 . As such, our 7-EV protein signature with 89% 317 prediction accuracy serves as a proof-of-concept and has the potential to facilitate the further 318 development of biomarker tests for pancreatic cancer. We anticipate that for clinical use 319 application, an even higher diagnostic performance is needed. Future studies are warranted to 320 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  Table 9). 351 352

Plasma Sample Collection and Processing 353
All blood samples were collected and processed following the same standard operating 354 procedure optimized for EV analysis and included the following steps: (i) whole blood was 355 collected into one 10ml yellow-top tube containing acid citrate dextrose; (ii) blood was mixed 356 by gently inverting the tube five times; (iii) vacutainer tubes were stored upright at room 357 temperature (RT); (iv) samples were centrifuged at 1,300g for 15 min in RT; (v) plasma was 358 removed from the top carefully avoiding cell pellet; (vi) repeat centrifugation of plasma at 359 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Preparation of EV samples 374
The isolated and dried EV samples were lysed to extract proteins using the phase-transfer 375 surfactant (PTS) aided procedure. The proteins were reduced and alkylated by incubation in 10 376 mM TCEP and 40 mM CAA for 10 min at 95°C. The samples were diluted fivefold with 50 mM 377 triethylammonium bicarbonate and digested with Lys-C (Wako) at 1:100 (wt/wt) enzyme-to-378 protein ratio for 3 h at 37°C. Trypsin was added to a final 1:50 (wt/wt) enzyme-to-protein ratio 379 for overnight digestion at 37°C. To remove the PTS surfactants from the samples, the samples 380 were acidified with trifluoroacetic acid (TFA) to a final concentration of 1% TFA, and ethyl 381 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ; https://doi.org/10.1101/2023.03.13.23287216 doi: medRxiv preprint acetate solution was added at a 1:1 ratio. The mixture was vortexed for 2 min and then 382 centrifuged at 16,000 × g for 2 min to obtain aqueous and organic phases. The organic phase 383 (top layer) was removed, and the aqueous phase was collected. This step was repeated once 384 more. The samples were dried in a vacuum centrifuge and desalted using Top-Tip C18 tips 385 (Glygen) according to the manufacturer's instructions. The samples were dried completely in a 386 vacuum centrifuge and stored at -80°C. 387

LC−MS Analysis of Plasma EV Proteome 389
Approximate 1 μg of each dried peptide sample was dissolved in 10.5 μL of 0.05% 390 trifluoroacetic acid with 3% (vol/vol) acetonitrile containing spiked-in indexed Retention Time 391 Standard containing 11 artificially synthetic peptides (Biognosys). The spiked-in 11-peptides 392 standard mixture was used to account for any variation in retention times and to normalize 393 abundance levels among samples. 10 μL of each sample was injected into an Ultimate 3000 394 nano UHPLC system (Thermo Fisher Scientific). Peptides were captured on a 2-cm Acclaim 395 PepMap trap column and separated on a heated 50-cm Acclaim PepMap column (Thermo 396 Fisher Scientific) containing C18 resin. The mobile phase buffer consisted of 0.1% formic acid in 397 ultrapure water (buffer A) with an eluting buffer of 0.1% formic acid in 80% (vol/vol) 398 acetonitrile (buffer B) run with a linear 60-min gradient of 6-30% buffer B at a flow rate of 300 399 nL/min. The UHPLC was coupled online with a Q-Exactive HF-X mass spectrometer (Thermo 400 Fisher Scientific). The mass spectrometer was operated in the data-dependent mode, in which a 401 full-scan MS (from m/z 375 to 1,500 with the resolution of 60,000) was followed by MS/MS of 402 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All data were quantified using the label-free quantitation node of Precursor Ions Quantifier 418 through the Proteome Discoverer v2.3 (Thermo Fisher Scientific). For the quantification of 419 proteomic data, the intensities of peptides were extracted with initial precursor mass tolerance 420 set at 10 ppm, a minimum number of isotope peaks as 2, maximum ΔRT of isotope pattern 421 multiplets -0.2 min, PSM confidence FDR of 0.01, with hypothesis test of ANOVA, maximum RT 422 shift of 5 min, pairwise ratio-based ratio calculation, and 100 as the maximum allowed fold 423 change. The abundance levels of all peptides and proteins were normalized to the spiked-in 424 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 internal iRT standard. For calculations of fold-change between the groups of proteins, total 425 protein abundance values were added together, and the ratios of these sums were used to 426 compare proteins within different samples. 427 The abundances of EV proteins were normalized using indexed retention time (iRT)

Pathways Enrichment and Protein Network Analysis 443
Pathway enrichment analysis was performed on statistically significant genes using g:Profiler 40 , 444 a web-based tool that searches for pathways whose genes are significantly enriched in our 445 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 dataset compared to a collection of genes representing Gene Ontology (GO) terms and 446 Reactome pathways. We further used EnrichmentMap 41 , a Cytoscape, v3.8.2 42 application to 447 create a visual network of connected pathways that helps to identify relevant pathways and 448 theme 43 . A Protein-Protein interaction network was generated using a stringApp, a Cytoscape 449 app. This application allows to import STRING networks into Cytoscape and enables to perform 200 ms maximum injection time) was run as triggered by a scheduled inclusion list. Higher-466 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Identification of EV signature for pancreatic cancer diagnosis 470
All algorithms for identifying the EV signature predictive of pancreatic cancer diagnosis were 471 implemented in R. We used Support Vector Machine (SVM) using CRAN package, e107 46 . 472 Ranking of genes was achieved using packages 'sigFeature' and 'SVM-RFE'. An R package, 473 'pROC' 47 was used to build a receiver operating characteristic curve (ROC) and to calculate the 474 area under the curve (AUC). 475 476

Survival Analysis 477
The prognostic value of every protein was estimated by dividing patients into two groups: 478 group 1, patients with expression below the 25 th percentile, and group 2, patients with 479 expression values greater than 75 th percentile. The Kaplan-Meier estimator was used to 480 estimate the survival function associating survival with EV protein expression, and the log-rank 481 test was used to compare survival curves of two groups. 'survival' R package was used for the 482 analysis. 483 484

Statistical Analysis 485
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 All statistical analyses were performed using the statistical software R. Statistical significance 486 was calculated by two-tailed Student's t-test or Wilcoxon rank-sum test unless specified 487 otherwise in the figure legend. Data are expressed as mean ± SEM. A p-value < 0.05 in biological 488 experiments or FDR < 0.05 after multiple comparison corrections in proteomics data analysis 489 was considered statistically significant. 490 491 492 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 We thank the patients and their families for their participation in this study. B.B. was supported 495 in part through UM1 (CA186709-06). S.D.F. was supported in part through the Barbara Janson 496 and and Fibrogen outside the submitted work; patent for Method for ACT in PDAC licensed to 511 Peaches SL; and Director of BMS. S.K.M. owns stocks and is a member of the Scientific Advisory 512 Board of KAHR Medical. The remaining authors declare no related competing interests. 513 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023   . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Diagnosis. 677
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

Metastasis (M)
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 20, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023