ethnicity: White host gender: Female host age (yr): 49 pathogen: Escherichia coli anatomic site of infection: Urinary tract experimental batch: 2034
Extracted molecule
total RNA
Extraction protocol
Total RNA was extracted from human blood using the PAXgene Blood RNA Kit (Qiagen, Valencia, CA) following the manufacturer’s recommended protocol including DNase treatment. Following isolation, RNA quantity was determined using the Agilent 2100 Bioanalyzer (Agilent, Santa Clara, CA). A set of four peptide nuleic acid (PNA) oligomers (Applied Biosystems, Foster City, CA) with sequences complimentary to globin mRNA were added to 2.5 ug of total RNA to reduce globin RNA transcription, then converted into cDNA using Reverse Transcriptase (Invitrogen) and a modified oligo(dT)24 primer that contains T7 promoter sequences (GenSet). After first strand synthesis, residual RNA was degraded by the addition of RNaseH and a double-stranded cDNA molecule was generated using DNA Polymerase I and DNA Ligase. The cDNA was then purified and concentrated using a phenol:chloroform extraction followed by ethanol precipitation.
Label
Biotin
Label protocol
The cDNA products were incubated with T7 RNA Polymerase and biotinylated ribonucleotides using an In Vitro Transcription kit (Affymetrix). The resultant cRNA product was purified using an RNeasy column (Qiagen) and quantified with a spectrophotometer. The cRNA target (20ug) was incubated at 94ºC for 35 minutes in fragmentation buffer (Tris, MgOAc, KOAc). The fragmented cRNA was diluted in hybridization buffer (MES, NaCl, EDTA, Tween 20, Herring Sperm DNA, Acetylated BSA) containing biotin-labeled OligoB2 and Eukaryotic Hybridization Controls (Affymetrix).
Hybridization protocol
Target was prepared and hybridized according to the "Affymetrix Technical Manual". The hybridization cocktail was denatured at 99°C for 5 minutes, incubated at 45°C for 5 minutes and then injected into a GeneChip cartridge. The GeneChip array was incubated at 42°C for at least 16 hours in a rotating oven at 60 rpm. GeneChips were washed with a series of nonstringent (25°C) and stringent (50°C) solutions containing variable amounts of MES, Tween20 and SSPE. The microarrays were then stained with Streptavidin Phycoerythrin and the fluorescent signal was amplified using a biotinylated antibody solution.
Scan protocol
Fluorescent images were detected in an GeneChip® Scanner 3000 and expression data was extracted using the GeneChip Operating System v 1.1 (Affymetrix). All GeneChips were scaled to a median intensity setting of 500.
Description
Enrollment protocol: Subjects were enrolled at Duke University Medical Center (DUMC; Durham, NC) as part of a prospective, NIH-sponsored study to develop novel diagnostic tests for severe sepsis and community-acquired pneumonia (ClinicalTrials.gov NCT00258869). Enrolled patients had a known or suspected infection a exhibited two or more Systemic Inflammatory Response Syndrome criteria. Patients were excluded if they had an imminently terminal co-morbid condition, advanced AIDS (CD4 count < 50), were receiving antibiotics prior to enrollment, or were enrolled in another clinical trial. Subjects in the current report had culture-confirmed monomicrobial BSI due to S. aureus (n=26; median age 55 years; range 40-91) or E. coli (n=14; median age 51.5 years; range 25-91). Uninfected controls (n=44; median age 26 years; range 20-59) were enrolled at DUMC as part of study investigating the effect of aspirin on platelet function among healthy.
Data processing
Data processing was conducted using the Robust Multichip Average (RMA) generated by Affymetrix Expression Console software. Microarray data was analyzed in two steps following the analysis strategy. First, a Bayesian sparse factor model was fit to the expression data without regard to phenotype. Second, factors were then used as independent variables to build a penalized binary regression with variable selection model trained to identify S. aureus infection. In order to minimize issues with overfitting, batch was not included in the regression models. This approach also allows for model averaging, which properly accounts for uncertainty in the choice of predictors and typically outperforms the single best model in predictive accuracy. Genes were filtered for analysis using non-specific filtering for genes with high mean expression and high variance across samples. Samples with a high number of outlying genes were removed during the factor analysis. Mice were batched into discrete experiments with each experiment containing the relevant controls to avoid confounding. Using the same murine experimental data, another classifier was derived to classify methicillin-resistant vs. methicillin-sensitive S. aureus infection. The methodology was otherwise the same as that described above. We fit a factor model on the human data independently from the mouse data. The factor model was fit to 8,892 genes after non-specific filtering to remove unexpressed and uniformly expressed genes. The factor model was trained on the 95 samples from three batches of expression data, and this resulted in 77 factors. These 77 factors were then projected onto the full data set with the goal of distinguishing S. aureus BSI from healthy controls or E. coli BSI. Leave-one-out cross-validation was utilized in order to control for overfitting of the penalized binary regression model. In order to minimize issues with overfitting, batch was not included in the regression models. Matlab scripts to perform these operations are available.