NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

National Research Council (US) Committee on Scientific Milestones for the Development of a Gene Sequence-Based Classification System for the Oversight of Select Agents. Sequence-Based Classification of Select Agents: A Brighter Line. Washington (DC): National Academies Press (US); 2010.

Cover of Sequence-Based Classification of Select Agents

Sequence-Based Classification of Select Agents: A Brighter Line.

Show details

Appendix IBotulinum Neurotoxin, B. Anthracis and Variola Virus

This appendix describes the complexity of the pathogenic mechanisms of the select agents botulinum neurotoxin, Bacillus anthracis, variola virus, filoviruses, and coronaviruses. It discusses the general mechanisms for acquisition of pathogenicity and the important role of the host in the calibration of the pathogenic potential of a microorganism

Botulinum neurotoxin (BoNT), B. anthracis, and variola virus have a majority of the attributes of pathogenicity that are considered important for inclusion on the Select Agents list. The toxin and both pathogens were featured in at least one nation-state bioweapons program in the last century, are associated with high case-fatality rates, and are relatively easy to produce or grow. However, there are important differences among them. BoNT is easy to isolate from Clostridium botulinum, is the most poisonous substance known, but is difficult to disseminate. B. anthracis is found in nature in the United States, is extremely stable in the environment, and is also poorly transmissible from human to human. Variola virus no longer circulates in the human population, is less stable in the environment, and is transmitted efficiently from human to human in large-droplet aerosols. Both B. anthracis and variola virus encode a large number of virulence genes that are responsible for their capacity to infect and sometimes kill humans.


The presence of a toxin gene may contribute to the pathogenicity of an organism. In some cases, the acquisition of a toxin genetic element may transform a relatively benign bacterium into a potent threat, as in the case of Clostridium botulinum. Although small-molecule toxins are found in shellfish or fungi and peptide toxins are found in venom from animal species, we will focus on the larger-protein toxins produced by bacteria and some plants. For the purpose of this appendix, we define a toxin as any protein that has a deleterious effect on health and fitness when ingested.

Seven immunologically distinct and extremely potent protein neurotoxins are produced by C. botulinum: A, B, C, D, E, F, and G. Types A, B, and E are most frequently associated with human disease, and types F and G are less often reported. Types C and D are associated with disease in fowl. The BoNTs are all expressed as single polypeptides that are posttranslationally proteolyzed to give a heavy chain and a light chain that are linked by a disulfide bond. The heavy chain is responsible fo r toxin adherence to the cell surface and translocation of the light chain into the cytosol. The light chain contains the zinc protease active site that is responsible for cleavage of the soluble N-ethylmaleimide-sensitive-factor attachment protein receptor (SNARE). BoNTs bind ganglioside receptors on the neuronal cell surface in the presynaptic terminals and, by their action as metalloendoproteases, selectively cleave proteins involved in the neuroexocytosis apparatus; this results in inhibition of acetylcholine release at the myoneural junctions and later flaccid paralysis that can lead to respiratory arrest.

The enzymatic active site of these toxins has been well characterized. A crystal structure of BoNT/A demonstrated that zinc is coordinated by the HisGluXXHis motif in the active site alpha helix through His-222 and His-226 and by Glu-261 (Lacy, Tepp et al. 1998). Site-directed mutagenesis experiments demonstrated the importance of these residues for toxin activity (Fujii, Kimura et al. 1992). Because the zinc binding site of this class of toxins is so well conserved (Fujii, Kimura et al. 1992) it would seem possible to predict from the gene sequence whether a bacterium is able to produce active toxin. However, data that argue against the use of BoNT sequence to predict activity comes from Agarwal et al. (Agarwal, Binz et al. 2005), who determined the structure of BoNT/E light chain with single amino acid substitutions outside of the canonical active site in regions adjacent to the catalytic residues. They found that the relatively minor change of glutamate to glutamine at position 335 rendered the enzyme unable to bind zinc. Other mutations in these regions also had large effects on the activity of the toxin. These findings serve to emphasize that biological toxin activity not only depends on the primary sequence but also on the three-dimensional structure of the folded protein.



Bacillus anthracis is a gram-positive, spore-forming rod that is an etiological agent of pulmonary, cutaneous, and gastrointestinal anthrax. In cases of pulmonary anthrax, spores are inhaled into the airway and taken up by macrophages. The generally accepted model of pathogenesis is as follows: Germination of the spores begins to occur in alveolar macrophages and continues as the macrophages migrate to the mediastinal lymph nodes, where vegetative growth of the bacilli occurs. In the final stages of pulmonary anthrax, the bacilli disseminate throughout the body and cause death of the host. In cutaneous anthrax, spores penetrate the skin through a wound. The spores germinate into vegetative bacilli and proliferate locally to cause formation of an eschar. In some patients, the bacilli can migrate into and multiply in the bloodstream, spread to various organs, and, if left untreated, cause death of the host. Antibiotic treatment initiated before dissemination of the bacteria into the bloodstream during either type of infection dramatically reduces the onset of multiorgan failure and death. However, because the symptoms of early pulmonary anthrax mimic those of other infections, the disease is often not diagnosed until the patient is unable to recover, even with broad-spectrum intravenous antibiotic therapy. The bimodal life cycle of B. anthracis contributes extensively to the pathogenesis of disease. The organism survives outside the mammalian host as a spore that is resistant to heat, chemicals, desiccation, and so on. B. anthracis spores are coated with at least 34 chromosomally encoded proteins, many of which are immunogenic and protect the bacterium. The genes encoding the components of the spore are regulated by a complex temporal network of sporulation-specific sigma factors. Germination into vegetative bacilli occurs rapidly in the host in response to as yet unidentified signals.

Virulence Genes

Vegetative bacilli produce the two toxins edema factor (EF) and lethal factor (LF). The genes encoding the components of EF (pagA and cya) and LF (pagA and lef) are found on the 182-kb pXO1 plasmid. In addition, the bacilli are encapsulated by a poly-D-γ-glutamic acid capsule that is produced by enzymes encoded by genes (capBCAD) on the pXO2 virulence plasmid. Both plasmids are required for full virulence of B. anthracis strains. The genes encoding both the toxins and the capsule are activated by the master regulator of virulence AtxA, the gene for which is also on pXO1. The toxin genes are directly activated by AtxA, which also activates acpA and acpB, pXO2-encoded genes whose products activate the capsule operon. AtxA also exerts an effect on chromosomal genes that encode surface components, including the surface-layer proteins Sap and EA (sap and eag, respectively), by activation of the pXO1-encoded pagR; PagR represses sap and activates eag. Thus, at a minimum, a fully virulent strain of B. anthracis requires the genes necessary for spore, toxin, and capsule formation and a large array of regulatory factors and other genes that are chromosomally encoded. However, given the conservation of the B. anthracis genome over decades, if an isolate was identified as B. anthracis by standard microbiologic means and if orthologous toxin and poly-D-glutamic acid capsular gene sequences were present, a presumption of virulence could be made. Proof of that assumption would require assessment of the relative lethality of spores prepared from the isolate in an animal model—mice, rabbits (better), or non-human primates (best).



There are three distinct clades of variola virus that coincide roughly with low, intermediate, and high case-fatality rate. Clade A was associated with an intermediate (8–12 percent) case-fatality rate. Clade B (variola minor) was associated with a low (below 1 percent) case-fatality rate and caused a disease referred to as amaas in Africa or alastrim in the Americas. Clade C (variola major) was associated with a high (16–30 percent) case-fatality rate and caused the disease classic or ordinary smallpox. Clinically, smallpox in an unvaccinated person has an incubation period of 7–19 days from the time of infection of the upper respiratory tract until the first symptoms of fever, malaise, headache, and backache occur. The characteristic rash then follows. The rash starts with papules, which sequentially transform into vesicles and then pustules; most of these lesions are on the head and limbs (often confluent) rather than on the trunk (centrifugal pattern). Lesions are 0.5–1 cm in diameter and can spread over the entire body. Once pustules have dried, scabs form and eventually desquamate during the next 2–3 weeks. The resulting feature of the cutaneous lesions is the formation of the classic pock scar that is apparent on the skin of surviving patients.

Virulence Genes

The protein coding regions of the variola virus genome is 96 percent identical at the nucleotide level to other orthopoxviruses. The majority of the sequence diversity occurs in the flanking regions of the genomes, which contain the virulence genes that target host apoptosis and the innate/immune response functions. Each poxvirus has 90 core genes and a unique complement of virulence genes, the latter of which determine in large part the unique biology of each orthopoxvirus species. These virus-specific features include the reservoir and incidental hosts, cell/tisariola virus encodes 71 virulence genes, of 41 of which something is known about function or location in the virion or infected cell; there is no experimental information on the remaining 30 genes. The putative roles of virulence genes in the natural life cycle of variola virus have been determined by the study of orthologous genes in other orthopoxvirus-animal models (such as vaccinia virus-mouse, ectromelia virus-mouse, and myxoma virus-rabbit). A number of the virulence genes encode cytokine-binding proteins, which can have varied specificity against ligands from different animal species. For example, the variola virus IL-18-binding protein has a higher affinity for mouse than for human IL-18; this suggests that adaptation to the human host does not always require or result in optimal specificity for the human ligand. Thus, the experimentally determined specificity of a gene product for host ligands may or may not support an inference of host biology. Some virulence genes target the same pathways. For example, NFκB is a key factor for transcription of host genes that mediate innate and immune responses and is targeted by a number of pathogens, including poxviruses (see Box I-1). Variola virus has at least five virulence genes thought to target that pathway: One gene product acts extracellularly by binding to IL-18, and the other four act intra-cellularly against signaling pathways or the NFκB complex. The importance of a gene for virulence is defined not only in the context of the host in which the virus replicates but also by the route of infection. For example, in the case of vaccinia virus infections of the mouse, 50 percent of 16 individual gene deletion or insertion mutants showed a phenotype distinct from that of controls in intra-nasal or intradermal infections but not both (Tscharke, Reading et al. 2002).

Box Icon


VariolaVirus Virulence Genes. Predicted function/location IMV-MP/Virulen. factor (Cop-A14.5)

Even with the same complement of virulence genes, the case-fatality rate of variola virus isolates appears to vary, presumably because of subtle functional differences in individual genes or groups of genes, which are not understandable from sequence analysis. For example, amino acid differences in the coding-region sequences revealed that a consensus of 67 open reading frames (ORFs) distinguished clade A strains (middle-range case-fatality rates) from clade B strains (low case-fatality rates), and 15 ORFs distinguished the middle-range from the low case-fatality rate groups of clade C virus strains from Africa (Esposito, Sammons et al. 2006).

Variola virus is one of the few Select Agents that transmit efficiently from person to person during disease. That process does not occur at the primary site of infection; rather, it depends on a large number of virus-replication cycles in different cell types and tissues in the face of a rapidly activated, innate or adaptive immune response. High transmissibility depends on effective systemic virus spread and later establishment of sufficient foci of infection in the oropharyngeal mucosa to produce virus concentrations in the respiratory gases that are high enough to infect contacts. Although monkeypox virus differs from variola virus in only 12 virulence genes, it is unable to sustain transmission in human populations after introduction from an animal source. The genetic basis of the difference in transmissibility between monkeypox and variola virus is unknown, and there are no physiologically relevant animal models that could be used to answer the question.

In summary, many virulence genes are required for the full virulence of B. anthracis and variola virus, and a number of these genes are directly or indirectly contextual to the particular animal species, route of infection, cell or biochemical pathways. For poxviruses, the presence or absence of a virulence gene is not informative as to the infectivity of the virus in any animal species, including humans. Furthermore, in some situations, it is likely that subtle changes in the activity of one or more variola virus genes alone can affect pathogenicity. Those hypothesized subtle differences in activity cannot now be predicted from sequence analysis.

In the following section, a number of general mechanisms for the evolution of pathogenicity and sustainability in a host are described with an emphasis on the important observation that closely related pathogen species or strains can evolve to be pathogenic in a host in unique and sometimes unpredictable ways.



The filovirus family, Filoviridae, consists of two genera: Marburgvirus, which comprises various strains of the 1967 Lake Victoria marburgvirus (MARV), and the antigenically distinct Ebolavirus. Ebolaviruses were first discovered in 1976. The genus contains five species: Sudan ebolavirus (SEBOV), Zaire ebolavirus (ZEBOV), Ivory Coast ebolavirus (CIEBOV)), Bundibugyo ebolavirus, (BEBOV), and Reston ebolavirus (REBOV). Filoviruses are among the deadliest of all human pathogens, causing hemorrhagic fever with mortality that can approach 90 percent. Several reports indicate high seroprevalance in many areas of Africa. Assuming that the serologic tests were specific, either EBOV is endemic or there are a set of uncharacterized, cocirculating, non-pathogenic, antigenically cross-reactive viruses. In support of that idea, REBOV is not pathogenic in humans.

Filoviruses are filamentous, nonsegmented negative-strand RNA viruses that have about a 19-kb RNA genome, have a highly conserved gene order, and are surrounded by a helical nucleocapsid structure and a lipid bilayer that contains several virus glycoprotein spikes. These viruses probably target bats as reservoir species, although Marburg Reston is maintained in swine in the Philippines. In uninterrupted human-to-human transmission, nucleotide sequence changes are rare, except in an Angola outbreak characterized by targeted evolution in VP40 and VP24. It is possible that these viruses require minimal evolution for human replication and pathogenesis.

Early in infection, filoviruses target cells of mononuclear lineage, notably macrophages, monocytes, and dendritic cells (DCs), but not lymphocytes. Infected monocytes release inflammatory cytokines, such as TNF-α, whereas DCs are anergic (characterized by limited cytokine production, DC maturation, and diminished antigen presentation for T-cell activation). Neutrophils rapidly become activated, most likely by viral glycoprotein interaction with the TREM-1 ligand, and this results in increased cytokine production. Massive bystander apoptosis of natural killer cells and lymphocytes occurs intravascularly and in lymphoid organs. Innate immunity and adaptive immunity are delayed during filovirus infections, and this allows increased virus replication and disease exacerbation as many cells (such as monocytes and neutrophils) are continuously triggered to release cytokines. As infection increases, filoviruses infect a wide variety of cells and organs, including the liver and endothelial cells. Dysfunctions in hemostasis occur as a consequence of hepatic damage and the release of TNF-α and other proinflammatory cytokines. In fatal cases, death occurs 6–16 days after the onset of symptoms, usually because of multiorgan failure and coagulopathy that results in disseminated intravascular coagulation and shock. Hemorrhagic disease occurs in about 25–45 percent of the patients and is most likely triggered by immune-mediated mechanisms.

Virulence Determinants

Filovirus virulence determinants include viral proteins that antagonize adaptive and innate immune responses, suggesting a role for inflammation in resistance and disease. Extensive replication of filoviruses in primates is regulated by two key viral proteins that antagonize host interferon responses. VP35 inhibits the activation of transcription factor IRF3 by binding to dsRNA and inhibiting retinoic acid induced gene-I (RIG-I) signaling. VP35 also interferes with the activation of the dsRNA-binding kinase, PKR. Mechanistically, VP35 expression likely augments the conjugation of a small ubiquitin-like modifier (SUMO) protein to IRF3/IRF7 through TLR and RIG-1 signaling, leading to increased inhibition of IFN transcription by IRF3/7. Thus, VP35 activates a normal negative feedback loop that regulates IFN signaling to weaken host innate immunity. In contrast, VP24 inhibits the cellular response to exogenous IFN by interacting with karyopherin α1, preventing the nuclear accumulation of tyrosine phosphorylated STAT1 and STAT2.

The GP glycoprotein, including its soluble sGP form, also functions as a major virulence determinant, playing an important role in virus attachment and entry, cell rounding, cytotoxicity and down-regulation of host proteins. GP toxicity is thought to be mediated by a dynamin-dependent protein trafficking pathway and a ERK mitogen activated protein kinase pathway. Importantly, exposure of primate PBMCs to select ZEBOV, or MARV GP peptides or inactivated ZEBOV resulted in decreased expression of activation markers on CD4 and CD8 cells; CD4 and CD8 cell apoptosis; blocked CD4 and CD8 cell cycle progression; decreased interleukin (IL)-2, IL12-p40 and IFN-gamma; but increased IL-10 expression. Thus, GP likely encodes an immunosuppressive motif that likely antagonizes adaptive immunity during infection.

This graphic shows a line drawing of an evolutionary tree for Coronavirus.

FIGURE I.1Coronavirus phylogeny



Coronaviruses encode a ~30 kb single-stranded positive polarity RNA genome that is wrapped in a helical nucleocapsid composed of multiple copies of a nucleocapsid protein and surrounded by a lipid envelope bearing three or more glycoprotein spikes. The virus family is divided into group 1 (alpha coronaviruses), 2 (beta coronaviruses), and 3 (gamma) coronaviruses based on sequence homology (See above Fig—Coronavirus Phylogeny) Coronavirus phylogeny and biology are characterized by frequent host-shifting events, including animal-to-human (zoonosis), human-to-animal (reverse zoonosis) or animal-to-animal. Over the past 30 years, several coronavirus cross-species transmission events as well as changes in virus tropism have given rise to new significant animal and human diseases. Most notably, severe acute respiratory syndrome (SARS), a human lower respiratory disease that was first reported in late 2002 in Guangdong Province, China, quickly spread worldwide over a period of four months. The virus infected over 8,000 individuals, killing nearly 800 before it was successfully contained by aggressive public health intervention strategies. The etiological agent of SARS (SARS-CoV) was determined to have crossed into human hosts from zoonotic reservoirs including bats as well as Himalayan palm civets (Paguma larvata) and raccoon dogs (Nyctereutes procyonoides) that were sold in exotic animal markets in China. Of note, another human corona-virus (HCoV-229E) likely emerged from African bat coronavirus lineages some 200 years earlier while HCoV OC43 likely emerged from closely related bovine coronaviruses about 100 years ago. SARS-CoV was recently proposed as a new select agent based primarily on its high virulence and transmission potential in human populations and the lack of effective vaccines and therapeutics. This recommendation models many of the difficulties in using sequence-based criteria for determination of virulence potentials. SARS-CoV is a group 2b corona-virus, which includes closely related civet and raccoon dog strains (>99 percent sequence identity) as well as more variant bat coronaviruses, HKU3 and RB3. Protein sequence identity is greater than 95 percent across most of the genome of human epidemic SARS-CoV and bat group2b coronaviruses although the S glycoproteins are only 80–90 percent identical. It has been proposed that bat coronaviruses were the progenitor strains for all group 1 and group 2 corona-viruses and genome homologies range from 43 to >90 percent amino acid identity. Obviously, some viral genes are more highly conserved than others.

Age was a major virulence determinant as mortality rates were less than 1 percent for individuals below 21 years of age, but >50 percent in individuals greater than 65 years of age. The predominant pathological features of SARS-CoV infection in the human lung included diffuse alveolar damage (DAD), hyaline membranes, atypical pneumonia with dry cough, persistent fever, progressive dyspnea and sometimes abrupt deterioration of lung function. Virus infection primarily targeted ciliated epithelial cells and type II pneumocytes in the lung as well as epithelial cells in the intestine. Major pathologic lesions include inflammatory exudation in the alveoli and interstitial tissue with hyperplasia of fibrous tissue and fibrosis. Two phases of disease were identified during SARS-CoV infection in humans. Acute respiratory distress syndrome (ARDS) develops within the first 10 days with DAD, edema, and hyaline membrane formation. After the acute phase, an organizing phase DAD with increased fibrosis is observed. Increasing age, male sex, presence of comorbid conditions, high early viral RNA burdens, and high lactate dehydrogenase levels are associated with greater risk of death. In serum, dynamic changes in cytokine levels have been reported following SARS-CoV infection including increases in IFN-γ, IL-18, TGF-β, IL-6, MP-10, MCP-1, MIG and IL-8, but not TNF-α, IL-2, IL-4, IL-10, IL-13 or TNFRI. The data suggested that an IFN-γ-related cytokine storm might be involved in the immunopathological damage noted in SARS patients.

This table compares conserved sequence alterations associated with expanding phases of the SARS-CoV epidemic.

FIGURE I.2Conserved sequence alterations associated with expanding phases of the SARS-CoV epidemic

Hypothetically, the pandemic strains of SARS evolved from the virus isolated from civet cats (SZ16, HC/SZ61/03). GD03 is a very early human isolate (Dec 22, 03) very similar to HC/SZ61/03. The GZ02 sequence is representative of early strains (e.g., HGZ8L1-A, etc.) detected in a patient in Guangzhou (the capital of Guangdong). CUHK-W1 and related viruses (Guangdong), are reasonable precursors to viral strains that evolved from early Guangdong isolates and is representative of the middle phase of the epidemic. Urbani is representative of late phase isolates that occurred after the Metropol Hotel superspreader event. Residues in green are indicative of the civet cat alleles, pink are mutations that occurred early during civet cat-human transmission, dark blue during the middle phase and gray during the late phase. We will build viruses encoding these mutational profiles.

Virulence Determinants

SARS-CoV Cross-Species Transmission. The SARS-CoV outbreak is unique in that a chronological set of sequence changes were obtained, providing precise sequence signatures associated with expanding waves of the global epidemic. By comparing the earliest full length human isolate (GZ02; early December 2002) to civet cat isolates SZ16 and HZ/SZ/63 and key strains epidemiologically linked to an expanding epidemic, 12 amino acid changes in ORF1a, 2 in ORF1b, 17 in the S glycoprotein, 4 in ORF3a, 1 in the M glycoprotein and the ORF8 29 nt deletion were identified that may have allowed for increased replication, transmission, and pathogenesis in human hosts (See Figure above). In general, civet/raccoon dog-related strains and strains identified in sporadic human cases prior to the onset of the epidemic were thought to be significantly less transmissible and pathogenic than those identified during the early, middle and late stages of the epidemic. Except for a couple of mutations in S and M which either promoted entry into human cells or promoted efficient egress, respectively, the role of most mutations in the expanding epidemic still remain unclear. Importantly, civet and raccoon dog strains do not replicate in human cells, despite having greater than 99 percent sequence identity. Only two or three mutations are needed for to promote efficient replication in human airway cells, yet animal strains whose S RBD recognize hACE2 receptors are only weakly pathogenic and require extensive adaptation in S and elsewhere in the genome. Current proposals to include “SARS-CoV” as a potential select agent are unclear regarding the disposition of the closely related but human host range-restricted, civet and raccoon dog SARS-like strains, sporadic human strains identified in 2004 that are mostly zoonotic in origin, as well as bat SARS-CoV-like strains in this classification scheme. More importantly, given the close homology but receptor mediated restriction and/or limitation in hACE2 recognition by these animal-origin strains, defining SA status by global genome sequence homology seems arbitrary and not grounded in any rational scientific method.

At ~180 kDa in mass, the S glycoprotein is a trimer in the virion and organized into two subunit domains, an amino-terminal S1, which contains the ~200-aa receptor binding domain (RBD), and a carboxy-terminal S2, which contains the putative fusion peptide, two heptad repeat (HR) domains, and a transmembrane domain (TM). This domain organization groups the CoV S spike glycoprotein with other class I viral fusion proteins, such as Influenza HA, HIV-1 env, SV5 F, and Ebola Gp2. The RBD of Spike is generally acknowledged as the principal determinant of coronavirus host range and two key mutations in S, K479N and S487T, were shown to be responsible for promoting efficient interaction with either the civet or human ACE2 receptor, respectively. Given these findings, it seems that reasonable criteria could be developed to use sequence-based criteria to categorize “human and animal” strains for oversight purposes. However, multiple pathways of S-driven host range expansion exist, complicating sequence based predictions of host range expansion. For example, re-adaptation of the civet SZ16 strain to human airway epithelial cells identified a different set of key “humanizing” mutations at K479N, Y442F and L472F. In addition, K479T and K479I also allow for efficient S RBD interaction with hACE2 demonstrating that multiple genetic mutation pathways exist, which would allow for differential recognition of civet and/or human ACE2 receptors. Importantly, the SARS-CoV RBD is highly plastic and can be recombined into closely related group2b bat and animal strains, promoting efficient growth in human airway epithelial cells. These strains are significantly less efficient at replicating in mouse models of human disease, suggesting that additional adaptation and mutation would be required to promote efficient disease potential in humans. In fact, the Sectodomains are readily interchangeable between group 1 and 2 coronaviruses and these recombinant viruses grow efficiently. Given the high recombination frequency noted in mixed coronavirus infection, recombination driven “host shifts” are likely common in coronavirus phylogenies. Finally, by completely novel mechanisms, as little as 2–4 mutations in some coronavirus fusion peptide and HR1 domains have promoted host range expansion into multiple species in cell culture. Although the mechanism is unclear, it represents an entirely different pathway to host range expansion. Clearly, the extensive plasticity and existence of multiple interaction networks by which S can augment receptor(s) interactions and entry, makes sequence based predictions of host range specificity difficult if not usually impossible.

Mutations that effect virus replication efficiency also must be considered an important virulence determinant. It is likely that a number of mutations that occurred during the SARS-CoV outbreak selected for more efficient virus-human host interactions that promoted efficient replication, gene expression and release. Consonant with this hypothesis, mutations in virtually any essential gene attenuates virus replication and pathogenesis in vivo, demonstrating the importance of efficient virus growth in disease progression. In general, this seems to be a common truism for most RNA virus genomes. Nevertheless, SARS-CoV also encodes a number of proteins that strongly antagonize the innate immune sensing/signaling pathways and which likely function as important virulence determinants that regulate pathogenesis. Among the 16 SARS-CoV ORF1 replicase proteins, nsp1, nsp3 (PLP-papain like protease), nsp7 and nsp15 all encode strong interferon antagonist activities that either block type 1 interferon and/or NFκB sensing/signaling pathways. These four non-structural proteins are encoded in all coronavirus genomes but are highly variable in primary amino acid sequence making it difficult to predict whether similar innate immune antagonism activities are universally encoded in all Coronaviridae genomes. In addition, nsp1 is reported to block host translation and promote host, but not viral, mRNA degradation, and nsp1 and PLP mutations have been described which attenuate innate immune antagonism activities and/or attenuate virus pathogenesis. However, even in closely related coronaviruses, these homologous proteins may or may not encode similar innate immune antagonism functions, making it difficult to predict activity based on sequence homology and biologic precedent. For example, the HCoV NL63 and SARS-CoV nsp3 encoded PLP domains are highly variable, differing by over 80 percent in amino acid identity, yet both likely form similar structures and antagonize IFN signaling. It is not clear whether common sequence motifs regulate this activity. In addition to the non-structural proteins encoded in ORF1, antagonist activities are also encoded in ORF3b, ORF6 and the N protein, which either block type 1 interferon sensing and/or signaling by targeting novel cellular components essential for signal transduction. Except for ORF6 from the closely related Bt-SARS-CoV HKU3, it is not clear whether closely related group2b homologs also encode similar IFN antagonist activities. Several SARS-CoV proteins (e.g., S, M, ORF3a, ORF6, ORF7, N, etc.) are pro-apoptotic and likely contribute to virus cell killing and pathogenesis as well.

The nsp1 proteins of coronaviruses are highly variable, both in terms of size and amino acid sequence variation. The SARS-CoV NSP1 is a 20kd protein that is localized to the cytoplasm of infected cells. In some reports, the NSP1 was able to block IFN-β mRNA induction but did not antagonize the IRF3 signaling pathway. NSP1 expression degraded not only IFN-β mRNA but also several endogenous cellular mRNAs as well and inhibited in translation of cellular mRNAs. As SARS-CoV infected cells also degraded cellular mRNA, these authors proposed that nsp1 degradation of host mRNA is an important mechanism of blocking host antiviral defenses. However, other work suggests that SARS-CoV NSP1 inhibits the signal transduction pathways involving IRF3, STAT1 and NF-kB. Interestingly, these authors did not observe the mRNA degradation phenotype seen in the previous studies.

ORF6 is a 63aa ER/Golgi membrane protein that has its C terminal tail facing the cytoplasm and its N terminus either in the ER lumen or associated with the ER membrane. ORF6 blocks the nuclear import of the STAT1/STAT2/IRF9 and STAT1/STAT1 complexes in the presence of IFNβ or IFNγ treatment, respectively, and resulted in the reduction of STAT1-dependent gene induction. The C terminal 10aa were critical for this import block; mediated by a recruitment of nuclear import factors to the ER/Golgi membrane. Karyopherin alpha 2 (KPNA2) specifically binds to the C-terminal tail of ORF6, retaining KPNA2 at the ER/Golgi membrane. This interaction subsequently recruits Karyopherin beta 1 (KPNB1) to the ER/Golgi membrane as well. The recruitment of KPNB1 onto membrane complexes limited the bioavailability of KPNB1, an essential component for the nuclear import of STAT1 complexes as well as other cargo. Consequently, ORF6 blocked nuclear import of the STAT1 complexes. ORF6 may affect other signaling pathways since KPNB1 is a common component of the classical nuclear import pathways. In recombinant viruses, ORF6 has also been shown to increase pathogenesis of the normally non-lethal MHV-A59 virus. In vitro studies with the MHV/ORF6 virus show that ORF6 expression increased the virus production from cells compared to WT virus. Recently, Hussain et al showed that ORF6 expression blocked proteins containing classical import signals but not proteins that use non-importin nuclear import mechanisms (Hussain, Perlman et al. 2008). MHV’s accessory proteins have not been implicated in impacting the innate immune response however they may clearly play a role in MHV’s capacity to evade the innate immune sensing proteins as deletion of some, but not all impact in vivo pathogenesis. The clear association of SARS-CoV ORF6 and Ebola VP24 with the host nuclear import pathway identifies nuclear import as a key site by which highly pathogenic viruses regulate the intracellular environment. By modulating the kinetics of nuclear import during infection, the virus controls innate immune, adaptive immune, apoptotic and cell stress signaling networks. While the mechanism may vary, it would be surprising if other pathogenic viruses don’t modulate the same pathways as seen in SARS-CoV during their infection process.

The SARS-CoV and MHV nucleocapsid protein have also been shown to affect different aspects of the innate immune response and appear to modulate several signaling pathways in the cell. Many of these studies were performed using over expression constructs in isolation and have not been confirmed in the context of virus infection. He et al. showed that N protein is able to induce AP-1 signaling in vitro. SARS-CoV N was able to block the induction of reporter gene expression from an IFN-β promoter and also block NF-kB signaling. These data indicate that the N protein is also able to inhibit an ISRE promoter in response to Sendai infection but not IFN-β treatment. The mechanism by which this inhibition is occurring is unknown and under investigation. The MHV N protein has been shown to inhibit the activation of PKR, a strongly antiviral protein, in the cytoplasm from poxvirus vectors and in MHV-infected cells. PKR activation normally leads to a block in protein synthesis by phosphorylating the alpha subunit of the translation factor eIF2. While N does not itself prevent PKR activation, it alters PKR’s function such that it no longer signals properly. The N proteins between these two distinct group 2 coronaviruses are quite conserved, so it will be interesting to determine if they both encode overlapping functions during infection or whether they mediate distinct inhibitory mechanisms.

Copyright © 2010, National Academy of Sciences.
Bookshelf ID: NBK50857


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (2.7M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...