Format
Sort by
Items per page

Send to

Choose Destination

Search results

Items: 1 to 20 of 59

1.
PeerJ. 2019 Jan 4;7:e6160. doi: 10.7717/peerj.6160. eCollection 2019.

Identifying accurate metagenome and amplicon software via a meta-analysis of sequence to taxonomy benchmarking studies.

Author information

1
Biomolecular Interactions Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
2
Department of Biochemistry, University of Otago, Dunedin, New Zealand.
3
Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand.
4
Institute of Environmental Science and Research, Porirua, New Zealand.
5
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.

Abstract

Metagenomic and meta-barcode DNA sequencing has rapidly become a widely-used technique for investigating a range of questions, particularly related to health and environmental monitoring. There has also been a proliferation of bioinformatic tools for analysing metagenomic and amplicon datasets, which makes selecting adequate tools a significant challenge. A number of benchmark studies have been undertaken; however, these can present conflicting results. In order to address this issue we have applied a robust Z-score ranking procedure and a network meta-analysis method to identify software tools that are consistently accurate for mapping DNA sequences to taxonomic hierarchies. Based upon these results we have identified some tools and computational strategies that produce robust predictions.

KEYWORDS:

Benchmark; Bioinformatics; Metabarcoding; Metabenchmark; Metagenomics; eDNA

Conflict of interest statement

The authors declare there are no competing interests.

2.
Mol Ecol Resour. 2018 May 26. doi: 10.1111/1755-0998.12907. [Epub ahead of print]

Towards robust and repeatable sampling methods in eDNA-based studies.

Author information

1
Bio-Protection Research Centre, Lincoln University, Lincoln, New Zealand.
2
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
3
Institut de Recherche sur la Biologie de l'Insecte - UMR 7261 CNRS, Université de Tours, Tours, France.
4
Applied Molecular Solutions Research Group, Environmental and Animal Sciences, Unitec Institute of Technology, Auckland, New Zealand.
5
School of Science, Auckland University of Technology, Auckland, New Zealand.
6
Institute for Applied Ecology, University of Canberra, Bruce, ACT, Australia.
7
School of Science, University of Waikato, Hamilton, New Zealand.
8
Polar Knowledge Canada, CHARS Campus, Cambridge Bay, NU, Canada.
9
Landcare Research, Lincoln, New Zealand.
10
School of Biological Sciences, The University of Auckland, Auckland, New Zealand.
11
Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand.
12
Hawkesbury Institute for the Environment, Western Sydney University, Penrith, NSW, Australia.
13
Institute of Environmental Science and Research Ltd., Christchurch, New Zealand.

Abstract

DNA-based techniques are increasingly used for measuring the biodiversity (species presence, identity, abundance and community composition) of terrestrial and aquatic ecosystems. While there are numerous reviews of molecular methods and bioinformatic steps, there has been little consideration of the methods used to collect samples upon which these later steps are based. This represents a critical knowledge gap, as methodologically sound field sampling is the foundation for subsequent analyses. We reviewed field sampling methods used for metabarcoding studies of both terrestrial and freshwater ecosystem biodiversity over a nearly three-year period (n = 75). We found that 95% (n = 71) of these studies used subjective sampling methods and inappropriate field methods and/or failed to provide critical methodological information. It would be possible for researchers to replicate only 5% of the metabarcoding studies in our sample, a poorer level of reproducibility than for ecological studies in general. Our findings suggest greater attention to field sampling methods, and reporting is necessary in eDNA-based studies of biodiversity to ensure robust outcomes and future reproducibility. Methods must be fully and accurately reported, and protocols developed that minimize subjectivity. Standardization of sampling protocols would be one way to help to improve reproducibility and have additional benefits in allowing compilation and comparison of data from across studies.

KEYWORDS:

contamination; environmental DNA; experimental design; metabarcoding; metadata; sampling

Publication type

Publication type

3.
PLoS Genet. 2018 May 8;14(5):e1007333. doi: 10.1371/journal.pgen.1007333. eCollection 2018 May.

Machine learning identifies signatures of host adaptation in the bacterial pathogen Salmonella enterica.

Author information

1
Wellcome Sanger Institute, Hinxton, United Kingdom.
2
Biomolecular Interaction Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
3
Department of Biochemistry, University of Otago, Dunedin, New Zealand.
4
Institute for Molecular Infection Biology, University of Wuerzburg, Wuerzburg, Germany.
5
Helmholtz Institute for RNA-based Infection Research, Wuerzburg, Germany.

Abstract

Emerging pathogens are a major threat to public health, however understanding how pathogens adapt to new niches remains a challenge. New methods are urgently required to provide functional insights into pathogens from the massive genomic data sets now being generated from routine pathogen surveillance for epidemiological purposes. Here, we measure the burden of atypical mutations in protein coding genes across independently evolved Salmonella enterica lineages, and use these as input to train a random forest classifier to identify strains associated with extraintestinal disease. Members of the species fall along a continuum, from pathovars which cause gastrointestinal infection and low mortality, associated with a broad host-range, to those that cause invasive infection and high mortality, associated with a narrowed host range. Our random forest classifier learned to perfectly discriminate long-established gastrointestinal and invasive serovars of Salmonella. Additionally, it was able to discriminate recently emerged Salmonella Enteritidis and Typhimurium lineages associated with invasive disease in immunocompromised populations in sub-Saharan Africa, and within-host adaptation to invasive infection. We dissect the architecture of the model to identify the genes that were most informative of phenotype, revealing a common theme of degradation of metabolic pathways in extraintestinal lineages. This approach accurately identifies patterns of gene degradation and diversifying selection specific to invasive serovars that have been captured by more labour-intensive investigations, but can be readily scaled to larger analyses.

PMID:
29738521
PMCID:
PMC5940178
DOI:
10.1371/journal.pgen.1007333
[Indexed for MEDLINE]
Free PMC Article
Icon for Public Library of Science Icon for PubMed Central
4.
Mol Biol Evol. 2018 Jun 1;35(6):1451-1462. doi: 10.1093/molbev/msy046.

An Evaluation of Function of Multicopy Noncoding RNAs in Mammals Using ENCODE/FANTOM Data and Comparative Genomics.

Author information

1
Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel, Germany.
2
Institute of Natural and Mathematical Sciences, Massey University, Auckland, New Zealand.
3
Biomolecular Interaction Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
4
Bioinformatics Institute, School of Biological Sciences, University of Auckland, Auckland, New Zealand.

Abstract

Mammalian diversification has coincided with a rapid proliferation of various types of noncoding RNAs, including members of both snRNAs and snoRNAs. The significance of this expansion however remains obscure. While some ncRNA copy-number expansions have been linked to functionally tractable effects, such events may equally likely be neutral, perhaps as a result of random retrotransposition. Hindering progress in our understanding of such observations is the difficulty in establishing function for the diverse features that have been identified in our own genome. Projects such as ENCODE and FANTOM have revealed a hidden world of genomic expression patterns, as well as a host of other potential indicators of biological function. However, such projects have been criticized, particularly from practitioners in the field of molecular evolution, where many suspect these data provide limited insight into biological function. The molecular evolution community has largely taken a skeptical view, thus it is important to establish tests of function. We use a range of data, including data drawn from ENCODE and FANTOM, to examine the case for function for the recent copy number expansion in mammals of six evolutionarily ancient RNA families involved in splicing and rRNA maturation. We use several criteria to assess evidence for function: conservation of sequence and structure, genomic synteny, evidence for transposition, and evidence for species-specific expression. Applying these criteria, we find that only a minority of loci show strong evidence for function and that, for the majority, we cannot reject the null hypothesis of no function.

5.
mSystems. 2017 Nov 14;2(6). pii: e00127-17. doi: 10.1128/mSystems.00127-17. eCollection 2017 Nov-Dec.

Genomic, Transcriptomic, and Phenotypic Analyses of Neisseria meningitidis Isolates from Disease Patients and Their Household Contacts.

Author information

1
Invasive Pathogens Laboratory, Institute of Environmental Science and Research, Porirua, New Zealand.
2
Malaghan Institute of Medical Research, Wellington, New Zealand.
3
School of Biological Sciences, Victoria University of Wellington, Wellington, New Zealand.
4
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
5
Wellcome Trust Sanger Institute, Hinxton, United Kingdom.
6
H. Lundbeck A/S, Valby, Denmark.
7
Centre for Biodiscovery, Victoria University of Wellington, Wellington, New Zealand.

Abstract

Neisseria meningitidis (meningococcus) can cause meningococcal disease, a rapidly progressing and often fatal disease that can occur in previously healthy children. Meningococci are found in healthy carriers, where they reside in the nasopharynx as commensals. While carriage is relatively common, invasive disease, associated with hypervirulent strains, is a comparatively rare event. The basis of increased virulence in some strains is not well understood. New Zealand suffered a protracted meningococcal disease epidemic, from 1991 to 2008. During this time, a household carriage study was carried out in Auckland: household contacts of index meningococcal disease patients were swabbed for isolation of carriage strains. In many households, healthy carriers harbored strains identical, as determined by laboratory typing, to the ones infecting the associated patient. We carried out more-detailed analyses of carriage and disease isolates from a select number of households. We found that isolates, although indistinguishable by laboratory typing methods and likely closely related, had many differences. We identified multiple genome variants and transcriptional differences between isolates. These studies enabled the identification of two new phase-variable genes. We also found that several carriage strains had lost their type IV pili and that this loss correlated with reduced tumor necrosis factor alpha (TNF-α) expression when cultured with epithelial cells. While nonpiliated meningococcal isolates have been previously found in carriage strains, this is the first evidence of an association between type IV pili from meningococci and a proinflammatory epithelial response. We also identified potentially important metabolic differences between carriage and disease isolates, including the sulfate assimilation pathway. IMPORTANCENeisseria meningitidis causes meningococcal disease but is frequently carried in the throats of healthy individuals; the factors that determine whether invasive disease develops are not completely understood. We carried out detailed studies of isolates, collected from patients and their household contacts, to identify differences between commensal throat isolates and those that caused invasive disease. Though isolates were identical by laboratory typing methods, we uncovered many differences in their genomes, in gene expression, and in their interactions with host cells. In particular, we found that several carriage isolates had lost their type IV pili, a surprising finding since pili are often described as essential for colonization. However, loss of type IV pili correlated with reduced secretion of a proinflammatory cytokine, TNF-α, when meningococci were cocultured with human bronchial epithelial cells; hence, the loss of pili could provide an advantage to meningococci, by resulting in a dampened localized host immune response.

KEYWORDS:

Neisseria meningitidis; carriage; household contact; type IV pili

6.
BMC Genomics. 2017 Oct 17;18(1):795. doi: 10.1186/s12864-017-4197-1.

Analysis of the genome of the New Zealand giant collembolan (Holacanthella duospinosa) sheds light on hexapod evolution.

Author information

1
Landcare Research, Private Bag, Auckland, 92170, New Zealand.
2
School of Biological Sciences, The University of Auckland, Auckland, New Zealand.
3
The New Zealand Institute for Plant & Food Research Ltd, Auckland, New Zealand.
4
Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand.
5
Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Adenauerallee 160, 53113, Bonn, Germany.
6
Evolutionary Biology & Ecology, Institute for Biology, University of Freiburg, Freiburg, Germany.
7
Genetics Otago, Department of Biochemistry, University of Otago, Dunedin, New Zealand.
8
School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, UK.
9
Department of Animal Behaviour, Bielefeld University, Bielefeld, Germany.
10
Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilian University of Munich, Planegg-, Martinsried, Germany.
11
Biomolecular Interactions Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
12
South Australian Museum, North Terrace, GPO Box 234, Adelaide, SA, 5001, Australia.
13
School of Pharmacy and Medical Sciences, University of South Australia, Adelaide, SA, Australia.
14
Landcare Research, Private Bag, Auckland, 92170, New Zealand. buckleyt@landcareresearch.co.nz.
15
School of Biological Sciences, The University of Auckland, Auckland, New Zealand. buckleyt@landcareresearch.co.nz.

Abstract

BACKGROUND:

The New Zealand collembolan genus Holacanthella contains the largest species of springtails (Collembola) in the world. Using Illumina technology we have sequenced and assembled a draft genome and transcriptome from Holacanthella duospinosa (Salmon). We have used this annotated assembly to investigate the genetic basis of a range of traits critical to the evolution of the Hexapoda, the phylogenetic position of H. duospinosa and potential horizontal gene transfer events.

RESULTS:

Our genome assembly was ~375 Mbp in size with a scaffold N50 of ~230 Kbp and sequencing coverage of ~180×. DNA elements, LTRs and simple repeats and LINEs formed the largest components and SINEs were very rare. Phylogenomics (370,877 amino acids) placed H. duospinosa within the Neanuridae. We recovered orthologs of the conserved sex determination genes thought to play a role in sex determination. Analysis of CpG content suggested the absence of DNA methylation, and consistent with this we were unable to detect orthologs of the DNA methyltransferase enzymes. The small subunit rRNA gene contained a possible retrotransposon. The Hox gene complex was broken over two scaffolds. For chemosensory ability, at least 15 and 18 ionotropic glutamate and gustatory receptors were identified, respectively. However, we were unable to identify any odorant receptors or their obligate co-receptor Orco. Twenty-three chitinase-like genes were identified from the assembly. Members of this multigene family may play roles in the digestion of fungal cell walls, a common food source for these saproxylic organisms. We also detected 59 and 96 genes that blasted to bacteria and fungi, respectively, but were located on scaffolds that otherwise contained arthropod genes.

CONCLUSIONS:

The genome of H. duospinosa contains some unusual features including a Hox complex broken over two scaffolds, in a different manner to other arthropod species, a lack of odorant receptor genes and an apparent lack of environmentally responsive DNA methylation, unlike many other arthropods. Our detection of candidate horizontal gene transfer candidates confirms that this phenomenon is occurring across Collembola. These findings allow us to narrow down the regions of the arthropod phylogeny where key innovations have occurred that have facilitated the evolutionary success of Hexapoda.

KEYWORDS:

Chemoreceptors; Developmental biology; Epigenetics; Genome assembly; Hexapoda; Horizontal gene transfer; Methylation; Neanuridae; Phylogenomics; RNA; Sex determination

PMID:
29041914
PMCID:
PMC5644144
DOI:
10.1186/s12864-017-4197-1
[Indexed for MEDLINE]
Free PMC Article
Icon for BioMed Central Icon for PubMed Central
7.
Genome Announc. 2017 Jun 1;5(22). pii: e00436-17. doi: 10.1128/genomeA.00436-17.

Complete Genome Sequences of Two Geographically Distinct Legionella micdadei Clinical Isolates.

Author information

1
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand amy.osborne@otago.ac.nz sandy.slow@otago.ac.nz.
2
Department of Pathology, University of Otago Christchurch, Christchurch, New Zealand.
3
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
4
Department of Pathology, University of Otago Christchurch, Christchurch, New Zealand amy.osborne@otago.ac.nz sandy.slow@otago.ac.nz.

Abstract

Legionella is a highly diverse genus of intracellular bacterial pathogens that cause Legionnaire's disease (LD), an often severe form of pneumonia. Two L. micdadei sp. clinical isolates, obtained from patients hospitalized with LD from geographically distinct areas, were sequenced using PacBio SMRT cell technology, identifying incomplete phage regions, which may impact virulence.

8.
PLoS One. 2017 Mar 1;12(3):e0172790. doi: 10.1371/journal.pone.0172790. eCollection 2017.

Transposon insertion libraries for the characterization of mutants from the kiwifruit pathogen Pseudomonas syringae pv. actinidiae.

Author information

1
Bioprotection Portfolio, The New Zealand Institute for Plant & Food Research Limited, Auckland, New Zealand.
2
Laboratory of Molecular Plant Pathology, Institute of Agriculture and Environment, Massey University, Palmerston North, New Zealand.
3
Bio-Protection Research Centre, New Zealand.
4
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
5
Department of Biochemistry, University of Otago, Dunedin, New Zealand.
6
Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand.
7
School of Biological Sciences, University of Auckland, Auckland, New Zealand.

Abstract

Pseudomonas syringae pv. actinidiae (Psa), the causal agent of kiwifruit canker, is one of the most devastating plant diseases of recent times. We have generated two mini-Tn5-based random insertion libraries of Psa ICMP 18884. The first, a 'phenotype of interest' (POI) library, consists of 10,368 independent mutants gridded into 96-well plates. By replica plating onto selective media, the POI library was successfully screened for auxotrophic and motility mutants. Lipopolysaccharide (LPS) biosynthesis mutants with 'Fuzzy-Spreader'-like morphologies were also identified through a visual screen. The second, a 'mutant of interest' (MOI) library, comprises around 96,000 independent mutants, also stored in 96-well plates, with approximately 200 individuals per well. The MOI library was sequenced on the Illumina MiSeq platform using Transposon-Directed Insertion site Sequencing (TraDIS) to map insertion sites onto the Psa genome. A grid-based PCR method was developed to recover individual mutants, and using this strategy, the MOI library was successfully screened for a putative LPS mutant not identified in the visual screen. The Psa chromosome and plasmid had 24,031 and 1,236 independent insertion events respectively, giving insertion frequencies of 3.65 and 16.6 per kb respectively. These data suggest that the MOI library is near saturation, with the theoretical probability of finding an insert in any one chromosomal gene estimated to be 97.5%. However, only 47% of chromosomal genes had insertions. This surprisingly low rate cannot be solely explained by the lack of insertions in essential genes, which would be expected to be around 5%. Strikingly, many accessory genes, including most of those encoding type III effectors, lacked insertions. In contrast, 94% of genes on the Psa plasmid had insertions, including for example, the type III effector HopAU1. These results suggest that some chromosomal sites are rendered inaccessible to transposon insertion, either by DNA-binding proteins or by the architecture of the nucleoid.

PMID:
28249011
PMCID:
PMC5332098
DOI:
10.1371/journal.pone.0172790
[Indexed for MEDLINE]
Free PMC Article
Icon for Public Library of Science Icon for PubMed Central
9.
RNA Biol. 2017 Mar 4;14(3):275-280. doi: 10.1080/15476286.2016.1272747. Epub 2017 Jan 9.

Why so narrow: Distribution of anti-sense regulated, type I toxin-antitoxin systems compared with type II and type III systems.

Author information

1
a School of Biological Sciences, University of Canterbury, Canterbury , Christchurch , New Zealand.
2
b Centre for Integrative Ecology, University of Canterbury, Canterbury , Christchurch , New Zealand.
3
c Biomolecular Interaction Centre, University of Canterbury, Canterbury , Christchurch , New Zealand.

Abstract

Toxin-antitoxin (TA) systems are gene modules that appear to be horizontally mobile across a wide range of prokaryotes. It has been proposed that type I TA systems, with an antisense RNA-antitoxin, are less mobile than other TAs that rely on direct toxin-antitoxin binding but no direct comparisons have been made. We searched for type I, II and III toxin families using iterative searches with profile hidden Markov models across phyla and replicons. The distribution of type I toxin families were comparatively narrow, but these patterns weakened with recently discovered families. We discuss how the function and phenotypes of TA systems as well as biases in our search methods may account for differences in their distribution.

KEYWORDS:

Antisense RNA; horizontal gene transfer; post-segregational killing; toxin-antitoxin systems

PMID:
28067598
PMCID:
PMC5367252
DOI:
10.1080/15476286.2016.1272747
[Indexed for MEDLINE]
Free PMC Article
Icon for Taylor & Francis Icon for PubMed Central
10.
Bioinformatics. 2017 Apr 1;33(7):988-996. doi: 10.1093/bioinformatics/btw728.

A comprehensive benchmark of RNA-RNA interaction prediction tools for all domains of life.

Umu SU1,2, Gardner PP1,2,3.

Author information

1
School of Biological Sciences.
2
Biomolecular Interaction Centre.
3
Bio-Protection Research Centre, University of Canterbury, Christchurch, New Zealand.

Abstract

Motivation:

The aim of this study is to assess the performance of RNA-RNA interaction prediction tools for all domains of life.

Results:

Minimum free energy (MFE) and alignment methods constitute most of the current RNA interaction prediction algorithms. The MFE tools that include accessibility (i.e. RNAup, IntaRNA and RNAplex) to the final predicted binding energy have better true positive rates (TPRs) with a high positive predictive values (PPVs) in all datasets than other methods. They can also differentiate almost half of the native interactions from background. The algorithms that include effects of internal binding energies to their model and alignment methods seem to have high TPR but relatively low associated PPV compared to accessibility based methods.

Availability and Implementation:

We shared our wrapper scripts and datasets at Github (github.com/UCanCompBio/RNA_Interactions_Benchmark). All parameters are documented for personal use.

Contact:

sinan.umu@pg.canterbury.ac.nz.

Supplementary information:

Supplementary data are available at Bioinformatics online.

PMID:
27993777
PMCID:
PMC5408919
DOI:
10.1093/bioinformatics/btw728
[Indexed for MEDLINE]
Free PMC Article
Icon for Silverchair Information Systems Icon for PubMed Central
11.
Elife. 2016 Sep 20;5. pii: e13479. doi: 10.7554/eLife.13479.

Avoidance of stochastic RNA interactions can be harnessed to control protein expression levels in bacteria and archaea.

Umu SU1,2, Poole AM1,2, Dobson RC1,2,3, Gardner PP1,2,4.

Author information

1
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
2
Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand.
3
Department of Biochemistry and Molecular Biology, University of Melbourne, Parkville, Australia.
4
BioProtection Research Centre, University of Canterbury, Christchurch, New Zealand.

Abstract

A critical assumption of gene expression analysis is that mRNA abundances broadly correlate with protein abundance, but these two are often imperfectly correlated. Some of the discrepancy can be accounted for by two important mRNA features: codon usage and mRNA secondary structure. We present a new global factor, called mRNA:ncRNA avoidance, and provide evidence that avoidance increases translational efficiency. We also demonstrate a strong selection for the avoidance of stochastic mRNA:ncRNA interactions across prokaryotes, and that these have a greater impact on protein abundance than mRNA structure or codon usage. By generating synonymously variant green fluorescent protein (GFP) mRNAs with different potential for mRNA:ncRNA interactions, we demonstrate that GFP levels correlate well with interaction avoidance. Therefore, taking stochastic mRNA:ncRNA interactions into account enables precise modulation of protein abundance.

KEYWORDS:

Archaea; E. coli; bacteria; bioinformatics; computational biology; evolutionary biology; gene expression; genomics; ncRNA; systems biology

PMID:
27642845
PMCID:
PMC5028192
DOI:
10.7554/eLife.13479
[Indexed for MEDLINE]
Free PMC Article
Icon for eLife Sciences Publications, Ltd Icon for PubMed Central
12.
Bioinformatics. 2016 Dec 1;32(23):3566-3574. doi: 10.1093/bioinformatics/btw518. Epub 2016 Aug 8.

A profile-based method for identifying functional divergence of orthologous genes in bacterial genomes.

Author information

1
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
2
Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand.
3
Institute for Molecular Infection Biology, University of Wuerzburg, Wuerzburg, Germany.
4
Institute of Food Research, Norwich Research Park, Norwich, UK.
5
Wellcome Trust Sanger Institute, Hinxton, UK.
6
Bio-protection Research Centre, University of Canterbury, Christchurch, New Zealand.

Abstract

MOTIVATION:

Next generation sequencing technologies have provided us with a wealth of information on genetic variation, but predicting the functional significance of this variation is a difficult task. While many comparative genomics studies have focused on gene flux and large scale changes, relatively little attention has been paid to quantifying the effects of single nucleotide polymorphisms and indels on protein function, particularly in bacterial genomics.

RESULTS:

We present a hidden Markov model based approach we call delta-bitscore (DBS) for identifying orthologous proteins that have diverged at the amino acid sequence level in a way that is likely to impact biological function. We benchmark this approach with several widely used datasets and apply it to a proof-of-concept study of orthologous proteomes in an investigation of host adaptation in Salmonella enterica We highlight the value of the method in identifying functional divergence of genes, and suggest that this tool may be a better approach than the commonly used dN/dS metric for identifying functionally significant genetic changes occurring in recently diverged organisms.

AVAILABILITY AND IMPLEMENTATION:

A program implementing DBS for pairwise genome comparisons is freely available at: https://github.com/UCanCompBio/deltaBS CONTACT: nicole.wheeler@pg.canterbury.ac.nz or lars.barquist@uni-wuerzburg.deSupplementary information: Supplementary data are available at Bioinformatics online.

PMID:
27503221
PMCID:
PMC5181535
DOI:
10.1093/bioinformatics/btw518
[Indexed for MEDLINE]
Free PMC Article
Icon for Silverchair Information Systems Icon for PubMed Central
13.
Curr Protoc Bioinformatics. 2016 Jun 20;54:12.13.1-12.13.25. doi: 10.1002/cpbi.4.

Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

Author information

1
Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany.
2
Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.
3
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
4
Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand.

Abstract

Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process.

KEYWORDS:

RNA; Rfam; alignment; conservation; covariance model; homology; ncRNA

PMID:
27322404
PMCID:
PMC5010141
DOI:
10.1002/cpbi.4
[Indexed for MEDLINE]
Free PMC Article
Icon for Wiley Icon for PubMed Central
14.
Sci Rep. 2016 Jan 18;6:19233. doi: 10.1038/srep19233.

An evaluation of the accuracy and speed of metagenome analysis tools.

Author information

1
Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand.
2
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
3
Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark.

Abstract

Metagenome studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial and aquatic ecosystems to human skin and gut. With the advent of high-throughput sequencing platforms, the use of large scale shotgun sequencing approaches is now commonplace. However, a thorough independent benchmark comparing state-of-the-art metagenome analysis tools is lacking. Here, we present a benchmark where the most widely used tools are tested on complex, realistic data sets. Our results clearly show that the most widely used tools are not necessarily the most accurate, that the most accurate tool is not necessarily the most time consuming, and that there is a high degree of variability between available tools. These findings are important as the conclusions of any metagenomics study are affected by errors in the predicted community composition and functional capacity. Data sets and results are freely available from http://www.ucbioinformatics.org/metabenchmark.html.

PMID:
26778510
PMCID:
PMC4726098
DOI:
10.1038/srep19233
[Indexed for MEDLINE]
Free PMC Article
Icon for Nature Publishing Group Icon for PubMed Central
15.
Cytogenet Genome Res. 2015;145(2):78-179. doi: 10.1159/000430927. Epub 2015 Jul 14.

Third Report on Chicken Genes and Chromosomes 2015.

Author information

1
Department of Human Genetics, University of Würzburg, Würzburg, Germany.
PMID:
26282327
PMCID:
PMC5120589
DOI:
10.1159/000430927
[Indexed for MEDLINE]
Free PMC Article
Icon for S. Karger AG, Basel, Switzerland Icon for PubMed Central
16.
PLoS One. 2015 Mar 30;10(3):e0121797. doi: 10.1371/journal.pone.0121797. eCollection 2015.

Conservation and losses of non-coding RNAs in avian genomes.

Author information

1
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand; Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand.
2
Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany; ecSeq Bioinformatics, Brandvorwerkstr.43, D-04275 Leipzig, Germany.
3
European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK.
4
Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom.
5
Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany.
6
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
7
Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany; Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany; Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, D-04103 Leipzig, Germany; Department of Theoretical Chemistry of the University of Vienna, Währingerstrasse 17, A-1090 Vienna, Austria; Center for RNA in Technology and Health, Univ. Copenhagen, Grønnegårdsvej 3, Frederiksberg C, Denmark; Santa Fe Institute, 1399 Hyde Park Road, Santa Fe NM 87501, USA; German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Germany.

Abstract

Here we present the results of a large-scale bioinformatics annotation of non-coding RNA loci in 48 avian genomes. Our approach uses probabilistic models of hand-curated families from the Rfam database to infer conserved RNA families within each avian genome. We supplement these annotations with predictions from the tRNA annotation tool, tRNAscan-SE and microRNAs from miRBase. We identify 34 lncRNA-associated loci that are conserved between birds and mammals and validate 12 of these in chicken. We report several intriguing cases where a reported mammalian lncRNA, but not its function, is conserved. We also demonstrate extensive conservation of classical ncRNAs (e.g., tRNAs) and more recently discovered ncRNAs (e.g., snoRNAs and miRNAs) in birds. Furthermore, we describe numerous "losses" of several RNA families, and attribute these to either genuine loss, divergence or missing data. In particular, we show that many of these losses are due to the challenges associated with assembling avian microchromosomes. These combined results illustrate the utility of applying homology-based methods for annotating novel vertebrate genomes.

PMID:
25822729
PMCID:
PMC4378963
DOI:
10.1371/journal.pone.0121797
[Indexed for MEDLINE]
Free PMC Article
Icon for Public Library of Science Icon for PubMed Central
17.
Gigascience. 2015 Feb 12;4:4. doi: 10.1186/s13742-014-0038-1. eCollection 2015.

Phylogenomic analyses data of the avian phylogenomics project.

Author information

1
Department of Neurobiology, Howard Hughes Medical Institute and Duke University Medical Center, Durham, NC 27710 USA.
2
Department of Computer Science, The University of Texas at Austin, Austin, TX 78712 USA.
3
Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.
4
China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China ; College of Medicine and Forensics, Xi'an Jiaotong University, Xi'an, 710061 China ; Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.
5
Department of Biology, New Mexico State University, Las Cruces, NM 88003 USA.
6
China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China ; Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.
7
School of Biological Sciences, University of Sydney, Sydney, NSW 2006 Australia.
8
Department of Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles, CA 90095 USA ; Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803 USA.
9
CNRS UMR 5554, Institut des Sciences de l'Evolution de Montpellier, Université Montpellier II, Montpellier, France.
10
Department of Evolutionary Biology, Uppsala University, SE-752 36 Uppsala, Sweden.
11
Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.
12
Department of Biology, New Mexico State University, Las Cruces, NM 88003 USA ; Biodiversity and Biocomplexity Unit, Okinawa Institute of Science and Technology Onna-son, Okinawa, 904-0495 Japan.
13
Department of Statistics and Institute of Bioinformatics, University of Georgia, Athens, 30602 USA.
14
Department of Genomics and Genetics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG UK.
15
Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, MA USA.
16
Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany ; Institute of Theoretical Informatics, Department of Informatics, Karlsruhe Institute of Technology, D- 76131 Karlsruhe, Germany.
17
Department of Biochemistry & Biophysics, University of California, San Francisco, CA 94158 USA.
18
Department of Ornithology, American Museum of Natural History, New York, NY 10024 USA.
19
Department of Biology and Genetics Institute, University of Florida, Gainesville, FL 32611 USA.
20
China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China ; Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark ; Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah, 21589 Saudi Arabia ; Macau University of Science and Technology, Avenida Wai long, Taipa, Macau, 999078 China ; Department of Medicine, University of Hong Kong, Hong Kong, Hong Kong.
21
Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark ; Trace and Environmental DNA Laboratory Department of Environment and Agriculture, Curtin University, Perth, WA 6102 Australia.
22
China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China ; Centre for Social Evolution, Department of Biology, Universitetsparken 15, University of Copenhagen, DK-2100 Copenhagen, Denmark.

Abstract

BACKGROUND:

Determining the evolutionary relationships among the major lineages of extant birds has been one of the biggest challenges in systematic biology. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders. We used these genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomic analyses.

FINDINGS:

Here we present the datasets associated with the phylogenomic analyses, which include sequence alignment files consisting of nucleotides, amino acids, indels, and transposable elements, as well as tree files containing gene trees and species trees. Inferring an accurate phylogeny required generating: 1) A well annotated data set across species based on genome synteny; 2) Alignments with unaligned or incorrectly overaligned sequences filtered out; and 3) Diverse data sets, including genes and their inferred trees, indels, and transposable elements. Our total evidence nucleotide tree (TENT) data set (consisting of exons, introns, and UCEs) gave what we consider our most reliable species tree when using the concatenation-based ExaML algorithm or when using statistical binning with the coalescence-based MP-EST algorithm (which we refer to as MP-EST*). Other data sets, such as the coding sequence of some exons, revealed other properties of genome evolution, namely convergence.

CONCLUSIONS:

The Avian Phylogenomics Project is the largest vertebrate phylogenomics project to date that we are aware of. The sequence, alignment, and tree data are expected to accelerate analyses in phylogenomics and other related areas.

KEYWORDS:

Avian genomes; Gene trees; Indels; Phylogenomics; Sequence alignments; Species tree; Transposable elements

PMID:
25741440
PMCID:
PMC4349222
DOI:
10.1186/s13742-014-0038-1
[Indexed for MEDLINE]
Free PMC Article
Icon for Silverchair Information Systems Icon for PubMed Central
18.
Pac Symp Biocomput. 2015:330-41.

Crowdsourcing RNA structural alignments with an online computer game.

Author information

1
School of Computer Science, McGill University, Montreal, QC H3A 0E9, Canada. jeromew@cs.mcgill.ca.

Abstract

The annotation and classification of ncRNAs is essential to decipher molecular mechanisms of gene regulation in normal and disease states. A database such as Rfam maintains alignments, consensus secondary structures, and corresponding annotations for RNA families. Its primary purpose is the automated, accurate annotation of non-coding RNAs in genomic sequences. However, the alignment of RNAs is computationally challenging, and the data stored in this database are often subject to improvements. Here, we design and evaluate Ribo, a human-computing game that aims to improve the accuracy of RNA alignments already stored in Rfam. We demonstrate the potential of our techniques and discuss the feasibility of large scale collaborative annotation and classification of RNA families.

PMID:
25592593
[Indexed for MEDLINE]
Free full text
Icon for World Scientific Publishing Company
19.
Nucleic Acids Res. 2015 Jan;43(2):691-8. doi: 10.1093/nar/gku1327. Epub 2014 Dec 17.

Annotating RNA motifs in sequences and alignments.

Author information

1
School of Biological Sciences, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand Biomolecular Interaction Centre, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand paul.gardner@canterbury.ac.nz.
2
School of Biological Sciences, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand Biomolecular Interaction Centre, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand.

Abstract

RNA performs a diverse array of important functions across all cellular life. These functions include important roles in translation, building translational machinery and maturing messenger RNA. More recent discoveries include the miRNAs and bacterial sRNAs that regulate gene expression, the thermosensors, riboswitches and other cis-regulatory elements that help prokaryotes sense their environment and eukaryotic piRNAs that suppress transposition. However, there can be a long period between the initial discovery of a RNA and determining its function. We present a bioinformatic approach to characterize RNA motifs, which are critical components of many RNA structure-function relationships. These motifs can, in some instances, provide researchers with functional hypotheses for uncharacterized RNAs. Moreover, we introduce a new profile-based database of RNA motifs--RMfam--and illustrate some applications for investigating the evolution and functional characterization of RNA. All the data and scripts associated with this work are available from: https://github.com/ppgardne/RMfam.

PMID:
25520192
PMCID:
PMC4333381
DOI:
10.1093/nar/gku1327
[Indexed for MEDLINE]
Free PMC Article
Icon for Silverchair Information Systems Icon for PubMed Central
20.
Science. 2014 Dec 12;346(6215):1311-20. doi: 10.1126/science.1251385. Epub 2014 Dec 11.

Comparative genomics reveals insights into avian genome evolution and adaptation.

Author information

1
China National GeneBank, Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, 518083, China. Centre for Social Evolution, Department of Biology, Universitetsparken 15, University of Copenhagen, DK-2100 Copenhagen, Denmark. zhanggj@genomics.cn jarvis@neuro.duke.edu mtpgilbert@gmail.com wangj@genomics.cn.
2
China National GeneBank, Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, 518083, China. Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.
3
China National GeneBank, Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, 518083, China.
4
Royal Veterinary College, University of London, London, UK.
5
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-742, Republic of Korea. Cho and Kim Genomics, Seoul National University Research Park, Seoul 151-919, Republic of Korea.
6
School of Biological Sciences, University of Nebraska, Lincoln, NE 68588, USA.
7
Centro de Investigación en Ciencias del Mar y Limnología (CIMAR)/Centro Interdisciplinar de Investigação Marinha e Ambiental (CIIMAR), Universidade do Porto, Rua dos Bragas, 177, 4050-123 Porto, Portugal. Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal.
8
Department of Biological Sciences, University of South Carolina, Columbia, SC, USA.
9
Department of Biology and Molecular Biology, Montclair State University, Montclair, NJ 07043, USA.
10
Department of Animal Ecology, Uppsala University, Norbyvägen 18D, S-752 36 Uppsala, Sweden.
11
Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Biological Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW 2006, Australia. Program in Emerging Infectious Diseases, Duke-NUS Graduate Medical School, Singapore 169857, Singapore.
12
Department of Integrative Biology University of California, Berkeley, CA 94720, USA.
13
China National GeneBank, Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, 518083, China. College of Life Sciences, Wuhan University, Wuhan 430072, China.
14
China National GeneBank, Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, 518083, China. School of Bioscience and Bioengineering, South China University of Technology, Guangzhou 510006, China.
15
China National GeneBank, Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, 518083, China. BGI Education Center,University of Chinese Academy of Sciences,Shenzhen, 518083, China.
16
Key Laboratory of Animal Models and Human Disease Mechanisms of Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Kunming, Yunnan 650223, China.
17
Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.
18
Department of Neurobiology, Howard Hughes Medical Institute, Duke University Medical Center, Durham, NC 27710, USA.
19
Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK.
20
School of Biosciences, University of Kent, Canterbury CT2 7NJ, UK.
21
Centro de Investigación en Ciencias del Mar y Limnología (CIMAR)/Centro Interdisciplinar de Investigação Marinha e Ambiental (CIIMAR), Universidade do Porto, Rua dos Bragas, 177, 4050-123 Porto, Portugal. Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto, Portugal.
22
Department of Biology, University of California Riverside, Riverside, CA 92521, USA.
23
Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA.
24
Instituto de Ciencias Ambientales y Evolutivas, Facultad de Ciencias, Universidad Austral de Chile, Valdivia, Chile.
25
Department of Anatomy, Physiology and Biochemistry, Swedish University of Agricultural Sciences, Post Office Box 7011, S-750 07, Uppsala, Sweden.
26
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-742, Republic of Korea. Cho and Kim Genomics, Seoul National University Research Park, Seoul 151-919, Republic of Korea. Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-742, Republic of Korea.
27
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-742, Republic of Korea.
28
Cho and Kim Genomics, Seoul National University Research Park, Seoul 151-919, Republic of Korea.
29
State Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing 100094, China.
30
State Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing 100094, China. College of Animal Science and Technology, China Agricultural University, Beijing 100094, China.
31
Organisms and Environment Division, Cardiff School of Biosciences, Cardiff University, Cardiff CF10 3AX, Wales, UK.
32
Organisms and Environment Division, Cardiff School of Biosciences, Cardiff University, Cardiff CF10 3AX, Wales, UK. Key Lab of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101 China.
33
International Wildlife Consultants, Carmarthen SA33 5YL, Wales, UK.
34
Centre for Zoo and Wild Animal Health, Copenhagen Zoo, Roskildevej 38, DK-2000 Frederiksberg, Denmark.
35
Department of Ecology and Evolutionary Biology, Tulane University, New Orleans, LA, USA. Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USA.
36
The Genome Institute at Washington University, St. Louis, MO 63108, USA.
37
College of Medicine and Forensics, Xi'an Jiaotong University, Xi'an, 710061, China.
38
Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA.
39
Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA.
40
Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, St. Petersburg, Russia. Nova Southeastern University Oceanographic Center 8000 N Ocean Drive, Dania, FL 33004, USA.
41
Smithsonian Conservation Biology Institute, National Zoological Park, 1500 Remount Road, Front Royal, VA 22630, USA.
42
Genetics Division, San Diego Zoo Institute for Conservation Research, 15600 San Pasqual Valley Road, Escondido, CA 92027, USA.
43
Department of Vertebrate Zoology, MRC-116, National Museum of Natural History, Smithsonian Institution, Post Office Box 37012, Washington, DC 20013-7012, USA. Center for Macroecology, Evolution and Climate, the Natural History Museum of Denmark, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen O, Denmark.
44
Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, 1 Beichen West Road, Chaoyang District, Beijing 100101, China. Swedish Species Information Centre, Swedish University of Agricultural Sciences, Box 7007, SE-750 07 Uppsala, Sweden.
45
Center for Macroecology, Evolution and Climate, the Natural History Museum of Denmark, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen O, Denmark.
46
Department of Biochemistry & Biophysics, University of California, San Francisco, CA 94158, USA.
47
Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA.
48
Department of Biology and Genetics Institute, University of Florida, Gainesville, FL 32611, USA.
49
Center for Macroecology, Evolution and Climate, the Natural History Museum of Denmark, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen O, Denmark. Imperial College London, Grand Challenges in Ecosystems and the Environment Initiative, Silwood Park Campus, Ascot, Berkshire SL5 7PY, UK.
50
Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, The Roslin Institute Building, University of Edinburgh, Easter Bush Campus, Midlothian EH25 9RG, UK.
51
Department of Biology, New Mexico State University, Box 30001 MSC 3AF, Las Cruces, NM 88003, USA.
52
China National GeneBank, Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, 518083, China. Macau University of Science and Technology, Avenida Wai long, Taipa, Macau 999078, China.
53
Department of Neurobiology, Howard Hughes Medical Institute, Duke University Medical Center, Durham, NC 27710, USA. zhanggj@genomics.cn jarvis@neuro.duke.edu mtpgilbert@gmail.com wangj@genomics.cn.
54
Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark. Trace and Environmental DNA Laboratory, Department of Environment and Agriculture, Curtin University, Perth, Western Australia, 6102, Australia. zhanggj@genomics.cn jarvis@neuro.duke.edu mtpgilbert@gmail.com wangj@genomics.cn.
55
China National GeneBank, Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, 518083, China. Macau University of Science and Technology, Avenida Wai long, Taipa, Macau 999078, China. Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark. Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah 21589, Saudi Arabia. Department of Medicine, University of Hong Kong, Hong Kong. zhanggj@genomics.cn jarvis@neuro.duke.edu mtpgilbert@gmail.com wangj@genomics.cn.

Abstract

Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits.

PMID:
25504712
PMCID:
PMC4390078
DOI:
10.1126/science.1251385
[Indexed for MEDLINE]
Free PMC Article
Icon for HighWire Icon for PubMed Central

Supplemental Content

Loading ...
Support Center