• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of bioinfoLink to Publisher's site
Bioinformatics. Jan 2013; 29(1): 122–123.
Published online Oct 8, 2012. doi:  10.1093/bioinformatics/bts567
PMCID: PMC3530914

MGAviewer: a desktop visualization tool for analysis of metagenomics alignment data

Abstract

Summary: Numerous metagenomics projects have produced tremendous amounts of sequencing data. Aligning these sequences to reference genomes is an essential analysis in metagenomics studies. Large-scale alignment data call for intuitive and efficient visualization tool. However, current tools such as various genome browsers are highly specialized to handle intraspecies mapping results. They are not suitable for alignment data in metagenomics, which are often interspecies alignments. We have developed a web browser-based desktop application for interactively visualizing alignment data of metagenomic sequences. This viewer is easy to use on all computer systems with modern web browsers and requires no software installation.

Availability: http://weizhongli-lab.org/mgaviewer

Contact: ude.csds@zwil

1 INTRODUCTION

The advances of Next Generation Sequencing technologies (Mardis, 2011) have promoted big waves of metagenomic projects in study of microbiomes under different environments such as ocean (Rusch et al., 2007) and human body (Qin et al., 2010). An essential step in metagenomic data analysis is to align the sequencing reads against the available microbial genomes.

Visualization is an intuitive way to analyze large-scale alignment data in genomic studies. There are many visualization tools available. Some are web browser-based such as UCSC genome browser (Dreszer et al., 2012), LookSeq (Manske and Kwiatkowski, 2009) and JBrowse (Skinner et al., 2009). Some are standalone programs such as Tablet (Milne et al., 2010), GenomeView (Abeel et al., 2012), MapView (Bao et al., 2009), IGB (Nicol et al., 2009), IGV (Robinson et al., 2011), SamScope (Popendorf and Sakakibara, 2012) and so on.

However, these sophisticated visualization tools are specialized in handling intraspecies alignment results (i.e. query and reference are same species). They are not suitable for interspecies alignments from metagenomic datasets, where query and reference can be from different species. There are fundamental differences between intraspecies and interspecies alignments. The former only involves one reference genome and represent features like single nucleotide polymorphism and alternative splicing. But the latter involves multiple (often 103) reference microbial genomes. To visualize interspecies alignments, a tool needs to show the wide range of alignment similarities (100% to as low as 50% for DNAs and 30% for proteins) and to handle thousands of reference genomes.

The Global Ocean Sampling study (Rusch et al., 2007) first introduced fragment recruitment plots to illustrate the metagenomic alignment data. However, its underlying software is not available to the public.

Here, we present MetaGenomic Alignment Viewer (MGAviewer), a platform-independent web browser-based tool for visualizing alignment data. It does not rely on web server and relational database for image generation and data retrieval. It can be simply used as a standalone desktop program to analyze local data. It can also be included in a web server like other web-based genome browsers.

2 METHODS

The key component of this tool is a graphic interface with a 2D map that displays large amounts of alignments between metagenomic sequences from one or more samples and a reference genome (Fig. 1). Users can explore alignment data by interactively operating the 2D map in a similar way as in Google Maps.

Fig. 1.
Screenshots of the MGAviewer visualization interface

MGAviewer is an HTML5 web application. It works in all major modern browsers, including Chrome, Firefox, Safari and Internet Explorer 9, without the need of installing any extra software or plugin. It uses jQuery (http://jquery.com/) as the base JavaScript library, and on top of which, a customized version of jQuery plotting plugin, Flot (http://code.google.com/p/flot/). We extended Flot to make it support drawing of fragments and annotation features. Above these, a site-specific JavaScript file (‘site.js’) is responsible for setting up plot parameters, placing and responding to additional controls and fetching data.

MGAviewer fetches alignment data from a user’s local computer or from a web server on demand via AJAX. It then draws the plot in an HTML5 Canvas element. Every time a user interaction event is triggered, e.g. zooming in/out, panning and resizing of the plot, the plot image is simply redrawn using data already loaded, unless additional data are required. This is in contrast to many other web-based genome browsers where plot images are generated on the server side and then retrieved by browser on demand; in MGAviewer a plot is drawn locally in browser. This results in no network traffic for most user operations and therefore dramatically improves the responsiveness of user interactions, especially on slow network.

Alignment data are stored in JSON (a lightweight data-interchange format used by JavaScript) formatted files, which contain alignment details including coordinate, sequence identity, name, e-value, etc. We provide scripts to generate JSON files from raw alignment results by BLAST (Altschul et al., 1997) and FR-HIT (Niu et al., 2011) and also from alignments in SAM format. These scripts need installation of BioPython package. Converters for other programs like BLAT (Kent, 2002) can be easily implemented.

MGAviewer can be used as standalone software by simply opening the directory that contains these JSON files, MGAviewer scripts and a master HTML file (see user’s guide for details). It can also be hosted on a web server. The plot itself can be embedded in any webpage.

3 RESULTS

MGAviewer has an interface for users to select one or more metagenomic samples and a reference from a list of reference genomes to generate the plot. The screenshots of MGAviewer are shown in Figure 1. The plot shows alignments from eight metagenomic samples to a reference genome. The x-axis is the genome coordinate, and y-axis is alignment identity (%). Alignments are coloured by sample and are represented as points or lines depending on zoom level. The bottom of the plot shows genes of the reference genome, and the top shows the genome coverage for each sample. Icons at left and right bottom corners are for zoom, resize and reset. Users can also zoom or pan the map by mouse. The inside circular images are zoomed views of the plot.

We tested MGAviewer on 1.5 million alignment datasets between >600 metagenomic samples from CAMERA (Sun et al., 2011) and >2500 genomes from NCBI. MGAviewer provides real-time visualization for almost all these datasets except a few hundred very large datasets, which need extra several seconds for data loading and plotting. MGAviewer is already adopted by CAMERA project in its alignment resources, which will be described in a separate publication. MGAviewer can be used to analyze alignment data not only for prokaryotic species but also for viruses and small eukaryotic organisms.

Funding: This study was supported by Award R01HG005978 from the National Human Genome Research Institute and the Gordon and Betty Moore Foundation.

Conflict of Interest: none declared.

References

  • Abeel T, et al. GenomeView: a next-generation genome browser. Nucleic Acids Res. 2012;40:e12. [PMC free article] [PubMed]
  • Altschul SF, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. [PMC free article] [PubMed]
  • Bao H, et al. MapView: visualization of short reads alignment on a desktop computer. Bioinformatics. 2009;25:1554–1555. [PubMed]
  • Dreszer TR, et al. The UCSC genome browser database: extensions and updates 2011. Nucleic Acids Res. 2012;40:D918–D923. [PMC free article] [PubMed]
  • Kent WJ. BLAT–the BLAST-like alignment tool. Genome Res. 2002;12:656–664. [PMC free article] [PubMed]
  • Manske HM, Kwiatkowski DP. LookSeq: a browser-based viewer for deep sequencing data. Genome Res. 2009;19:2125–2132. [PMC free article] [PubMed]
  • Mardis ER. A decade’s perspective on DNA sequencing technology. Nature. 2011;470:198–203. [PubMed]
  • Milne I, et al. Tablet–next generation sequence assembly visualization. Bioinformatics. 2010;26:401–402. [PMC free article] [PubMed]
  • Nicol JW, et al. The integrated genome browser: free software for distribution and exploration of genome-scale datasets. Bioinformatics. 2009;25:2730–2731. [PMC free article] [PubMed]
  • Niu B, et al. FR-HIT, a very fast program to recruit metagenomic reads to homologous reference genomes. Bioinformatics. 2011;27:1704–1705. [PMC free article] [PubMed]
  • Popendorf K, Sakakibara Y. SAMSCOPE: an openGL-based real-time interactive scale-free SAM viewer. Bioinformatics. 2012;28:1276–1277. [PubMed]
  • Qin J, et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464:59–65. [PMC free article] [PubMed]
  • Robinson JT, et al. Integrative genomics viewer. Nat. Biotechnol. 2011;29:24–26. [PMC free article] [PubMed]
  • Rusch DB, et al. The sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol. 2007;5:e77. [PMC free article] [PubMed]
  • Skinner ME, et al. JBrowse: a next-generation genome browser. Genome Res. 2009;19:1630–1638. [PMC free article] [PubMed]
  • Sun S, et al. Community cyberinfrastructure for advanced microbial ecology research and analysis: the CAMERA resource. Nucleic Acids Res. 2011;39:D546–D551. [PMC free article] [PubMed]

Articles from Bioinformatics are provided here courtesy of Oxford University Press
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • PubMed
    PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...