![]() | ![]() |
Formats:
|
||||||||||
Copyright © 2004 Oxford University Press PlasMapper: a web server for drawing and auto-annotating plasmid maps Departments of Biological Sciences and Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada *To whom correspondence should be addressed. Tel: +1 780 492 0383; Fax: +1 780 492 5305; Email: david.wishart/at/ualberta.ca aThe online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. a© 2004, the authors Received February 15, 2004; Revised March 30, 2004; Accepted March 30, 2004. This article has been cited by other articles in PMC.Abstract PlasMapper is a comprehensive web server that automatically generates and annotates high-quality circular plasmid maps. Taking only the plasmid/vector DNA sequence as input, PlasMapper uses sequence pattern matching and BLAST alignment to automatically identify and label common promoters, terminators, cloning sites, restriction sites, reporter genes, affinity tags, selectable marker genes, replication origins and open reading frames. PlasMapper then presents the identified features in textual form and as high-resolution, multicolored graphical output. The appearance and contents of the output can be customized in numerous ways using several supplied options. Further, PlasMapper images can be rendered in both rasterized (PNG and JPG) and vector graphics (SVG) formats to accommodate a variety of user needs or preferences. The images and textual output are of sufficient quality that they may be used directly in publications or presentations. The PlasMapper web server is freely accessible at http://wishart.biology.ualberta.ca/PlasMapper. INTRODUCTION Plasmid map generation is one of the oldest and most frequently performed operations in bioinformatics. Indeed, probably almost every practicing molecular biologist has worked with or generated a plasmid map to guide them through the cloning or plasmid manipulation process. Because of the size and complexity of plasmid molecules, computer-generated maps are absolutely essential to identify, locate and analyze key regions in a vector sequence. As early as the 1980s standalone computer programs were being described that supported the presentation and manipulation of plasmid maps on specific platforms and computer operating systems (1–7). Many of these early freeware packages have since been replaced by more sophisticated and far more user-friendly commercial packages such as SimVector (Premier BioSoft), GeneTool (BioTools), VectorNTI (Informax, Invitrogen), MacVector (Accelrys), DNA Strider and LaserGene (DNAStar). Currently there are remarkably few freeware plasmid mapping programs still available, although pDRAW32 (AcaClone) is one example of an installable standalone package that supports plasmid mapping. With the growing trend toward using freeware and freely available web tools in bioinformatics, it seems that the continuing dependency on expensive commercial packages to perform just a single operation (plasmid mapping) is somewhat questionable. Furthermore, with the increasing diversity of operating systems seen in many laboratories and the expanding level of inter-laboratory and international collaboration, we believe that a platform-independent solution to plasmid mapping is needed. One obvious solution is a plasmid mapping web server. Here we describe a web server, called PlasMapper, which is able to accept FASTA-formatted DNA sequences and generate a fully labeled/annotated plasmid map (both graphical and textual output) with essentially no further user input. A central innovation in PlasMapper is its capacity to automatically identify and label the plasmid control sequences found in both eukaryotic and prokaryotic vectors using a large database of common plasmid sequences and common plasmid subsequences (replication origins, promoters, terminators, marker genes, etc.). PlasMapper supports a wide range of textual and visual display options that allow users to easily customize the image or textual output. It is able to generate plasmid maps of sufficient quality and resolution that they may be readily used in publications or presentations. PlasMapper is specifically designed to make plasmid annotation trivially simple and to facilitate the sharing and dissemination of plasmid images and plasmid data across all computer platforms. PROGRAM DESCRIPTION PlasMapper is composed of three parts: a front-end web interface (generated using Java), a back-end for rendering and sequence matching (written in Java and C) and a Feature Site Database (FSD) consisting of 336 DNA sequence motifs (promoters, terminators, selectable markers, etc.; Table 1) and 457 restriction enzymes from the Restriction Enzyme Database (8). The FSD was compiled from an extensive survey of commercially and publicly available plasmids. PlasMapper accepts FASTA DNA sequences up to 20 000 bases in length as input to its sequence window and performs checks on both the length and validity of the DNA sequence prior to conducting any further analysis. To facilitate the generation of maps for commonly used plasmids, the PlasMapper website also maintains a growing repository of 288 vector sequences available from various vendors and suppliers. These sequences may be selected and automatically uploaded into the sequence window using the ‘Plasmid Library’ button. To facilitate tracking, editing and ‘virtual cloning’ the sequence text box has a ‘(Re)Format’ button that allows the raw FASTA sequence file to be block formatted and numbered. Insertions (cloned genes), deletions, mutations, edits and corrections can all be made readily in this specially formatted view. As soon as any edits are completed, the user can press the ‘(Re)Format’ button to reformat and renumber the modified sequence. After a sequence has been pasted, edited, selected or reformatted, the ‘Submit’ button can be used to begin the map generation process. As seen in Figure Figure1,1
The feature identification and image rendering portion of PlasMapper consists of four separate programs: BLAST, FIND-SITE, FORMAT and CGView. BLAST (9) is used to identify portions of the supplied sequence that match promoters, terminators, selectable markers, reporter genes and replication origins stored in the FSD. The BLAST program parameters have been optimized for PlasMapper by testing more than 50 randomly selected commercial plasmids to ensure that the resulting annotations completely matched those reported by the vendors. The FIND-SITE program, which uses several components of BioJava (http://www.biojava.org/), is used to identify open reading frames (ORFs) and type II restriction enzyme cutting sites. The third program (FORMAT), also written in Java, generates formatted text output which displays the plasmid sequence (60 bases per line, numbered, courier font) with the requested annotations displayed in stacked, non-overlapping positions above each sequence line. The final program (CGView) uses the Java2D API to convert the results obtained from BLAST and FIND-SITE into a graphical map. Specifically, CGView accepts sequence feature information (feature name, feature type, position and strand) and generates a collection of two-dimensional objects. Each object's shape (an arrow or an arc), color, opacity and position are adjusted according to the attributes of the feature represented. After the objects are drawn, the feature names are placed on the map using an iterative collision detection and shifting process that results in a visually pleasing label arrangement and no label overlap. Java classes included as part of the Java API are used to convert the map into JPG and PNG images. SVG output is generated using the Batik SVG Toolkit (http://xml.apache.org/batik/). Because SVG is a vector format, SVG images can be scaled without any noticeable degradation. Most web browsers can display SVG images using the freely available Adobe SVG Viewer plug-in (http://www.adobe.com/). PROGRAM OUTPUT PlasMapper generates both text and graphics output (JPG, PNG or SVG format). The default view is the graphic image, with a button to create the text view in another window. Figure Figure22
DISCUSSION The PlasMapper server provides a convenient and easily accessible solution to plasmid annotation and drawing for users who normally depend on freeware or free web servers such as EMBOSS (10) or The Sequence Manipulation Suite (11). The simplicity and accessibility of PlasMapper should also make it a useful tool for teaching or training high school and university students in introductory molecular biology or genetics courses. Furthermore, we believe that the use of web-server technology should enable or encourage the sharing of plasmid data and images among geographically distant labs or between labs that normally use incompatible software and/or computer platforms. In trying to make the PlasMapper interface as simple as possible, some sacrifices in flexibility had to be made. No doubt some users may not want a specific label displayed or will dislike the default color scheme. Others may find certain rare restriction enzymes or gene features missing from the FSD. Likewise, the limited choice of plasmid circle or gene/marker widths may seem too restrictive. In many cases, these problems can be addressed by simply pasting the PlasMapper image into an image manipulation package to change the offending feature. The SVG format is helpful in this regard, since individual features and labels can be repositioned or modified using vector-graphics-capable software. For more specialized maps and annotations, commercial sequence analysis packages may be a more appropriate choice. In summary, PlasMapper is a web server that permits the automated annotation and rendering of circular plasmids for both eukaryotic and prokaryotic vectors. It combines database-searching and pattern-matching techniques with a unique collection of plasmid-feature sequences to automatically generate publication-quality text and images. PlasMapper supports a wide range of display and formatting options and should make plasmid analysis and manipulation much simpler and far more accessible. The PlasMapper web server is freely accessible at http://wishart.biology.ualberta.ca/PlasMapper. ACKNOWLEDGEMENTS Funding for this project was provided by the Protein Engineering Network of Centres of Excellence (PENCE Inc.) and Genome Prairie (a division of Genome Canada). REFERENCES 1. Abremski K. and Ward,D.F. (1986) Plasmid map: a microcomputer program for display and storage of plasmid data. Gene, 46, 127–130. [PubMed] 2. Filippone E. and Lurquin,P.F. (1988) PROPLASM: an Apple Macintosh computer program for proportional plasmid map drawing. Biotechniques, 6, 574–575. [PubMed] 3. Liu J.D. and Parkinson,J.S. (1989) A Macintosh program for drawing circular plasmid maps. Comput. Appl. Biosci., 5, 237–238. [PubMed] 4. Peterson E.A. and Ward,D.F. (1990) CLONE 3: plasmid drawing and clone management software program for microcomputers. Biotechniques, 8, 690–693. [PubMed] 5. Dolz R. (1994) GCG: drawing circular restriction maps. Methods Mol. Biol., 24, 35–46. [PubMed] 6. Reda D. and Reda,A.C. (2000) Redasoft Plasmid 1.1: software for easy, efficient cloning and map drawing. Curr. Issues Mol. Biol., 2, 37–39. [PubMed] 7. Tsudzuki T. (2000) A graphic tool for circular genome maps. Nucleic Acids Symp. Ser., 44, 189–190. [PubMed] 8. Roberts R.J., Vincze,T., Posfai,J. and Macelis,D. (2003) REBASE: restriction enzymes and methyltransferases. Nucleic Acids Res., 31, 418–420. [PubMed] 9. Altschul S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403–410. [PubMed] 10. Rice P., Longden,I. and Bleasby,A. (2000) EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet., 16, 276–277. [PubMed] 11. Stothard P. (2000) The Sequence Manipulation Suite: JavaScript programs for analyzing and formatting protein and DNA sequences. Biotechniques, 28, 1102–1104. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||
Gene. 1986; 46(1):127-30.
[Gene. 1986]Nucleic Acids Symp Ser. 2000; (44):189-90.
[Nucleic Acids Symp Ser. 2000]Nucleic Acids Res. 2003 Jan 1; 31(1):418-20.
[Nucleic Acids Res. 2003]J Mol Biol. 1990 Oct 5; 215(3):403-10.
[J Mol Biol. 1990]Trends Genet. 2000 Jun; 16(6):276-7.
[Trends Genet. 2000]Biotechniques. 2000 Jun; 28(6):1102, 1104.
[Biotechniques. 2000]