Format

Send to

Choose Destination
BMC Genomics. 2018 Jan 19;19(Suppl 1):36. doi: 10.1186/s12864-017-4337-7.

PGAP-X: extension on pan-genome analysis pipeline.

Zhao Y1,2, Sun C1,2, Zhao D1,2, Zhang Y1,2, You Y3, Jia X1,2, Yang J1,2, Wang L1,2, Wang J1,2, Fu H3, Kang Y1, Chen F1, Yu J1, Wu J4,5, Xiao J6,7,8,9.

Author information

1
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
2
University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China.
3
Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, People's Republic of China.
4
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China. wujy@big.ac.cn.
5
Beijing Institute of Genomics, Chinese Academy of Sciences, NO. 1 Beichen West Road, Chaoyang District, Beijing, 100101, People's Republic of China. wujy@big.ac.cn.
6
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China. xiaojingfa@big.ac.cn.
7
Big Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China. xiaojingfa@big.ac.cn.
8
University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China. xiaojingfa@big.ac.cn.
9
Beijing Institute of Genomics, Chinese Academy of Sciences, NO. 1 Beichen West Road, Chaoyang District, Beijing, 100101, People's Republic of China. xiaojingfa@big.ac.cn.

Abstract

BACKGROUND:

Since PGAP (pan-genome analysis pipeline) was published in 2012, it has been widely employed in bacterial genomics research. Though PGAP has integrated several modules for pan-genomics analysis, how to properly and effectively interpret and visualize the results data is still a challenge.

RESULT:

To well present bacterial genomic characteristics, a novel cross-platform software was developed, named PGAP-X. Four kinds of data analysis modules were developed and integrated: whole genome sequences alignment, orthologous genes clustering, pan-genome profile analysis, and genetic variants analysis. The results from these analyses can be directly visualized in PGAP-X. The modules for data visualization in PGAP-X include: comparison of genome structure, gene distribution by conservation, pan-genome profile curve and variation on genic and genomic region. Meanwhile, result data produced by other programs with similar function can be imported to be further analyzed and visualized in PGAP-X. To test the performance of PGAP-X, we comprehensively analyzed 14 Streptococcus pneumonia strains and 14 Chlamydia trachomatis. The results show that, S. pneumonia strains have higher diversity on genome structure and gene contents than C. trachomatis strains. In addition, S. pneumonia strains might have suffered many evolutionary events, such genomic rearrangements, frequent horizontal gene transfer, homologous recombination, and other evolutionary process.

CONCLUSION:

Briefly, PGAP-X directly presents the characteristics of bacterial genomic diversity with different visualization methods, which could help us to intuitively understand dynamics and evolution in bacterial genomes. The source code and the pre-complied executable programs are freely available from http://pgapx.ybzhao.com .

KEYWORDS:

Genetic variation; Genome visualization; Pan-genomics

PMID:
29363431
PMCID:
PMC5780747
DOI:
10.1186/s12864-017-4337-7
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center