Format

Send to

Choose Destination
PLoS One. 2014 Apr 11;9(4):e94077. doi: 10.1371/journal.pone.0094077. eCollection 2014.

Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.

Author information

1
Institute of Experimental Botany, Academy of Sciences of the Czech Republic, Prague, Czech Republic; Department of Experimental Plant Biology, Faculty of Science, Charles University in Prague, Prague, Czech Republic.
2
Department of Experimental Plant Biology, Faculty of Science, Charles University in Prague, Prague, Czech Republic.
3
Department of Botany, Faculty of Science, Charles University in Prague, Prague, Czech Republic; Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czech Republic.
4
Department of Botany and Plant Pathology and Center for Genome Research and Biocomputing, Oregon State University, Corvallis, Oregon, United States of America.
5
Institute of Experimental Botany, Academy of Sciences of the Czech Republic, Prague, Czech Republic.

Abstract

Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370) locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation.

PMID:
24728280
PMCID:
PMC3984084
DOI:
10.1371/journal.pone.0094077
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center