Logo of molsystbiolLink to Publisher's site
Mol Syst Biol. 2009; 5: 335.
Published online 2009 Dec 22. doi:  10.1038/msb.2009.92
PMCID: PMC2824493

Update on the Keio collection of Escherichia coli single-gene deletion mutants

The Keio collection (Baba et al, 2006) has been established as a set of single-gene deletion mutants of Escherichia coli K-12. These mutants have a precisely designed deletion from the second codon from the seventh to the last codon of each predicted ORF. Further information is available at http://sal.cs.purdue.edu:8097/GB7/index.jsp or http://ecoli.naist.jp/. The distribution is now being handled by the National Institute of Genetics of Japan (http://www.shigen.nig.ac.jp/ecoli/pec/index.jsp). To date more than 4 million samples have been distributed worldwide. As we described earlier (Baba et al, 2006), gene amplification during construction is likely to have led to a small number of mutants with genetic duplications.

The design of the Keio deletions was based on annotations that are now outdated. Of 4288 ORFs targeted, mutants were obtained for 3985 (Baba et al, 2006). Re-annotation based on highly accurate sequencing of E. coli K-12 (Hayashi et al, 2006) led to changing many coding regions and the total number of ORFs to 4296, including pseudogenes (Riley et al, 2006) (Supplementary Table I). The recent E. coli K-12 MG1655 GenBank record (U0096, released in December 2008) has an additional 97 ORFs (exclusive of the ORFs in IS elements, Supplementary Table II) that were not targeted. Of these 4214 annotated ORFs, 4186 were targeted for deletion and 28 were not (Supplementary Table III), which resulted in the isolation of two independent mutants for 3864 targeted ORFs. No deletion was found for 299 ORFs, which are candidates for essential genes. Deletions were also isolated for 23 other ORFs; however, re-annotation led to re-classification of these ORFs as ‘split ORFs', because their coding regions are interrupted by an IS element or some other mutation (Supplementary Table IV).

To identify mutants with partial duplications, we performed two sets of PCR reactions on both representatives of all 3864 mutants. In the first set, we tested for the presence of the targeted gene by using a pair of internal gene-specific primers (Figure 1A and B). With the parental strain E. coli K-12 BW25113, we were able to amplify 3803 ORFs, as indicated by the presence of PCR products of the expected sizes. For 61 ORFs, we used a pair of external primers that flanked the targeted gene either because the length of the initial PCR product was too short or because the internal primer pair failed to amplify fragments of the predicted sizes for the parental control strain. Results from testing 7728 strains (3864 ORFs) showed that the vast majority (96.1%, 7428/7728) are correct; results in Supplementary Table V show that one or both isolates are correct for 98.3% (3800/3864) of the Keio mutants (Figure 1C). As one isolate is correct for 177 ORFs for which the other isolate is ambiguous, no further tests were done with the other isolate of these mutants.

Figure 1
Identification of Keio collection mutants with partial duplications. (A) Primer design. The upper branch shows the expected structure of a single-gene mutant. The targeted ORF is replaced with the kanamycin resistance gene (kan). The lower branch shows ...

Mutants of the remaining 58 ORFs (33 with mixtures and 25 with duplications; Figure 1C) were tested in a second set of PCR reactions, which was carried out using external primers flanking the targeted gene (Figure 1A and B). A positive result in the first PCR test can occur not only from mutants with a partial duplication but also from ones that have been cross-contaminated from a nearby microplate well. Therefore, the second set of PCR tests was performed on three colonies after colony purification. In the second PCR test, colonies with the correct deletion or from a cross-contaminant mutant were expected to yield a single PCR product of length corresponding to the expected structure of the respective single-gene mutant or the structure of the targeted gene, respectively. In contrast, mutants with both the respective single-gene deletion and a genetic duplication were expected to yield both PCR products. In cases wherein the sizes of the predicted PCR products were indistinguishable for the deletion and wild-type structures, the PCR products were digested with XbaI before size separation by electrophoresis, which cuts within the kan (kanamycin resistance gene) replacement gene.

For 33 of the 58 ORFs, one or more colonies yielded a single PCR product of size corresponding to the single-gene deletion, indicating that the wells for these mutants were cross-contaminated (Supplementary Table V). For the 25 other mutants, purified colonies consistently produced PCR fragments corresponding to structures for both the single-gene deletion and targeted, indicating that these mutants have partial duplications (Figure 1C and Table I). As mentioned above, our PCR tests also revealed 177 mutants, for which we showed that only one isolate is correct. Further testing of these ambiguous mutants by our second PCR test revealed that most of them do not carry a partial duplication.

Table 1
Keio mutants with partial duplications for both isolates

The 25 ORFs for which both isolates have duplications are candidates for essential genes (Table I). Fourteen of these have been reported to be essential in the PEC (Profiling of E. coli Chromosome) database (http://www.shigen.nig.ac.jp/ecoli/pec/index.jsp; Table IA). Thus, it is likely that these 14 genes are essential. The other 11 with partial duplications have been designated as non-essential genes in the PEC database (Table IB). Further tests are required to validate their essentiality. We also carefully evaluated all single-gene deletion mutants in the Keio collection, which were classified as essential in the PEC database. None provided evidence of a partial duplication. Thus, some ORFs reported as essential in the PEC database are nonessential, at least not in the genetic background of our host E. coli K-12 BW25113 during aerobic growth at 37°C on LB agar. It should be noted that no evidence exists that the Red system that we used to generate the Keio collection is responsible for causing duplications. Besides, other authors have shown that genetic duplications can occur during DNA replication (Anderson and Roth, 1981). As a cautionary note, partial duplications can occur not only during the construction of single-gene deletion but also upon transfer of the deletion into a new host, e.g., by PCR or transduction as reported previously (Zhou et al, 2003).

Supplementary Material

Supplementary Table I

Construction and evaluation of Keio collection.

Supplementary Table II

genes that ECK numbers were not assigned before Keio construction

Supplementary Table III

genes that were not targets of Keio construction

Supplementary Table IV

Mutants for split ORFs of ancestral genes

Supplementary Table V

Information on Keio collection deletion mutants


This work was supported by a Grant-in-Aid for Scientific Research (A) and KAKENHI (Grant-in-Aid for Scientific Research) on Priority Areas ‘System Genomics' from the Ministry of Education, Culture, Sports, Science and Technology of Japan to NAIST and by funds from the Yamagata Prefectural Government and Tsuruoka City to Keio University. BLW was supported by NIH GM62662.


The authors declare that they have no conflict of interest.


  • Anderson P, Roth J (1981) Spontaneous tandem genetic duplications in Salmonella typhimurium arise by unequal recombination between rRNA (rrn) cistrons. Proc Natl Acad Sci USA 78: 3113–3117 [PMC free article] [PubMed]
  • Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H (2006) Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006 0008 10.1038/msb4100050 [PMC free article] [PubMed]
  • Hayashi K, Morooka N, Yamamoto Y, Fujita K, Isono K, Choi S, Ohtsubo E, Baba T, Wanner BL, Mori H, Horiuchi T (2006) Highly accurate genome sequences of Escherichia coli K-12 strains MG1655 and W3110. Mol Syst Biol 2: 2006 0007 10.1038/msb4100049 [PMC free article] [PubMed]
  • Riley M, Abe T, Arnaud MB, Berlyn MK, Blattner FR, Chaudhuri RR, Glasner JD, Horiuchi T, Keseler IM, Kosuge T, Mori H, Perna NT, Plunkett G III, Rudd KE, Serres MH, Thomas GH, Thomson NR, Wishart D, Wanner BL (2006) Escherichia coli K-12: a cooperatively developed annotation snapshot—2005. Nucleic Acids Res 34: 1–9 [PMC free article] [PubMed]
  • Zhou L, Lei XH, Bochner BR, Wanner BL (2003) Phenotype microarray analysis of Escherichia coli K-12 mutants with deletions of all two-component systems. J Bacteriol 185: 4956–4972 [PMC free article] [PubMed]

Articles from Molecular Systems Biology are provided here courtesy of The European Molecular Biology Organization
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...