• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of genoresGenome ResearchCSHL PressJournal HomeSubscriptionseTOC AlertsBioSupplyNet
Genome Res. May 2001; 11(5): 731–753.
PMCID: PMC311110

The Complete Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis ssp. lactis IL1403


Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group.

[The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.]

Lactic acid bacteria (LAB) are a heterogeneous group of microorganisms that convert carbohydrates into lactic acid. They comprise both pathogens (such as Streptococcus pneumoniae or Streptococcus pyogenes) and useful bacteria (such as Streptococcus thermophilus and Lactococcus lactis, which were used for millennia in milk fermentation). Determination and analysis of the genome sequence of a representative LAB is therefore of great interest, as it would provide information allowing us to combat the former and use the latter more efficiently. Until now, no complete and annotated genome sequence of either LAB class has been reported.

In nature, L. lactis occupies a niche related to plant or animal surfaces and the animal gastrointestinal tract. It is believed to be dormant on the plant surfaces and to multiply in the gastrointestinal tract after being swallowed by a ruminant. In contrast, “domesticated” species of L. lactis, used by dairy industry as starters in cheese fermentation, live in a different niche, which is defined by technological considerations, such as fast growth and rapid production of lactic acid in milk. The importance of L. lactis for humankind can be appreciated from the estimate that close to 107 tons of cheese are made annually (Fox 1989), leading to human consumption of close to 1018 lactococci.

There are two subspecies of L. lactis, designated initially as Streptococcus lactis and Streptococcus cremoris and reclassified more recently as L. lactis ssp. lactis and L. lactis ssp. cremoris, respectively (Schleifer et al. 1985). The former is preferred for making of soft cheeses and the latter for the hard ones. The two subspecies have been intensely studied, mainly because of their industrial interest, and have became excellent models for research on metabolism, physiology, genetics, and molecular biology of LAB.

The questions addressed in research on useful bacteria are often antithetical to those involving pathogens, because one of the basic objectives is to improve rather than to limit bacterial growth. Efficient use of lactococci by dairy industry requires understanding of many aspects of bacterial physiology, such as use of sugars and proteins from milk for growth, conversion of sugars to lactate, and synthesis of substances involved in cheese flavor, and thus of the relationship between different types of fermentation. The potential for new applications of LAB, such as oral vaccines (Steidler et al. 2000) or production of foreign proteins and metabolites, leads to questions concerning the protein secretion system, biosynthesis of cofactors, and regulation of central metabolism. In addition to questions related to the industrial use of lactococci, fundamental biological questions, such as retrohoming of introns (Cousineau et al. 1998), are also being addressed in L. lactis.

A genetic map of a “laboratory workhorse” L. lactis ssp. lactis strain IL1403, based on a low-fidelity diagnostic genome sequencing, has been reported (Bolotin et al. 1999). Here we present the analysis of the accurate sequence of the IL1403 genome, which is the first such report for any lactic acid bacterium. We focus mainly on features related to the importance of L. lactis for humankind, which is its use in dairy fermentation. Also, several unexpected findings are reported, such as a putative chimerical structure of the genome, the possibility that L. lactis can respire, the existence of genes required for DNA transformation, and a discovery of a transfer of genetic information from lactococci to gram-negative enteric bacteria.


Two-Step Sequencing Strategy

The first step of our strategy, designated diagnostic genome sequencing, was described before (Bolotin et al. 1999). Briefly, it implies cloning of relatively short (1–20 kb) genome fragments in Escherichia coli plasmid and phage vectors, and sequencing of a limited number of randomly chosen clones, to a redundancy of about one. A novel procedure, designated multiplex long accurate PCR (MLA PCR), developed and tested in the course of the Bacillus subtilis genome sequencing project (Sorokin et al. 1996; Kunst et al. 1997), is then applied for connecting the resulting contigs and synthesizing the missing genome regions, sequenced subsequently by standard methods. This approach allowed us to establish the entire L. lactis genome sequence and assemble it in a unique contig, with a sequencing redundancy of less than two (Bolotin et al. 1999). Three- to fourfold fewer sequencing reactions were required to reach this goal than if the fully random approach were used. For comparison, only 10,235 reactions were needed to assemble L. lactis genome sequence, whereas 40,020 were required for the genome of Neisseria meningitidis (Tettelin et al. 2000), which is of a similar size. Diagnostic sequence allowed us to identify all L. lactis genes that encode proteins sufficiently similar to those present in the databases. However, the elevated error rate, estimated to be ~1%, did not allow us to predict the genes unique for L. lactis or the borders of coding region. To obtain a more complete and reliable description of the L. lactis genome, we carried out a second step of our strategy. It involved random sequencing of additional clones until the overall redundancy of ~6.4 was reached and then primer walking on PCR-generated templates to ensure that each base was sequenced at least four times and at least once on each strand. We designated this step “shotgun polishing” and concluded that the strategy presented here can be a good alternative to the fully random strategy used in most cases (Fraser and Fleischmann 1997). Its advantages should increase even more when a greater number of completely sequenced and thoroughly annotated bacterial genomes becomes available. Carrying out the diagnostic step and polishing only a very little will then be sufficient to determine a reliable genome sequence of bacteria relatively close to the ones that were already sequenced and annotated.

Gene Content

The circular chromosome of L. lactis IL1403 has 2,365,589 bp and an average G+C content of 35.4%. We detected 2310 open reading frames (ORFs) in the sequence, with an average length of 879 bp. Protein-coding genes represent 86% of the genome, stable RNA 1.4%, and noncoding regions 12.6%. These values are similar to those observed for genomes of other bacteria. We have assigned a biochemical or biological role to 64.2% (1482 ORFs) of the genes and classified them into functional categories (Table (Table1).1). There are 20.1% of genes (465 ORFs) that match hypothetical coding sequences of unknown function, and the remaining 15.7% (363 ORFs) represent genes with no similarity to known proteins, which can be considered specific for lactococci.

Table 1
Functional Classification of the Lactococcus lactis Protein-Coding Genes#

Origin and Terminus of Replication

Approximate position of the replication origin and terminus of the L. lactis chromosome was determined previously, using the GC and AT skews (Fig. (Fig.1;1; Bolotin et al. 1999). It should be noted that the precision of the origin mapping is greater than that of the terminus, as there are conserved elements (dnaA and dnaN genes, DnaA boxes) in the vicinity of the former but not of the latter (rtp gene was not found). We choose as the coordinate 1 of the genome the middle of a HindIII site localized near the replication origin (Fig (Fig1).1).

Figure 1
Distribution of IS elements and prophages in the IL1403 chromosome. Outer circle shows the scale in basepairs. IS981, IS983, and IS1077 are shown by yellow, red, and blue squares, respectively (enlarged for clarity). Red and blue arcs show the areas of ...

RNA, IS Elements, and Prophages

Location of six rRNA operons, 62 tRNA genes, the RNA component of RNAse P gene (rnpB), and the 10S RNA (ssrA) were determined earlier from the diagnostic sequence (Bolotin et al. 1999). There are six different IS elements in the IL1403 chromosome: IS981, IS982, IS983, IS904, IS905, and IS1077, present in 10, 1, 15, 9, 1, and 7 copies, respectively (Fig. (Fig.1)1) and totaling 42 kb. It is remarkable that one or two copies of IS904 always accompany IS1077 and that the relative orientation of the two is generally not the same. The former element might be a satellite of the latter. Another remarkable feature is that three of the IS elements are not randomly distributed over the chromosome (Fig. (Fig.1).1). Seven copies of IS1077 (and the associated IS904) occupy the region between 2150 and 840 kb, encompassing the replication origin, whereas 15 copies of IS983 occupy a different region, between 680 and 2270 kb. The two regions overlap by only ~150 kb. As the 10 copies of IS981 are distributed over the whole genome, the uneven distribution of three other IS elements is not caused by a particular property of the L. lactis cell. We suggest that this distribution indicates a lateral transfer of a large portion of the genome from a lactococcus donor, carrying one type of IS, to a recipient, carrying the other type. Two lines of evidence lend support to this hypothesis. First, IS1076, which corresponds to the association of IS1077 and IS904 described above, is distributed over the whole genome of the strain L. lactis ssp. cremoris MG1363 (Le Bourgeois et al. 1995) rather than being restricted to one region of the genome, as is the case in IL1403. This transposon has, therefore, no particular hot region for insertion in the lactococcal genome. Second, the restriction map of another strain, L. lactis ssp. lactis DL11, coincides with that of IL1403 in the area between rrnF (550 kb) and rrnE (1980 kb), while it is divergent elsewhere (Le Bourgeois et al. 1992). We suggest that DL11 may be close to one of the putative parental strains of IL1403. Investigations of the distribution of IS1077 and IS983 among different lactococci might allow identification of both putative parents of the IL1403 strain.

Three potential prophages, designated pi1, pi2, and pi3, were detected at positions near 460, 1050, and 1460 kb (Fig. (Fig.1).1). They are large (35–44 kb), encode 49–60 proteins, and are related to known temperate phages of L. lactis. Another three prophages, designated ps1, ps2, and ps3, are localized near 42, 509, and 2020 kb (Fig. (Fig.1).1). They are small (11–15 kb), encode only 16–23 proteins and might be satellites of the other phages, as they lack most of the genes that code for phage structural elements. A copy of IS983 is present in ps3, which might, thus, be defective. The six prophages comprise a total of 175 kb of DNA and 221 protein coding genes. Recently, Chopin et al. (2001) characterized five phages, which can be found in the supernatant of IL1403 after mitomycin C treatment, and demonstrated the correspondence between the phage DNA extracted from the supernatant and the chromosome sequence. Phage bIL285 from the supernatant corresponds to pi2, bIL286 to pi3, bIL309 to pi1, bIL310 to ps1, and bIL312 to ps2. ps3, designated also as bIL311 (Chopin et al. 2001), cannot be induced, probably because of the IS983 element present in its genome. Detecting the circular forms of DNA of these phages allowed precise determination of the integration sites. About 9.2% of the L. lactis genome is thus formed by IS elements and prophages, suggesting that they may be important for horizontal gene transfer in these bacteria.

Paralogous Gene Families

We define here as a paralogous protein family a group of proteins within which each protein shares at least one homologous domain with another protein of the group. By this criterion, there are 370 paralogous families, comprising 1189 gene products, in the L. lactis genome. Among the smaller families (<10 members) there are 208 of two members, 80 of three, 36 of four, 13 of five, 13 of six, 8 of seven, 4 of eight, and 2 of nine. The larger families contain 10, 11, 15, 18, 26, and 60 members, the last corresponding to ATP-binding proteins of ABC transporters, as is the case in many bacteria. In the four smallest families, distribution of the number of proteins resembles that of B. subtilis (Kunst et al. 1997). It decreases, very approximately, twofold when the family member count increases by one (568:273:168:100 in B. subtilis and 416:240:144:65 in L. lactis for doublets, triplets, quadruplets, and quintuplets, respectively).

Information Processing and Gene Regulation

Information processing refers to the genes constituting replication, transcription, and translation machinery. In L. lactis, it is overall very similar to that of B. subtilis, the best characterized AT-rich gram-positive bacterium (Kunst et al. 1997). There are 67 genes involved in DNA metabolism in L. lactis. All the genes involved in DNA replication in B. subtilis are present in L. lactis, including counterparts of dnaB, dnaD, and dnaI, genes essential for initiation of replication in B. subtilis and absent in gram-negative bacteria. Two DNA-polymerase III α-chain genes, one corresponding to polC and another to dnaE of B. subtilis, were also detected in L. lactis. In contrast, E. coli has only the dnaE gene.

Transcription machinery in both L. lactis and B. subtilis comprises some 30 genes other than the ς-factors. However, the number of ς-factors differs greatly, as there are only three in L. lactis, while there are 18 in B. subtilis, pointing to a considerable difference in the mode of gene-expression regulation in the two organisms. Translation machinery comprises 119 genes in L. lactis and 131 genes in B. subtilis. There are no duplicated aminoacyl-tRNA synthetase genes in L. lactis, while there are three (for threonine, tyrosine, and histidine) in B. subtilis. Posttranslational protein modification genes mostly differ, as there are 27 such genes in B. subtilis and only 10 in L. lactis. A particular regulation of translation might also operate in L. lactis. As discussed more fully below, all the late competence genes of L. lactis seem to be controlled by a mechanism relaying on leaderless mRNAs and, thus, on a particular mode of translation. Recent evidence shows that the involvement of translation initiation factor 3, present in all bacteria, in start codon recognition is important for restriction of translation in such systems (Tedin et al. 1999). This provides a link between regulation of translation and competence in L. lactis. Such interaction has not been detected previously.

Analysis of homology allowed us to assign regulatory functions to 138 genes, half of which were classified further by their similarity to regulatory proteins of known families. The overall number of regulatory systems is about twofold lower in L. lactis than in B. subtilis, but the proportion of these genes is similar in the two organisms. Among the interesting differences is a much lower number of the two-component signal transducers in L. lactis than in B. subtilis (eight instead of 34) and of ς-factors (three instead of 18), both of which regulate complex responses to changing environmental conditions.

Energy Metabolism and Transporters

The most important industrial applications of L. lactis are based on its energy metabolism, which leads mainly to the production of high amounts of lactic acid (homolactic fermentation). Anaerobic glycolysis is the principal energy-generating process in L. lactis, and very little of the fermented sugar (~5%) is used for synthetic reactions (Poolman 1993). All the genes required for the conversion of the glucose to pyruvate are present in the genome. The pyruvate is converted into lactic acid, thus allowing the oxidation of reduced NAD, and the lactate dehydrogenase gene ldh, essential for this process, was studied intensely (Griffin et al. 1992). Three other genes, highly similar to ldh, (ldhB, ldhX and hicD) are present in the genome, but their role is not known. The product of the last gene has a high similarity (42% identity) to hydroxyisocaproate dehydrogenase and may, therefore, be involved in the catabolism of branched-chain amino acids. Lactate is transported into the growth medium, causing the efflux of protons and, thus, providing transmembrane potential indispensable for growth and energy recycling (Ten Brink et al. 1985).

Genome analysis indicates that the full citric acid cycle, gluconeogenesis enzymes, and many anaplerotic reactions do not exist in L. lactis. Unexpectedly, the functions necessary for aerobic respiration are encoded in the genome. L. lactis has men and cytABCD operons, encoding proteins required for menaquinone synthesis and cytochrome d biogenesis. It also has three genes involved in the late steps of heme synthesis (hemH, hemK, and hemN, required for oxidation of porphyrinogen and attachment of iron to heme) but not the genes required for the early steps. L. lactis may thus be able to carry out oxidative phosphorylation if the protoporphyrinogen is provided. Indeed, improved growth properties in media containing hemin were observed for certain Streptococci (Sijpesteijn 1970; Mickelson 1972). The genome analysis thus suggests the existence of aerobic respiration in this bacterium, generally considered an exclusively fermentative microorganism.

Use of L. lactis in the food industry also exploits its ability to form fermentation products other than lactate (mixed acid fermentation). The balance of products depends on activities of enzymes that act on the key metabolite generated by glycolysis, the pyruvate. A number of genes encoding such enzymes (pyruvate dehydrogenase, pdhABCD; α-acetolactate synthase, als; pyruvate-formate lyase, pfl; and lactate dehydrogenase, ldh) have been identified previously in L. lactis and confirmed by genome analysis. We detected a novel gene, poxL, encoding pyruvate oxidase, which also acts on pyruvate and might, therefore, play a role in switching between different fermentation modes.

Besides gene activity, the availability of cofactors, such as NADH and FAD, also affects the balance of different fermentation products. Artificial changing of NADH/NAD ratio in L. lactis can redirect carbon flow from lactic acid to acetoin and diacetyl (Lopez de Felipe et al. 1998). There are more than five NADH dehydrogenase genes in the L. lactis genome, which may affect the type of fermentation products. Some NADH dehydrogenases generate hydrogen peroxide, which is toxic for the cells. L. lactis has no gene encoding catalase, which can remove the toxic H2O2. However, there is a gene encoding thiol peroxidase (tpx) and two genes (ahpC and ahpF) encoding alkyl hyperoxide reductases. These proteins could possibly act on H2O2. Active sodA, encoding superoxid dismutase, which converts oxygen radicals to H2O2, was shown to be important for the oxidative stress response (Sanders et al. 1995). Also, the gshR gene encoding glutathion reductase may be involved in response of L. lactis to the aerobic growth conditions.

The heterofermentative metabolism takes place in L. lactis when pentoso-phosphate pathway is active, as in this case, glycolysis generates not only a three-carbon compound that can be converted to lactate but also a two-carbon compound. We detected glucose-6P dehydrogenase (zwf), phosphogluconate dehydrogenase (gnd), and ribuloso-5P epimerase (rpe), which can lead to the formation of xyluloso-5P. Phosphoketolase, encoded by ptk gene, can catalyze formation of glyceraldehyde-3P and acetyl-P, which enter the fermentation pathways that yield lactate and ethanol, respectively.

Understanding the molecular basis of the switch between different fermentation types is of interest not only for standard uses of L. lactis but also for the metabolic engineering in this organism, aiming to enhance synthesis of certain metabolites to industrially useful levels. We detected a correlation between the presence of the phosphoenolpyruvate dependent transport system (PTS) and the fermentation profile for a given carbon source. PTS systems for fructose, mannose, sucrose or trehalose, mannitol, and cellobiose are present in the genome, and the homolactic fermentation profiles were reported for growth on fructose, mannose, glucose (which uses mannose or mannitol PTS) and sucrose (Cocaign-Bousquet et al. 1996). In contrast, mixed acid or heterofermentation profiles were observed for growth on galactose, xylose, maltose, gluconate, ribose, and lactose, which are not imported by a PTS system. When L. lactis cells harbor a plasmid encoding lactose-specific PTS system, lactose fermentation becomes homolactic (Gasson 1983). Our genome analysis thus strengthens the proposal that sugar consumption rate, which is the highest when PTS system is available, determines the ability for efficient homolactic fermentation (Cocaign-Bousquet et al. 1996). The correlation of information derived from genome analysis with experimental data on fermentation product distribution indicates that critical parameters regulating the final product balance may be found by a thorough analysis of the carbon source use and transport systems.

Proteases and Amino Acid Catabolism Genes

Proteases and peptidases provide a selective advantage for bacteria growing in milk, as this medium is rich in caseins and relatively poor in free amino acids. Amino acid catabolism has an impact on fermentation regulation and on the flavor of dairy products.

Genome sequence revealed 19 protease-encoding genes (Table (Table1).1). These include the membrane protease HtrA, which is responsible for degradation of the precursors of foreign exported proteins (Pouquet et al. 2000). Some 16 peptidases from LAB were characterized previously, including the products of 13 genes detected in L. lactis (Christensen et al. 1999).

Catabolism of amino acids usually starts by deamination. Arginine catabolic genes, organized in an operon near 2110 kb, encode the enzymes for the deaminase pathway as well as the arginine tRNA synthetase, suggesting complex regulation. Another operon for arginine catabolism, near 1755 kb, contains genes arcC3 and otcA. It could have a regulatory function, as it also contains the genes llrH and yrfE, representing a signal transduction system of a new type. Aspartate aminotransferase (aspC) and asparaginase (ansB) are involved in aspartate and asparagine catabolism. No genes for aspartate decarboxylase or aspartase were detected, although such enzymatic activities were identified in Lactobacillus, another prominent group of LAB (Rollan et al. 1985). Recent studies on catabolism and biosynthesis of glutamate in L. lactis identified the existence of a pathway leading to the production of γ-aminobutyrate (GABA; Sanders et al. 1998). We identified gadRCB operon for GABA production, gltBD genes for glutamate synthase, and an operon involved in citric acid metabolism: pycA, gltA, citB, and icd. Under appropriate physiological conditions, products of some of these genes might carry out glutamate catabolism, rather than biosynthesis. Serine can be directly converted to pyruvate by serine dehydratase encoded by the sdaAB operon.

Genome sequence provides inventory of 12 aminotransferases, of which some can initiate degradation of aromatic, branched-chain, and sulfur-containing amino acids, important for cheese flavor. The specificity of seven aminotransferases (aspC, serC, argD, glmS, hisC, aspB, and arcT) can be predicted from sequence comparisons, whereas those of other five (araT, nifZ, yeiG, bcaT, and ytjE) are less obvious. It was recently shown that araT and bcaT are involved in the degradation of aromatic and branched-chain amino acids, respectively (Yvon et al. 2000). The product of ytjE might be specific for methionine, as the gene is cotranscribed with the relevant biosynthesis genes. Degradation of tryptophane seems to proceed via indole aldehyde because of indole pyruvate decarboxylase gene ipd. It is not clear which pathways L. lactis uses to catabolize phenylalanine and tyrosine. It is possible that phenyl pyruvate and p-OH-phenyl pyruvate are degraded further by decarboxylation. This would depend on the specificity of the phenolic acid decarboxylase encoded by pdc.

Amino Acid, Vitamin, and Nucleotide Biosynthesis

L. lactis requires certain metabolites in the growth medium, although it has a genetic potential to synthesize some of them. Synthetic medium for L. lactis should contain at least six amino acids (isoleucine, valine, leucine, histidine, methionine, and glutamic acid) and seven vitamins (biotin, pyridoxal, folic acid, riboflavin, nicotinamide, thiamine, and pantothenic acid; Jensen and Hammer 1993). L. lactis has the genes to synthesize the 20 standard amino acids and at least four cofactors (folic acid, menaquinone, riboflavin, and thioredoxin). One reason for the requirement of the compounds that can potentially be synthesized is that some of the existing genes are not functional, as was reported previously for amino acid biosynthesis genes (Godon et al. 1993). We carefully checked sequencing tracks for the genes that could contain a frameshift mutation and could not rule out the presence of a mutation in 30 of them. This relatively high level of pseudogenes in IL1403 could possibly be, at least in part, caused by the treatments used to cure the parental strain of its plasmids (Chopin et al. 1984).

Milk does not contain sufficient levels of purine compounds to support growth of L. lactis and, therefore, de novo biosynthesis is necessary (Dickely et al. 1995). We detected 57 genes involved in this metabolism. Therefore, physiological and genomic evidence shows that L. lactis has sufficient and fairly active capacities for biosynthesis and also for salvage of nucleic acid compounds.

Cell Wall Metabolism

Many L. lactis properties that are important for applications, such as phage sensitivity, stress resistance, autolysis, and mucosal immunostimulation, depend on the structure of the cell wall. There are 29 genes encoding enzymes required for the synthesis of the main cell wall component, peptidoglycan. Among these, three encode amino acid racemases: dal for alanine, murI for glutamate, and racD for aspartate. D-alanine and D-glutamate are the components of linear peptide moiety of peptidoglycan, whereas D-aspartate forms cross-bridges. There are no genes for synthesis of modified peptidoglycan, containing D-lactate or D-serine instead of D-alanine, reported for several other LAB.

Cheese ripening can be accelerated by induction of enzymes that process peptidoglycan. There are six genes related to such processing in L. lactis: dacA and dacB, encoding alanine–alanine carboxypeptidase; and acmA, B, C, and D, encoding four lysozymes. Carboxypeptidases alone cannot cause the cell lysis, as their activity does not destabilize the wall. Modulation of the level of their production can, however, influence the action of lysozymes. acmA, responsible for separation of daughter cells, was used for artificial induction of autolysis (Buist et al. 1997).

Lipoteichoic acid is another main component of the L. lactis cell wall. Neither teichoic nor teichuronic acids were detected in this microorganism (Valyasevi et al. 1990). However, there is a cluster of seven tag genes near 950 kb. Only three genes from teichuronic acid biosynthesis pathway were found: ycbK, ycbF, and ycbH, corresponding to tuaB, tuaC, and tuaG of B. subtilis. dlt operon, encoding D-alanylation of lipoteichoic acid, is of crucial importance for properties of the cell wall and whole-cell physiology. A knockout mutation in dltD causes filamentous growth and UV sensitivity and facilitates penetrability of the cells (Duwat et al. 1997).

Synthesis of extracellular polysaccharides is important for the industrial use of many LAB, as these polymers affect the texture of the fermented products. There are >20 genes involved in the biosynthesis of such molecules in the region near 200 kb. They encode functions providing activated sugars and other components involved in production of surface or extracellular polysacharide. A plasmid that carries an operon involved in the formation of the repeating unit, linking activated sugar to the lipid carrier, export, and polymerization, was recently identified (Van Kranenburg et al. 1997). Conjunction of plasmid-carried and chromosomal functions presumably determine the amounts and the structure of extracellular polysaccharides.

Protein Secretion

L. lactis has only eight genes identified as implicated in protein secretion. Contrary to B. subtilis and E. coli, this bacterium does not have secDF genes, known to improve the secretion efficiency (Pogliano and Beckwith 1994; Bolhuis et al. 1998). There is only one membrane protease, HtrA, involved in degradation of hybrid exported proteins (Pouquet et al. 2000). Gene pmpA (protein maturation protein) encodes a homolog of PrsA from B. subtilis and might be involved in stabilization of secreted proteins by facilitating their folding. L. lactis was shown to secrete up to 20 mg/L of foreign protein with optimized gene constructs (Le Loir et al. 1998). This value could possibly be improved by manipulating the gene expression levels and supplying the missing components of the secretion machinery.

Competence to Genetic Transformation

Natural competence to DNA transformation was not demonstrated in L. lactis. We detected four operons (comE, comF, comC, and comG) containing genes similar to the late competence genes from B. subtilis and S. pneumoniae. In addition, we found a gene for ComX, which is similar to the S. pneumoniae ECF-type ς-factor required for transcription of the competence genes (Lee and Morrison 1999). The regions preceding the first ORF of the four operons resemble competence promoters from S. pneumoniae and might be transcribed by ComX. There are three common sequences in front of all competence operons, two of which, GTTACATT and TTTTCGTATA, are in the −35 and −10 domains of the promoter, while the third, AGTATG, includes the ATG start codon of the first gene in each operon. The relative position of the three conserved elements indicates that all mRNAs start at the ATG codon of the first gene and are, therefore, leaderless, lacking the canonical ribosome-binding site. Search for the consensus sequence over the whole genome, using PatScan (Dsouza et al. 1997), revealed six such promoters other than those of the late competence operons. The genes downstream of these promoters are radA, coiA, dprA, recQ, ssbA, and yqfG. Only the radA gene, encoding a DNA repair protein, has leaderless mRNA. Three of the genes, coiA, dprA, and recQ, affect DNA transformation in S. pneumoniae, H. influenzae, and B. subtilis, respectively (Karudapuram et al. 1995; Fernandes et al. 1998; Pestova and Morrison 1998). ssbA encodes single-strand DNA-binding protein and could be involved in the processing of transforming DNA, which enter gram-positive bacteria in the single-stranded form. yqfG encodes a protein of unknown function. The existence of the competence-related genes in L. lactis indicates that this bacterium might be naturally transformable by DNA. There are no genes homologous to those involved in early steps of competence development in S. pneumoniae, which indicates that, in L. lactis, the regulation cascade upstream of ComX ς-factor is very different from that in Streptococcae.

Another difference between L. lactis and S. pneumoniae competence systems is that the leaderless mRNAs are present in the former organism only. The translation of such mRNAs requires that they start precisely at the initiation codon of the gene (Kravchenko et al. 1988; Van Etten and Janssen 1998). Synthesis of competence-related proteins would, therefore, not take place on spurious transcription of the cognate genes by leakage from upstream operons. This might tighten the control of the competence development and does limit it to very strict environmental conditions.

Horizontal Gene Transfer between Lactococci and Gram-Negative Enteric Bacteria

We detected a gene of unknown function, designated ycdB, which appears to be present in all bacteria and some eukaryotes. The level of identity between the YcdB protein and a homolog from S. pyogenes or S. pneumococcus, phylogenetically close to L. lactis, is ~80%, while the identity with the homologous genes from gram-negative bacteria is ~40%. Very surprisingly, the E. coli and S. typhimurium genomes encode not only a protein that is 40% identical with YcdB but also a protein that is 94% identical to YcdB. We conclude that this second ycdB gene has been transferred from lactococci to enteric bacteria. The divergence of the synonymous nucleotide sites in L. lactis IL1403, compared with Salmonella and E. coli, is ~10%. If the rate of nucleotide changes at such sites is ~1% per million years (Ochman et al. 1999), the genes in Salmonella/E. coli and L. lactis IL1403 started to diverge 10 million years ago. However, comparison of the ycdB genes in different strains of lactococci and in gram-negative enteric bacteria may reveal even more closely related genes and allow us to better assess the time of the gene transfer, the species that may have been involved in the transfer, and the mechanism of the transfer. Nevertheless, anticipating that closer homologs will be found, it is tempting to speculate that the transfer may have taken place in the digestive tract of ruminants, if it involved wild-type lactococci, or of humans, if it involved the domesticated lactococci, massively introduced there by cheese consumption.

Analysis of completely sequenced genomes, available from the NCBI server, revealed that most bacteria have only one homolog to YcdB. Some (E. coli, S. typhimurium, B. subtilis, E. faecalis, and Shewanella putrefaciens), however, have two, indicating that the family might be undergoing an expansion where, at least for enteric bacteria, a lateral gene transfer from lactococci might be a driving force. As the function of this gene is unknown, the advantage that the second copy confers is not known. Elucidation of the gene function would help to answer this question.


Genome Cloning, Sequencing, and Data Verification

The strain IL1403 is a plasmid-free derivative of the strain IL594, isolated from a cheese starter culture (Chopin et al. 1984). Diagnostic sequencing, involving 10,235 sequencing reactions and yielding a total of 4,687,630 bases, has been described previously (Bolotin et al. 1999). Further sequencing was carried out to assure us that each nucleotide in the genome was read at least four times and at least once on each strand. For this purpose, a collection of short insert clones was constructed. A total of 9,888,620 bases, covering 93% of the total genome, were produced by 15,578 more sequencing reactions. To reduce the error rate level to <0.01%, 978 more reactions, with average read length of 632 bases, were carried out using genome-specific primers. The redundancy of the final assembling is 6.44.

Informatics and Gene Nomenclature

Assembling manual corrections of sequencing errors and consensus generation were carried out concurrently with data accumulation, using the XBAP program (Dear and Staden 1991; version 14.0). To predict protein-coding regions, we used a conceptual translation of the whole genome in six possible coding frames. The predicted proteins >60 amino acids were checked for the statistical consistency with the output of the GENMARK program (Borodovsky and McIninch 1993) using parameters for Streptococcal genes. EBI server (http://www2.ebi.ac.uk/genemark) and pyogenes_3.xdr matrix dated November 14, 1996, were used for this analysis. The presence of a putative ribosome-binding site upstream of the 5′ end of the candidate was searched next. As a ribosome binding site, we considered the presence of initiator codon ATG, TTG, or GTG and a short sequence homologous to the 3′ end of 16S rRNA of L. lactis (5′…GGAUCACCUCCUUUCUAA 3′) upstream of it (Chiaruttini and Milet 1993). Genome annotation was done by using several homemade shell or Perl scripts, generating convenient html format tables linked to BLAST (Altschul et al. 1990) output files. NCBI server (http://www.ncbi.nlm.nih.gov/Entrez) was used to generate updated bacterial protein databases. Homology analysis of YcdB with the unpublished genome sequences was carried out by using the relevant NCBI server (http://www.ncbi.nlm.nih.gov/Microb_blast/unfinishedgenome.html). The functional classification of genes was done according to the list of categories presented earlier (Bolotin et al. 1999). Fully automatic computer-generated classification was used as the starting material. Each protein was then analyzed by an expert to improve the category assignment, which is presented in Table Table11 and Figure Figure2.2. The expert usually used three means to confirm or to alter the automated function assignment and classification: first, phylogenetic or COGnitor (Tatusov et al. 1997) assisted scrutiny of BLAST or FASTA reslts (performed with different parameters); second, complete knowledge of particular biochemical pathways or biological systems, existing in other than L. lactis IL1403 organisms (such as protein secretion or the competence system). Phage-specific proteins were classified to those because of their clustering in the areas identified as prophages. Also, specialized databases (Quentin et al. 1999) were used by the expert to classify the ABC transporters; third, results of numerous experiments in L. Lactis, published previously (148 functional assignments). Although it is never absolutely explicit, the provided classification of gene functions in L. lactis IL1403 is biological, rather than biochemical.

Figure 2
Linear map of the Lactococcus lactis ssp. lactis IL1403 chromosome. Coding regions are shown as arrows color-coded to the assigned functional categories. IS-elements and rRNA genes are shown as black arrows with white designation numbers inside. Symbols ...

L. lactis paralogous gene families were constructed by searching each predicted protein against all predicted proteins, using BLASTP with different parameters. Alignments of proteins in the identified families were then scrutinized to make a decision of how many proteins belong to a family. This decision was based either on the size of homologous domains or on the similarity levels. A protein was always assigned to only one family of paralogs.

We tried to keep the same gene symbols as proposed by the previous authors for ORFs with functions experimentally confirmed in L. lactis (148 genes). A y prefix with the gene symbol consistent with its position on the chromosome (Fig. (Fig.2)2) was kept for unascertained functions (1149 genes). Other gene symbols, consistent with those for homologs found in other bacteria, are proposed here (1017 genes).

Accessibility of Data

The nucleotide sequence of the L. lactis IL1403 genome is available from NCBI with accession no. AE005176. Updated annotations are supported at the Génétique Microbienne (INRA) server at http://spock.jouy.inra.fr. A PatScan of Ross Overbeek (Dsouza et al. 1997) for pattern searches in DNA sequence and proteins, implemented for IL1403, and peptide spectrum identification tool PeptOko for L. lactis proteome research are also available from this server.


We thank Jacek Bordovski and Saulius Kulakauskas for giving samples of the L. lactis IL1403 strain and Marie-Christine and Alain Chopin, Patrick Duwat, Emmanuel Jamet, Alexandra Gruss, Emmanuelle Maguin, Isabelle Poquet, Pierre Renault, and Catherine Robert for helpful discussions. We thank also the Genome Centers that contributed to the Unfinished Microbial Genome Database available for BLAST search through the NCBI server (http://www.ncbi.nlm.nih.gov/Microb_blast/unfinishedgenome.html).

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.


E-MAIL rf.arni.yuoj.cetoib@enikoros; FAX 33-1-34-65-25-21.

Article published on-line before print: Genome Res., 10.1101/gr.169701.

Article and publication are at www.genome.org/cgi/doi/10.1101/gr.169701.


  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. [PubMed]
  • Bolhuis A, Broekhuizen CP, Sorokin A, van Roosmalen ML, Venema G, Bron S, Quax WJ, van Dijl JM. SecDF of Bacillus subtilis, a molecular Siamese twin required for the efficient secretion of proteins. J Biol Chem. 1998;273:21217–21224. [PubMed]
  • Bolotin A, Mauger S, Malarme K, Ehrlich SD, Sorokin A. Low-redundancy sequencing of the entire Lactococcus lactis IL1403 genome. Antonie Leeuwenhoek. 1999;76:27–76. [PubMed]
  • Borodovsky M, McIninch J. GENMARK: A parallel gene recognition for both DNA strands. Comput Chem. 1993;17:123–133.
  • Buist G, Karsens H, Nauta A, van Sinderen D, Venema G, Kok J. Autolysis of Lactococcus lactis caused by induced overproduction of its major autolysin, AcmA. Appl Environ Microbiol. 1997;63:2722–2728. [PMC free article] [PubMed]
  • Chiaruttini C, Milet M. Gene organization, primary structure and RNA processing analysis of a ribosomal RNA operon in Lactococcus lactis. J Mol Biol. 1993;230:57–76. [PubMed]
  • Chopin A, Chopin MC, Moillo-Batt A, Langella P. Two plasmid-determined restriction and modification systems in Streptococcus lactis. Plasmid. 1984;11:260–263. [PubMed]
  • Chopin A, Bolotin A, Sorokin A, Ehrlich SD, Chopin MC. Analysis of six prophages in Lactococcus lactis IL1403: Different genetic structure of temperate and virulent phage populations. Nucleic Acids Res. 2001;29:644–651. [PMC free article] [PubMed]
  • Christensen J, Dudley E, Pederson J, Steele J. Peptidases and amino acid catabolism in lactic acid bacteria. Antonie Leeuwenhoek. 1999;76:217–246. [PubMed]
  • Cocaign-Bousquet M, Garrigues C, Loubiere P, Lindley ND. Physiology of pyruvate metabolism in Lactococcus lactis. Antonie Leeuwenhoek. 1996;70:253–267. [PubMed]
  • Cousineau B, Smith D, Lawrence-Cavanagh S, Mueller JE, Yang J, Mills D, Manias D, Dunny G, Lambowitz AM, Belfort M. Retrohoming of a bacterial group II intron: Mobility via complete reverse splicing, independent of homologous DNA recombination. Cell. 1998;94:451–462. [PubMed]
  • Dear S, Staden R. A sequence assembly and editing program for efficient management of large projects. Nucl Acids Res. 1991;19:3907–3911. [PMC free article] [PubMed]
  • Dickely F, Nilsson D, Hansen EB, Johansen E. Isolation of Lactococcus lactis nonsense suppressors and construction of a food-grade cloning vector. Mol Microbiol. 1995;15:839–847. [PubMed]
  • Dsouza M, Larsen N, Overbeek R. Searching for patterns in genomic data. Trends Genet. 1997;13:497–498. [PubMed]
  • Duwat P, Cochu A, Ehrlich SD, Gruss A. Characterization of Lactococcus lactis UV-sensitive mutants obtained by ISS1 transposition. J Bacteriol. 1997;179:4473–4479. [PMC free article] [PubMed]
  • Fernandez S, Sorokin A, Alonso JC. Genetic recombination in Bacillus subtilis 168: Effects of recU and recS mutations on DNA repair and homologous recombination. J Bacteriol. 1998;180:3405–3409. [PMC free article] [PubMed]
  • Fox PF. Cheese: An overview. In: Fox PF, editor. Cheese: Chemistry, Physics and Microbiology. London: Chapman & Hall; 1989. pp. 1–36.
  • Fraser CM, Fleischmann RD. Strategies for whole microbial genome sequencing and analysis. Electrophoresis. 1997;18:1207–1216. [PubMed]
  • Gasson MJ. Plasmid complements of Streptococcus lactis NCDO 712 and other lactic streptococci after protoplast-induced curing. J Bacteriol. 1983;154:1–9. [PMC free article] [PubMed]
  • Godon JJ, Delorme C, Bardowski J, Chopin MC, Ehrlich SD, Renault P. Gene inactivation in Lactococcus lactis: Branched-chain amino acid biosynthesis. J Bacteriol. 1993;175:4383–4390. [PMC free article] [PubMed]
  • Griffin HG, Swindell SR, Gasson MJ. Cloning and sequence analysis of the gene encoding L-lactate dehydrogenase from Lactococcus lactis: Evolutionary relationships between 21 different LDH enzymes. Gene. 1992;122:193–197. [PubMed]
  • Jensen PR, Hammer K. Minimal requirements for exponential growth of Lactococcus lactis. Appl Environ Microbiol. 1993;59:4363–4366. [PMC free article] [PubMed]
  • Karudapuram S, Zhao X, Barcak GJ. DNA sequence and characterization of Haemophilus influenzae dprA+, a gene required for chromosomal but not plasmid DNA transformation. J Bacteriol. 1995;177:3235–3240. [PMC free article] [PubMed]
  • Kravchenko VV, Gileva IP, Dobrynin VN, Filipov SA, Korobko VG. Location of the initiation codon AUG in relation to the 5′-end of mRNA mediates the effectiveness of translation in E. coli cells. Bioorg Khim. 1988;14:1387–1392. [PubMed]
  • Kunst F, Ogasawara N, Moszer I, Albertini AM, Alloni G, Azevedo V, Bertero MG, Bessieres P, Bolotin A, Borchert S, et al. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature. 1997;390:249–256. [PubMed]
  • Le Bourgeois P, Lautier M, Mata M, Ritzenthaler P. Physical and genetic map of the chromosome of Lactococcus lactis subsp. lactis IL1403. J Bacteriol. 1992;174:6752–6762. [PMC free article] [PubMed]
  • Le Bourgeois P, Lautier M, van den Berghe L, Gasson M, Ritzenthaler P. Physical and genetic map of the chromosome of Lactococcus lactis subsp. cremoris MG1363 chromosome: Comparison with that of Lactococcus lactis subsp. lactis IL1403 reveals a large genome inversion. J Bacteriol. 1995;177:2840–2850. [PMC free article] [PubMed]
  • Lee MS, Morrison DA. Identification of a new regulator in Streptococcus pneumoniae linking quorum sensing to competence for genetic transformation. J Bacteriol. 1999;181:5004–5016. [PMC free article] [PubMed]
  • Le Loir Y, Gruss A, Ehrlich SD, Langella P. A nine-residue synthetic propeptide enhances secretion efficiency of heterologous proteins in Lactococcus lactis. J Bacteriol. 1998;180:1895–1903. [PMC free article] [PubMed]
  • Lopez de Felipe FL, Kleerebezem M, de Vos WM, Hugenholtz J. Cofactor engineering: A novel approach to metabolic engineering in Lactococcus lactis by controlled expression of NADH oxidase. J Bacteriol. 1998;180:3804–3808. [PMC free article] [PubMed]
  • Mickelson MN. Glucose degradation, molar growth yields, and evidence for oxidative phosphorylation in Streptococcus agalactiae. J Bacteriol. 1972;109:96–105. [PMC free article] [PubMed]
  • Ochman H, Elwyn S, Moran NA. Calibrating bacterial evolution. Proc Natl Acad Sci. 1999;96:12638–12643. [PMC free article] [PubMed]
  • Pestova EV, Morrison DA. Isolation and characterization of three Streptococcus pneumoniae transformation-specific loci by use of a lacZ reporter insertion vector. J Bacteriol. 1998;180:2701–2710. [PMC free article] [PubMed]
  • Pogliano JA, Beckwith J. SecD and SecF facilitate protein export in Escherichia coli. EMBO J. 1994;13:554–561. [PMC free article] [PubMed]
  • Poolman B. Energy transduction in lactic acid bacteria. FEMS Microbiol Rev. 1993;12:125–147. [PubMed]
  • Pouquet I, Saint V, Seznec E, Simoes N, Bolotin A, Gruss A. HtrA in the unique surface housekeeping protease in Lactococcus lactis and is required for natural protein processing. Mol Microbiol. 2000;35:1042–1051. [PubMed]
  • Quentin Y, Fichant G, Denizot F. Inventory, assembly and analysis of Bacillus subtilus ABC transport systems. J Mol Biol. 1999;287:467–484. [PubMed]
  • Rollan G, de Nadra MCM, Holgado PR, Oliver G. Aspartate metabolism in Lactobacillus murinus CNRS 313. I. Aspartase. J Gen Appl Microbiol. 1985;31:403–409.
  • Sanders JW, Leenhouts KJ, Haandrikman AJ, Venema G, Kok J. Stress response in Lactococcus lactis: Cloning, expression analysis, and mutation of the lactococcal superoxide dismutase gene. J Bacteriol. 1995;177:5254–5260. [PMC free article] [PubMed]
  • Sanders JW, Leenhouts K, Burghoorn J, Brands JR, Venema G, Kok J. A chloride-inducible acid resistance mechanism in Lactococcus lactis and its regulation. Mol Microbiol. 1998;27:299–310. [PubMed]
  • Schleifer KH, Kraus J, Dvorak C, Kilpper-Bälz R, Collins MD, Fischer W. Transfer of Streptococcus lactis and related streptococci to the genus Lactococcus gen. nov. Syst Appl Microbiol. 1985;6:183–195.
  • Sijpesteijn AK. Induction of cytochrome formation and stimulation of oxidative dissimilation by hemin in Streptococcus lactis and Leuconostoc mesenteroides. Antonie Leeuwenhoek. 1970;36:335–348. [PubMed]
  • Sorokin A, Lapidus A, Capuano V, Galleron N, Pujic P, Ehrlich SD. A new approach using multiplex long accurate PCR and yeast artificial chromosomes for bacterial chromosome mapping and sequencing. Genome Res. 1996;6:448–453. [PubMed]
  • Steidler L, Hans W, Schotte L, Neirynck S, Obermeier F, Falk W, Fiers W, Remaut E. Treatment of murine colitis by Lactococcus lactis secreting interleukin-10. Science. 2000;289:1352–1355. [PubMed]
  • Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science. 1997;278:631–637. [PubMed]
  • Tedin K, Moll I, Grill S, Resch A, Graschopf A, Gualerzi CO, Blasi U. Translation initiation factor 3 antagonizes authentic start codon selection on leaderless mRNAs. Mol Microbiol. 1999;31:67–77. [PubMed]
  • Ten Brink B, Otto R, Hansen UP, Konings WL. Energy recycling by lactate flux in growing and non-growing cells of Streptococcus cremoris. J Bacteriol. 1985;162:383–390. [PMC free article] [PubMed]
  • Tettelin H, Saunders NJ, Heidelberg J, Jeffries AC, Nelson KE, Eisen JA, Ketchum KA, Hood DW, Peden JF, Dodson RJ, et al. Complete genome sequence of Naisseria meningitidis serogroup B strain MC58. Science. 2000;287:1809–1815. [PubMed]
  • Valyasevi R, Sandine WE, Geller BL. The bacteriophage kh receptor of Lactococcus lactis subsp. cremoris KH is the rhamnose of the extracellular wall polysaccharide. Appl Environ Microbiol. 1990;56:1882–1889. [PMC free article] [PubMed]
  • Van Etten WJ, Janssen GR. An AUG initiation codon, not codon-anticodon complementarity, is required for the translation of unleadered mRNA in Escherichia coli. Mol Microbiol. 1998;27:987–1001. [PubMed]
  • Van Kranenburg R, Marugg JD, van Swam II, Willem NJ, de Vos WM. Molecular characterization of the plasmid-encoded eps gene cluster essential for exopolysaccharide biosynthesis in Lactococcus lactis. Mol Microbiol. 1997;24:387–397. [PubMed]
  • Yvon M, Chambellon E, Bolotin A, Roudot-Algaron F. Characterisation and role of the branched-chain aminotransferase (BcaT) isolated from Lactococcus lactis subsp. cremoris NCDO763. Appl Environ Microbiol. 2000;66:571–577. [PMC free article] [PubMed]

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...