Identification and classification of innexin gene transcripts in the central nervous system of the terrestrial slug Limax valentianus

Intercellular gap junction channels and single-membrane channels have been reported to regulate electrical synapse and the brain function. Innexin is known as a gap junction-related protein in invertebrates and is involved in the formation of intercellular gap junction channels and single-cell membrane channels. Multiple isoforms of innexin protein in each species enable the precise regulation of channel function. In molluscan species, sequence information of innexins is still limited and the sequences of multiple innexin isoforms have not been classified. This study examined the innexin transcripts expressed in the central nervous system of the terrestrial slug Limax valentianus and identified 16 transcripts of 12 innexin isoforms, including the splicing variants. We performed phylogenetic analysis and classified the isoforms with other molluscan innexin sequences. Next, the phosphorylation, N-glycosylation, and S-nitrosylation sites were predicted to characterize the innexin isoforms. Further, we identified 16 circular RNA sequences of nine innexin isoforms in the central nervous system of Limax. The identification and classification of molluscan innexin isoforms provided novel insights for understanding the regulatory mechanism of innexin in this phylum.


Introduction
Gap junction channels are formed by docking of single-cell membrane channels between adjacent cells, and that allow intercellular communication via the exchange of small molecules such as ions, nucleotides, small peptides, and micro RNA (miRNA). Previous studies have indicated that the gap junction-related proteins are involved in the regulation of brain function. Neuronal gap junctions work as electrical synapses and exhibit plasticity [1,2]. In addition, single-cell membrane channels, which do not form intercellular channels, are also involved in neuroplasticity through their interaction with chemical synapses [2,3]. The channel properties of gap junction-related proteins are regulated by protein modifications such as N-glycosylation, S-nitrosylation, and phosphorylation [4][5][6][7]. As gap junction-related proteins in animal species, three protein families, innexin in invertebrates, connexin and pannexin in vertebrates, are known. These proteins share structural features that are characterized by four transmembrane domains, two extracellular loops, and intracellular N-and C-terminal domains [8]. The hexamers, heptamers, or octamers of these proteins form a single-cell membrane channel [9,10]. Additionally, multiple isoforms of these proteins contribute to the specific regulation of channel properties and intercellular docking of single-cell membrane channels. Previous studies have also reported the differential characteristics of the three protein families [11,12]. Vertebrate connexin proteins form both single-cell membrane channels and intercellular gap junctions. On the other hand, vertebrate pannexin proteins form only single-cell membrane channels in vivo as glycosylation at their extracellular domain inhibits the formation of intercellular gap junction channels. In invertebrates, innexin is the only protein family that exhibits the functions of both connexin and pannexin.
To start the elucidation of the regulatory function of gap junction-proteins in the brains, we here focused on innexin in molluscan species. In invertebrate neurophysiological research, molluscs are useful animal species because of their large, identifiable neurons. Neural circuit studies directly related to behavioral changes have been conducted, and electrical synapses between specific neurons have been studied [13][14][15]. For instance, electrical synapses have been demonstrated to exist and play important roles in feeding and withdrawal behaviors in the gastropod mollusc Lymnaea stagnalis [16][17][18][19]. The gastropod mollusc, terrestrial slug Limax valentianus is used as a model animal for studying the neural mechanism of learning and memory [20]. Especially, the olfactory memory of Limax has been well investigated [21,22] and electrical synapses are involved in the neural mechanism of olfactory detection [23,24].
Despite the many research on neural circuit identification and electrical synapses, there are limited studies on the molluscan innexin genes. Early studies identified two putative gap junction protein genes in the pteropod mollusc Clione limacina [25], and demonstrated that the injection of mRNA of one gene altered the electrical coupling between the identified neurons [26]. Recently, eight innexin gene transcripts were identified in Lymnaea stagnalis using transcriptome data [27]. Moreover, several studies have performed genomic and transcriptome analyses of various molluscan species [28-32], however, identified innexin homologs, including multiple isoforms, have not been classified.
This study aimed to identify innexin gene transcripts in the central nervous system (CNS) using the terrestrial slug Limax valentianus. We comparative and phylogenetic analyses of the predicted amino acid sequences of molluscan species were performed to classify the innexin homologs. We here examined the characterization of molluscan innexin isoforms, and first reported circular RNAs (circRNAs) expression of gap junction-related protein in the CNS. Our findings will facilitate future research on the regulatory mechanisms of gap junctionrelated proteins in the CNS of molluscan species.

Animals
All experiments were performed using the terrestrial slugs Limax valentianus at 3-4 months post-hatching. The animals were maintained under laboratory conditions at 19˚C with a 12-h light/dark cycle. They were fed with humidified powder mixture containing the following: 520 g rat chow (Oriental Yeast, Tokyo, Japan), 500 g potato starch (Hokuren, Hokkaido, Japan), and 21 g vitamin mixture (AIN-76, Oriental Yeast). For the CNS isolation, slugs were anesthetized by a body cavity injection of Mg 2+ buffer solution containing the following (in mM): 60.0 MgCl 2 , 5.0 glucose, and 5.0 HEPES, pH 7.0. The CNS were isolated in ice-cold high-Mg 2+ saline containing the following (in mM): 35.0 NaCl, 2.0 KCl, 28.0 MgCl 2 , 4.9 CaCl 2 , 5.0 glucose, and 5.0 HEPES, pH 7.0. The isolated ganglia were transferred to a dish filled with a Limax saline solution containing the following (in mM): 70.0 NaCl, 2.0 KCl, 4.9 CaCl 2 , 4.7 MgCl 2 , 5.0 glucose, and 5.0 HEPES, pH 7.0. The isolated CNS was frozen in liquid nitrogen for RNA extraction.

Polymerase Chain Reaction (PCR)
RNA was extracted from the dissected CNS of three animals using the Nucleospin RNA XS kit and treated with DNAse I (Macherey & Nagel, Düren, Germany). Total RNA (100 ng) was reverse-transcribed using M-MLV reverse transcriptase (Invitrogen, Carlsbad, CA) and random primers, following the manufacturer's instructions. The cDNA was subjected to PCR using ExTaq DNA polymerase (Takara Co., Otsu, Japan) or PrimeSTAR GXL DNA polymerase kit (Takara Co.). The sequence-specific primers for Limax innexin mRNAs were designed based on the innexin homolog sequences identified through transcriptome shotgun assembly of Limax valentianus (paper in preparation). For amplifying circRNA, divergent primers were designed based on the verified complementary DNA (cDNA) sequences. The primer sequences are listed in supporting information (S1 Table). The amplicons were subcloned into the TOPO™ vector (Invitrogen) and subjected to nucleotide sequencing analysis.

Protein sequence analyses of predicted innexin homologs
Phylogenetic analysis was performed using the available molluscan innexin sequences in the Swissprot, Genbank, and Refseq databases with basic local alignment search tool (BLAST). Multiple sequence alignment of molluscan innexin homologs was performed using MUSCLE. The phylogenetic tree was constructed using the maximum likelihood method with MEGAX software [33].

Identification of Limax innexin homologs
To identify the gene transcripts of gap junction proteins in the Limax CNS, a local BLASTX search was performed for our transcriptome data (paper in preparation) using sequences of Aplysia innexin homologs. The cDNA sequences were cloned and subjected to nucleotide sequencing. In this study, we identified 16 Limax innexin homologs with the entire coding domain sequence (CDS) (Limax innexin 1-11 including spliced isoforms) and partial CDS of Limax innexin 12 (accession numbers LC595664-LC595679).
The deduced protein sequences were used for predicting transmembrane domain using OCTOPUS software [38]. Amino acid alignment analysis revealed that the Limax innexin homologs conserve the typical characteristics of innexin proteins, four transmembrane domains that are connected by the first and second extracellular loops and one intracellular loop (Fig 1). Around the second transmembrane domain, each Limax innexin had a and Limax innexin 11x1-11x2. The boxes show the four predicted transmembrane domains (TM1 to TM4), while the red arrows indicate two extracellular loops (EC1 and EC2). The black asterisks indicate the extracellular cysteine residues in the extracellular loops, while the red asterisk indicates the additional cysteine residue. The black arrow-heads indicate the P-X-X-X-W motif. https://doi.org/10.1371/journal.pone.0244902.g001
Another typical characteristic of innexin proteins is the conserved cysteine (Cys) residues in the extracellular loops [39]. In invertebrate innexin, two pairs of Cys form essential intramolecular disulfide bonds between the first and second extracellular loops [42]. The Limax innexin homologs had two Cys residues each in the first and second extracellular loops (Fig 1; e.g. Limax innexin 1, Cys 56 and Cys 74 in the first extracellular loop; Cys 262 and Cys 279 in the second extracellular loop).
Interestingly, Limax innexin 1-9, except Limax innexin 4, had an additional Cys between the two Cys residues in the first extracellular loop (Fig 1; e.g. Limax innexin 1, Cys 58 ). The additional Cys residue was also detected in the same region of other innexin and pannexin sequences of limited species reported in the public databases. In invertebrate innexins, the additional Cys residue was detected in six of the 12 leech innexins (Hirudo medicinalis innexins 1, 3, 6, 9, 11, and 12) but not found in the innexin sequences of insects and nematodes. In vertebrates, the additional Cys residue was detected in pannexin 3 of the clade Cetartiodactyla, such as cattle (Bos taurus NP001137556.1) and long-finned pilot whales (Globicephala melas XP030691269.1), but not in other pannexin isoforms (pannexin 1 and 2) or in connexins. The function of this additional Cys residue has not been reported in any animal species, however, these results indicate that the additional Cys residue is not a unique feature of molluscan innexins.
Sequence alignment analysis also revealed differences among the identified Limax innexins (Fig 1). The lengths of the extracellular loops were well conserved among Limax innexin 1-10 (the first extracellular loops, 53-55 amino acid residues; the second extracellular loop, 66-68 amino acid residues). On the other hand, the first extracellular loop in Limax innexin 11 was longer than that in Limax innexin 1-10 (the first extracellular loop, 69 amino acids; the second extracellular loop, 63 amino acids). This result is similar to the observation in vertebrate connexin and pannexin with different lengths of the extracellular loops [43]. Furthermore, the lengths between two Cys residues in the second extracellular loops largely differed between Limax innexin 1-10 and Limax innexin 11 (Limax innexin 1-10, 16 amino acids; Limax innexin 11, 12 amino acids). Previous studies on vertebrate connexins have reported that the amino acid sequences between the Cys are critical for the formation of functional intercellular channels [9,44,45]. By focusing on the differences between vertebrate connexins and pannexins, the isoforms of Limax innexin can be broadly divided into two groups, Limax innexin 1-10 and Limax innexin 11.
Even though the genomic data is available in Aplysia and Elysia, the number of innexin homolog genes varied among species (n = 20 for Aplysia californica; n = 14 for Elysia chlorotica). Consistent with this observation, a previous study on vertebrate connexins demonstrated that the number of connexin genes in zebrafish (n = 37) was approximately two times higher than that in humans (n = 20) and mice (n = 19) [48]. The authors suggested that gene duplication events occur continuously in each phylum, which contributes to the species-specific regulation by gap junction-related proteins.
Molecular phylogenetic analysis revealed that the identified Limax innexin gene transcripts expressed in the CNS can be classified into seven ortholog groups (

Prediction of post-translational modification sites: N-glycosylation, Snitrosylation, and phosphorylation
To classify the molluscan innexin homologs, potential protein modification sites were detected in the deduced amino acid sequences of the seven ortholog groups. The predicted modification sites in the innexin sequences of each group are shown in Figs 3-9. The conserved modification sites between orthologs are summarized in Fig 10. Potential N-glycosylation sites were detected only in the innexin homologs of Group 2 and 7. In Group 2, one potential N-glycosylation site was identified at the first extracellular loop in two of the six orthologs (Limax innexin 3, N 227 FT; Biomphalaria XP013085202.1 partial CDS sequence, N 110 YS). In Group 7, two N-glycosylation sites were identified in the first extracellular loop in all orthologs of Group 7 (Limax innexin 11 splice variants, Limax innexin 11x1, N 85 MT and N 102 ET; Limax innexin 11x2, N 85 LS and N 102 ET; Lymnaea FX187001, N 85 LS and N 102 ET; Elysia EGW08_022284 partial sequence, N 102 AT; Biomphalaria innexin-19-likex1 N-terminal partial sequence, N 118 LS and N 134 TT; Aplysia innexin unc-9-likex2, N 85 LS and N 101 LT), except Limax innexin 12 for which we did not obtain the full-length coding sequence.
The result revealed that N-glycosylation modification occurs in the molluscan innexins of certain ortholog groups. Consistent with this finding, vertebrate pannexin proteins have Nglycosylation sites at the extracellular loops, whereas nearly all connexins do not have the Nglycosylation sites. N-glycosylation of pannexins regulates the subcellular localization and intermixing of different pannexins for channel formation [49,50]. Additionally, N-glycosylation of pannexin inhibits the intercellular docking of neighboring pannexin channels and the formation of gap junctions [50,51]. To summarize the results so far, molluscan innexins of Group 7 exhibited the following two pannexin-like characteristics: N-glycosylation at the extracellular loops and longer extracellular loops compared to other homologs in Group 1-6.
We also identified S-nitrosylation sites in the molluscan innexin homologs of the seven groups. As a result, the conserved S-nitrosylation sites were found in the intracellular loops of

PLOS ONE
limited ortholog groups, Group 4 and 5 (Figs 6, 7 and 10). S-nitrosylation, which is the covalent addition of nitric oxide (NO) to the Cys residue, is reported to be involved in the regulation of channel properties, such as channel permeability and voltage-sensitive gating of vertebrate connexins and pannexins [52][53][54]. In molluscs, various studies have examined NO function in the CNS [24, [55][56][57][58][59][60]. Our findings gave a new insight that NO regulates neuronal communication in molluscan CNS by modifying the gap junction-related proteins.
Further, we identified the phosphorylation sites of multiple kinases, PKC, PKA, PKG, CK1, CK2, SRC, cdc2, CaMK, and p38MAPK (Figs 3-10). These kinases regulate the protein functions of vertebrate connexin/pannexin, such as oligomerization, channel properties, intracellular trafficking, gap junction assembly, and stability [5,6,61,62]. The results showed that the phosphorylation sites for PKC, PKA, CK1, CK2, SRC and cdc2 were relatively conserved among the orthologs in the same group (Fig 10). The phosphorylation sites for PKG, CaMK, and p38MAPK were not conserved in the ortholog groups. Although Group 4 and 5 were phylogenetically related (Fig 2), a surprisingly high number of phosphorylation sites was found in the orthologs in Group 4 compared with Group 5 (Figs 6, 7 and 10). The results showed that the protein kinase phosphorylation sites are conserved among the orthologs and that each ortholog group exhibits a distinctive phosphorylation pattern.
Exceptionally, the orthologs of Group 2 exhibited low sequence homology and variability of the protein modification sites (Figs 4 and 10). More, although Limax innexin 3 and 4 of Group 2 showed the highest sequence similarity, the protein modification sites were largely different between them (Figs 2, 4 and 10). At the extracellular loop, Limax innexin 3 had an N-glycosylation site, while Limax innexin 4 had two S-nitrosylation sites. We identified extra phosphorylation sites in the intracellular C-terminal domain of Limax innexin 3 compared to Limax innexin 4 (Fig 4; T 385 , T 391 , and S 398 for PKC; T 369 , S 405 , and S 416 for CK2; S 377 and S 398 for cdc2). Similar to this finding, the numbers of phosphorylation sites in the intracellular C-

PLOS ONE
termini show large variability among vertebrate connexin isoforms [61]. For instance, connexin 43 is phosphorylated by at least five different protein kinases at more than 12 motifs, while connexin 26 has no phosphorylation motif in this region. The phosphorylation of connexin C-terminus has been reported to regulate the trafficking, assembly, and degradation [61]. Thus, Limax innexin 3 and 4 channels may be differently regulated by phosphorylation, even though they are phylogenetically close to each other.

CircRNA identification of Limax innexin genes
CircRNA is a class of non-coding RNAs that are generated through alternative back-splicing, which connects the terminal 5' and 3' ends of linear pre-mRNA [63], and the function has not been fully elucidated. Previous studies have demonstrated the presence of circRNAs throughout animal species, such as mammals, insects, and nematodes [64,65]. In the brain tissue, cir-cRNA is especially abundant and lots of circRNA-producing genes encode synapse-related proteins [64,66].
In the RT-PCR experiments of Limax innexin transcripts in the CNS, we obtained multiple amplicons for some innexin homologs. Sequencing analysis revealed that these amplicons were circRNAs. To identify the circRNAs of the innexin genes, we designed divergent primers for the coding region of Limax innexin sequences, as a previous study showed that circRNAs mainly comprise coding exons [65]. As a result, we identified 16 circRNA sequences for nine Limax innexin genes in the CNS (Table 1). Nine circRNAs each encoded a truncated protein  lacking the N-terminal end (circINX2b, circINX2c, circINX4, circINX6b, circINX7a, cir-cINX8a, circINX8b, circINX9x2, and circINX9x4), and three circRNAs each encoded the fulllength protein (circINX2a, circINX3, and circINX7b). The functions of circRNAs generated from the connexin, pannexin, and innexin genes have not been elucidated. Thus, we also examined the circRNAs of vertebrate connexin and pannexin genes using a database that contains >32,000 human circRNAs [67], and retrieved cir-cRNA data of four connexin genes (GJA1, connexin 43; GJB3, connexin 31; GJB5, connexin 31.1; GJC1, connexin 45) and one pannexin gene (PNX1, pannexin 1). Of these, circRNAs of

PLOS ONE
The results demonstrated that circRNAs are constitutively formed from the innexin genes in the CNS of Limax, and that circRNA formation from gap junction-related protein genes is not a characteristic event in molluscs. The observation suggested that circRNAs can be involved in the regulatory mechanism of certain gap junction-related proteins.

Discussion
In this study, the innexin homologs expressed in the CNS of Limax were identified and characterized. Phylogenetic analysis demonstrated the diversity of molluscan innexin genes and the In invertebrate species, innexin is the only protein family that forms gap junctions. Even though innexin sequence data in molluscan species have been accumulated through comprehensive sequence analysis studies, the multiple isoforms have not been classified. Based on the phylogenetic analysis, we classified Limax innexin homologs into seven molluscan ortholog groups. One ortholog group, Group 7, exhibited vertebrate pannexin-like characteristics at the extracellular loops with the lengths and the glycosylation sites. In vertebrates, previous studies have demonstrated that the intercellular docking of two single-cell membrane channels is regulated by the interaction between the extracellular loops of gap junction-related proteins. Pannexin proteins mainly form single-cell membrane channels in vivo, and glycosylation of pannexin at the extracellular loop inhibits the formation of intercellular channels [12,51]. In invertebrates, only one study examined the glycosylation of innexin homologs in the yellow fever mosquito Aedes aegypti [68]. The study reported that two of the six Aedes innexins (AeInx3 and AeInx7) have putative glycosylation sites and that AeInx3 is an important component of gap junctions. However, AeInx3 exhibited multiple bands in the western blot, that indicates several stages of modifications on AeInx3. Regard to the previous studies of pannexin, the glycosylation status of AeInx3 protein can inhibit the formation of intercellular gap junction channels, and the authors mentioned the effect of glycosylation on intercellular channel formation. In our study, in addition to the pannexin-like orthologs in Group 7, Limax innexin 3 in Group 2 had an N-glycosylation site. This indicates that the N-glycosylation status of Limax innexin 3 possibly regulates the intercellular docking of the channels.
In addition to the N-glycosylation sites, we identified the phosphorylation and S-nitrosylation sites and characterized molluscan innexins of the seven ortholog groups. Each group exhibited a distinctive protein modification pattern and the protein modification sites were well conserved among the orthologs (Fig 10). The results indicate that the regulatory mechanism of innexin channels is well conserved among orthologs and the duplication of innexin genes have generated innexin isoforms with new functions, as previously reported for vertebrate connexins [48]. Exceptionally, the orthologs in Group 2 exhibited low sequence homology and did not conserve N-glycosylation sites and protein modification sites (Figs 2  and 10). The variability among the orthologs in Group 2 may arise from continuous gene duplication events that produce new functions of innexin proteins in Group 2. Further, we examined the formation of circRNAs from Limax innexin genes. We also identified circRNAs of human connexin and pannexin in the circRNA database [67]. Even though a lot of research has been conducted on gap junction-related proteins in vertebrates, there is no study that examined the existence of circRNA. So far, only one review proposed that mammalian connexin 43-derived circRNA functions as a miRNA sponge during breast cancer initiation stages [69], however, this has not been experimentally demonstrated yet. The function of circRNAs has been under investigation and several studies reported that circRNAs function as miRNA sponges [65,70]. On the other hand, recent studies further reported that several cir-cRNAs are associated with polysomes and translated [71,72]. In this study, we identified twelve of the 16 Limax innexin circRNAs encoding N-terminal truncated innexin proteins, indicating that circRNAs can be templates for translation. Interestingly, several studies on vertebrate connexin reported the unexpected functions of N-terminal truncated connexin that are generated from internal translation initiation [73][74][75]. The short connexin 43 isoform is involved in trafficking the full-length connexin 43 to the gap junction site. Additionally, the short connexin 43 isoform directly regulates N-cadherin transcription, which results in neural crest cell migration at an early developmental stage in amphibian and mammalian cells. Thus, the translation of Limax innexin circRNA may generate truncated innexin proteins that have uncharacterized roles, such as protein trafficking and regulation of gene expression.
As described above, there are multiple innexin gene products, including circRNAs, expressed in the CNS of Limax. The present data directly contribute to future analysis of gene expression using quantitative RT-PCR or in situ hybridization in the CNS. In order to elucidate the physiological functions of each innexin homologs on electrical synapses, such as local electrical junctions of bursting neurons in the Limax procerebral lobe [23,24], it would be helpful to apply gene knockout systems using siRNA (short interfering RNA) and CRISPR/ CAS9 systems to electrophysiology experiments.

Conclusion
In this study, multiple innexin transcripts were identified in the CNS of Limax and the molluscan innexin sequences were phylogenetically classified. Additionally, circRNA formation of multiple innexin genes was demonstrated, which provided novel insights for understanding the regulatory mechanism of gap-junction-related proteins. The results of this study will facilitate further research on the function of innexin proteins and contribute to the development of new research on gap junction proteins.
Supporting information S1