NGS in Hereditary Ataxia: When Rare Becomes Frequent

The term hereditary ataxia (HA) refers to a heterogeneous group of neurological disorders with multiple genetic etiologies and a wide spectrum of ataxia-dominated phenotypes. Massive gene analysis in next-generation sequencing has entered the HA scenario, broadening our genetic and clinical knowledge of these conditions. In this study, we employed a targeted resequencing panel (TRP) in a large and highly heterogeneous cohort of 377 patients with a clinical diagnosis of HA, but no molecular diagnosis on routine genetic tests. We obtained a positive result (genetic diagnosis) in 33.2% of the patients, a rate significantly higher than those reported in similar studies employing TRP (average 19.4%), and in line with those performed using exome sequencing (ES, average 34.6%). Moreover, 15.6% of the patients had an uncertain molecular diagnosis. STUB1, PRKCG, and SPG7 were the most common causative genes. A comparison with published literature data showed that our panel would have identified 97% of the positive cases reported in previous TRP-based studies and 92% of those diagnosed by ES. Proper use of multigene panels, when combined with detailed phenotypic data, seems to be even more efficient than ES in clinical practice.


Introduction
The term hereditary ataxia (HA) refers to a heterogeneous group of rare neurodegenerative disorders with a wide spectrum of ataxia-dominated phenotypes. Gait abnormalities, lack of coordination, dysarthria, and dysmetria are the most common clinical traits, associated with degeneration of Purkinje cells and/or spinocerebellar connections, often combined with atrophy of other regions of both the central and peripheral nervous systems [1].
Although these conditions have been formally classified on the basis of patterns of transmission and disease-gene relationships, different examples of commonalities with a range of clinical syndromes are now rapidly emerging. As a result, it is common to see HA overlapping with other neurological diseases (e.g., hereditary spastic paraplegia, epilepsy, and hypo-and hyperkinetic movement disorders) [2]. Autosomal dominant spinocerebellar ataxia (SCA) currently has an overall estimated prevalence of 1.5-4.0 × 10 −5 [3] and includes more than 40 clinical conditions [4]. Most of these, caused by pathological trinucleotide repeat expansions in coding regions, are termed polyQ SCA. On the other hand, the forms collectively termed autosomal recessive spinocerebellar ataxia (SCAR, prevalence: 1.8-4.9 × 10 −5 [3]) are caused by mutations in more than 100 genes. Changes in a similar number of genes are responsible for recessive forms in which ataxia is only part of the clinical picture.
Thanks to the advent and growth of next-generation sequencing (NGS), our knowledge of HA and its genetic pathogenesis has broadened over the past decade [5]. Since its first application in a small subset of patients [6], various-now common-clinical applications of the technique (i.e., based on targeted resequencing panels (TRPs) and exome sequencing (ES)) have been used in attempts to establish genetic diagnoses in several cohorts with undetermined ataxia. Overall, TRP and ES have been found to have an average diagnostic yield of 19.4% and 34.6%, respectively. However, because of differences in gene coverage, performance quality, data analysis, and global costs, together with the different needs of specific laboratories, it remains difficult to identify the most powerful technique in absolute terms.
Here, we report a large cross-sectional study of 377 highly heterogeneous patients with genetically uncharacterized HA. Using a TRP approach, we obtained a diagnostic rate of 33.2%., nearly twice that reported in other TRP-based studies and comparable to those employing ES. Overall, our data point to increasingly frequent mutation of genes until now considered only very rarely involved in HA and show that massive parallel sequencing is currently unveiling a large set of phenotypes associated with ataxia. They also indicate that TRPs are still suitable for the genetic screening of large cohorts of patients.
A family history was found in 62 (16.5%) of our patients (27 autosomal dominant and 35 autosomal recessive), while the vast majority (315, 83.5%) were sporadic (Figure 1). At least one relative could be tested in 150 (39.8%) of the index cases. Overall, we tested 297 affected or unaffected relatives for segregation studies (Figure 1). Phenotypically, our cohort was highly heterogeneous, including patients with pure cerebellar ataxia; spastic ataxia; congenital ataxia; sensory ataxia; and even ataxia with seizures, myoclonus, peripheral neuropathy, or combinations of these. This reflects the broad heterogeneity of HA seen in routine clinical practice in movement disorder centers and ataxia clinics. Their age at onset was also highly variable: we had childhood-/teenageonset cases (<16 yrs; 94 index cases, 24.9%) as well as early-(<40 yrs; 81/377, 21.5%) and late-onset ones (≥40 yrs; 195/377, 51.7%); age at onset was not ascertained in seven patients (1.9%). The high level of clinical and genetic heterogeneity observed in this cohort is in line with routine referrals to neurogenetics laboratories in our country. Variants of pathogenic or putative pathogenic significance, defined according to the criteria of the American College of Medical Genetics and Genomics, were identified in 125 patients (33.2%) ( Figure 2A): 69 males (55.2%) and 56 females (44.8%). Fifty-nine patients (15.6%) had an uncertain molecular diagnosis because of the presence of variants of unknown significance (VOUS). VOUS also included variants detected in patients with phenotypic data not detailed enough to help with variant prioritization and/or biallelic mutations that could not be phased due to lack of parental DNA ( Figure 2A). Furthermore, patients assigned to the VOUS group also included those harboring variants of putative pathogenic significance in genes not clearly correlated with a specific phenotype, and those carrying single likely pathogenic variants in genes known to cause only recessive disorders [7]. One-hundred-and-ninety-three patients (51.2% of the whole set) were negative on NGS analysis ( Figure 2A). It is tempting to advance various hypotheses in these cases: variants falling outside the coding exons of the genes included in our TRP strategy, variants in yet-to-be-discovered new genes or in genes not canonically associated with HA, or even the possibility of another condition mimicking HA.    In our study, the diagnostic yield in cases with a family history was twice that obtained in sporadic cases ( Figure 2B). Moreover, the possibility of achieving a molecular diagnosis increased with increasing age at onset ( Figure 2B). Unsurprisingly, the possibility to perform confirmatory genetic studies in other relatives also increased the diagnostic rate, whereas no significant differences were detected when considering the gender of the index cases ( Figure 2B).
Among the positive cases, our variant-filtering criteria identified 164 mutations as pathogenic or likely pathogenic (Table 1; in silico predictions, population frequencies, and ACMG classifications are listed in Table S1). Missense variants (107/164, 65.2%) were the most common, whereas frameshift, nonsense, splicing, large deletion, and in frame insertion or deletion mutations were less common, being found in 24 (14.6%), 15 (9.1%), 10 (6.1%), 4 (2.4%), and 4 (2.4%) patients, respectively ( Figure 3A). Large deletions were identified on the basis of complete absence of coverage in the same SPG7 exon in two cases, suggesting a possible homozygous single exon deletion subsequently confirmed by gene-specific multiplex ligation-dependent probe amplification (MLPA) testing.
To better define the pathogenic role of variants identified, we performed functional investigation in silico focusing on all novel missense variants of pathogenic significance (n = 36) detected. Phylogenetic examination showed that all variants are highly conserved through species (Figure S1), and protein domain localization analysis indicated that 28/36 (77.8%) mutations lay in regions supposed to have a critical role for protein function (Table S2). Moreover, to further confirm the robustness of our results, we took advantage of multiple computational methods to predict protein stability changes upon mutations in terms of changes in folding free energy (∆∆G) between wild type and mutant structures. Among proteins whose 3D structures were freely available in online databases, we could computationally analyze the effect of seven variants of pathogenic significance. Together we analyzed four variants in STUB1 found in this study and already known to be disease causing [61,63] used as positive controls. Our analyses converged to assess a destabilizing effect on protein stability in 4/7 cases (Table S3), providing changes in interatomic interactions ( Figure S2).
Among the 125 cases with a positive diagnosis, we were able to study first-degree relatives in 62/125 (49.6%).
In 24/125 (19.2%) patients, SCA genes associated with non-polyQ forms were found to be causative, while SCAR genes were mutated in 40 (32.0%). Interestingly, nearly half of the positive index cases (58/125, 46.4%) harbored variants of pathogenic significance in genes not respecting zygosity rules and known to cause both SCA and SCAR ( Figure 3A). Variants in 56 different genes were considered causative in our group of 125 positive cases, indicating a very high level of genetic heterogeneity. Of note, the nine most common disease-causing genes (STUB1, PRKCG, SPG7, CACNA1A, PNPLA6, SYNE1, TMEM240, CACNA1G, and ITPR1, in that order of frequency) accounted for nearly half (55/125, 44.0%) of all our positive cases ( Figure 3B). Thus, STUB1 was the most common (11/125, 8.8%), followed by PRKCG (8.0%) and SPG7 (6.4%) ( Figure 3B). On sorting pathogenic variants by their associated GO terms and potential disease mechanisms, we observed that mutations in genes involved in protein homeostasis and quality control (found in 24/125 patients, 19.2%) were the most frequent in our cohort. However, genes coding for ion channels (17 index cases, 13.6%) or involved in signal transduction (14,11.2%) were also frequent ( Figure 3C). Furthermore, we noticed a significant frequency of variants in genes involved in cytoskeleton and cell ultrastructure functions (in nine index cases, 7.2%), lipid metabolism (eight, 6.4%), DNA repair and maintenance (seven cases, 5.6%), transport proteins (seven, 5.6%), intracellular transport (six, 4.8%), and electron respiratory chain/oxidative metabolism (five patients, 4.0%) ( Figure 3C). Few patients had pathogenic variants in genes encoding tRNA synthetases, proteins involved in DNA replication and transcription, ciliary and mitochondrial biogenesis, and homeostasis; variants in genes linked to other molecular pathways were even rarer.
Among the 59 patients with an uncertain molecular diagnosis (Table S4), most had lateonset disease (>40 yrs; 37/59, 62.7%), almost all were sporadic cases (55/59, 93.2%), and in the majority it was not possible to study biological samples from relatives for segregation analyses (54, 91.5%). A positive family history and the possibility to perform segregation analysis can of course corroborate or exclude the pathogenic role of detected gene variants. Forty-seven of the 193 negative cases (24.3%) were initially considered as VOUS, but further assessments in family relatives helped to exclude the potential involvement of "candidate" variants.
We then compared our results with those published earlier by others. Our literature analysis (conducted using PubMed and Google Scholar; latest access 4 January 2021) identified 27 published studies that applied NGS methods in heterogeneous cohorts of genetically undiagnosed HA patients. TRP strategies were employed in nine (33%) studies and ES in 15 (56%), whereas three studies (11%) employed both methods (Table 2). Overall, the index cases involved in TRP and ES studies numbered 1262 and 1179, respectively. In analyzing the literature data and calculating the weighted mean in each study in relation to its cohort size, we observed that the use of TRPs led to a mean diagnostic yield of 19.4% (range: 11-82%), whereas ES delivered a mean diagnostic rate of 34.6% (range: 20-57%) (Figure 2A). Table S5 lists the disease-causing genes in each study. We did not consider percentages of patients with VOUS, as this information is lacking in most studies. As in our study, the inclusion criteria used in previous reports analyzed following our literature search were broad, allowing the inclusion of pediatric and late-onset patients, familial or sporadic ones, and cases with different patterns of inheritance. In several studies (13/27, 48%), the inclusion criteria were even more relaxed, generating highly heterogenous cohorts. Interestingly, most studies suggest that in the presence of certain features, such as early onset, a positive family history, the availability of segregation studies, and consanguinity in the family, the likelihood of reaching a definitive genetic diagnosis is high.

Discussion
The diagnostic rate (33.2%) obtained in our study was nearly twofold the average weighted value described in the TRP-based studies of HA (19.4%) reported in the literature, and comparable to the value reported in those using ES as a first-tier approach (34.6%). Considering the high coverage reached with multigene panels, and the easier and faster analysis of their results compared with the more commonly used ES method (where low, not always uniform coverage might result in gaps), it can be argued-as others have done [93]-that multigene panels are still worth using for quick screening of large cohorts. Our results indicate that the large multigene panel we designed would have intercepted 97.5% (235/241) of the diagnosed cases in published TRP series (excluding those with mtDNA mutations) and most of those solved by ES (92.2%; 376/408) ( Table S8). The TRPs used in the literature included an average of 73 genes, and detected mutations in 45, whereas ES-based studies detected variants in 127 genes; these data further corroborate the high diagnostic power of our panel (>200 genes analyzable).
Nonetheless, with the costs of both techniques (ES in particular) now rapidly declining, we suggest that the most correct approach for the coming years-at least until we have cost-effective whole-genome sequencing (WGS) strategies suitable for routine clinical usemight be to combine ES (ensuring minimum average coverage of 100X) with an in silico panel (including the genes in our TRP) for gene prioritization.
It is likely to assume that the significantly higher diagnostic yield achieved in our study, compared to those described in other TRP-based approaches, mainly depends on the number of genes analyzed (285), that is four time higher than average value of panels in literature (73) and with the number of HA-related genes being increased over time. Indeed, despite only a dozen genes being responsible for the disease of half of our cohort, the genetic etiology of the half remaining is spread through > 40 genes. This assumption is corroborated by analogous results obtained with ES-based studies analyzed. However, also, the peculiar clinical features of selected patients had a pivotal role in reaching a positive molecular diagnosis (i.e., late onset and presence of familial history).
In our cohort, STUB1, a gene originally described in SCAR16 and recently also associated with SCA48 [64], was found to be the most frequent disease-causing gene in HA [61,63], as also confirmed by others [95,96]. We cannot, however, exclude that the relative novelty of this gene might be a source of bias explaining its high rate in our study. PRKCG, a gene known to cause a rare form of ataxia (SCA14), was also found to be common [53], as was SPG7, which had a similar high frequency [97,98]. PNPLA6, a gene originally associated with HSP [47,99], was quite common, as too was TMEM240 [67,100]. These findings suggest that it is worth testing sporadic HA patients for the rarer non-polyQ SCA forms.
Protein homeostasis and quality control, especially mitochondrial assembly and signaling, emerged as the biological pathways most frequently impaired in HA [101]; furthermore, a high prevalence of mutations in genes coding for ion channels or their subunits confirms recent findings indicating their high frequency in SCA [25,90]. Conversely, mutations in genes involved in signal transduction, cell development, DNA and RNA maintenance, metabolism of complex lipids, transport proteins, intracellular transport, and electron transport appear less common [4,101]. Other cellular processes such as those involving tRNA synthetases are seldom affected [8,29].
A sizeable proportion of the patients in our study harbored VOUS, and therefore had an uncertain molecular diagnosis. Like the patients with a defined genetic diagnosis, these cases harbored mostly (~75%) missense mutations, and this fact further highlights the need, in medical genetics, for robust functional tools for variant interpretation. In the case of VOUS, in silico predictions alone cannot suffice, and family studies often remain inconclusive. Therefore, efforts should be made to include functional analyses in the process of assessing and validating the putative pathogenic role of new variants. For instance, systematic use of simple in vivo models to predict the impact of new variants (e.g., complementation assays in yeast) could be a relatively fast and efficient method [21]. Our data also highlight the difficulties in providing certain genetic diagnoses, and therefore adequate counseling, to sporadic patients, especially in cases where it is not possible to investigate close family members, or there is limited access to clinical data. Future NGS studies in HA would certainly benefit from more appropriate sample/data collection, even more robust bioinformatic tools, and technical improvements facilitating phase attribution of variants (i.e., long-read sequencing [102]).
Notably, recent literature, including studies performed by our group, clearly suggest that the significance of variants should not be inferred from the clinical features of the index case and the known pattern of zygosity, because further heterogeneity and atypical phenotypes are emerging all the time, and the list of causative genes, both dominant and recessive, is growing rapidly [64]. Moreover, thanks to the availability of large, rapidly consultable genetic data repositories shared by multiple laboratories worldwide (such as the GENESIS 2.0 platform; https://www.tgp-foundation.org/, latest access 29 December 2020) [103], and the contribution of ataxia experts, it is also becoming possible to filter rare variants absent in public databases and avoid false interpretations [104].
Current computational methods predicting changes in protein stability seem to provide a reliable tool to confirm the deleterious effects of missense variants. On the other hand, these methods require a solved 3D protein structure, currently not available for several proteins, and often consisting of only a partial structure for most. Furthermore, these predictions do not take account of specific protein-protein interactions that are crucial in polygenic inheritance and in protein complexes assembly, probably explaining, for instance, why p. Val571Gly mutations in AFG3L2 (inherited together with p. Ala510Val in SPG7 in one patient) does not computationally sort any destabilizing effect on protein structure. However, the outcomes achieved combining in silico and 3D-modeling studies allowed us to speculate on the potential causative role of the other variants (i.e., those whose 3D protein structure was not available) Interestingly, the SCAR genes considered to be frequent in the pre-NGS era (e.g., ATM, SETX, and APTX) were less common than expected in our cohort, or even absent (i.e., SACS). This could be related to the prevalently retrospective nature of our study, in which patients with peculiar phenotypes, such as those resembling ataxia-telangectasia, ataxia with oculomotor apraxia, and autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS), underwent direct Sanger sequencing prior to inclusion [105].
It is worth mentioning the translational value of our study. For instance, we identified a child harboring a de novo truncating mutation in SLC2A1, encoding the major glucose transporter in the brain [106], a critical finding that allowed the child to be promptly put on a ketogenic diet, which led to a significant clinical improvement (personal communication to FMS). Furthermore, our findings were helpful in identifying novel neuroimaging biomarkers in SCA48 [107] that could facilitate future diagnosis.
The TRP analyses were negative in two-thirds of our patients, as is commonly the case in all neurological disorders when using NGS applications [108]. Variants in noncoding parts of the genome (e.g., the intronic RCF1 expansion) may account for unsolved cases, but functional tests remain a challenge. Furthermore, as recently observed in ataxia linked to both ultra-rare (i.e., POLR3A) and relatively more common (i.e., SPG7) genes [52,109], the presence of deep intronic mutations (not usually sought in TRP studies) might also explain these difficulties. The use of molecular karyotype analysis in sporadic cases might unveil de novo quantitative alterations not detectable by routine NGS applications. Pathogenic mutations misclassified as benign are presumably a common bias generating false-negative results in every NGS study. Indeed, systematic re-examinations of NGS data in the face of novel clinical insights, as well as more efficient combination of bioinformatic tools with system biology information, could increase the rate of positive diagnoses [88]. Non-genetic factors, such as epigenetic, post-transcriptional or environmental factors, might also play a role; the same applies to co-occurrence of variants in different genes that may exert a synergistic effect in the development of the disease. This latter phenomenon was found in our study for example (i.e., in the form of digenic mutations in SPG7 and AFG3L2), and it has also been observed by others [95,110]. Intriguingly, recent studies indicate that the even more complex inheritance mechanisms of classical Mendelian disorders, such as multilocus inheritance, are emerging in inherited neurological disorders [111]. Against this background, it seems clear that the genetics of HA has new surprises in store for the future.
In summary, both our own experience and our literature analysis underline that a core set of a few dozen genes is the cause of most non-polyQ forms of HA, and highlight the existence of "more common", "relatively rare", and "ultra-rare" HA genes. Our experience suggests that TRPs are still a robust tool in clinical practice, and if combined with informative clinical data are worth adopting in large-scale genetic screenings.

Patient Recruitment
All samples were collected in centers belonging to ITASPAX (the Italian Spastic Paraplegia and Ataxia Network, coordinated by AF and FMS). The patient recruitment and biological sample collection stages of the study were performed during the six-year period 2015-2020. Individuals with acquired forms of ataxia were not included. A clinical diagnosis of genetically uncharacterized HA was the only inclusion criterion used, as the aim was to obtain a sample that reflected the routine clinical practice scenario. There-fore, patients were included regardless of their age at onset, their clinical features, and of the presence/absence of a family history of the disease and/or relatives available for segregation analysis.

DNA Extraction and Preliminary Analyses of Repeated Nucleotide Expansions
Genomic DNA was obtained using the MagPurix Blood DNA Extraction Kit 200 designed for the MagPurix DNA Extract (Zinexts, Zhonghe, Taiwan). Before undergoing massive parallel sequencing, all patients were tested for pathological trinucleotide expansions in SCA1, 2, 3, 6, 7, 8, 12, 17 using a TP-PCR-based method [112,113], and for the intronic GAA expansion in FXN using an established long-PCR technique [114]. Capillary sequencing for TP-PCR products was performed using a 3130xl Genetic Analyzer (Thermo Fisher Scientific, Waltham, MA, USA), and fragment analyses were performed using GeneMapper ID Software version 3.1 (Thermo Fisher Scientific).

Massive Parallel Sequencing and Data Analysis
A custom targeted resequencing panel encompassing 285 genes known to cause HA or more complex syndromes in which ataxia is a symptom was designed using SureDesign (Agilent Technologies, Santa Clara, CA, USA) (full list of genes available in Table S9). Fifty bases upstream and downstream of every coding exon were also covered, and designed probes were predicted to cover 99.5% of the whole region of interest. Library preparations were realized following the manufacturer's instructions. Massive parallel sequencing was carried out using a NextSeq500 (Illumina, San Diego, CA, USA) sequencer.
Variant classification was based on the American College of Medical Genetics and Genomics published guidelines [115]. Splicing variants and synonymous variants close to splicing sites were also tested using Human Splicing Finder 3.1 (http://www.umd.be/HSF/, latest access 22 December 2020) and NNSPLICE 0.9 (http://www.fruitfly.org/seq_tools/splice.html/, latest access 22 December 2020). Thus, variants were further filtered using highly stringent criteria, to identify those with a CADD score > 20 and for which the majority (more than half) of other algorithms suggested a damaging effect.
Filtered variants were also explored in the PREPARE-Ataxia network genomes (https://www.prepare-ataxia.com/, latest access 29 December 2020). This was done in the GENESIS 2.0 platform (https://www.tgp-foundation.org/, latest access 29 December 2020), an affordable genome-scale analysis and data management solution for medical research containing genomic data of over 12,000 individuals with rare neurological diseases and both affected and healthy relatives, to find a match with other affected individuals, or to exclude variants present in healthy individuals in the case of dominant genes, or genes characterized by high frequency. Use of the Human Gene Mutation Database was not deemed mandatory in order to classify variants as disease causing.

Sanger Sequencing
Regions containing selected variants of interest were amplified by PCR. PCR products were purified with ExoSAP-IT™ PCR Product Cleanup Reagent (Thermo Fisher Scientific) and sequenced using the BigDye™ Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher Scientific) in a 3500xL Genetic Analyzer (Thermo Fisher Scientific). Electropherograms were analyzed using SeqScape™ Software v3.0 (Thermo Fisher Scientific).

Multiplex Ligation-Dependent Probe Amplification (MLPA) Analysis
In order to find second mutations, MLPA (MRC-Holland, Amsterdam, The Netherlands) was performed to identify deletions/duplications in patients harboring a single mutation in a frequent recessive gene. We used Salsa kits P213 for SPG7, P441 for SACS, and P163 for WFS1. Capillary sequencing for MLPA products was performed using 3130xl Genetic Analyzer (Thermo Fisher Scientific), while Coffalyser software (MRC-Holland) was employed to analyze MLPA results.

Computational Analysis of Protein Stability
To predict protein stability changes upon mutation, in terms of variation in folding free energy (∆∆G) between wild type and mutant structures, we employed five different computational methods: DynaMut [116], ENCoM [117], mCSM [118], SDM [119], and DUET [120]. This analysis was performed on those variants that were novel and whose 3D structure was partially or completely solved and available in Protein Data Bank (https://www.rcsb.org/, latest access 25 May 2021). DynaMut web server (http://biosig.unimelb.edu.au/dynamut/, latest access 25 May 2021) was used to perform all predictions, and to generate images of interatomic interactions.

Literature Revision
To analyze the state-of-the-art in HA gene testing, we looked for relevant literature published since 2013, querying PubMed and Google Scholar with pertinent keywords (latest access 4 January 2021). To allow a reliable comparison between our study and those reported in the literature, we specifically selected cohort studies involving small, medium and large cohorts of patients with a clinical diagnosis of HA but no diagnosis on common genetic tests, regardless of other specific features.

Conclusions
The application of a TRP in 377 patients with a clinical diagnosis of HA yielded a molecular diagnosis in one out of three patients, twice the diagnostic yield reported in similar published studies employing TRPs, and in line with the results of ES.
Our study allows a series of considerations. First, it seems that genomic results should be considered dynamic, not static, data, and we strongly encourage their sharing through appropriate platforms, especially in the case of "ultra-rare" HA genes. Second, both our experience and data published by others confirm that NGS is a far-from-perfect tool. Even though we broadened the genetic and clinical spectrum of HA, two-thirds of our cases remained unsolved, confirming the results of previous studies. It is tempting to speculate that we might have reached a sort of "plateau" in our ability to find answers in the coding parts of the genome; this clearly makes use of WGS a much-needed next step in medical genetics. Other boundaries include zygosity and formal clinical classifications, with the latter now being almost obsolete. Moreover, consideration should also be given to the possibility of identifying additional minor variants acting as potential modifiers, or even cases of complex multilocus inheritance. Similarly, a more correct integration of genetic data with results of other omics approaches may unveil non-genetic causes of HA. Overcoming all these challenges will probably move us towards an even higher diagnostic rate, by allowing us to solve "cold cases" in several neurological conditions, including HA.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/ijms22168490/s1/, Table S1: In silico predictions, population frequencies, and ACMG classification of pathogenic and likely pathogenic variants in positive cases, Table S2: Protein domain localization analysis of novel missense variants of pathogenic significance, Table S3: Computational analysis of protein stability changes Table S4: Genetic features of patients with an uncertain molecular diagnosis, Table S5: Disease-causing genes in each study analyzed in the literature, Table S6: Types of pathogenic mutations identified in all the studies analyzed, Table S7: Frequency of all diseasecausing genes described in the literature, Table S8: Efficiency of our TRP applied to literature data, Table S9: Full list of genes available in the TRP employed in this study, Figure S1: Conservation analysis through species of novel variants of pathogenic significance identified in this study, Figure S2: Interatomic interactions changes of missense variants computationally investigated. Wild-type and mutant residues are colored in light-green.

Institutional Review Board Statement:
The study was conducted in accordance with Italian National Health System guidelines and the Declaration of Helsinki. It was approved by the Tuscany Regional Pediatric Ethics Committee (approval code CEP148/2016; August 2016) and by the Ethics in Research Committee of IRCCS Fondazione Stella Maris (Pisa, Italy). Storage/handling of genetic and personal data complied with Italian National Health Institute (ISS) regulations on ethical and biomedical research and with relevant current legislation.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.