Howard Hughes Medical Institute & Department of Genetics, Washington University School of Medicine, Saint Louis, Missouri 63110 USA. eddy@genetics.wustl.edu
BACKGROUND: Covariance models (CMs) are probabilistic models of RNA secondary structure, analogous to profile hidden Markov models of linear sequence. The dynamic programming algorithm for aligning a CM to an RNA sequence of length N is O(N3) in memory. This is only practical for small RNAs. RESULTS: I describe a divide and conquer variant of the alignment algorithm that is analogous to memory-efficient Myers/Miller dynamic programming algorithms for linear sequence alignment. The new algorithm has an O(N2 log N) memory complexity, at the expense of a small constant factor in time. CONCLUSIONS: Optimal ribosomal RNA structural alignments that previously required up to 150 GB of memory now require less than 270 MB.