Different stages in the A*-search algorithm. (**a**) The top-most triangle represents the Scooby-Domain domain-probability matrix for a protein sequence. The search for protein domains in a query sequence is like travelling through a maze. The centre of the maze being the best domain prediction. In this figure, each triangle is like a different path through the maze, and each level below the first triangle represents one more domain region being predicted. Each ‘hotspot’ in the triangular matrix, is used to locate the exact region of the sequence with highest probability of a globular domain being formed. Three highest scoring hotspots in the first matrix are identified and highlighted with a dot in the figure, with scores of 7, 3 and 6, respectively. This leads to the addition of three new paths, with each one being the recalculated matrix for the remaining sequence, after the first domain region was predicted and removed from the original sequence. (**b**) Each triangle also represents a node in the search tree, where each node could branch to a different path that may lead to the solution. The highest scoring triangle (7) is searched for new hotspots, which have scores of 5, 2 and 2. (**c**) Regardless of level, the node with the next highest score would be searched upon, until no further domain regions can be predicted. In this example, it is the node with a score of 6. This allows the algorithm to consider different parallel paths through the ‘maze’, covering a larger area, and avoiding the search being confined to a ‘dead end’ path. (**d**) The next node to search following the highest scoring predictions has a score of 5.

## PubMed Commons