Resources Learn Page Amino Acid Explorer PSSM Viewer Help CDD Help Go back

What do the sequence conservation levels (complete, high, moderate, low) mean? Sequence conservation in the PSSM Viewer is measured using the "information content" of each column in the PSSM. For each column that contains more than one residue type, the information content C is defined as the following sum over all residue types i in the column (excluding gap characters): SUM { f(i) * log(2) [f(i) / q(i)] }
where f(i) = weighted frequency of residue i (see help: How are the frequency bars calculated?) Thus, if a column has the same residue frequencies as described in the BLOSUM62 matrix, C will be zero. If a column has higher residue freqencies than described by BLOSUM62, C will be positive. High values of C therefore imply significantly more conservation than implied by BLOSUM62.
The sequence conservation values are defined as follows:
