(a) Correlation between original and corrected *Z* scores (*Z* and *Z*_{D}, respectively). Statistically significant amino acid covarying pairs (black diamonds) were distinguished from nonsignificant results (gray diamonds) using the corrected *Z*_{D} score, FDR, and a probability threshold of *P* < 0.001. The correlation coefficient between *Z* and *Z*_{D} was low (*r*^{2} = 0.63). (b) Simulations of sequences constrained by the distance scaling factor *D*. Six sets of 1,000 simulated sequences were analyzed for covarying amino acids, where *D* was constrained between 0.1 to 0.3, 0.2 to 0.4, 0.3 to 0.5, 0.4 to 0.6, 0.5 to 0.7, and 0.6 to 0.8 within a set of sequences. The *Z*_{D} and *Z* scores were plotted against each other for each set, and a binomial line of best fit was calculated. The data generated using subtype B HIV-1 integrase sequences are shown as both a scatter plot and a liner of best fit. (c) Relationship between the *Z* score and *P* value for simulated data sets. FDR-corrected *P* values were estimated from the *Z*_{D} scores relating to each of the six simulated data sets. The minimum *Z*_{D} score required to produce a *P* value lower than a range of *P* values was plotted (the average *D* of six data sets is shown as a solid line; *D* = 0.1 to 0.3, dashed line; *D* = 0.6 to 0.8, dotted line).

## PubMed Commons