An analysis of variance type test for comparing clusters of DNA sequences based on randomization test methodologies
A method for comparing groupings of DNA sequences is presented, which utilizes randomization test methods to assign significance levels to a test statistic defined in terms of the Hamming distance between two sequences. The method, which is intuitively motivated by the analysis of variance procedure, partitions the variation caused by differences between clusters from the variation attributable to differences at random base pair locations within clusers. Implementation issues are discussed, and an example of the application of the method is provided.