For permanently submitted haplotype datasets, MARKER uses an ENTROPY routine to choose an economical
set of markers from the dataset, which describes the full information content of the set. The algorithm
(due to Richard Mott) starts by picking the most informative marker, which will be that with allele
frequency closest to 0.5. It then repeatedly adds new markers from the set, each time choosing the
one marker which most increases the logical entropy of the system. Eventually enough markers have
been chosen to distinguish all the haplotypes.
[More details soon...]
The page below shows the attributed entropy contributions for a test set of data, with the selected SNPs
highlighted. To get a feel for how the entropy works, you can select or deselect some of the markers
and then press "Recompute entropy" to see how well the new selection describes the set.
Note that the Mott scores shown only apply to the original processing of the dataset,
representing the entropy increase at each step in the process, and they may be strongly dependent on the
order in which the default selection was built up.
|