There are two approaches to testing the performance of analytical methods – empirical performance testing and simulation-based performance testing. In an empirical performance test, a method is used to analyze data collected in nature. Though a great deal can be learned from an empirical test, this approach is limited due to the fact that we never know the true structure of a group of natural populations with certainty. In the TOSSM project, this problem is avoided by using simulation-based tests simulating groups of populations that display known spatial structures. This way, we know what the true structure of the population is, and leave it up to genetic methods to see if they can ferret out the truth. For the TOSSM project, we have generated simulated datasets representing five different population structure ‘archetypes’, ranging from a single population with no spatial structure to multiple populations with varying degrees of dispersal between. Within each archetype, there are various 'scenarios'--alternative population parameterizations. All of the TOSSM datasets are available from the TOSSM download center.


The datasets were generated using the R package Rmetasim (Strand, 2002). The simulations were initialized with haplotype and allele frequencies generated by the coalescent program SimCoal (Laval and Excoffier, 2004). Each individual in each dataset possesses a 500 bp mitochondrial sequence and 30 unlinked microsatellite loci. Mutation parameters for the loci were tuned to produce genetic diversity statistics comparable to those found in eastern Pacific gray whales. The demographic parameters of the model were also tuned to eastern Pacific gray whales. Further details on the generation and parameterization of the datasets are available in the TOSSM dataset handbook.

