Models that allows continuous variability of mutation rates over sites are more realistic and the gamma model of Yang (1994) outperforms the invariant model. The discrete gamma model is implemented in PHASE . The continuous rate distribution is approximated with a discrete distribution which is computationaly tractable and sites are divided into equally probable rate categories. A single parameter governs the shape of this distribution and the substitution rates for all categories. The mean of the gamma distribution is the average mutation rate of our substitution model as stated earlier and its variance is . A small alpha suggests that rates differ significantly between sites with few sites having high rates and others being practically invariant; on the contrary, large models weak rate heterogeneity (see figure 2.6). When , the gamma model reduces to the single rate model. Computational requirement of the discrete gamma model is roughly linear, i.e., the application of a discrete gamma model with categories is about times slower than the use of a model where rate heterogeneity is not considered.
|