The aim of this section is to assign each nucleotide/pair to
a class. Each class is expected to have a different pattern of evolution.
This section consists of a sequence of integers which correspond to the class
of each nucleotide.
For instance, the class section of a protein coding gene may look like:
...2 3 1 2 3 1 2 3 1 2 3 1 2 ...
When the data file contains a class section, programs in the PHASE
package
expect it to comply to the following set of rules:
Since PHASE is specifically designed for the analysis of RNA sequences with secondary structure, the most common use of the class section should be the obvious separation of unpaired and base-paired sites into two distinct classes. The code MIXED can replace the code RNA to avoid a tiresome task and let PHASE know that he can simply use the provided pairing mask to build the class section (e.g., (((.())))..) implies 2 2 2 1 2 2 2 2 2 1 1 2). When the code MIXED is used the class section is not compulsory and the unpaired and paired sites will respectively be attributed to the classes 1 and 2 automatically.
Usually classes are used to determine the model of sequence evolution PHASE is using with each nucleotide. Each class in the data file is treated by its own model of nucleotide substitution during the phylogenetic inference. The models are defined later in the model section of the control file. Let us just point out here that if you use the MIXED type for your data with the automatic assignment, i.e., without the class section, you have to make sure your first and second model are respectively a nucleotide substitution model and a base-pair substitution model when you declare your models of evolution. We will return to this point later on.