ClockTree class documentation
ClockTree is a class that implements the core algorithms for maximum likelihood time tree inference. It operates on a tree with fixed topology. All operations the reroot or change tree topology are part of the TreeTime class.
ClockTree docstring and constructor
- class treetime.ClockTree(*args, dates=None, debug=False, real_dates=True, precision_fft='auto', precision='auto', precision_branch='auto', branch_length_mode='joint', use_covariation=False, use_fft=True, **kwargs)[source]
ClockTree is the main class to perform the optimization of the node positions given the temporal constraints of (some) leaves.
The optimization workflow includes the inference of the ancestral sequences and branch length optimization using TreeAnc. After the optimization is done, the nodes with date-time information are arranged along the time axis, the conversion between the branch lengths units and the date-time units is determined. Then, for each internal node, we compute the the probability distribution of the node’s location conditional on the fixed location of the leaves, which have temporal information. In the end, the most probable location of the internal nodes is converted to the most likely time of the internal nodes.
- __init__(*args, dates=None, debug=False, real_dates=True, precision_fft='auto', precision='auto', precision_branch='auto', branch_length_mode='joint', use_covariation=False, use_fft=True, **kwargs)[source]
ClockTree constructor
- Parameters:
dates (dict) –
{leaf_name:leaf_date}
dictionarydebug (bool) – If True, the debug mode is ON, which means no or less clean-up of obsolete parameters to control program execution in intermediate states. In debug mode, the python debugger is also allowed to interrupt program execution with intercative shell if an error occurs.
real_dates (bool) – If True, some additional checks for the input dates sanity will be performed.
precision (int) – Precision can be 0 (rough), 1 (default), 2 (fine), or 3 (ultra fine). This parameter determines the number of grid points that are used for the evaluation of the branch length interpolation objects. When not specified, this will default to 1 for short sequences and 2 for long sequences with L>1e4
precision_fft (int) – When calculating convolutions using the FFT approach a regular discrete grid needs to be chosen. To optimize the calculation the size is not set to a fixed number but is determined by the FWHM of the distributions. The number of points desired to span the width of the FWHM of a distribution can be specified explicitly by precision_fft (default is 200).
branch_length_mode (str) – determines whether branch length are calculated using the ‘joint’ ML, ‘marginal’ ML, or branch length of the input tree (‘input’).
use_covariation (bool) – determines whether root-to-tip regression accounts for covariance introduced by shared ancestry.
use_fft (boolean) –
- Use FFT for calculation of convolution integrals if true (default).
The alternative is kept to be able to reproduce previous behavior.
- **kwargs:
Key word arguments passed on to the parent class (TreeAnc)
Running TreeTime analysis
- ClockTree.init_date_constraints(clock_rate=None, **kwarks)[source]
Get the conversion coefficients between the dates and the branch lengths as they are used in ML computations. The conversion formula is assumed to be ‘length = k*numdate + b’. For convenience, these coefficients as well as regression parameters are stored in the ‘dates2dist’ object.
Note
The tree must have dates set to all nodes before calling this function.
- Parameters:
clock_rate (float) – If specified, timetree optimization will be done assuming a fixed clock rate as specified
- ClockTree.make_time_tree(time_marginal=False, clock_rate=None, **kwargs)[source]
Use the date constraints to calculate the most likely positions of unconstrained nodes.
- Parameters:
time_marginal (bool) – If true, use marginal reconstruction for node positions
**kwargs – Key word arguments to initialize dates constraints
Post-processing
- ClockTree.branch_length_to_years()[source]
This function sets branch length to reflect the date differences between parent and child nodes measured in years. Should only be called after
timetree.ClockTree.convert_dates()
has been called.- Returns:
All manipulations are done in place on the tree
- Return type:
None
- ClockTree.convert_dates()[source]
This function converts the estimated “time_before_present” properties of all nodes to numerical dates stored in the “numdate” attribute. This date is further converted into a human readable date string in format %Y-%m-%d assuming the usual calendar.
- Returns:
All manipulations are done in place on the tree
- Return type:
None
- ClockTree.get_confidence_interval(node, interval=(0.05, 0.95))[source]
If temporal reconstruction was done using the marginal ML mode, the entire distribution of times is available. This function determines the 90% (or other) confidence interval, defined as the range where 5% of probability is below and above. Note that this does not necessarily contain the highest probability position. In absense of marginal reconstruction, it will return uncertainty based on rate variation. If both are present, the wider interval will be returned.
- Parameters:
node (PhyloTree.Clade) – The node for which the confidence interval is to be calculated
interval (tuple, list) – Array of length two, or tuple, defining the bounds of the confidence interval
- Returns:
confidence_interval – Array with two numerical dates delineating the confidence interval
- Return type:
numpy array
- ClockTree.get_max_posterior_region(node, fraction=0.9)[source]
If temporal reconstruction was done using the marginal ML mode, the entire distribution of times is available. This function determines the interval around the highest posterior probability region that contains the specified fraction of the probability mass. In absense of marginal reconstruction, it will return uncertainty based on rate variation. If both are present, the wider interval will be returned.
- Parameters:
node (PhyloTree.Clade) – The node for which the posterior region is to be calculated
interval (float) – Float specifying who much of the posterior probability is to be contained in the region
- Returns:
max_posterior_region – Array with two numerical dates delineating the high posterior region
- Return type:
numpy array