ClockTree class documentation

ClockTree is a class that implements the core algorithms for maximum likelihood time tree inference. It operates on a tree with fixed topology. All operations the reroot or change tree topology are part of the TreeTime class.

ClockTree docstring and constructor

class treetime.ClockTree(*args, dates=None, debug=False, real_dates=True, precision_fft='auto', precision='auto', precision_branch='auto', branch_length_mode='joint', use_covariation=False, use_fft=True, **kwargs)[source]

ClockTree is the main class to perform the optimization of the node positions given the temporal constraints of (some) leaves.

The optimization workflow includes the inference of the ancestral sequences and branch length optimization using TreeAnc. After the optimization is done, the nodes with date-time information are arranged along the time axis, the conversion between the branch lengths units and the date-time units is determined. Then, for each internal node, we compute the the probability distribution of the node’s location conditional on the fixed location of the leaves, which have temporal information. In the end, the most probable location of the internal nodes is converted to the most likely time of the internal nodes.

__init__(*args, dates=None, debug=False, real_dates=True, precision_fft='auto', precision='auto', precision_branch='auto', branch_length_mode='joint', use_covariation=False, use_fft=True, **kwargs)[source]

ClockTree constructor

Parameters:
  • dates (dict) – {leaf_name:leaf_date} dictionary

  • debug (bool) – If True, the debug mode is ON, which means no or less clean-up of obsolete parameters to control program execution in intermediate states. In debug mode, the python debugger is also allowed to interrupt program execution with intercative shell if an error occurs.

  • real_dates (bool) – If True, some additional checks for the input dates sanity will be performed.

  • precision (int) – Precision can be 0 (rough), 1 (default), 2 (fine), or 3 (ultra fine). This parameter determines the number of grid points that are used for the evaluation of the branch length interpolation objects. When not specified, this will default to 1 for short sequences and 2 for long sequences with L>1e4

  • precision_fft (int) – When calculating convolutions using the FFT approach a regular discrete grid needs to be chosen. To optimize the calculation the size is not set to a fixed number but is determined by the FWHM of the distributions. The number of points desired to span the width of the FWHM of a distribution can be specified explicitly by precision_fft (default is 200).

  • branch_length_mode (str) – determines whether branch length are calculated using the ‘joint’ ML, ‘marginal’ ML, or branch length of the input tree (‘input’).

  • use_covariation (bool) – determines whether root-to-tip regression accounts for covariance introduced by shared ancestry.

  • use_fft (boolean) –

    Use FFT for calculation of convolution integrals if true (default).

    The alternative is kept to be able to reproduce previous behavior.

    **kwargs:

    Key word arguments passed on to the parent class (TreeAnc)

Running TreeTime analysis

ClockTree.init_date_constraints(clock_rate=None, **kwarks)[source]

Get the conversion coefficients between the dates and the branch lengths as they are used in ML computations. The conversion formula is assumed to be ‘length = k*numdate + b’. For convenience, these coefficients as well as regression parameters are stored in the ‘dates2dist’ object.

Note

The tree must have dates set to all nodes before calling this function.

Parameters:

clock_rate (float) – If specified, timetree optimization will be done assuming a fixed clock rate as specified

ClockTree.make_time_tree(time_marginal=False, clock_rate=None, **kwargs)[source]

Use the date constraints to calculate the most likely positions of unconstrained nodes.

Parameters:
  • time_marginal (bool) – If true, use marginal reconstruction for node positions

  • **kwargs – Key word arguments to initialize dates constraints

Post-processing

ClockTree.branch_length_to_years()[source]

This function sets branch length to reflect the date differences between parent and child nodes measured in years. Should only be called after timetree.ClockTree.convert_dates() has been called.

Returns:

All manipulations are done in place on the tree

Return type:

None

ClockTree.convert_dates()[source]

This function converts the estimated “time_before_present” properties of all nodes to numerical dates stored in the “numdate” attribute. This date is further converted into a human readable date string in format %Y-%m-%d assuming the usual calendar.

Returns:

All manipulations are done in place on the tree

Return type:

None

ClockTree.get_confidence_interval(node, interval=(0.05, 0.95))[source]

If temporal reconstruction was done using the marginal ML mode, the entire distribution of times is available. This function determines the 90% (or other) confidence interval, defined as the range where 5% of probability is below and above. Note that this does not necessarily contain the highest probability position. In absense of marginal reconstruction, it will return uncertainty based on rate variation. If both are present, the wider interval will be returned.

Parameters:
  • node (PhyloTree.Clade) – The node for which the confidence interval is to be calculated

  • interval (tuple, list) – Array of length two, or tuple, defining the bounds of the confidence interval

Returns:

confidence_interval – Array with two numerical dates delineating the confidence interval

Return type:

numpy array

ClockTree.get_max_posterior_region(node, fraction=0.9)[source]

If temporal reconstruction was done using the marginal ML mode, the entire distribution of times is available. This function determines the interval around the highest posterior probability region that contains the specified fraction of the probability mass. In absense of marginal reconstruction, it will return uncertainty based on rate variation. If both are present, the wider interval will be returned.

Parameters:
  • node (PhyloTree.Clade) – The node for which the posterior region is to be calculated

  • interval (float) – Float specifying who much of the posterior probability is to be contained in the region

Returns:

max_posterior_region – Array with two numerical dates delineating the high posterior region

Return type:

numpy array

ClockTree.timetree_likelihood(time_marginal)[source]

Return the likelihood of the data given the current branch length in the tree