Tree Builder Help

From Ribosomal Database Project Wiki
Jump to: navigation, search


Tree Builder Help


Tree Builder uses a Java applet for displaying and saving your tree. This requires a browser capable of displaying Java applets, and Java version 1.0 or later (if needed, Download Java).


Tree Builder uses sequences aligned with RDP's aligner. A distance matrix is generated using the Jukes-Cantor corrected distance model. When generating the distance matrix, only alignment model positions are used, alignment inserts are ignored and the minimum comparable position is 200. The tree is created using Weighbor with alphabet size 4 and length size 1000.

Weighbor Tree
Weighbor is a weighted version of Neighbor Joining that gives significantly less weight to the longer distances in the distance matrix. The weights are based on variances and covariances expected in a simple Jukes-Cantor model.
Jukes-Cantor Correction
The Jukes-Cantor distance correction is a model which considers that as two sequences diverge, the probability of a second substitution at any nucleotide site increases. For distance-based trees such as Weighbor, the difference in nucleotides is considered for the distance, therefore, second substitutions will not be counted and the distance will be underestimated. Jukes and Cantor createad a formula that calculates the distance taking into account more than just the individual differences (1969; Evol.of Protein Molecules, Academic Press)
Bootstrapping is a statistical method for estimating the sampling distribution by resampling with replacement from the original sample. In making phylogenetic trees, the approach is to create a pseudoalignment by taking random positions of the original alignment. Some columns of the alignment could be selected more than once or not selected at all. The pseudoalignment will be as long as the original alignment and will be used to create a distance matrix and a tree. The process is repeated 100 times and a majority consensus tree is displayed showing the number (or percentage) of times a particular group was on each side of a branch without concerning the subgrouping.

Quick Start

There are three steps to using the Tree Builder (more detail below and in the new video tutorial):

Selecting public RDP sequences using the Hierarchy Browser or Sequence Match or selecting private data you have uploaded to myRDP. Selecting the outgroup for the analysis and clicking on "Create tree" and wait until your tree is done. Viewing, manipulating, or saving your tree using the Java applet.

Making a Phylogenetic Tree

A phylogenetic tree is a graphic representation of the genealogic relationships between taxa. The Tree Builder is not a comprehensive inference package but a quick-and-dirty method of generating a tree.

Step 1 - Selecting Sequences

Limits: Tree Builder requires between 4 and 50 sequences to be selected in your Sequence Cart before it will allow you to generate a tree. Fifty sequences is an arbitrary limit to prevent bogging down the RDP server, but please note you will probably want to select significantly fewer since trees quickly become cluttered and hard to view.
Selecting private data in myRDP: From the "Tree Builder - Start" page, select "myRDP" in the top menu bar. This will take you to the "myRDP Overview" page. Logging in is necessary in order to access your private sequences. Select groups of sequences, or click on a group to select individual sequences in a group. Only sequences that have successfully completed alignment can be used with the Tree Builder. Once selected, sequences are added to your Sequence Cart and will be available to the Tree Builder program.
Adding RDP Sequences to your seqCart: You can select publicly available aligned 16S rRNA sequences from either the Sequence Match or the Hierarchy Browser applications. The selected sequences will be added to the Sequence Cart.

You can run the Sequence Match and then select sequences similar to your sequences. Click on "SEQMATCH" in the menu bar and run the program with your sequences in the cart or with a different file. Then click on "view selectable matches" and select your sequences of interest.

You can also get sequences from the Hierachy Browser. From the "Tree Builder - Start" page, click on "BROWSER" in the menu bar. This will take you to the RDP Hierarchy Browser, you need to define your dataset before selecting the sequences. Sequences selected using the Hierarchy Browser will be placed in your Sequence Cart. Please see the Hierarchy Browser Help for assistance with using the Hierarchy Browser.

It is recommended that you use type strains in your selection. Type strains are well characterized and link phylogeny with taxonomy. Type strains have a (T) in their description.

Step 2 - Selecting an outgroup

An outgroup is any group used in an analysis that is not included in the taxon under study. Sequences in a group should be more closely related to each other than to the outgroup. However, an outgroup should not be very distant to your taxa because multiple mutations could have occurred and this information would not be considered. Adding multiple outgroups generally improves the tree topology. You can select the outgroup from the Hierarchy Browser.

After you are done adding sequences you wish to tree, return to the Tree Builder program by clicking "TREE" in the top menu or selecting "Tree Builder" on the RDP homepage. The "Tree Builder - Start" page will give you an overview of how many aligned RDP and myRDP sequences you have in your Sequence Cart. If your Sequence Cart contains between 4-50 sequences, a menu includng the "CREATE TREE" button will appear.

Before creating the tree you need to select the alignment model (the eubacterial model is currently available; an archaeal aligner is currently being developed). You also need to select the outgroup before creating the tree. If you have multiple outgroups, select the most unrelated one.

Finally click on "CREATE TREE". This task will take between 6 seconds and 45 minutes depending on how many sequences were selected and server load (During peroids of extremely high server load, you may not be able to submit requests). The number of treeing requests currently waiting in the queue is displayed in the "Tree Builder - Start" page.

Step 3 - Viewing the Tree

Tree Builder uses a Java applet to display and save the tree. You must have a browser capable of running Java 1.0 or later. The commands for manipulating and saving the tree are shown on the "Tree Builder - Result" page. Please note, on Macs the Option key takes the place of the Alt key. The tree can be saved in PostScript or Newick format. PostScript is a graphics format similar to PDF and can be edited in programs such as Adobe Illustrator. A link to a free PostScript-to-PDF online converter is available on the result page. The Newick format is a simple text format accepted by many tree viewing programs such as ARB.

The tree includes results from a boostrap test using 100 replicates. Bootstrap values higher than 50% are highlighted.

1. William J. Bruno, Nicholas D. Socci, and Aaron L. Halpern (2000). Weighted Neighbor Joining: A Likelihood-Based Approach to Distance-Based Phylogeny Reconstruction, Mol. Biol. Evol. 17 (1): 189-197.
2. E. O. Wiley, D. R. Brooks, D. Siegel-Causey, V. A. Funk (1991). The Compleat Cladist: A Primer of Phylogenetic Procedures. Freely available at

Personal tools