The Process:

The classifier tool main page can be found at RDP Classifier

The interactive classifier tool on the RDP web page is best for examining a small number of sequences to obtain taxonomic assignments. The advantages of using this version include the fast turn-around time and the rich browsing options over the results. Be aware, it only allows the upload of a single sequence file.

Download the sample input files . . . for this tutorial -- the sample input file contains 7 sequences.

Uploading your data . . .

To run the Classifier, select the gene "16S rRNA" from the drop-down menu, then you have two choices to submit the test file:

The output . . . the results are shown in a taxonomic hierarchy. The hierarchy view displays all the taxon nodes with sequences assigned to them in the hierarchical order.

summary detail

  • Change "Display Depth" to "3" to display only 3 taxonomic levels in the Hierarchy View.
  • Click the link on "Proteobacteria" to explore the sequences assigned to phylum "Proteobacteria".
  • Change "Confidence threshold" to from default value 80% to 50%. You can see one sequence (X67228) is now assigned to genus "Rhizobium" instead of "unclassified_Rhizobiales". For each rank assignment, the Classifier automatically estimates the classification reliability using bootstrapping. A threshold of 50% is recommended for sequences shorter than 250 bp.
  • Click "download entire hierarchy as text file" to download of assignment count for each taxon. To create a quick graphical view, import the text file as tab delimited into Excel and sort based on rank and then number of sequences (sort by column A then C, largest to smallest).
  • Click link on [show assignment detail for "Proteobacteria" only ] to view the assignment detail. Each query sequence is listed with its name and then its classification with confidence values displayed for each taxonomic level.
  • Save the list of sequences and assignment detail. Clicking "download allrank result" or "download fixrank result" while in assignment detail view. The "allrank" format outputs the results for all ranks applied for each sequence. The "fixrank" format only outputs the results for a list of selected ranks in the following order: domain, phylum, class, order, family and genus. In case of missing ranks in the lineage, the bootstrap value and the taxon name from the immediate lower rank will be reported. This eliminates the gaps in the lineage, but also introduces non-existing taxon name and rank. Interpret the "fixrank" results with caution.

