Browsers | Classifier | LibCompare | SeqMatch | Probe Match | FunGene | RDPipeline  | SeqCart | Taxomatic | Tree Builder | AssignGen

Ribosomal Database Project Release 11


Please note the RDP License Agreement and how to cite services from this site.

 

RDP Release 11 Notes

toptop 

RDP Release 11 provides many significant enhancements over RDP release 10:

New Data Collection--Fungal 28S rRNA Sequences

Starting with release 11, we are providing fungal 28S rRNA sequences as part of RDP data collections. The fungal taxonomy used by RDP is the recently published taxonomy hand-developed using published phylogenies for different taxa and taxonomic databases (13) with updates. Hierarchy Browser, Classifier, SeqMatch and ProbeMatch tools are updated to work with fungal 28S related queries. Researchers can now align fungal 28S sequences using our new Fungal LSU aligner (see below).
For each release, a set of flat files containing the entire sequence collection for each of the three genes are available for download in aligned or unaligned FASTA, and annotated GenBank formats.

New and Updated Alignment

Updated RDP Amplicon Sequence Pipeline (RDPipeline)

The RDPipeline is a new tool suite designed to replace our previous Pyrosequencing Pipeline, offering extended processing and analysis tools reflecting recent shifts in amplicon sequencing technologies and techniques (14).

New Paired-End Reads Assembler

Our paired-end reads Assembler extended the existing PANDAseq (14) program. The Assembler performs a modified statistical analysis using the Q scores to find the most likely overlap, computes assembled Q scores for the read overlap region, and handles more complex overlap layouts. Assembler outperformed both the original PANDAseq and Mothur (15) on two defined community datasets from two different Illumina MiSeq runs (14). The Assembler can be run with multiple threads. Using a single CPU during testing, it took Assembler 1.4 hrs to assemble over 16 million reads from one MiSeq run. The Assembler is integrated into Initial Processing (RDPipeline) and is available for download on the RDP GitHub repository and RDP Resources page.

RDP GitHub repository

For researchers involved in high-volume sequencing projects, or who would like to incorporate some of our tools into their local custom workflow, we offer most of RDP tools on the RDP GitHub repository. Step-by-step instructions and sample data files are provided for each tool.

 

Release 11 Update History

09/17/2014

RDP Release 11.3 consists of 3,019,928 aligned and annotated 16S rRNA sequences and 102,901 Fungal 28S rRNA sequences. The Bacteria and Archaea hierarchy model used by RDP Classifier and RDP Hierarchy Browser has been updated to training set No. 10. The new addition includes 14 new bacterial phyla and 1 new archaeal phylum, and 159 new genera. The former phylum Nitrospira is renamed to Nitrospirae. The former candidate phyla OP11, TM7, OD1, WS3 are now the new bacterial phyla Microgenomates, Candidatus Saccharibacteria, Parcubacteria, Latescibacteria.

07/01/2014

New Warcup Fungal ITS training set to classify fungal ITS sequences is released. Available on RDP Classifier page.

03/07/2014

RDP Release 11.2 consists of 2,929,433 aligned and annotated 16S rRNA sequences and 95,365 Fungal 28S rRNA sequences. The Fungal LSU hierarchy model used by RDP Classifier and RDP Hierarchy Browser has been updated to training set No. 11. The new Fungal LSU training set offers increased coverage of the Glomeromycota, Chytridiomycota and other basal lineages, with expanded non-fungal Eukarya phyla to better separate fungi from other eukaryotes.

10/16/2013

RDP Release 11.1 consists of 2,809,406 aligned and annotated 16S rRNA sequences and 62,860 Fungal 28S rRNA sequences. New to this release includes a collection of Fungi 28S sequences, a new Fungi 28S Aligner, updated Bacteria and Archaea 16S Aligner. The RDP pipeline has been completely re-designed for speed and capacity. It now provides support for user account, file type validation and for single-strand and paired-end read data. Most of the RDP tools are now available on GitHub.

Ribosomal Database Project Release 10 (RDP) Notes


Page Contents:

 

Changes from RDP Release 9

toptop 

RDP10 provides two significant enhancements over RDP9:

Archaea Sequences

This release of the RDP includes Archaea sequences. Aligned and annotated public Archaea sequence data is available from our Hierarchy Browser, the RDP Classifier and Seqmatch can be used to classify and match Archaea sequences, and user's private sequence data can be uploaded and aligned against RDP's Archaea alignment model.

Improved Alignment Strategy

RDP Bacterial and Archaeal alignments are now produced using Infernal (1), a secondary-structure based aligner that provides better support for short partial sequences and handles certain sequencing artifacts in a more intuitive manner. The aligner is trained on sets of high quality hand-aligned sequences and incorporates the conserved Bacterial and Archaeal secondary structure models of Gutell and co-workers (2).

 

Changes From RDP8.1 and Earlier Versions

toptop 

Bacterial Taxonomy

Since RDP Release 9, all tools use a new RDP Hierarchy which differs significantly from the Release 8.1 and earlier Hierarchy. The RDP Hierarchy is now based on the new phylogenetically consistent higher-order bacterial taxonomy proposed by Garrity et. al (3), with additional major rearrangements that have been proposed for the Firmicutes (8) and Cyanobacteria (7). It also includes published informal classifications for well-defined lineage with few cultivated members, such as Acidobacteria, Verrucomicrobia and OP11 (4, 5, 6). Additional information for the classification of chloroplast, Korarchaeota, and Nanoarchaeum were taken from the NCBI taxonomy (9). Sequences are placed in the Hierarchy using the RDP Classifier.

Major rearrangements for Classifier training set No. 9 include the following:

Some tools also allow the user to view data in a taxonomy as classified by NCBI.

Data Updates

Release 10 is kept up-to-date by syncing its sequence data with the major sequence repositories about once every month. New public sequences are added to the RDP, deleted sequences are removed, and annotation is updated with every update. Each update of the RDP is given a specific update number, such as RDP 10.1 (RDP 10, update 1), RDP 10.2, etc.

Quality Checking for Public Sequence Data

Since RDP Release 9, public data added to the RDP is labeled as being either "good" or "suspect" in quality. Suspected low quality sequences may be chimeric or have other sequencing related issues. Quality checking is performed using a combination of Pintail and the RDP Seqmatch. More information is available about RDP's quality checking algorithm.

 

Notes and Clarifications about RDP Release 10

Alignments

RDP Bacterial and Archaeal alignments are created from separate alignment models. This means a user cannot download an aligned file containing both Archaea and Bacteria sequences, or run an analysis program that requires aligned data (eg. TreeBuilder) on a mixed set of Bacteria and Archaea sequences. However, each alignment includes one cross-aligned sequence (either E.Coli or Methanocaldococcus jannaschii) that can be used as an outgroup if needed for trees, etc.

Using separate alignment models for Archaea and Bacteria maximizes the number of comparable base positions within each alignment, resulting in more data or meaning being encoded in the alignment. However, if you would still find a "combined" alignment with a reduced number of comparable alignment positions useful, please contact us. We are considering adding this as an option at some future point, but are unsure of its usefulness to our users.

myRDP, RDP10 and RDP9

myRDP account information--including usernames, passwords, and uploaded sequence data--is shared between the RDP10 and RDP9 websites. All myRDP user sequences were realigned using the RDP10 alignment model. Log into the RDP9 website to use the old RDP9 alignment, and log into the RDP10 website for the new Release 10 alignment models. Sequences uploaded to RDP10 to be aligned using our Archaea aligner will show up under RDP9's myRDP site as unaligned.

 

References

1.Nawrocki , E. P. and S. R. Eddy. 2007. Query-Dependent Banding (QDB) for Faster RNA Similarity Searches. PLoS Comput. Biol., 3:e56. [PubMed]

2. Cannone, J.J., Subramanian, S., Schnare, M.N., Collet, J.R., D'Souza, L.M., Du, Y., Feng, B., Lin, N., Madabusi, L.V., Muller, K.M., Pande, N., Shang, Z., Yu, N., and Gutell, R.R. (2002) The Comparative RNA Web (CRW) Site: An Online Database of Comparative Sequence and Structure Information for Ribosomal, Intron, and other RNAs. BioMed Central Bioinformatics., 3,2. [PubMed]

3. Garrity, G.M., Lilburn, T.G., Cole, J.R., Harrison, S.H., Euzeby, J., and Tindall, B.J. (2007) The Taxonomic Outline of Bacteria and Archaea. TOBA Release 7.7, March 2007. Michigan State University Board of Trustees. [http://www.taxonomicoutline.org/]

4. Barns, S. M, E. C. Cain, L. Sommerville and C. R. Kuske. 2007. Acidobacteria phylum sequences in uranium-contaminated subsurface sediments greatly expand the known diversity within the phylum. Appl Environ Microbiol. 73(9):3113-6. [PubMed]

5. Sangwan, P, X. Chen, P. Hugenholtz and P. H. Janssen. 2004. Chthoniobacter flavus gen. nov., sp. nov., the first pure-culture representative of subdivision two, Spartobacteria classis nov., of the phylum Verrucomicrobia. Appl Environ Microbiol. 70(10):5875-81. [PubMed]

6. Harris, J. K, S. T. Kelley and N. R. Pace. 2004. New Perspective on Uncultured Bacterial Phylogenetic Division OP11. Appl Environ Microbiol. 70(2):845-9. [PubMed]

7. Wilmotte, A. and M. Herdman. 2001. Phylogenetic relationships among the Cyanobacteria based on 16S rRNA sequences. Bergey's Manual of Systematic Bacteriology. 2nd Ed.. Volume 1, p. 487-493. Springer-Verlag, New York.

8. Ludwig, W., K.-H. Schleifer and W. B. Whitman. 2008. Revised Road Map to the Phylum Firmicutes. Bergey's Manual of Systematic Bacteriology. Volume 3. Springer-Verlag, New York.

9. Wheeler DL, C. Chappey, A.E. Lash, D.D. Leipe, T.L. Madden, G.D. Schuler, T.A. Tatusova, B.A. Rapp. 2000. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 28(1):10-14.

10. Tamaki,H., Tanaka,Y., Matsuzawa,H., Muramatsu,M., Meng,X.Y., Hanada,S., Mori,K. and Kamagata,Y. 2011. Armatimonas rosea gen. nov., sp. nov., of a novel bacterial phylum, Armatimonadetes phyl. nov., formally called the candidate phylum OP10. Int. J. Syst. Evol. Microbiol. 61(PT 6):1442-47. [PubMed]

11. Lee,K.C., Dunfield,P.F., Morgan,X.C., Crowe,M.A., Houghton,K.M., Vyssotski,M., Ryan,J.L., Lagutin,K., McDonald,I.R. and Stott,M.B. 2011. Chthonomonas calidirosea gen. nov., sp. nov., an aerobic, pigmented, thermophilic micro-organism of a novel bacterial class, Chthonomonadetes classis nov., of the newly described phylum Armatimonadetes originally designated candidate division OP10. Int. J. Syst. Evol. Microbiol. 61 (PT 10), 2482-2490. [PubMed]

12. Rainey, F.A., Hollwen, B.J., and Small, A. Genus I. Clostridium Prazmowski 1880, 23. 2008. Bergey's Manual of Systematic Bacteriology. Volume 3, p. 738-741. Springer-Verlag, New York.

13. Liu, K-L., Porras-Alfaro,A., Kuske,C.R., Eichorst,S. and Xie,G. (2012) Accurate, rapid taxonomic classification of fungal large subunit rRNA genes. Appl. Environ. Microbiol., 78, 1523-1533. [PubMed]

14. Cole,R.C., Wang.Q., Fish,A.J., Chai,B., McGarrell.M.D., Sun,Y., Brown.C.T., Porras-Alfaro.A., Kuske.C. and Tiedje.J.M. Ribosomal Database Project: Data and Tools for High Throughput rRNA Analysis. Nucl. Acids Res. 41(Database issue):D633-D642; doi: 10.1093/nar/gkt1244. [PubMed]

15. Masella,A.P., Bartram,A.K., Truszkowski,J.M., Brown,D.G. and Neufeld,J.D. (2012) PANDAseq: paired-end assembler for illumina sequences. BMC Bioinformatics, 13, 31. [PubMed]

16. Kozich,J.J., Westcott,S.L., Baxter,N.T., Highlander,S.K., Schloss,P.D. (2013) Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform. Appl. Environ. Microbiol., 79, 5112-5120. [PubMed]

17. Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N.N., Anderson, I.J., Cheng, J.-F., Darling, A., Malfatti, S., Swan, B.K., Gies, E.A., Dodsworth, J.A., Hedlund, B.P., Tsiamis, G., Sievert, S.M., Liu, W.-T., Eisen, J.A., Hallam, S.J., Kyrpides, N.C., Stepanauskas, R., Rubin, E.M., Hugenholtz, P., and Woyke, T. Insights into the phylogeny and coding potential of microbial dark matter. Nature (2013) 499:431-437. [PubMed]

18. Garrity, G.M., and Holt, J.G. "Phylum BVIII. Nitrospirae phy. nov." In: Bergey's Manual of Systematic Bacteriology, 2nd ed., vol. 1 (The Archaea and the deeply branching and phototrophic Bacteria) (D.R. Boone and R.W. Castenholz, eds.), Springer-Verlag,New York (2001). p. 451.

 

RDP Release 10 Issues

The following are known issues with the RDP Release 10 site:

Video Tutorials

The video tutorials may not mention the multiple alignment models available.

 

Release 10 Update History

07/18/2013

RDP 11.1: 44,622 new sequences added (2,809,900 total).

05/14/2013

RDP 10.32: 126,121 new sequences added (2,765,278 total).

12/07/2012

RDP 10.31: 60,255 new sequences added (2,639,157 total).

09/19/2012

RDP 10.30: 259,863 new sequences added (2,578,902 total).

06/01/2012

RDP 10.29: 210,206 new sequences added (2,320,464 total). The hierarchy model used by RDP Classifier has been updated to training set No. 9 with minor rearrangements. The updates include adding new class Negativicutes, order Selenomonadales and family Acidaminococcaceae to phylum Firmicutes, moving family Veillonellaceae to order Selenomonadales. RDP Classifier software package has been updated to version 2.5. Starting from this release, RDP switches to use Uchime with a reference database to detect chimeras.

01/13/2012

RDP 10.28: 189,079 new sequences added (2,110,258 total). The hierarchy model used by RDP Classifier has been updated to training set No. 7. There are 10,046 sequences included in the new training set (19% more than the previous training set No. 6). About 1% public RDP sequences changed phyla. About 4.9% of sequences were classified to greater taxonomic depth in the new release, while about 4.8% were classified with confidence at lesser levels. See more rearrangements details at Taxonomy.

08/09/2011

RDP 10.27: 308,116 new sequences added (1,921,179 total).

06/13/2011

The RDP Fungal LSU Classifier (large subunit rRNA gene) is released, our thanks to Cheryl Kuske and colleagues for providing the hierarchy model and training set.

03/28/2011

RDP 10.26: 67,383 new sequences added (1,613,063 total).

02/17/2011

RDP 10.25: 47,003 new sequences added (1,545,680 total).

01/11/2011

RDP 10.24: 15,661 new sequences added (1,498,677 total).

12/07/2010

RDP 10.23: 64,519 new sequences added (1,483,016 total).

08/30/2010

RDP 10.22: 39,073 new sequences added (1,418,497 total).

07/14/2010

RDP 10.21: 141,461 new sequences added (1,379,424 total).

05/19/2010

RDP 10.20: 0 new sequences added (1,237,963 total). Removed large datasets with only partial 454 sequences.

03/31/2010

RDP 10.19: 38,367 new sequences added (1,396,793 total). Minor updates on the Hierarchy model used by RDP. The hierarchy model used by RDP Classifier is updated to training set No. 6.

01/25/2010

RDP 10.18: 77,329 new sequences added (1,358,426 total). The hierarchy model used by RDP Classifier has been updated to training set No. 5. 101 taxa were removed and 544 taxa were added to the new hierarchy. About 0.9% public RDP sequences changed phyla. About 9.6% of sequences were classified to greater taxonomic depth in the new release, while about 9.1% were classified with confidence at lesser levels.

12/09/2009

RDP 10.17: 46,053 new sequences added (1,281,097 total).

11/10/2009

RDP 10.16: 130,661 new sequences added (1,235,044 total).

10/05/2009

RDP 10.15: 30,308 new sequences added (1,104,383 total).

08/31/2009

RDP 10.14: 24,642 new sequences added (1,074,075 total).

07/28/2009

RDP 10.13: 128,790 new sequences added (1,049,433 total).

06/10/2009

RDP 10.12: 64,302 new sequences added (920,643 total).

05/07/2009

RDP 10.11: 19,527 new sequences added (856,341 total).

04/03/2009

RDP 10.10: 36,655 new sequences added (836,814 total).

03/06/2009

RDP 10.9: 20,656 new sequences added (800,159 total).

02/09/2009

RDP 10.8: 19,666 new sequences added (779,503 total).

01/14/2009

RDP 10.7: 44,200 new sequences added (759,837 total).

12/03/2008

RDP 10.6: 9,534 new sequences added (715,637 total).

10/30/2008

RDP 10.5: 15,954 new sequences added (706,103 total).

10/01/2008

RDP 10.4: 13,092 new sequences added (690,149 total).

09/04/2008

RDP 10.3: 53,883 new sequences added (677,057 total).

08/01/2008

RDP 10.2: 72,808 new sequences added (623,174 total). Probematch can now restrict searches by e.coli position or kingdom. Class Assignment Generator has been updated to also include Archaea sequences in student assignments. Release 10 is no longer beta, and replaces Release 9 as the latest RDP Release.

05/22/2008

RDP 10.1: Initial preview/beta release with 550,366 sequences.

 

Questions/comments: rdpstaff@msu.edu
Creative Commons License: Attribution-ShareAlike

 Move to Toptop topMove to Top