From Ribosomal Database Project Wiki
Jump to: navigation, search

An open source command-line tool suite for performing a complete workflow of analysis tasks of NGS data.

Index of Terms: Tool and task


Shannon & Chao indices Compute Shannon and Chao1 indices using clustering results
Jaccard & Sørensen Compute Jaccard or Sorensen abundance stats using clustering results
compute rarefaction curve Generate rarefaction plot using clustering results


alignment merge Merge/combine multiple alignments to one file
align nucleotide seq to protein alignment Transfer protein alignments to nucleotide alignments
k nearest neighbors Compute k-nearest-neighbors by pairwise alignment
Defined Community Analysis Compares input nucleotide reads to the set of known sequences for amplification targets, including compare sequencing error types, calculate error rate, create error profile.
parseErrorAnalysis.py Generates summary output file for Defined Community Analysis


cluster Perform complete/single/average linkage clustering
dereplication Dereplicate aligned sequences
convert cluster result Change .clust file to biom OTU table or R format
convert sequence file Convert genebank, embl, fastq, stff, sto to fasta/unaligned fasta format
De-align sequences Remove the alignment from a fasta file
calculate distance Compute distance matrix from Aligned sequence / Unaligned sequence (limited to 4K sequences)
hadoop distance Calculate distances using hadoop distance calculator
modify distance file Dumps a binary distance file to flat text/square matrix
modify merge file Convert a merges file to a newick tree or replay a merge file to create a cluster file
represenative sequence Get represenative sequences from a cluster file
refresh mappings Remove mapping entries for sequences externally filtered
explode mapping Explode a dereplicated sequence file back to sample replicated files
demultiplex Demultiplex a tab-delimited result file using an id and sample mapping


frame shift correction Produces frameshift-corrected protein and DNA sequences and an optimal global or local protein alignment
build Index Build index for metric indexed search based on the input DNA sequences using global pairwise alignment mode
convert framebot output Generates matrix stats
random sampling Randomly select a subset of sequence IDs from the sample Mapping file, same number of sequences for each sample
translate Translate sequences from nucleotide to protein at a given reading frame


probe & prime match Check the given primer/probe against a sequence database (or sequence file) to produce the list of matching sequences.
edit distance The total number of sequence positions that are different between a probe or primer and a sequence, including all three types: mismatch, insertion, and deletion


random sampling Random select a subset or subregion of sequences
reverse complement sequences Change orientation of DNA/RNA sequences
remove duplicates Remove identical sequence, or sequences that are substrings of others
select-seqs Select, or exclude a list of sequences from a sequence file or split the sequence file into smaller ones of the given size.
convert sequence file format Change sequence to fasta/fastq format


Initial process Initial processing steps include matching the raw reads to experimental samples, trimming off the tag and primer portions, and removing sequences of low quality


nearest neighbor sequence matching Train SequenceMatch and find K nearest neighbor using reference sequence


classify Classify one or multiple bacterial, archaeal 16s rRNA or Fungal LSU sequences
library compare Classification with a statistical test to flag taxa differing significantly between libraries
estimate accuracy Cross validate/Leave one (sequence or taxon) out accuracy test
merge classify result Merge classification result to taxon assignment counts file
random sampling Random select a subset or subregion of sequences
remove duplicates Remove identical or any sequence contained by another sequence
taxa distance Calculate and plot the similarities between taxa
train Retrain classifier
Personal tools