Insertions and deletions cause frameshifts when translating DNA sequences to protein sequences -- RDP FrameBot detects and corrects these frameshift errors. Given a query DNA read and a set of known protein sequences, FrameBot compares each member of the protein target sequences to the query DNA sequence in both forward and reverse directions, and produces frameshift-corrected protein and DNA sequences and an optimal global-local protein pairwise alignment.
Extends a dynamic programming algorithm proposed by Guan et al., 1996. Alignments of DNA and protein sequences containing frameshift errors. Comput. Appl. Biosci. 12:31-40
- Requires a set of target protein sequences
- Checks both forward and reverse directions of the query DNA
- Produces an optimal alignment between the query DNA and the target protein sequences in the presence of frameshifts
- Returns the frameshift-corrected protein and DNA query sequences
- Reports the protein pairwise alignment with the best score
You can adjust the length cutoff (after alignment) and the percent identity cutoff to filter out non-target reads. FrameBot has been tested and pre-configured for several important functional genes including nitrogenase reductase (nifH), butyryl-CoA transferase (but) and butyrate kinase (buk), dioxin/dibenzofuran dioxygenase (dxnA/dbfA1), dibenzofuran dioxygenase (dbfA2), carbazole dioxygenase (carA), cytochrome P-450 (p450), alkane hydroxylase B (alkb) and biphenyl dioxygenase (bphA).
If your gene is not in the drop-down list, you need to provide your own set of protein target sequences. FrameBot is computationally intensive. Since it does all-against-all comparison between query DNA and the target protein sequences, we recommend limiting the number of protein target sequences to 200.