Browsers | Classifier | LibCompare | SeqMatch | Probe Match | FunGene | RDPipeline  | SeqCart | Taxomatic | Tree Builder | AssignGen

About RDP Web Services


Contents: :: RDP Classifier :: Sample SOAP Client (Java)
  :: Seqmatch :: Sample SOAP Client (Ruby)
    :: Sample SOAP Client (Perl)

The RDP Web Services are an alternative method for accessing RDP tools. These services are intended for programmers who wish to automate or easily parse the results of RDP tools. With a few exceptions, these services return the same information presented on the tools' web pages.

Note: The RDP Web Services are actively in development and the interface may change based upon user feedback. If you have any suggestions, please let us know.

These services conform to the SOAP standard. The methods are made available using JAX-WS RI 2.1, and should be interoperable with all languages that have a reasonably modern SOAP stack. If you have any problems accessing these SOAP services using your language of choice, please let us know. The services have been tested using Java, Ruby and Perl.


RDP Classifier


URL: http://rdp.cme.msu.edu/services/classifier
namespace: http://rdp.cme.msu.edu/services/classifier
WSDL: http://rdp.cme.msu.edu/services/classifier?wsdl
XML Schema: http://rdp.cme.msu.edu/services/classifier?xsd=1

There are three web service methods available for the RDP Classifier:

  1. Method Name: classifier

    Description: Classifies the contents of a sequence file using the default confidence value (80%). This method is Document/Literal (not wrapped).

    Input: classifier(String seqFileContents). A string that contains the content of a sequence file in either FASTA, Genbank, or EMBL format.

    Returns: A ClassifierResult object defined as...

    public class ClassifierResult {
    
        @XmlSchemaType(name = "dateTime")
        public XMLGregorianCalendar dateRun;
        public int taxonomyVersion;
        public String taxonomyDescription;
        public String error;
        public List<Classification> classification;
    }
      
    public class Classification {
    
        public String queryID;
        public String assignmentStr;
        public List<Assignment> assignment;
    }
      
    public class Assignment {
    
        public int taxid; // See the note on taxids
        public String rank;
        public String name;
        public float confidence;
    }
      

  2. Method Name: classifierWithOptions

    Description: Classifies a list of query objects, using the indicated confidence value. This method is Document/Literal, Wrapped for WS-I compliance.

    Input: classifierWithOptions(List<Query> queries, float confidenceCutoff)

    List<Query> queries: A list of query objects defined as...

    public class Query {
    
        public String bases;
        public String id;
    }
      

    float confidenceCutoff: The classifier confidence value required to assign a sequence to a node. Passed as a decimal; eg pass .8 for 80% confience threshold.

    Returns: same ClassifierResult object as the method: classifer. Due to WS-I conventions, the ClassifierResult object is wrapped inside a ClassifierWithOptionsResponse object. Please see the WSDL and XML Schema for the exact implementation.

  3. Method Name: fetchTree

    Description: Returns the most current RDP Taxonomy.

    Input: Takes no parameters.

    Returns: a TaxonomyTree object.

    public class TaxonomyTree {
    
        public String versionDescription;
        public int versionNumber;
        public Node node; // Root node of the taxonomy
        public String error;
    }
    
    public class Node {
    
        public String rank;
        public String name;
        public int taxid; // See the note on taxids
        public List<Node> node; // List of child nodes
    }
      

About the taxid returned by the classifier web services:

The taxid value must be used with care. The taxid returned is an internal RDP id that is returned for convience, and is only intended to make parsing results easier. All taxid numbers are regenerated with each new Taxonomy version, meaning a taxid from Taxonomy Version 3 will not point to the same node as the same taxid with Taxonomy Version 4.

If you are using the taxid values, please ensure the Taxonomy version number returned as part of your result object matches the version number you were previously working with. We suggest you do not persist, publish, or share the taxid numbers, since they will quickly become obsolete and invalidate your data.

Limitations


Seqmatch


URL: http://rdp.cme.msu.edu/services/seqmatch
namespace: http://rdp.cme.msu.edu/services/seqmatch
WSDL: http://rdp.cme.msu.edu/services/seqmatch?wsdl
XML Schema: http://rdp.cme.msu.edu/services/seqmatch?xsd=1

Two web service methods are available for seqmatch:

  1. Method Name: seqmatch

    Description: Runs seqmatch on the provided sequence file using the default options (20 nearest matches, good quality sequences with >1200bp). This method is Document/Literal (not wrapped).

    Input: String seqFileContents: A string that contains the content of a sequence file in either FASTA, Genbank, or EMBL format.

    Returns: A SeqmatchResult object defined as...

    public class SeqmatchResult {
    
        @XmlSchemaType(name = "dateTime")
        public XMLGregorianCalendar dateRun;
        public String rdpRelease;
        public List<QueryResult> query;
        public String error;
    }
      
    public class QueryResult {
    
        public String queryId;
        public int queryWordCount;
        public List<Match> match;
    }
      
    public class Match {
    
        public String sid;
        public String definition;
        public double sab;
        public long oligos;
        public String lineageStr;
    }
      

  2. Method Name: seqmatchWithOptions

    Description: Runs seqmatch on a list of query objects using the supplied dataset options and requested number of matches. This method is Document/Literal, Wrapped for WS-I compliance.

    Input: seqmatchWithOptions(List<Query> queries, TaxonomyType taxType, StrainType strainType, SourceType sourceType, SizeType sizeType, QualityType qualityType, int numberOfMatches)

    List<Query> queries: same list of query objects as the method: classiferWithOptions.

    TaxonomyType taxType: Either rdpHome for the RDP Taxonomy, or ncbiHome for NCBI's taxonomy.

    StrainType strainType: One of {type, nontype, both}.

    SourceType sourceType: One of {environ, isolates, both}.

    SizeType sizeType: One of {ge1200, lt1200, both}.

    QualityType qualityType: One of {good, low, both}.

    int numberOfMatches: and integer indicating how many matches to return per query sequence.

    Returns: The same object returned by the method: seqmatch. Due to a WS-I conventions, the SeqmatchResult object is wrapped inside a SeqmatchWithOptionResponse object. Please see the WSDL and XML Schema for the exact implementation.

Limitations


:: Sample SOAP Client (Java)


Download the Sample Java client.

The sample's jar file contains both the class files and source code. This jar requires the latest versions of the JAXB and JAXWS reference implementations. JAXB and JAXWS are included with recent versions of java. Older java environments will need to download JAXWS RI from Sun. (The JAX-WS RI includes JAXB libraries.)

Most of the Java class files are autogenerated from the SOAP services' WSDL:

/path/to/jaxws-ri/bin/wsimport.sh -s src http://rdp.cme.msu.edu/services/classifier?wsdl
/path/to/jaxws-ri/bin/wsimport.sh -s src http://rdp.cme.msu.edu/services/seqmatch?wsdl

To run the Java client and output tab-deliminated text:

seqmatch:   >java -jar RdpWebServices.jar -seqmatch seqs.fasta
 classifier:   >java -jar RdpWebServices.jar -classifier seqs.fasta

To output XML:

seqmatch:   >java -jar RdpWebServices.jar -seqmatch -xml seqs.fasta
 classifier:   >java -jar RdpWebServices.jar -classifier -xml seqs.fasta


:: Sample SOAP Client (Ruby)


This ruby code is developed and tested with Ruby 1.8.6 and JRuby 1.1.

Ruby code for accessing the RDP Classifier:

require 'soap/wsdlDriver'
require 'pp'

WSDL_URL = 'http://rdp.cme.msu.edu:80/services/classifier?wsdl'
classifierService = SOAP::WSDLDriverFactory.new(WSDL_URL).create_rpc_driver
#driver.generate_explicit_type = true
#driver.wiredump_dev = STDOUT


require "fileutils"
# seqFile = ARGV[0]
seqFile = "/scratch/rdp_download_4seqs.fa"

seqStr = File.new(seqFile).read

result = classifierService.classifier(seqStr)

puts "#RDP Classifier: " + result.taxonomyDescription
puts "#Taxonomy Version Number: " + result.taxonomyVersion
puts "#Date Run: " + result.dateRun

if result.respond_to?('error')
  puts "#Error String: " + result.error
  exit 1
end

result.classification.each { |i| 
  puts i.queryID + "\t" + i.assignmentStr
}

Ruby code for accessing the method classifierWithOptions:

# setup the soap driver
require 'soap/wsdlDriver'
WSDL_URL = 'http://rdp.cme.msu.edu:80/services/classifier?wsdl'
classifierService = SOAP::WSDLDriverFactory.new(WSDL_URL).create_rpc_driver
# uncomment the following lines for debugging purposes.
#driver.generate_explicit_type = true
#driver.wiredump_dev = STDOUT

# define a class that matches the query object expected by the web service method
class Query
  attr_accessor :id, :bases
end
queries = [] # a list of queries

# parse the fasta file into a list of Query objects
require "fileutils"
seqFile = File.new("/scratch/rdp_download_4seqs.fa")
while (line = seqFile.gets)
  if (line =~ /^>/)
    query = Query.new
    query.id = line
    query.bases = ""
    queries << query
  else
    query.bases += line
  end
end

# call the web service method with the list of queries and a confidence of 70%
response = classifierService.classifierWithOptions(:query => queries, :confidenceCutoff => 0.70)

# By convention, WS-I compliant web services wrap the result object in a response object
result = response.return

# output the results
puts "#RDP Classifier: " + result.taxonomyDescription
puts "#Taxonomy Version Number: " + result.taxonomyVersion
puts "#Date Run: " + result.dateRun

if result.respond_to?('error')
  puts "#Error String: " + result.error
  exit 1
end

result.classification.each { |i|
  puts i.queryID + "\t" + i.assignmentStr
}

Ruby code for accessing Seqmatch:

require 'soap/wsdlDriver'
require 'pp'

WSDL_URL = 'http://rdp.cme.msu.edu:80/services/seqmatch?wsdl'
seqmatchService = SOAP::WSDLDriverFactory.new(WSDL_URL).create_rpc_driver
#driver.generate_explicit_type = true
#driver.wiredump_dev = STDOUT


require "fileutils"
# seqFile = ARGV[0]
seqFile = "/scratch/rdp_download_4seqs.fa"

seqStr = File.new(seqFile).read

result = seqmatchService.seqmatch(seqStr)

puts "#RDP Release: " + result.rdpRelease
puts "#Date Run: " + result.dateRun

if result.respond_to?('error')
  puts "#Error String: " + result.error
  exit 1
end

result.query.each { |i| 
  puts i.queryId + "\t" + i.queryWordCount
  i.match.each { |j| 
    puts "\t" + j.sab + "\t" + j.oligos + "\t" + j.sid + "\t" + j.definition + "\t" + j.lineageStr
  }
}


:: Sample SOAP Client (Perl)


The following perl code requires the module SOAP::Lite. This code was tested with Perl 5.8.8 and SOAP::Lite 0.71.

Thanks to Mike Coyne of Channing Laboratory at Harvard for help writing this perl client.

Perl code for accessing the method seqmatchWithOptions:

#!/usr/bin/perl -w
use strict;
use SOAP::Lite; # or "use SOAP::Lite +trace => 'debug'" for debugging information

my (%seqs, $id, $comment, @order);

die "No fasta filename provided!\n" unless ($ARGV[0]);
die "No file named $ARGV[0] found!\n" unless (-e $ARGV[0]);

# read a fasta file into a hash: id as key, bases as value;
open (FASTA, $ARGV[0]) or die "Can't open fasta input file: $!\n";

 while (<FASTA>) {
    if (/^>\S+/) {
        ($id, $comment) = (split /\s+/, $_, 2);
        $id =~ s/^>//;
        chomp ($comment);
        $comment = ' ' unless ($comment);
        $seqs{$id}{comment} = $comment;
        push (@order, $id);
    } else {
        chomp;
        $seqs{$id}{seq} .= $_;
    }
}

# create the parameters to pass to the SOAP service.
my @data;
foreach my $id (@order) {
    my $bases = $seqs{$id}{seq};
    push @data, SOAP::Data->name("query" => \SOAP::Data->value(
        SOAP::Data->name("bases" => $bases)->type(''),
        SOAP::Data->name("id" => $id)->type('')));
}
push @data, SOAP::Data->name("strainType" => 'both')->type('');      # type, nontype, or both
push @data, SOAP::Data->name("sourceType" => 'both')->type('');      # environ, isolates, or both
push @data, SOAP::Data->name("sizeType" => 'ge1200')->type('');      # ge1200, lt1200, or both
push @data, SOAP::Data->name("qualityType" => 'good')->type('');     # good, low, or both
push @data, SOAP::Data->name("taxonomyType" => 'rdpHome')->type(''); # rdpHome or ncbiHome
push @data, SOAP::Data->name("numberOfResults" => '20')->type('');   # integer = matches to return per query

# create the soap object.
my $soap = SOAP::Lite->uri('http://rdp.cme.msu.edu/services/seqmatch')
    ->proxy('http://rdp.cme.msu.edu/services/seqmatch');

# call the soap web service method.
my $result = $soap->call(
    SOAP::Data->name('n1:seqmatchWithOptions')
        ->attr({'xmlns:n1' => 'http://rdp.cme.msu.edu/services/seqmatch'}) => @data)
    ->paramsin;

print "#RDP Seqmatch Results\n";
print "#RDP Release: " . $result->{'rdpRelease'} . "\n";
print "#Date Run: " . $result->{'dateRun'} . "\n";
print "\n";
print "#Query:\tqueryID\tqueryWordCount\n";
print "#Query description:\tqueryDefinition\n";
print "#Match:\tsab\toligos\tsid\tdefinition\tlineageString\n";
print "\n";

# SOAP::Lite will return an array ref if there is more then one result, otherwise just the object itself.
if (ref($result->{'query'}) eq 'ARRAY') {
    foreach my $query (@{$result->{'query'}}) {
        print "Query:\t" . $query->{'queryId'} . "\t" . $query->{'queryWordCount'} . "\n";
        print "Query description: $seqs{$id}{comment}\n" if ($seqs{$id}{comment} =~ /\w+/);
        if (ref($query->{'match'}) eq 'ARRAY') {
            foreach my $match (@{$query->{'match'}}) {
                print "Match:\t" . $$match{'sab'} . "\t" . $$match{'oligos'}
                    . "\t" . $$match{'sid'} . "\t" . $$match{'definition'}
                    . "\t" . $$match{'lineageStr'} . "\n";
            }
        } else {
            print "Match:\t" . $query->{'match'}{'sab'} . "\t" . $query->{'match'}{'oligos'}
                . "\t" . $query->{'match'}{'sid'} . "\t" . $query->{'match'}{'definition'}
                . "\t" . $query->{'match'}{'lineageStr'} . "\n";
        }
        print "\n";
    }
} else {
    print "Query:\t" . $result->{'query'}{'queryId'} . "\t" . $result->{'query'}{'queryWordCount'} . "\n";
    print "Query description: $seqs{$id}{comment}\n" if ($seqs{$id}{comment} =~ /\w+/);
    if (ref($result->{'query'}{'match'}) eq 'ARRAY') {
        foreach my $match (@{$result->{'query'}{'match'}}) {
            print "Match:\t" . $$match{'sab'} . "\t" . $$match{'oligos'}
                . "\t" . $$match{'sid'} . "\t" . $$match{'definition'}
                . "\t" . $$match{'lineageStr'} . "\n";
        }
    } else {
        print "Match:\t" . $result->{'query'}{'match'}{'sab'} . "\t" . $result->{'query'}{'match'}{'oligos'}
            . "\t" . $result->{'query'}{'match'}{'sid'} . "\t" . $result->{'query'}{'match'}{'definition'}
            . "\t" . $result->{'query'}{'match'}{'lineageStr'} . "\n";
    }
}

 

Questions/comments: rdpstaff@msu.edu
Creative Commons License: Attribution-ShareAlike

 Move to Toptop topMove to Top