This package exports the ExecScan method, which can be used to perform a BLAST or pattern scan from a FASTA or pattern file against one or more genomes. The output from the method is a series of n-tuples. The first element is a location object for the matching region in the input. The second element is a location object for the matching region in the target genome. This could be a real contig-based location or it could be a location inside an identified feature. The third element is the P-score of the match. For scans, this will always by 0. The fourth and fifth elements are the alignment length and the bit score. For scans, the alignment length will be the entire length and the bit score will be 0.
The tool table is a hash that provides useful information about each blast tool. The hash maps each tool name to a hash reference. The various fields in the hashes are as follows.
Type of database against which the tool runs: prot for a protein database and
dna for a DNA database.
Execution string for the tool. The variable $seqFile is presumed to be the location
of the input sequence, $db is the directory, and $options are the user-specified options.
my @sims = ExecScan($fig, $seqFile, \@genomes, $tool, $options);
Call BLAST or SCAN to search for DNA sequences or features.
A FIG-like object for accessing the data store.
Name of a file containing the input sequence. This will either be a FASTA or a scan pattern.
A list of the IDs for the target genomes of the search.
Name of the tool to use.
Options to pass to the tool, formatted for the command line.
Returns a list of 5-tuples, each consisting of a location from the input, a location in one of the target genomes or features, a match score (with 0 being the best), the alignment length, and the bit score. For some tools, the bit score will be replaced by the matching text.
my $newName = CallScanner::Canonize($name, $genomeID);
If the specified name is a contig ID, insure it has a genome ID in front of it.
Name to fix up.
ID of the genome to be added to the contig ID, if necessary.
Returns a fixed-up name.
CallScanner::VerifyDB($db, $type);
Verify that the specified FASTA file has BLAST databases. If the databases do not exist, they will be created. If they are older than the FASTA file, they will be regenerated.
Name of the FASTA file.
Type of database desired: prot for protein and dna for DNA.