Title: | Visually Assessing the Specificity and Informativeness of Primer Pairs |
---|---|
Description: | Identifies potential target sequences for a given set of primers and generates phylogenetic trees annotated with the taxonomies of the predicted amplification products. |
Authors: | Jim Hester [aut], Matt Cannon [aut, cre] |
Maintainer: | Matt Cannon <[email protected]> |
License: | GPL-2 |
Version: | 1.0.7 |
Built: | 2025-02-04 04:55:07 UTC |
Source: | https://github.com/mvesuviusc/primertree |
Maps a nucleotide database accession to a taxonomy database taxId
accession2taxid(accessions)
accession2taxid(accessions)
accessions |
accessions character vector to lookup. |
named vector of taxIds.
PrimerTree results for the bryophyte trnL primers
Summarize pairwise differences.
calc_rank_dist_ave(x, ranks = common_ranks)
calc_rank_dist_ave(x, ranks = common_ranks)
x |
a primerTree object |
ranks |
ranks to show unique counts for, defaults to the common ranks |
The purpose of this function is to calculate the average number of nucleotide differences between species within each taxa of given taxonomic level.
For example, at the genus level, the function calculates the average number of nucleotide differences between all species within each genus and reports the mean of those values.
There are several key assumptions and calculations made in this function.
First, the function randomly selects one sequence from each species in the primerTree results. This is to keep any one species (e.g. human, cow, etc.) with many hits from skewing the results.
Second, for each taxonomic level tested, the function divides the sequences by each taxon at that level and calculates the mean number of nucleotide differences within that taxa, then returns the mean of those values.
Third, when calculating the average distance, any taxa for which there is only one species is omitted, as the number of nucleotide differences will always be 0.
returns a data frame of results
## Not run: calc_rank_dist_ave(mammals_16S) calc_rank_dist_ave(bryophytes_trnL) # Note that the differences between the results from these two primers # the mean nucleotide differences is much higher for the mammal primers # than the byrophyte primers. This suggests that the mammal primers have # better resolution to distinguish individual species. ## End(Not run)
## Not run: calc_rank_dist_ave(mammals_16S) calc_rank_dist_ave(bryophytes_trnL) # Note that the differences between the results from these two primers # the mean nucleotide differences is much higher for the mammal primers # than the byrophyte primers. This suggests that the mammal primers have # better resolution to distinguish individual species. ## End(Not run)
Calls clustal omega to align a set of sequences of class DNAbin. Run without any arguments to see all the options you can pass to the command line clustal omega.
clustalo(x, exec = "clustalo", quiet = TRUE, original.ordering = TRUE, ...)
clustalo(x, exec = "clustalo", quiet = TRUE, original.ordering = TRUE, ...)
x |
an object of class 'DNAbin' |
exec |
a character string with the name or path to the program |
quiet |
whether to supress output to stderr or stdout |
original.ordering |
use the original ordering of the sequences |
... |
additional arguments passed to the command line clustalo |
Filter out sequences retrieved by search_primer_pair() that are either too short or too long. The alignment and tree will be recalculated after removing unwanted reads.
filter_seqs(x, ...) ## S3 method for class 'primerTree' filter_seqs(x, min_length = 0, max_length = Inf, ...)
filter_seqs(x, ...) ## S3 method for class 'primerTree' filter_seqs(x, min_length = 0, max_length = Inf, ...)
x |
a primerTree object |
... |
additional arguments passed to methods. |
min_length |
the minimum sequence length to keep |
max_length |
the maximum sequence length to keep |
a primerTree object
filter_seqs(primerTree)
: Method for primerTree objects
## Not run: # filter out sequences longer or shorter than desired: mammals_16S_filtered <- filter_seqs(mammals_16S, min_length=131, max_length=156) ## End(Not run)
## Not run: # filter out sequences longer or shorter than desired: mammals_16S_filtered <- filter_seqs(mammals_16S, min_length=131, max_length=156) ## End(Not run)
Retrieves a fasta sequence from NCBI nucleotide database.
get_sequence( accession, start = NULL, stop = NULL, api_key = Sys.getenv("NCBI_API_KEY") )
get_sequence( accession, start = NULL, stop = NULL, api_key = Sys.getenv("NCBI_API_KEY") )
accession |
nucleotide accession to retrieve. |
start |
start base to retrieve, numbered beginning at 1. If NULL the beginning of the sequence. |
stop |
last base to retrieve, numbered beginning at 1. if NULL the end of the sequence. |
api_key |
NCBI api-key to allow faster sequence retrieval. |
an DNAbin object.
Retrieves fasta sequences from NCBI nucleotide database.
get_sequences( accession, start = NULL, stop = NULL, api_key = Sys.getenv("NCBI_API_KEY"), simplify = TRUE, .parallel = FALSE, .progress = "none" )
get_sequences( accession, start = NULL, stop = NULL, api_key = Sys.getenv("NCBI_API_KEY"), simplify = TRUE, .parallel = FALSE, .progress = "none" )
accession |
the accession number of the sequence to retrieve |
start |
start bases to retrieve, numbered beginning at 1. If NULL the beginning of the sequence. |
stop |
stop bases to retrieve, numbered beginning at 1. if NULL the stop of the sequence. |
api_key |
NCBI api-key to allow faster sequence retrieval. |
simplify |
simplify the FASTA headers to include only the genbank accession. |
.parallel |
if 'TRUE', perform in parallel, using parallel backend provided by foreach |
.progress |
name of the progress bar to use, see 'create_progress_bar' |
an DNAbin object.
Retrieve the taxonomy information from NCBI for a set of nucleotide gis.
get_taxonomy(accessions)
get_taxonomy(accessions)
accessions |
a character vector of the accessions to retrieve |
data.frame of the 'accessions, taxIds, and taxonomy
identify the point closest to the mouse click only works on single ranks
## S3 method for class 'primerTree_plot' identify(x, ...)
## S3 method for class 'primerTree_plot' identify(x, ...)
x |
the plot to identify |
... |
additional arguments passed to annotate |
plot_tree
layout a tree using ape, return an object to be plotted by
plot_tree
layout_tree_ape(tree, ...)
layout_tree_ape(tree, ...)
tree |
The |
... |
additional arguments to |
edge |
list of x, y and xend, yend coordinates as well as ids for the edges |
tips |
list of x, y, label and id for the tips |
nodes |
list of x, y and id for the nodes |
PrimerTree results for the mammalian 16S primers
Parse the primer hits
parse_primer_hits(response)
parse_primer_hits(response)
response |
a httr response object obtained from |
plots a tree, optionally with colored and labeled points by taxonomic rank
plot_tree( tree, type = "unrooted", main = NULL, guide_size = NULL, rank = NULL, taxonomy = NULL, size = 2, legend_cutoff = 25, ... )
plot_tree( tree, type = "unrooted", main = NULL, guide_size = NULL, rank = NULL, taxonomy = NULL, size = 2, legend_cutoff = 25, ... )
tree |
to be plotted, use layout_tree to layout tree. |
type |
The type of tree to plot, default unrooted. |
main |
An optional title for the plot |
guide_size |
The size of the length guide. If NULL auto detects a reasonable size. |
rank |
The rank to include, if null only the tree is plotted |
taxonomy |
A data.frame with an accession field corresponding to the tree tip labels. |
size |
The size of the colored points |
legend_cutoff |
The number of different taxa names after which the names are no longer printed. |
... |
additional arguments passed to |
plot to be printed.
plots a tree along with a series of taxonomic ranks
plot_tree_ranks( tree, taxonomy, main = NULL, type = "unrooted", ranks = common_ranks, size = 2, guide_size = NULL, legend_cutoff = 25, ... )
plot_tree_ranks( tree, taxonomy, main = NULL, type = "unrooted", ranks = common_ranks, size = 2, guide_size = NULL, legend_cutoff = 25, ... )
tree |
to be plotted, use layout_tree to layout tree. |
taxonomy |
A data.frame with an accession field corresponding to the tree tip labels. |
main |
An optional title for the plot |
type |
The type of tree to plot, default unrooted. |
ranks |
The ranks to include, defaults to all common ranks, if null print all ranks. |
size |
The size of the colored points |
guide_size |
The size of the length guide. If NULL auto detects a reasonable size. |
legend_cutoff |
The number of different taxa names after which the names are no longer printed. |
... |
additional arguments passed to |
plot_tree
to plot only a single rank or the just the
tree layout.
library(gridExtra) library(directlabels) #plot all the common ranks plot_tree_ranks(mammals_16S$tree, mammals_16S$taxonomy) #plot specific ranks, with a larger dot size plot_tree_ranks(mammals_16S$tree, mammals_16S$taxonomy, ranks=c('kingdom', 'class', 'family'), size=3)
library(gridExtra) library(directlabels) #plot all the common ranks plot_tree_ranks(mammals_16S$tree, mammals_16S$taxonomy) #plot specific ranks, with a larger dot size plot_tree_ranks(mammals_16S$tree, mammals_16S$taxonomy, ranks=c('kingdom', 'class', 'family'), size=3)
plot function for a primerTree object, calls plot_tree_ranks
## S3 method for class 'primerTree' plot(x, ranks = NULL, main = NULL, ...)
## S3 method for class 'primerTree' plot(x, ranks = NULL, main = NULL, ...)
x |
primerTree object to plot |
ranks |
The ranks to include, defaults to all common ranks, if NULL print all ranks. If 'none' just print the layout. |
main |
an optional title to display, if NULL displays the name as the title |
... |
additional arguments passed to plot_tree_ranks |
library(gridExtra) library(directlabels) #plot with all common ranks plot(mammals_16S) #plot only the class plot(mammals_16S, 'class') #plot the layout only plot(mammals_16S, 'none')
library(gridExtra) library(directlabels) #plot with all common ranks plot(mammals_16S) #plot only the class plot(mammals_16S, 'class') #plot the layout only plot(mammals_16S, 'none')
ambiguity codes, enumerate all possible combinations and combine the results.
primer_search( forward, reverse, num_aligns = 500, num_permutations = 25, ..., .parallel = FALSE, .progress = "none" )
primer_search( forward, reverse, num_aligns = 500, num_permutations = 25, ..., .parallel = FALSE, .progress = "none" )
forward |
forward primer to search by 5'-3' on plus strand |
reverse |
reverse primer to search by 5'-3' on minus strand |
num_aligns |
number of alignment results to keep |
num_permutations |
the number of primer permutations to search, if the degenerate bases cause more than this number of permutations to exist, this number will be sampled from all possible permutations. |
... |
additional arguments passed to Primer-Blast |
.parallel |
if 'TRUE', perform in parallel, using parallel backend provided by foreach |
.progress |
name of the progress bar to use, see 'create_progress_bar' |
httr response object of the query, pass to parse_primer_hits
to parse the results.
primerTree
has two main commands:
search_primer_pair
which takes a primer pair and returns an
primerTree object of the search results
plot.primerTree
a S3 method for plotting the primerTree object
obtained using search_primer_pair
Automatic primer searching Search a given primer pair, retrieving the alignment results, their product sequences, the taxonomic information for the sequences, a multiple alignment of the products
search_primer_pair( forward, reverse, name = NULL, num_aligns = 500, num_permutations = 25, simplify = TRUE, clustal_options = list(), distance_options = list(model = "N", pairwise.deletion = T), api_key = Sys.getenv("NCBI_API_KEY"), ..., .parallel = FALSE, .progress = "none" )
search_primer_pair( forward, reverse, name = NULL, num_aligns = 500, num_permutations = 25, simplify = TRUE, clustal_options = list(), distance_options = list(model = "N", pairwise.deletion = T), api_key = Sys.getenv("NCBI_API_KEY"), ..., .parallel = FALSE, .progress = "none" )
forward |
forward primer to search by 5'-3' on plus strand |
reverse |
reverse primer to search by 5'-3' on minus strand |
name |
name to give to the primer pair |
num_aligns |
number of alignment results to keep |
num_permutations |
the number of primer permutations to search, if the degenerate bases cause more than this number of permutations to exist, this number will be sampled from all possible permutations. |
simplify |
use simple names for primer hit results or complex |
clustal_options |
a list of options to pass to clustal omega, see
|
distance_options |
a list of options to pass to dist.dna, see
|
api_key |
NCBI api-key to allow faster sequence retrieval |
... |
additional arguments passed to Primer-Blast |
.parallel |
if 'TRUE', perform in parallel, using parallel backend provided by foreach |
.progress |
name of the progress bar to use, see
|
A list with the following elements,
name |
name of the primer pair |
BLAST_result |
html blast results from Primer-BLAST as
'a |
object.
taxonomy |
taxonomy for the primer products from NCBI |
sequence |
sequence of the primer products |
alignment |
multiple alignment of the primer products |
tree |
phylogenetic tree of the reconstructed from the 'multiple alignment |
## Not run: #simple search mammals_16S = search_primer_pair(name='Mammals 16S', 'CGGTTGGGGTGACCTCGGA', 'GCTGTTATCCCTAGGGTAACT') #returning 1000 alignments, allow up to 3 mismatches in primer mammals_16S = search_primer_pair(name='Mammals 16S', 'CGGTTGGGGTGACCTCGGA', 'GCTGTTATCCCTAGGGTAACT', num_aligns=1000, total_primer_specificity_mismatch=3) ## End(Not run)
## Not run: #simple search mammals_16S = search_primer_pair(name='Mammals 16S', 'CGGTTGGGGTGACCTCGGA', 'GCTGTTATCCCTAGGGTAACT') #returning 1000 alignments, allow up to 3 mismatches in primer mammals_16S = search_primer_pair(name='Mammals 16S', 'CGGTTGGGGTGACCTCGGA', 'GCTGTTATCCCTAGGGTAACT', num_aligns=1000, total_primer_specificity_mismatch=3) ## End(Not run)
Get a summary of sequence lengths from a primerTree object
seq_lengths(x, summarize = TRUE)
seq_lengths(x, summarize = TRUE)
x |
a primerTree object. |
summarize |
a logical indicating if a summary should be displayed |
a table of sequence length frequencies
# Show the counts for each length seq_lengths(mammals_16S) # Plot the distribution of lengths seqLengths <- seq_lengths(mammals_16S) barplot(seqLengths, main = "Frequency of sequence lengths for 16S mammal primers", xlab="Amplicon length (in bp)", ylab=("Frequency"))
# Show the counts for each length seq_lengths(mammals_16S) # Plot the distribution of lengths seqLengths <- seq_lengths(mammals_16S) barplot(seqLengths, main = "Frequency of sequence lengths for 16S mammal primers", xlab="Amplicon length (in bp)", ylab=("Frequency"))
Method for primerTree objects
## S3 method for class 'primerTree' seq_lengths(x, summarize = TRUE)
## S3 method for class 'primerTree' seq_lengths(x, summarize = TRUE)
x |
a primerTree object. |
summarize |
a logical indicating if a summary should be displayed |
Summarize a primerTree result, printing quantiles of sequence length and pairwise differences.
## S3 method for class 'primerTree' summary(object, ..., probs = c(0, 0.05, 0.5, 0.95, 1), ranks = common_ranks)
## S3 method for class 'primerTree' summary(object, ..., probs = c(0, 0.05, 0.5, 0.95, 1), ranks = common_ranks)
object |
the primerTree object to summarise |
... |
Ignored options |
probs |
quantile probabilities to compute, defaults to 0, 5, 50, 95, and 100 probabilities. |
ranks |
ranks to show unique counts for, defaults to the common ranks |
invisibly returns a list containing the printed results
Construct a neighbor joining tree from a dna alignment
tree_from_alignment(dna, pairwise.deletion = TRUE, ...)
tree_from_alignment(dna, pairwise.deletion = TRUE, ...)
dna |
fasta dna object the tree is to be constructed from |
pairwise.deletion |
a logical indicating if the distance matrix should be constructed using pairwise deletion |
... |
furthur arguments to dist.dna |