% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/blast.R
\name{blastClassify16S}
\alias{blastClassify16S}
\title{Classifying using BLAST}
\usage{
blastClassify16S(sequence, bdb)
}
\arguments{
\item{sequence}{Character vector of 16S sequences to classify.}

\item{bdb}{Name of BLAST data base, see \code{\link{blastDbase16S}}.}
}
\value{
A \code{data.frame} with two columns: Taxon is the predicted taxon for each \code{sequence}
and Identity is the corresponding identity-value. If no BLAST hit is seen, the sequence is 
\code{"unclassified"}.
}
\description{
A 16S based classification based on BLAST.
}
\details{
A vector of 16S sequences (DNA) are classified by first using BLAST \code{blastn} against
a database of 16S DNA sequences, and then classify according to the nearest-neighbour principle.
The nearest neighbour of a query sequence is the hit with the largest bitscore. The BLAST+
software  \url{https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastDocs&DOC_TYPE=Download}
must be installed on the system. Type \code{system("blastn -help")} in the Console window,
and a sensible Help-text should appear.

The database must contain 16S sequences where the Header starts with a token specifying the taxon. 
More specifically, the tokens must look like:

<taxon>_1

<taxon>_2

...etc

where <taxon> is some proper taxon name. Use \code{\link{blastDbase16S}} to make such databases.

The identity of each alignment is also computed. This should be close to 1.0 for a classification
to be trusted. Identity values below 0.95 could indicate uncertain classifications, but this will
vary between taxa.
}
\examples{
data("small.16S")
seq <- small.16S$Sequence
tax <- sapply(strsplit(small.16S$Header,split=" "),function(x){x[2]})
\dontrun{
dbase <- blastDbase16S("test",seq,tax)
reads <- amplicon(seq)
predicted <- blastClassify16S(reads[nchar(reads)>0],dbase)
print(predicted)
}

}
\author{
Lars Snipen.
}
\seealso{
\code{\link{blastDbase16S}}.
}

