% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/oolong.R
\name{create_oolong}
\alias{create_oolong}
\title{Generate an oolong test}
\usage{
create_oolong(
  input_model = NULL,
  input_corpus = NULL,
  n_top_terms = 5,
  bottom_terms_percentile = 0.6,
  exact_n = NULL,
  frac = 0.01,
  n_top_topics = 3,
  n_topiclabel_words = 8,
  use_frex_words = FALSE,
  difficulty = 1,
  input_dfm = NULL,
  construct = "positive",
  btm_dataframe = NULL
)
}
\arguments{
\item{input_model}{(gs/tm) a STM, WarpLDA, topicmodels or BTM object; if it is NULL, create_oolong assumes that you want to create gold standard.}

\item{input_corpus}{(gs/tm) if input_model is not null, it should be the corpus (character vector or quanteda::corpus object) to generate the model object. If input_model and input_corpus are not NULL, topic intrusion test cases are generated. If input_model is a BTM object, this argument is ignored. If input_model is null, it generates gold standard test cases.}

\item{n_top_terms}{(tm) integer, number of top topic words to be included in the candidates of word intrusion test.}

\item{bottom_terms_percentile}{(tm) double, a term is considered to be an word intruder when its theta less than the percentile of this theta, must be within the range of 0 to 1}

\item{exact_n}{(tm/gs) integer, number of topic intrusion test cases to generate, ignore if frac is not NULL}

\item{frac}{(tm/gs) double, fraction of test cases to be generated from the corpus}

\item{n_top_topics}{(tm) integer, number of most relevant topics to be shown alongside the intruder topic}

\item{n_topiclabel_words}{(tm) integer, number of topic words to be shown as the topic label}

\item{use_frex_words}{(tm) logical, for a STM object, use FREX words if TRUE, use PROB words if FALSE}

\item{difficulty}{(tm) double, adjust the difficulty of the test. Higher value indicates higher difficulty and must be within the range of 0 to 1, no effect for STM if use_frex_words is FALSE. Ignore for topicmodels objects.}

\item{input_dfm}{(tm) a dfm object used for training the input_model, if input_model is a WarpLDA object}

\item{construct}{(gs) string, an adjective to describe the construct you want your coders to code the the gold standard test cases.}

\item{btm_dataframe}{(tm) dataframe used for training the input_model, if input_model is a BTM object}
}
\value{
an oolong test object.
}
\description{
This function generates an oolong test object that can either be used for validating a topic model or for creating ground truth (gold standard) of a text corpus.
}
\section{Usage}{


This function generates an oolong test object based on \code{input_model} and \code{input_corpus}. If \code{input_model} is not NULL, it generates oolong test for a topic model (tm). If \code{input_model} is NULL but input_corpus is not NULL, it generates oolong test for generating gold standard (gs).
}

\section{Methods}{

An oolong object, depends on its purpose, has the following methods:
\describe{
  \item{\code{$do_word_intrusion_test()}}{(tm) launch the shiny-based word intrusion test. The coder should find out the intruder word that is not related to other words.}
  \item{\code{$do_topic_intrusion_test()}}{(tm) launch the shiny-based topic intrusion test. The coder should find out the intruder topic that is least likely to be the topic of the document.}
  \item{\code{$do_gold_standard_test()}}{(gs) launch the shiny-based test for generating gold standard. The coder should determine the level of the predetermined constructs with a 5-point Likert scale.}
  \item{\code{$lock(force = FALSE)}}{(gs/tm) lock the object so that it cannot be changed anymore. It enables \code{\link{summarize_oolong}} and the following method.}
  \item{\code{$turn_gold()}}{(gs) convert the oolong object into a quanteda compatible corpus.}
}
For more details, please see the overview vignette: \code{vignette("overview", package = "oolong")}
}

\examples{
## Creation of oolong test with only word intrusion test
data(abstracts_stm)
data(abstracts)
oolong_test <- create_oolong(input_model = abstracts_stm)
## Creation of oolong test with both word intrusion test and topic intrusion test
oolong_test <- create_oolong(input_model = abstracts_stm, input_corpus = abstracts$text)
## Creation of gold standard
oolong_test <- create_oolong(input_corpus = trump2k)
}
\references{
Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. In Advances in neural information processing systems (pp. 288-296).

  Song et al. (2020) In validations we trust? The impact of imperfect human annotations as a gold standard on the quality of validation of automated content analysis. Political Communication.
}
\author{
Chung-hong Chan, Marius Sältzer
}
