tknz_sent() and preprocess() now have a
different implementation on Windows and UNIX OSs, respectively (since
the previous C++ implementation has impredictable behaviour on Windows,
see #30). This fix also included minor changes in the
tknz_sent() output, in some corner cases
(e.g. tknz_sent("") now returns character(0),
wheareas it used to return "").perplexity() gets a new argument exp that
allows to return the cross-entropy per word, rather than perplexity (its
exponential).perplexity.character() gets a new argument
detailed that allows to return, alongside with the total
perplexity of the input document, also the cross-entropies and word
lengths of individual sentences. Closes #28.?kgram_freqs.R requirements 3.5 -> 4.0.SystemRequirements: C++11 (see this
tidyverse blog post)verbose arguments now default to
FALSE.probability(), perplexity() and
sample_sentences() are restricted to accept only
language_model class objects as their model
argument.as_dictionary(NULL) now returns an empty
dictionary..preprocess and
.tknz_sent arguments to be ignored in
process_sentences().max_lines and
batch_size arguments in
kgram_freqs.connection().dictionary.dictionary() with
batch processing and non-trivial size constraints on vocabulary
size.
Need a high-speed mirror for your open-source project?
Contact our mirror admin team at info@clientvps.com.
This archive is provided as a free public service to the community.
Proudly supported by infrastructure from VPSPulse , RxServers , BuyNumber , UnitVPS , OffshoreName and secure payment technology by ArionPay.