Welcome to ClientVPS Mirrors

Help for package proximetricsR

Package {proximetricsR}


Type: Package
Title: Spectral Preprocessing and Chemometric Calibration of NIR Sensors
Version: 0.6.4
Date: 2026-06-22
Maintainer: Leonardo Ramirez-Lopez <ramirez-lopez.l@buchi.com>
BugReports: https://github.com/l-ramirez-lopez/proximetricsr/issues
Description: Provides tools to build quantitative chemometric models and applications for near-infrared (NIR) sensors. Chemometric regression models are based on partial least squares regression as described by Wold (1975) <doi:10.1016/B978-0-12-103950-9.50017-4> and modified partial least squares regression as described by Shenk and Westerhaus (1991) <doi:10.2135/cropsci1991.0011183X003100020049x>, with further discussion by Westerhaus (2014) <doi:10.1255/nirn.1492>.
License: MIT + file LICENSE
URL: https://github.com/l-ramirez-lopez/proximetricsr
VignetteBuilder: quarto
Depends: R (≥ 4.2.0)
Imports: callr, digest (≥ 0.6), foreach, mathjaxr (≥ 1.0), plotly (≥ 4.0), prospectr (≥ 0.2.10), quarto, uuid, withr, zip, readxl, jsonlite, Rcpp
Suggests: knitr, testthat, covr, doParallel, parallel, devtools
LinkingTo: Rcpp, RcppArmadillo
RdMacros: mathjaxr
NeedsCompilation: yes
LazyData: true
LazyDataCompression: xz
Encoding: UTF-8
Config/testthat/edition: 3
Config/VersionName: Saentis
Config/roxygen2/version: 8.0.0
Config/roxygen2/markdown: TRUE
Packaged: 2026-06-25 17:37:39 UTC; leo
Author: Leonardo Ramirez-Lopez ORCID iD [aut, cre], Claudio Orellano ORCID iD [aut], Nicolae Cudlenco ORCID iD [aut], Mai Said ORCID iD [aut], Mohamed Abushosha [aut], Marcal Plans ORCID iD [aut]
Repository: CRAN
Date/Publication: 2026-06-30 20:30:07 UTC

Overview of the proximetricsR package

Description

NIR calibration and application tools for BUCHI ProxiMate and ProxiScout devices. logo

Details

This is package version 0.6.4 (Saentis).

This package provides R functions for spectral pre-processing, NIR model calibration, and reading/writing files for BUCHI ProxiMate and ProxiScout devices. The calibration algorithms (fit_plsr, fit_xlsr) and the pre-treatment constructors (prep_smooth, prep_snv, prep_resample, prep_derivative) reproduce the corresponding algorithms in BUCHI NIRWise PLUS (version 1.1.3000.0), guaranteeing numerical compatibility between models built with this package and those built in NIRWise PLUS.

The ProxiScout functions for preprocessing are also numerically equivalent to the ones of the "BUCHI Modeller" software. The regression method in te Modeller is teh classical PLS regression, however, the other PLS algorithms implemented in proximetricsR (modified PLS, standard PLS, and XLS) can also be used to generate models for ProxiScout devices.

The functions available for ProxiMate spectral data are:

The functions available for reading generic spectral data files are:

The functions available for spectral pre-processing are:

The functions available for calibrating NIR regression models are:

The functions available for writing ProxiMate files are:

The functions available for reading and editing ProxiMate application files are:

The functions available for ProxiScout devices are:

The functions available for creating plots are:

Other functions:

A typical example dataset for a ProxiMate device can be found in:

Author(s)

Leonardo Ramirez-Lopez, Claudio Orellano, Nicolae Cudlenco, Mai Said, Mohamed Abushosha, Marcal Plans

See Also

Useful links:


NIRWise PLUS modeling methods (basic)

Description

internal function

Usage

.calibrate_basic(
  X,
  Y,
  group = NULL,
  method = fit_plsr(ncomp = min(15, dim(X))),
  control = calibration_control(),
  return_inputs = TRUE,
  sample_labels = NULL,
  verbose = TRUE
)

Value

An internal object containing the fitted model and validation results.


Computes the NIRWise QVAL statistic

Description

QVAL indicates how different the predicted response variable (y) in cross-validation deviates from the fitted version of y (i.e. the fitted y values obtained when all calibration observations are used to fit the model).

Usage

.calibration_statistics(
  y,
  fitted_y,
  predicted_y_in_cv = NULL,
  scaled_scores,
  ncomp
)

Arguments

y

a matrix of one column with the response variable.

fitted_y

a matrix with the estimated response variable for each component.

predicted_y_in_cv

the cross-validation estimates of the response variable for every component.

scaled_scores

a matrix of the scaled scores of the model.

ncomp

a vector for each included component.

Value

A list containing calibration statistics including residuals, predicted values, Mahalanobis distance, and Q-values.

See Also

calibrate


A method for estimating the model

Description

Compute partial least squares (PLS) or extended partial least squares (XLS) regression models for a response variable and its associated set of predictors based on the methods available in the BUCHI NIRWise PLUS calibration software.

Usage

.estimate_model(X, Y, method = fit_plsr(ncomp = min(15, dim(X))))
## S3 method for class 'spectral_fit'
predict(object, newdata, ...)

Arguments

X

a numeric matrix of spectral data.

Y

a matrix of one column with the response variable.

method

an object of class fit_constructor specifying the regression method, as returned by fit_plsr or fit_xlsr.

object

an object of class spectral_fit.

newdata

a matrix containing new spectral data.

...

not currently used.

Details

The regression method (PLS or XLS) and its parameters are controlled entirely through the method argument. See fit_plsr and fit_xlsr for the available methods and their options.

Value

For .estimate_model, an object of class spectral_fit, which is a list with the following elements:

For predict, a list with one element:

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

fit_plsr, fit_xlsr, calibrate


NIRcannabis

Description

Selected samples of cannabis NIR measurements for demo purposes. The dataset contains absorbance spectra of 80 cannabis samples measured between 1001 nm and 1700 nm at a 3 nm interval. A total number of four reference vectors is included: "CBDA" (Cannabidiolic acid), "THCA" (Tetrahydrocannabinolic acid), "CBD" (Cannabidiol) and "THC" (Tetrahydrocannabinol).

Usage

data("NIRcannabis")

Format

A data.frame containing 80 observations of four response variables, with their corresponding spectral data.

Details

This dataset is an example for a typical data file for ProxiMate applications, with a total of 80 cannabis samples, selected as a subset of a larger database. It contains the following rows for each observation:

Source

BUCHI Labortechnik AG.


A function for adding application metadata to a list of spectral_model objects

Description

This function has two use cases:

i. If object (a list of spectral_model objects) is passed to the function, it returns the same object with the specified application metadata added to it.

ii. Otherwise, the function can be used to create a list of application metadata that can be used as input for the argument metadata of the proximate_write_nax function.

Usage

add_application_metadata(
  object, key = UUIDgenerate(),
  name = c(name = "Untitled", alias = NULL),
  view = c("Up", "Down"), measurement_mode = c("DrIwr", "TrIwr"),
  measurement_time = 15,
  absorbmask_low = c(min = 0, max = 0),
  absorbmask_high = c(min = 0, max = 0),
  rotate_sample = TRUE,
  selectable = TRUE, created, changed,
  composition = NULL,
  description = "created with proximetricsR",
  sop = "",
  presentation_id = "Default"
)

Arguments

object

an optional object, consisting of a list of objects of class spectral_model. See details.

key

a string for the key of the application. Defaults to a newly generated key using UUIDgenerate.

name

a vector length at most 2, consisting of characters for the name and alias of the application. Defaults to "Untitled".

view

a string for the type of view in the application. Has to be either "Up" (default) or "Down".

measurement_mode

a string, indicating how the samples were measured. Has to be either Diffuse Reflection ("DrIwr", default) or Transflection ("TrIwr").

measurement_time

a numeric for the time each sample in the application should be measured, in seconds. Defaults to 15 seconds.

absorbmask_low

a vector of numerics of length 2 for the minimum and maximum of the lower absorbance mask. Defaults to a vector of zeros.

absorbmask_high

a vector of numerics of length 2 for the minimum and maximum of the higher absorbance mask. Defaults to a vector of zeros.

rotate_sample

a logical. Should the sample be rotated? Defaults to TRUE.

selectable

a logical, whether the application should be selectable. Defaults to TRUE.

created

a string of date and time of the creation of the application. Default is the current date and time of the system. See details for the format in which it has to be provided.

changed

a string of date and time when the application was changed. Defaults to the current date and time of the system. See details for the format in which it has to be provided.

composition

an optional string for the composition of the application. Defaults to NULL.

description

an optional string for the description of the application. Defaults to "created with proximetricsR".

sop

a string for the standard operating procedure (sop) for this particular application. Defaults to an empty character.

presentation_id

a string for the sample presentation ID of the application. Default is "Default".

Details

This function has two functionalities:

The application metadata is required for the import of an application into a ProxiMate device.

The two-fold functionality of this function allows to add application metadata during the construction of the models, or after the model-building processes have been finished. In the former case, a list of models of class spectral_model must be passed in object. Then, the returned object of this function contains the same list of models, including the specified metadata. Models can also be added or removed from that list, without changing the application metadata. In the latter case, the returned value of this function may be passed to the parameter metadata of function proximate_write_nax.

A lot of the parameters can be left unchanged and may be adjusted at a later stage of the application development (e.g. in a ProxiMate device). However, several parameters are of great importance for a successful migration of the application:

The parameter view describes if the spectrum is measured by either up-view "Up" or down-view "Down".

The measurement_mode describes how the samples are measured, with the following possibilities: Diffuse Reflection "DrIwr" or Transflection "TrIwr".

The parameters created and changed must contain the date (YYYY-MM-DD) and time (HH:MM:SS), seperated by a single "T" (without any spaces). For example, the following code returns the correct format (both created and changed default to this value):

gsub(" ", "T", format(Sys.time()))

Value

Either the list of spectral_model objects with the added application metadata (if object is provided), or the application metadata as a named list.

Author(s)

Claudio Orellano, Leonardo Ramirez-Lopez

See Also

calibrate, proximate_write_nax

Examples


data(NIRcannabis)

# Downview Absorbance of CBDA in percentage
downview_metadata <- add_application_metadata(
  name = "CBDA Downview",
  view = "Down",
  measurement_mode = "DrIwr"
)

# Create a simple model with default model metadata
simple_model <- calibrate(CBDA ~ spc,
  data = NIRcannabis, preprocess = preprocess_recipe(),
  method = fit_plsr(5), control = calibration_control(),
  metadata = add_model_metadata(), verbose = FALSE
)

# Two ways to add application metadata to a list of spectral_model objects:
model_list <- list(simple_model)

# Using the add_application_metadata 'object' argument
model_list <- add_application_metadata(
  object = model_list,
  name = "CBDA Downview",
  view = "Down",
  measurement_mode = "TrIwr"
)

# Adding it manually
model_list$metadata <- downview_metadata

# Alternatively, if you are creating an application, you can also pass
# application metadata to 'proximate_write_nax':
proximate_write_nax(
  object = model_list,
  path = tempdir(),
  metadata = downview_metadata,
  tsv_name = "some_tsv",
  empty_tsv_name = "another_tsv",
  report = TRUE,
  verbose = FALSE
)


A function for adding model metadata to a spectral_model object

Description

This function has two use cases:

i. If object (being a spectral_model object) is passed to the function, it returns the same object with the specified model metadata added to it.

ii. Otherwise, the function creates a a list of model metadata that can be used as input for the argument metadata of the calibrate function.

Usage

add_model_metadata(
  object, key = UUIDgenerate(), created, changed,
  name = c("", NULL), sort_order = 1, tol_min = NULL,
  tol_max = NULL, decimal_places = 2, unit = "",
  mahal_limit = 5, corrections = c(bias = 0, slope = 1),
  limit_min = NULL, limit_max = NULL, target = NULL,
  wavelength_range = c("Nir", "Vis", "Nir+Vis"),
  predict_type = "Calibration", arguments = rep("", 4)
)

Arguments

object

an optional object of class spectral_model. See details.

key

a string for the key of the model. Defaults to a newly generated key using UUIDgenerate.

created

a string for date and time of the addition of the model to the application. Default is the current date and time of the system. See details for the format in which it has to be provided.

changed

a string for date and time when the model has been changed. Default is the current date and time of the system. See details for the format in which it has to be provided.

name

a vector of character strings of length 2 for the name and alias of the property. If object is given or an object returned by this function is passed to calibrate, defaults to the name of the property (but not the alias). Otherwise, defaults to an empty character.

sort_order

a numeric, indicating the order in which the properties are shown on a ProxiMate device. Defaults to 1.

tol_min

an optional numeric for the minimum error tolerance. Defaults to NULL.

tol_max

an optional numeric for the maximal error tolerance. Defaults to NULL.

decimal_places

a numeric for the decimal precision of the measurements of the property. Defaults to 2.

unit

a string for the units in which the reference values of the property are measured. Defaults to an empty character.

mahal_limit

a numeric for the maximum Mahalanobis distance allowed. Defaults to 5.

corrections

a vector of numerics of length 2 for bias and slope corrections. Defaults to no corrections, i.e. c(0, 1).

limit_min

an optional numeric for the lower limit of the reference values. Defaults to NULL.

limit_max

an optional numeric for the upper limit of the reference values. Defaults to NULL.

target

an optional numeric for the desired predicted reference values. Defaults to NULL.

wavelength_range

a string for the considered wavelength range of the spectrum. Must be one of "Nir" (default), "Vis" or "Nir+Vis".

predict_type

a string for the prediction type of the model. Defaults to "Calibration".

arguments

a vector of maximal length 4. Contains additional arguments to be saved into the metadata. Defaults to a vector of empty characters of length 4.

Details

This function has two functionalities:

The two-fold functionality of this function allows to add metadata during the construction of the model, or after the model-building has been finished. For the former, the model has to be passed in object, and the returned value of this function contains the model including the chosen metadata. In the latter case, the returned value of this function may be passed to the parameter metadata of function calibrate.

A lot of the parameters can be left unchanged and may be adjusted at a later stage of the application development (e.g. in a ProxiMate device).

The parameters created and changed must contain the date (YYYY-MM-DD) and time (HH:MM:SS), seperated by a single "T" (without any spaces). For example, the following code returns the correct format (also, both created and changed default to this value):

gsub(" ", "T", format(Sys.time()))

Value

Either the spectral_model object with the added property metadata (if object is provided), or the property metadata, which is a named list.

Author(s)

Claudio Orellano, Leonardo Ramirez-Lopez

See Also

calibrate, proximate_write_nax

Examples


data(NIRcannabis)

# Downview Absorbance of CBDA in percentage
downview_metadata <- add_model_metadata(
  name = "CBDA",
  unit = "%",
  arguments = "Example metadata"
)

# Three ways to add metadata to spectral_model object:
# As a direct argument
simple_model <- calibrate(CBDA ~ spc,
  data = NIRcannabis, preprocess = preprocess_recipe(),
  method = fit_plsr(5), control = calibration_control(),
  metadata = downview_metadata
)

# Passing the model to add_model_metadata
simple_model <- add_model_metadata(
  object = simple_model,
  name = "CBDA",
  unit = "%",
  arguments = "Example metadata"
)

# Adding it directly (not recommended)
simple_model$metadata <- downview_metadata


Adds a chosen number of columns with entries equal to zero to a matrix object

Description

Adds a chosen number of columns with entries equal to zero to a matrix object

Usage

add_zero_cols(object, n_zero_cols)

Arguments

object

matrix

n_zero_cols

an integer for the number of columns to be added

Value

the matrix with n_zero_cols zero-columns added to the left and right side of object


Calibrate a spectral model

Description

Produce calibrations for predictive partial least squares (pls) or extended partial least squares (xls) models using cross-validation and outlier detection. Reproduces the modeling methods in NIRWise PLUS calibration software.

Usage

## S3 method for class 'formula'
calibrate(formula, data, group = NULL,
          preprocess = preprocess_recipe(prep_snv()),
          method,
          metadata = NULL,
          return_inputs = TRUE,
          ...,
          na_action = na.pass)

## Default S3 method:
calibrate(X, Y, data = NULL, group = NULL,
          preprocess = preprocess_recipe(prep_snv()),
          method = fit_plsr(ncomp = min(15, dim(X))),
          control = calibration_control(),
          metadata = NULL,
          skip_indices = NULL,
          return_inputs = TRUE,
          verbose = TRUE,
          ...)

## S3 method for class 'spectral_model'
predict(object, newdata, ncomp = object$final_ncomp,
          verbose = TRUE, ...)

Arguments

...

not currently used.

formula

an object of class formula which represents the basic model to be calibrated.

data

a data.frame containing the data of the variables in the model. Must be provided if using S3 method for class formula. Otherwise, optional; however, if using proximate_write_nax for the returned object, this parameter will be required.

X

a numeric matrix of spectral data. The names of the columns must be equivalent to wavelengths, such that they can be coerced to class numeric.

Y

a matrix of one column with the response variable. The column must be named.

group

an optional factor (or character vector that can be coerced to factor by as.factor) that assigns a group/class label to each observation in X (e.g. groups can be given by spectra collected from the same batch of measurements, from the same observation, from observations with very similar origin, etc). This is taken into account for cross-validation for pls tuning (factor optimization) to avoid pseudo- replication. When one observation is selected for cross-validation, all observations of the same group are removed and assigned to validation. The length of the vector must be equal to the number of observations in X.

preprocess

a preprocess_recipe object as returned by the preprocess_recipe function, indicating the pretreatments to be applied on the spectra before the regression steps.

method

an object of class fit_constructor, as returned by fit_plsr or fit_xlsr, indicating what type of regression method to use along with its parameters.

control

a calibration_control object as returned by the calibration_control function, indicating how some aspects of the calibration process must be conducted (e.g. cross-validation and outlier detection).

metadata

either NULL or an object as returned by method add_model_metadata. Contains the specifications for the metadata of the model. Defaults to NULL.

skip_indices

a vector of integers for the indices in the input data to be skipped for the regression. Defaults to NULL

return_inputs

a logical. For calibrate methods, indicates if the input data should be attached to the returned object. Note that this data is crucial for creating an application file.

verbose

a logical indicating whether or not to print a progress bar for the iterations of the validation along with messages of the execution of the cross-validation. For the predict method, messages about the progress are printed. Default is TRUE. Note: In case parallel processing is used, these progress bars are not printed.

object

an object of class spectral_model.

newdata

a data.frame containing the new spectral data of the variables in the model, of similar form as data. Alternatively, can also be a matrix of spectra.

ncomp

a vector for the number of components to be used in the prediction. Default is object$final_ncomp i.e. the optimized number of components found in the object passed to predict.

na_action

a function to specify the action to be taken if NAs are found in the object passed in data. Default is na.pass.

Details

The resulting object of the calibrate functions provides a complete list of calibration results.

By using the group argument one can specify groups of observations that have something in common (e.g. observations with very similar origin). The purpose of group is to avoid biased cross-validation results due to pseudo-replication. This argument allows to select calibration points that are independent from the validation ones. In this regard, the p argument used in object passed to control (and created with the calibration_control function), refers to the percentage of groups of observations (rather than single observations) to be retained in each sampling iteration.

The regression algorithms implemented here correspond to the partial least squares ("pls") and extended partial least squares ("xls") methods in NIRWise PLUS calibration software. Note that in these particular regression algorithms, the Y-loading of each component is constantly equal to 1, and therefore not considered.

The calibration_statistics matrix retrieved in the final_model and also in the initial_fit outputs includes a column named Q_value. This value can be used to asses model overfitting. For each observation, \(q_i\) is computed as follows:

\[s = \sqrt{ \frac{\sum_{i=1}^{n} (y_j - \hat{y}_j)^2} {n - 1}}\] \[q_i = \frac{\left |2 y_i - \hat{y}_i - \ddot{y}_i \right |} {s}\]

where for ith observation, \(y\) is the observed value, \(\hat{y}\) is the fitted value (using a model with all the observations) and \(\ddot{y}\) is the predicted value during cross-validation.

Value

For calibrate(), an object of class spectral_model which is a list with the following elements:

For predict(), the output is an object of class spectral_prediction, which is a list with the following elements:

Parallel cross-validation

The cross-validation loop is implemented with foreach, so it can be parallelised transparently by registering a parallel backend before calling calibrate. Set allow_parallel = TRUE in calibration_control (the default) and register a backend, for example:

cl <- parallel::makeCluster(parallel::detectCores() - 1L)
doParallel::registerDoParallel(cl)

model <- calibrate(...)

parallel::stopCluster(cl)

When no parallel backend is registered, foreach falls back silently to sequential execution regardless of the allow_parallel setting. Note that progress bars are suppressed during parallel execution.

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

preprocess_recipe,

fit_plsr,

fit_xlsr,

calibration_control,

calibrate_models

Examples


data("NIRcannabis")
simple_model <- calibrate(CBDA ~ spc,
  data = NIRcannabis, preprocess = preprocess_recipe(prep_snv()),
  method = fit_xlsr(5), control = calibration_control("kfold"),
  verbose = FALSE
)

method <- fit_plsr(15)
control <- calibration_control(validation_type = "kfold", number = 3, folds = "sequential")
pretreats <- preprocess_recipe(
  prep_resample(grid = c(1001, 1700, 5)),
  prep_derivative(m = 2, w = 9, p = 5, algorithm = "nwp"),
  prep_snv(),
  prep_smooth(w = 5, algorithm = "moving-average"),
  device = "proximate"
)
skip_indices <- c(5, 13, 21, 73)
# With formula
complex_model_formula <- calibrate(
  CBDA ~ spc,
  data = NIRcannabis, preprocess = pretreats, method = method,
  control = control, skip_indices = skip_indices, verbose = FALSE
)
# Default, need care with Y
Y <- matrix(NIRcannabis$CBDA)
colnames(Y) <- "CBDA"
complex_model_default <- calibrate(
  X = NIRcannabis$spc, Y = Y, data = NIRcannabis, preprocess = pretreats,
  method = method, control = control, skip_indices = skip_indices, verbose = FALSE
)

# Predict the skipped indices
predict(complex_model_formula,
  newdata = NIRcannabis[skip_indices, ],
  ncomp = complex_model_formula$final_ncomp,
  verbose = FALSE
)


Calibrate models for multiple response variables

Description

Calibrate independent models (iteratively) for multiple properties with optimization of both the pre-processing recipe (based on a list of different recipes) and the regression method. This function uses calibrate to construct such list of models.

Usage

calibrate_models(
  formulas,
  data, group = NULL,
  preprocess_recipes,
  methods,
  control = calibration_control(seed = 1),
  metadata_list = NULL,
  skip_indices_list = NULL,
  return_inputs = TRUE,
  ...,
  na_action = na.pass,
  verbose = TRUE,
  save_all = FALSE
)

## S3 method for class 'spectral_multimodel'
predict(object, newdata, verbose = TRUE, ...)

Arguments

formulas

a list containing one or more objects of class formula where each of them represents the model to be calibrated.

data

a data.frame containing the data of the variables in the model (as in the calibrate function).

group

an optional factor (or character vector that can be coerced to factor by as.factor) that assigns a group/class label to each observation in X (e.g. groups can be given by spectra collected from the same batch of measurements, from the same observation, from observations with very similar origin, etc). This is taken into account for cross-validation for pls tuning (factor optimization) to avoid pseudo- replication. When one observation is selected for cross-validation, all observations of the same group are removed and assigned to validation. The length of the vector must be equal to the number of observations in X.

preprocess_recipes

a list with one or more objects of class preprocess_recipe that are to be tested for finding the optimal one for each model in the list passed to formulas.

methods

a list containing one or more objects of class fit_constructor as returned by fit_plsr or fit_xlsr, indicating what type of regression method to use along with its parameters.

control

a calibration_control object as returned by the calibration_control function, indicating how some aspects of the calibration process must be conducted (e.g. cross-validation and outlier detection). Default is calibration_control(seed = 1). See details.

metadata_list

a list containing the specifications for the metadata of each model in formulas given in the same order. Each element in the list should be defined as in the metadata argument of calibrate using the add_model_metadata function. Defaults to NULL.

skip_indices_list

a list of vectors of integers for the indices in the input data to be skipped for the computation of each of the models in formulas. The vectors in this list must be provided in the same order as their corresponding counterparts in formulas. Defaults to NULL. In case a list is passed, the list components must be filled with numeric() for those formulas where there is no indices to be skipped.

return_inputs

a logical. For calibrate methods, indicates if the input data should be attached to the returned object. Note that this data is crucial for creating an application file.

...

arguments to be passed to the calibrate method. Not currently used for the predict.spectral_multimodel method.

na_action

a function to specify the action to be taken if NAs are found in the object passed in data. Default is na.pass.

verbose

a logical indicating whether or not to print a progress bar for the iterations of the validation along with messages of the execution of the cross-validation. For the predict method, messages about the progress are printed. Default is TRUE. Note: In case parallel processing is used, these progress bars are not printed.

save_all

a logical indicating if all the models tested (with the different pre-processing recipes) are to be saved. Default is FALSE.

object

an object of class spectral_multimodel.

newdata

a data.frame containing the new spectral data of the variables in the model, of similar form as data. Alternatively, can also be a matrix of spectra.

Details

The object passed to the control argument should indicate a seed for the random number generator (RNG). This allows the function to use the same cross-validation validation groups (for leave group-out cross-validation, see calibration_control) across the same formula with different recipes. This enables proper model comparisons.

Value

A list of class "spectral_multimodel" containing the following objects:

For predict(), a list with the following elements:

Parallel cross-validation

The cross-validation loop inside each call to calibrate is implemented with foreach, so it can be parallelised transparently by registering a parallel backend before calling calibrate_models. Set allow_parallel = TRUE in calibration_control (the default) and register a backend, for example:

cl <- parallel::makeCluster(parallel::detectCores() - 1L)
doParallel::registerDoParallel(cl)

result <- calibrate_models(...)

parallel::stopCluster(cl)

When no parallel backend is registered, foreach falls back silently to sequential execution regardless of the allow_parallel setting. Note that progress bars are suppressed during parallel execution.

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

calibrate,

preprocess_recipe,

fit_plsr,

fit_xlsr,

calibration_control

Examples


data("NIRcannabis")
# the list of formulas for the models to be built
app_formulas <- list(THC ~ spc, THCA ~ spc, CBD ~ spc, CBDA ~ spc)

# the list of pre-processing recipes to be tested
precipes <- list(
  recipe_1 = preprocess_recipe(
    prep_resample(grid = c(1001, 1700, 2)),
    prep_snv(),
    prep_derivative(m = 1, w = 9, p = 7, algorithm = "nwp"),
    device = "proximate"
  ),
  recipe_2 = preprocess_recipe(
    prep_resample(grid = c(1001, 1700, 2)),
    prep_snv(),
    prep_derivative(m = 2, w = 11, p = 9, algorithm = "nwp"),
    device = "proximate"
  )
)

optimized_app <- calibrate_models(
  formulas = app_formulas,
  data = NIRcannabis,
  preprocess_recipes = precipes,
  methods = list(fit_plsr(15, type = "nwp")),
  return_inputs = TRUE,
  save_all = FALSE
)

optimized_app


A function that controls the calibration of models

Description

This function is used to further control some aspects of the calibration of models (with the calibrate function) such as cross-validation and outlier detection.

Usage

calibration_control(validation_type = c("lgo", "loo", "kfold", "none"),
                    number = ifelse(validation_type == "lgo", 100, 10),
                    p = 0.75,
                    folds = c("random", "sequential"),
                    tuning_parameter = c("rmse", "rsq", "none"),
                    learning_rates = c(maximum = 1.1, sequential = 1.05),
                    remove_outliers = 0,
                    cal_residual_limit = 2.5,
                    mahalanobis_limit = 5,
                    val_residual_limit = 3.5,
                    allow_parallel = TRUE,
                    fix_pls_factors = TRUE,
                    fixed_components = 0,
                    replacements = TRUE,
                    seed = NULL)

Arguments

validation_type

a character string indicating the type of cross-validation (cv) to be conducted. Options are: "lgo" for leave-group-out cv (default), "loo" for leave-one-out cv and "kfold" for k-fold cv. See details.

number

an integer indicating the number of sampling iterations or sub-sample groups for the selected validation_type argument. Default is 100 for leave-group out cv and 10 for k-fold cross-validation. This parameter is ignored for leave-one-out cv.

p

a numeric value indicating the percentage of calibration observations to be retained at each sampling iteration at each local segment when "lgo" is selected in the validation_type argument. Default is 0.75 (i.e. 75 percent of the observations).

folds

a character string indicating the way folds are created (valid only when validation_type = "kfold"). Options are: "random" (default) or "sequential".

tuning_parameter

a character string indicating which cross-validation statistic to use for the optimization of the included number of components. Options are: "rmse" (default, minimization of the root mean squared error), "rsq" (maximization of the coefficient of determination) or "none" (no tuning). Does not apply when validation_type = "none".

learning_rates

a vector of length 2 for additional control over the selection of the optimal number of components. See details for its use. Defaults to c(1.1, 1.05).

remove_outliers

an integer indicating the number of times the model should automatically detect and remove outliers. Each time, a new model is fitted with the outliers removed, until either no more outliers are found or the remove_outliers has been reached. Outliers found and removed in each step, as well as the first and last computed models are recorded. Outliers are detected based on the limits set in the arguments cal_residual_limit, mahalanobis_limit and val_residual_limit. Setting remove_outliers to 0 (default) disables automatic outlier removal, whereas selecting it as Inf removes outliers until no more are found.

cal_residual_limit

a numeric value which indicates the upper limit of the standardized residuals for the fitted response variable. Observations with absolute residuals above this limit are labeled as "calibration outliers". The standardized calibration residuals are calculated as the absolute differences between the reference values and their corresponding fitted values divided by the standard deviation of these absolute differences. Default is 2.5 (as in NIRWise PLUS calibration software).

mahalanobis_limit

a numeric value which indicates the upper limit of the squared Mahalanobis distances of each sample in the score space to zero. Observations with squared Mahalanobis distance above this limit are labeled as "Mahalanobis outliers". The squared Mahalanobis distances are calculated as the squared Euclidean distance of the standardized scores to the origin. Default is 5 (as in NIRWise PLUS calibration software).

val_residual_limit

a numeric value which indicates the upper limit of the standardized residuals for cross-validation predictions of the response variable. This applies only to "kfold" or "loo" cross-validation. Observations with absolute residuals above this limit are labeled as "validation outliers". The standardized validation residuals are calculated as the absolute differences between the reference values and their corresponding cross-validated predictions divided by the standard deviation of these absolute differences. Default is 3.5 (as in NIRWise PLUS calibration software).

allow_parallel

a logical indicating if parallel execution is allowed. If TRUE, parallelization is applied to the cross-validation procedure. The parallelization of this for loop is implemented using the foreach function of the foreach package. Default is TRUE.

fix_pls_factors

a logical. This parameter only has an influence on the produced application files, where it indicates whether the final number of factors of the model should be fixed. Note that this has no influence on the model in R itself, as the optimal number of components inside the model remains the same (but it does influence the exported files). Default is TRUE.

fixed_components

a numerical value indicating a fixed number of components to be used in the model (i.e. no optimization of the components). The default value is 0, which indicates that the number of components is not fixed and it uses the one selected by the function.

replacements

a logical. Only used in case validation_type is selected as "lgo". Specifies if the sampling for the calibration sets must be done with replacements. See details for a more thorough explanation. Defaults to TRUE.

seed

an integer that can be used in any of the validation methods to obtain reproducible results, using the set.seed function. In case it is selected as NULL, no seed will be set. Note that the seed will not be reset, and future random computations can be affected. Furthermore, this parameter is meant as a way to provide reproducibility, and should not be used to simply select the seed with the best results. Default is NULL.

Details

This package extends the cross-validation methods implemented in the NIRWise PLUS software, which is based only on k-fold cross validation.

The validation methods available for assessing the predictive performance of the models are:

For each validation type (except "none"), the optimal number of factors is not necessarily chosen to be the minimum of RMSE or the maximum of \(R^2\) (depending on the tuning_parameter). Instead, since both are often monotonically decreasing respectively monotonically increasing as the number of components increases, an additional parameter learning_rates \(\gamma\) for fine-tuning of the determination of the number of factors is included:

For RMSE, consider the index where the minimum of all computed RMSE is attained:

\[n_{min} = arg\min_{n} \ RMSE_n\]

,

Then, among all \(1 < n < n_{min}\) fulfilling

\[RMSE_{n} < RMSE_{n_{min}} \cdot \gamma_{max}\] \[RMSE_{n} < RMSE_{n+1} \cdot \gamma_{seq}\]

we take the smallest \(n\) as the optimal number of components.
For \(R^2\), a similar approach is taken, but with maxima instead of minima: \(n_{max} = arg\max_{n} RMSE_n\) Then, take the smallest \(1 < n < n_{max}\) still satisfying

\[R^2_{n} > R^2_{n_{max}} \cdot \gamma_{max}^{-1}\] \[R^2_{n} > R^2_{n+1} \cdot \gamma_{seq}^{-1}\]

Note that in this case, we take the inverse of the learning rates. Furthermore, setting learning_rates = c(1, 1) retains the global minimum for RMSE, respectively maximum for \(R^2\).

Value

a list of class calibration_control mirroring the specified parameters

Author(s)

Leonardo Ramirez-Lopez

See Also

calibrate, calibrate_models

Examples


# 5-fold cross-validation with sequential sampling
calibration_control(
  validation_type = "kfold",
  number = 5,
  folds = "sequential"
)

# leave-one-out cross_validation
calibration_control(validation_type = "loo")

# 100 leave-group-out validations with 60% samples retained, with replacements
calibration_control(
  validation_type = "lgo",
  number = 100,
  p = 0.6,
  replacements = TRUE
)

# 2-fold leave-group-out cross-validation with 75% samples retained, no replacements
calibration_control(
  validation_type = "lgo",
  number = 2,
  p = 0.75,
  replacements = FALSE
)

# Same as before, but removing any outlier that is found
calibration_control(
  validation_type = "lgo",
  number = 2,
  p = 0.75,
  replacements = FALSE,
  remove_outliers = Inf
)

# no validation, gives warning
calibration_control(validation_type = "none")


Extract the property names from a given data.frame

Description

This function aims to extract the column names of properties from x. A property in this context is a response vector of numerical values that then later can be calibrated for predictions (such as with calibrate).

Usage

extract_property_names(x)

Arguments

x

a data.frame, as normally obtained by proximate_read_data, read_spc, proxiscout_read_data, or some other data parsing function.

Details

Depending on the class of x, the names of the properties are identified differently. For all cases, only columns which contain numerical values (including NA) are considered as potential properties.

If x is of class proximate_data, the property names are identified as follows:

If x is of class proxiscout_data, property names are identified as columns that contain only numerical values (including NA) and are not matched by any of the following, case-insensitive regex (each wrapped by ^ and $):

If x is of neither class, all columns with numerical values are considered to be properties

Value

A character vector, containing only the names of numerical properties. If no property names were identified, return a character vector of length 0.


Fitting method constructors

Description

These functions create configuration objects that specify the regression method to be used within calibrate.

Usage

fit_plsr(ncomp, type = c("nwp", "standard", "modified"))

fit_xlsr(ncomp, type = c("nwp", "standard", "modified"), min_w = 3, max_w = 15)

Arguments

ncomp

a positive integer indicating the maximum number of PLS components to use.

type

a character string indicating the algorithm variant. One of "nwp" (default), "standard", or "modified".

  • "nwp": replicates the NIRWise PLUS method, which uses correlation-based weights with an additional slope correction applied to the weights and scores.

  • "standard": standard PLS using standardised covariances between spectra and reference values as weights.

  • "modified": modified PLS using correlations between spectra and reference values as weights.

min_w

a positive integer indicating the minimum window size for the XLS algorithm. Default is 3.

max_w

a positive integer indicating the maximum window size for the XLS algorithm. Must be greater than min_w. Default is 15.

Details

There are two regression methods available:

Partial least squares (fit_plsr)

Uses PLS regression. The only parameter optimised is the number of components (ncomp). Three algorithm variants are available via type: "nwp", "standard", and "modified".

Extended partial least squares (fit_xlsr)

Uses the XLS algorithm. In addition to ncomp and type, the window range (min_w, max_w) controls the local smoothing applied within the algorithm.

Value

An object of class c("fit_plsr", "fit_constructor") or c("fit_xlsr", "fit_constructor") containing the specified parameters, to be passed to calibrate.

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

calibrate, calibrate_models

Examples

# PLS as in NIRWise PLUS
fit_plsr(ncomp = 15)

# Standard PLS with 15 components
fit_plsr(ncomp = 15, type = "standard")

# Modified PLS with 15 components
fit_plsr(ncomp = 15, type = "modified")

# XLS as in NIRWise PLUS
fit_xlsr(ncomp = 10)

# Standard XLS with custom window range
fit_xlsr(ncomp = 10, type = "standard", min_w = 5, max_w = 20)


get file name

Description

internal

Usage

get_fname(x)

Arguments

x

a file name

Value

A character string with the file name without path or extension.


get the tsv info

Description

internal

Usage

get_info_tsvs(files)

Arguments

files

a vector of tsv files with their paths

Value

A data frame containing information about the TSV files.


Summary of spectral_model

Description

Gets a summary of an object of class spectral_model

Usage

get_model_summary(x, ...)

Arguments

x

an object of class spectral_model (as returned by the calibrate function).

...

arguments to be passed to methods (not functional).

Value

A list containing a summary of the model.

Author(s)

Leonardo Ramirez-Lopez


get info from nad files

Description

internal

Usage

get_nad_info(file)

Arguments

file

a nad file

Value

A list containing metadata extracted from the NAD file, including properties, units, geometry, and measurement mode information.


ProxiScout standard wavenumbers

Description

Returns the standard wavenumbers used by ProxiScout NIR scanners.

Usage

get_proxiscout_wavenumbers()

Details

The standard wavenumbers of ProxiScout (see https://www.si-ware.com/) NIR scanners range from approximately 3921.569 cm^{-1} to 7407.407 cm^{-1} in steps (resolution) of around 13.61655 cm^{-1}. This is equivalent to a spectral range of 1350 to 2550 nm, with a varying resolution that starts from 2.486189 nm at 1350 nm and ends with a resolution of 8.823525 nm at 2550 nm.

Value

A numeric vector containing the standard wavenumbers of ProxiScout NIR scanners.

Examples

# Get the complete set of ProxiScout wavenumbers
wavs <- get_proxiscout_wavenumbers()

# Get the corresponding wavelengths (nm)
wavelengths_nm <- 10000000 / wavs

# Display the range of wavenumbers
range(wavs)

A function to assign values to sample distribution strata

Description

for internal use only! This function takes a continuous variable, creates n strata based on its distribution and assigns the corresponding starta to every value.

Usage

get_sample_strata(y, n = NULL, probs = NULL)

Arguments

y

a matrix of one column with the response variable.

n

the number of strata.

Value

a data table with the input y and the corresponding strata to every value.


A function for stratified calibration/validation sampling

Description

for internal use only! This function selects samples based on provided strata.

Usage

get_samples_from_strata(
  y,
  original_order,
  strata,
  samples_per_strata,
  sampling_for = c("calibration", "validation"),
  replacement = FALSE
)

Arguments

y

the vector of reference values

original_order

a matrix of one column with the response variable.

strata

the number of strata.

sampling_for

sampling to select the calibration samples ("calibration") or sampling to select the validation samples ("validation").

replacement

logical indicating if sampling with replacement must be done.

Value

a list with the indices of the calibration and validation samples.


Test if a string can be coerced to a numeric

Description

based on the code found at # https://stackoverflow.com/a/21154566/2292993

Usage

is_numeric_like(
  x,
  na_strings = c("", ".", "NA", "na", "N/A", "n/a", "NaN", "nan")
)

Value

A logical vector indicating whether each element can be coerced to numeric.


Function to locate the serial number index

Description

This function locates the index of a given serial number in a list of serial numbers.

Usage

locate_serialnumber_index(serial_numbers, serialnumber)

Arguments

serial_numbers

A vector of serial numbers.

serialnumber

A single serial number to be located within the list.

Details

The function first checks if the list of serial numbers is empty and returns 1 if true. If the serial number is found in the list, it returns the last index where the serial number appears. If the serial number is not found, it calculates the ASCII value difference between the given serial number and each element in the list using the string_diff function. If all differences are larger than 256^3, it returns 1. Otherwise, it returns the index of the element with the smallest ASCII value difference.

Value

The index of the serial number in the list, or the index of the closest match based on ASCII value difference.


Converts a matrix into a string in style of .cal files

Description

Converts a matrix into a string in style of .cal files

Usage

matrix_cal_string(object)

Arguments

object

matrix to be converted into string

Value

character of length 1, where each row is separated by ';' and all values in each row is separated by ','


Converts a matrix into a string in style of .prj files

Description

Converts a matrix into a string in style of .prj files

Usage

matrix_prj_string(object, transp = FALSE)

Arguments

object

matrix to be converted into string

transp

a logical. Should the matrix be transposed?

Value

character of length 1, where each row is separated by ⁠"\n"⁠ and all values in each row is separated by ⁠"\t"⁠


A function to construct an optimal strata for the samples, based on the distribution of the given y.

Description

for internal use only! This function computes the optimal strata from the distribution of the given y

Usage

optim_sample_strata(y, n)

Arguments

y

a matrix of one column with the response variable.

n

number of samples that must be sampled.

Value

a list with two data frames: sample_strata contains the optimal strata, whereas samples_to_get contains information on how many samples per stratum are supposed to be drawn.


Get the package version info

Description

returns package info.

Usage

pkg_info(pkg = "proximetricsR")

Arguments

pkg

the package name i.e "proximetricsR"

Value

A matrix containing package version and related information from the DESCRIPTION file.


Plot results of a given model

Description

Create a html file for a number of useful analytical plots using the R Quarto file "model_plot_template.qmd" for the given model x of class spectral_model.

Usage

## S3 method for class 'spectral_model'
plot(
  x, validations = NULL, output_file = x$target_variable,
  output_dir = NULL,
  spectral = c("weights", "coefficients", "scores", "mahalanobis"),
  cv = c("error", "response", "residuals", "qq", "distributions"),
  regression = NULL,
  validation = if (!is.null(validations)) "all" else NULL,
  verbose = TRUE, open_file = TRUE, ...
)

Arguments

x

an object of class "spectral_model". This model should be generated using the calibrate function.

validations

an optional object of class "spectral_validation". This object, if provided, should be generated using validate_prediction. Default is NULL.

output_file

a character string for the name of the generated file. Default is the target name saved in model x.

output_dir

a string for the directory in which the file is generated. Default is NULL, which writes the file to tempdir().

spectral

a character vector of spectral plots to include, "all" to include every spectral plot, or NULL to skip the section entirely. Available names: "raw", "preprocessed", "weights", "loadings", "coefficients", "scores", "scores_3d", "scaled_scores", "mahalanobis".

cv

a character vector of cross-validation plots to include, "all" to include every CV plot, or NULL to skip the section entirely. Available names: "error", "response", "response_overview", "residuals", "qq", "q_values", "distributions".

regression

a character vector of regression analysis plots to include, "all" to include every regression plot, or NULL (default) to skip the section entirely. Available names: "response", "response_overview", "residuals", "qq", "residuals_vs_fitted", "scale_location", "leverage".

validation

a character vector of validation plots to include, "all" to include every validation plot, or NULL to skip. Available names: "predicted_vs_reference". Defaults to "all" when validations is supplied, NULL otherwise.

verbose

a logical. When TRUE (default), prints the path of the generated file. Pandoc output is always suppressed.

open_file

a logical, indicating whether the file should automatically be opened in a browser after compilation. Defaults to TRUE.

...

additional graphical parameters. See details.

Details

This function creates a html file from rendering the R Markdown file 'model_plot_template.qmd' using quarto::quarto_render(). This will generate an .html file with the given output_file as its name in the directory specified by output_dir. Note that any existing file in the given directory of similar name will be overwritten.

The file opens automatically in the default browser of the system if open_file is set to TRUE.

Depending on the size of the provided dataset, the produced file might take a long time to process, and the files can quickly get quite large. The four section arguments (spectral, cv, regression, validation) control which plots are included. Each accepts a character vector of plot names, "all" to include the entire section, or NULL to skip it. For example, to render every available plot:

plot(x, spectral = 'all', cv = 'all', regression = 'all',
     validation = 'all')

The available plots per section are as follows (defaults marked with *):

spectral

cv

Only available if the calibration used cross-validation. For leave-group-out cross-validation, only "error" is available.

regression

These plots do not necessarily indicate model performance - more components generally improve fit but may overfit. Useful for identifying outliers. Similar to plot.lm.

validation

Only available when validations is supplied (an object of class spectral_validation from validate_prediction).

Most of above plots contain a slider, which may be used to adjust the considered component. The sliders start at the optimal components (if any calibration control was applied) or at the maximum number of components (otherwise).

The plots are constructed with the help of the plotly package. As such, the possibilities to manipulate the plots are as in that package. The arrangement of the plots is controlled by the quarto package.

Additional graphical parameters may be supplied to this function by using the ellipsis argument .... These arguments will be passed to some of the scatter and layout functions of plotly. More precisely, the arguments are passed to possible attributes of add_trace, and layout function of plotly. However, the following arguments will always be ignored:
c("p", "sliders", "x", "x0", "dx", "y", "y0", "dy", "visible", "type", "name", "hovertext", "text", "mode"),
as well as arguments passable to both add_trace, and layout. The "line" attribute is ignored when plotting markers and vice-versa. Some plots ignore the ellipsis argument altogether.

Possible attributes of these functions may be found by using the function schema of plotly.

Value

NULL. The desired plots are opened in a browser window.

Author(s)

Claudio Orellano, Leonardo Ramirez-Lopez

Examples


data("NIRcannabis")
control <- calibration_control(validation_type = "kfold", number = 3, folds = "sequential")
prepro_recipe <- preprocess_recipe(
  prep_resample(grid = c(1001, 1700, 2)),
  prep_snv(),
  prep_derivative(m = 1, w = 9, p = 7, algorithm = "nwp"),
  device = "proximate"
)
skips <- c(5, 13, 21, 73)
my_model <- calibrate(CBDA ~ spc,
  data = NIRcannabis, preprocess = prepro_recipe,
  method = fit_plsr(15), control = control, skip = skips, verbose = FALSE
)

plot(my_model, output_dir = tempdir())
# Include every available plot in every section
plot(my_model, output_dir = tempdir(),
  spectral = "all", cv = "all", regression = "all", validation = "all"
)
# Custom section selection
plot(
  my_model,
  output_file = "example_plot",
  output_dir = tempdir(),
  spectral = c("weights", "scores"),
  cv = "all",
  regression = NULL
)
# Make predictions and validate
preds <- predict(my_model, NIRcannabis[skips, ])
validations <- validate_prediction(preds, NIRcannabis$CBDA[skips])
# Plot validation section only
plot(
  my_model,
  output_dir = tempdir(),
  output_file = "example_plot",
  validations = validations,
  spectral = NULL,
  cv = NULL,
  regression = NULL
)


Derivative constructor for spectral preprocessing

Description

Creates a preprocessing constructor for computing first or second order derivatives of spectral data. The constructor is intended to be passed to preprocess_recipe and executed via process.

Three algorithms are supported: Savitzky-Golay ("savitzky-golay"), Norris-Gap/Gap-Segment ("gap-segment"), and the derivative pre-treatment from BUCHI NIRWise PLUS software ("nwp").

Usage

prep_derivative(m, w, p, algorithm = c("savitzky-golay", "gap-segment", "nwp"))

Arguments

m

An integer indicating the derivative order. Must be 1 (first derivative) or 2 (second derivative).

w

A positive odd integer indicating the filter window size. For "gap-segment", w indicates the gap size (spacing between points over which the derivative is computed).

p

An integer. For "savitzky-golay", indicates the polynomial order and must satisfy p < w and p >= m. For "gap-segment" and "nwp", indicates the segment or smoothing window size and must be a positive odd integer.

algorithm

A character string specifying the algorithm. One of "savitzky-golay" (default), "gap-segment", or "nwp". See Details.

Details

Savitzky-Golay ("savitzky-golay"): fits a polynomial of order p within a moving window of size w and differentiates analytically. Implemented via savitzkyGolay.

Gap-Segment ("gap-segment"): computes the derivative over a gap of w points, with optional averaging over a segment of p points. When p = 1 this reduces to the standard Norris-Gap derivative. Implemented via gapDer.

NWP ("nwp"): reproduces the "DG" derivative pre-treatment from BUCHI NIRWise PLUS calibration software. A moving average of window p is applied first (pre-smoothing), followed by differentiation. For first order, a gap derivative with gap w is used. For second order, a centered second difference with spacing half_w is computed:

\[d^2x_i = \frac{2x_i - (x_{i+h} + x_{i-h})}{2h}\]

where \(h = half_w\). Edge columns affected by the window are removed from the output.

For the "nwp" algorithm, the NIRWise PLUS half-window conventions are: \[half_w = (w + 1) / 2\] \[half_s = (p - 1) / 2\] These are stored internally for device file serialization and are not user-facing parameters.

Value

An object of class preprocessing to be used in preprocess_recipe and executed by process. The object is a list containing the method name, all parameters, and (for algorithm = "nwp") the NIRWise PLUS half-window values (half_w, half_s) required for device file serialization.

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

preprocess_recipe, process

Examples

data("NIRcannabis")
X <- NIRcannabis$spc

# Savitzky-Golay first derivative, window 11, polynomial order 3
sg <- prep_derivative(m = 1, w = 11, p = 3, algorithm = "savitzky-golay")

# Gap-Segment second derivative, gap 9, segment 3
gs <- prep_derivative(m = 2, w = 9, p = 3, algorithm = "gap-segment")

# NWP first derivative, window 5, pre-smoothing 11
nwp <- prep_derivative(m = 1, w = 5, p = 11, algorithm = "nwp")

# Apply via preprocess_recipe
recipe <- preprocess_recipe(sg, device = "unspecified")
X_der <- process(X, recipe)


Detrending constructor for spectral preprocessing

Description

Creates a preprocessing constructor for detrending spectral data. The constructor is intended to be passed to preprocess_recipe and executed via process.

Usage

prep_detrend(p = 2)

Arguments

p

A positive integer specifying the polynomial order used for fitting. Must be >= 1. Default is 2.

Details

For each spectrum, a polynomial of order p is fitted using the column wavelengths as the explanatory variable (or integer indices if column names are not numeric). The residuals from this fit are returned as the detrended spectrum, removing wavelength-dependent baseline effects.

This constructor always performs pure polynomial detrending without a prior SNV transformation. Users who want the full Barnes et al. (1989) procedure (SNV followed by detrending) should chain prep_snv before prep_detrend in their recipe.

The computation is delegated to detrend with snv = FALSE.

Value

An object of class preprocessing to be used in preprocess_recipe and executed by process.

Author(s)

Leonardo Ramirez-Lopez

References

Barnes RJ, Dhanoa MS, Lister SJ. 1989. Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra. Applied Spectroscopy, 43(5): 772-777.

See Also

prep_snv, preprocess_recipe, process

Examples

data("NIRcannabis")
X <- NIRcannabis$spc

# Pure polynomial detrend
dt <- prep_detrend(p = 2)
recipe <- preprocess_recipe(dt, device = "unspecified")
X_dt <- process(X, recipe)

# Barnes et al. (1989): SNV followed by detrend
recipe_barnes <- preprocess_recipe(
  prep_snv(), prep_detrend(p = 2), device = "unspecified"
)
X_barnes <- process(X, recipe_barnes)


Resampling constructor for spectral preprocessing

Description

Creates a preprocessing constructor for resampling spectral data to a new wavelength grid. The constructor is intended to be passed to preprocess_recipe and executed via process.

Usage

prep_resample(grid)

Arguments

grid

Either a numeric vector of length 3 specifying the target wavelength grid as c(min_wav, max_wav, resolution), or the character string "proxiscout" to resample to the standard NeoSpectra wavenumber grid (see Details).

When grid is a numeric vector:

  • grid[1] (min_wav): minimum wavelength of the target grid.

  • grid[2] (max_wav): maximum wavelength; must be greater than min_wav.

  • grid[3] (resolution): spacing between wavelengths; must be positive.

Extrapolation beyond the range of the input wavelengths is never allowed.

Details

User-defined grid (grid = c(min_wav, max_wav, resolution)): resamples spectra to the specified target grid using natural spline interpolation via resample. Column names of X must be coercible to numeric wavelength values. This mode is compatible with the "proximate" device.

NeoSpectra grid (grid = "proxiscout"): resamples spectra to the standard wavenumber grid of NeoSpectra NIR scanners (approx. 3921.569 to 7407.407 cm^{-1}, ~256 channels at ~13.617 cm^{-1} steps). Only wavenumbers overlapping with the input range are retained. This mode is compatible with the "proxiscout" device.

Value

An object of class preprocessing to be used in preprocess_recipe and executed by process.

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

preprocess_recipe, process, get_proxiscout_wavenumbers

Examples

data("NIRcannabis")
X <- NIRcannabis$spc

# User-defined grid (proximate)
rs <- prep_resample(grid = c(1001, 1700, 2))
recipe <- preprocess_recipe(rs, device = "proximate")
X_rs <- process(X, recipe)


Smoothing constructor for spectral preprocessing

Description

Creates a preprocessing constructor for smoothing spectral data. The constructor is intended to be passed to preprocess_recipe and executed via process.

Two algorithms are supported: Savitzky-Golay ("savitzky-golay") and moving average ("moving-average").

Usage

prep_smooth(w, p = NULL, algorithm = c("savitzky-golay", "moving-average"))

Arguments

w

A positive odd integer specifying the filter window size.

p

An integer specifying the polynomial order. Required when algorithm = "savitzky-golay". Must satisfy p < w and p >= 0. Ignored for "moving-average".

algorithm

A character string specifying the smoothing algorithm. One of "savitzky-golay" (default) or "moving-average". See Details.

Details

Savitzky-Golay ("savitzky-golay"): fits a polynomial of order p within a moving window of size w and returns the zero-order coefficient (i.e. the smoothed value). Implemented via savitzkyGolay with m = 0.

Moving average ("moving-average"): computes a simple moving average of window size w using movav. Edge values are handled using progressively narrower windows so the output has the same number of columns as the input. This reproduces the "Smooth" pre-treatment from BUCHI NIRWise PLUS.

For "moving-average", the NIRWise PLUS half-window convention is: \[half_w = (w - 1) / 2\] stored internally for device file serialization and not user-facing.

Value

An object of class preprocessing to be used in preprocess_recipe and executed by process. The object is a list containing the method name and all parameters. For algorithm = "moving-average", the NIRWise PLUS half-window value (half_w) is also stored for device file serialization.

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

preprocess_recipe, process

Examples

data("NIRcannabis")
X <- NIRcannabis$spc

# Savitzky-Golay smoothing, window 11, polynomial order 3
sg <- prep_smooth(w = 11, p = 3, algorithm = "savitzky-golay")

# Moving average smoothing, window 7
ma <- prep_smooth(w = 7, algorithm = "moving-average")

# Apply via preprocess_recipe
recipe <- preprocess_recipe(sg, device = "proxiscout")
X_smooth <- process(X, recipe)


Standard Normal Variate constructor for spectral preprocessing

Description

Creates a preprocessing constructor for applying Standard Normal Variate (SNV) normalisation to spectral data. The constructor is intended to be passed to preprocess_recipe and executed via process.

Usage

prep_snv()

Details

SNV normalises each spectrum row-wise by subtracting its mean and dividing by its standard deviation:

\[SNV_i = \frac{x_i - \bar{x}_i}{s_i}\]

where \(x_i\) is the signal of the \(i\)th observation, \(\bar{x}_i\) is its mean and \(s_i\) its standard deviation. Implemented via standardNormalVariate.

Value

An object of class preprocessing to be used in preprocess_recipe and executed by process.

Author(s)

Leonardo Ramirez-Lopez with code from Antoine Stevens

References

Barnes RJ, Dhanoa MS, Lister SJ. 1989. Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra. Applied spectroscopy, 43(5): 772-777.

See Also

preprocess_recipe, process

Examples

data("NIRcannabis")
X <- NIRcannabis$spc

snv <- prep_snv()
recipe <- preprocess_recipe(snv)
X_snv <- process(X, recipe)


Reflectance/absorbance conversion constructor for spectral preprocessing

Description

Creates a preprocessing constructor for converting spectral data between reflectance and absorbance. The constructor is intended to be passed to preprocess_recipe and executed via process.

Usage

prep_transform(to = c("absorbance", "reflectance"))

Arguments

to

A character string specifying the target unit. Either "absorbance" (default) or "reflectance".

Details

Conversion follows Beer's Law:

\[A = -\log_{10}(R)\]

where \(A\) is absorbance and \(R\) is reflectance.

When converting to absorbance, all values in X must be strictly positive. A warning is issued if the resulting absorbance contains small negative values, which may indicate precision or scaling issues in the input.

Note that no check is performed on whether the input is actually in the expected unit (the transformation is applied as specified).

Value

An object of class preprocessing to be used in preprocess_recipe and executed by process.

Author(s)

Leonardo Ramirez-Lopez

See Also

preprocess_recipe, process

Examples

data("NIRcannabis")
X <- NIRcannabis$spc # absorbance

tr <- prep_transform(to = "reflectance")
recipe <- preprocess_recipe(tr, device = "proxiscout")
X_ref <- process(X, recipe)


Wavelength trimming constructor for spectral preprocessing

Description

Creates a preprocessing constructor for trimming spectral data to a specified wavelength band. The constructor is intended to be passed to preprocess_recipe and executed via process.

Usage

prep_wav_trim(
  band,
  trim_constant_edges = FALSE
)

Arguments

band

A numeric vector of length 2 giving the minimum and maximum wavenumber/wavelength to retain. Columns of X outside this range are dropped. Pass c() (empty vector) to skip band trimming and only apply trim_constant_edges.

trim_constant_edges

A logical. If TRUE, constant or zero-valued columns at the left and right edges are removed after band trimming. Default is FALSE.

Details

Band trimming retains only those columns whose names (coerced to numeric) fall within [min(band), max(band)]. If no columns fall within the band the original matrix is returned with a warning.

Constant edge trimming scans inward from each edge and drops columns that are identical to their immediate neighbour or are all zero. If trimming would leave fewer than two columns the step is skipped with a warning.

Value

An object of class preprocessing to be used in preprocess_recipe and executed by process.

Author(s)

Claudio Orellano and Leonardo Ramirez-Lopez

See Also

preprocess_recipe, process

Examples

data("NIRcannabis")
X <- NIRcannabis$spc

tr <- prep_wav_trim(band = c(1000, 1800))
recipe <- preprocess_recipe(tr, device = "proxiscout")
X_trim <- process(X, recipe)


Create a string from a single preprocess step

Description

Create a string from a single preprocess step

Usage

prepro_to_string(p)

Arguments

p

an object of class preprocess_recipe

Value

a string with the preprocess parameters added


Build and execute spectral preprocessing recipes

Description

The preprocess_recipe function assembles an ordered sequence of preprocessing steps into a recipe, while process executes the recipe on a spectral data matrix.

Usage

preprocess_recipe(..., device)

process(X, recipe, device)

Arguments

...

one or more objects of class preprocessing as returned by any of the following constructor functions:

The order in which the objects are provided defines the order of execution. If no arguments are provided, an empty recipe is returned and process will return the input data unchanged.

device

a character string specifying the target device: "unspecified" (no validation), "proximate", or "proxiscout". When "proximate" or "proxiscout" is specified, preprocess_recipe validates that all steps are compatible with that device and raises an informative error if not. Pass "unspecified" to skip validation explicitly.

device is required whenever the recipe contains any preprocessing step, with one exception: a recipe containing only a single prep_snv step does not require device, because SNV is device-agnostic (identical behaviour for both "proximate" and "proxiscout"). In that case device defaults to "unspecified".

X

a numeric matrix of spectral data to be preprocessed (samples in rows, wavelengths in columns).

recipe

an object of class preprocess_recipe as returned by preprocess_recipe. A single object of class preprocessing is also accepted and treated as a one-step recipe.

Value

For preprocess_recipe, an object of class preprocess_recipe with three components: steps (the ordered list of preprocessing step objects), device (the target device string), and preprocessing_order (a simplified string summarising the sequence of applied transformations).

For process, a numeric matrix of preprocessed spectral data. The applied recipe is stored as the attribute "preprocess_recipe" on the returned matrix and can be retrieved with attr(result, "preprocess_recipe").

Author(s)

Leonardo Ramirez-Lopez

See Also

prep_smooth, prep_snv, prep_derivative, prep_resample, prep_detrend, prep_transform, prep_wav_trim

Examples

data("NIRcannabis")
X <- NIRcannabis$spc

# SNV alone — no device needed (SNV is device-agnostic)
recipe_snv <- preprocess_recipe(prep_snv())
X_snv <- process(X, recipe_snv)

# Any other combination requires device
recipe <- preprocess_recipe(
  prep_smooth(w = 7, p = 1, algorithm = "savitzky-golay"),
  prep_snv(),
  prep_derivative(m = 1, w = 5, p = 2, algorithm = "savitzky-golay"),
  device = "proxiscout"
)

X_proc <- process(X, recipe)
attr(X_proc, "preprocess_recipe")

Print method for an object of class nax

Description

Prints the contents of an object of class nax

Usage

## S3 method for class 'nax'
print(x, ...)

Arguments

x

an object of class nax

...

not yet functional.

Value

No return value, called for side effects.

Author(s)

Leonardo Ramirez-Lopez


Print method for an object of class spectral_fit

Description

Prints the content of an object of class spectral_fit

Usage

## S3 method for class 'spectral_fit'
print(x, ...)

Arguments

x

an object of class spectral_fit (as returned by the .estimate_model function).

...

arguments to be passed to methods (not functional).

Value

Returns x invisibly.

Author(s)

Leonardo Ramirez-Lopez


Print method for an object of class spectral_model

Description

Prints the content of an object of class spectral_model

Usage

## S3 method for class 'spectral_model'
print(x, ...)

Arguments

x

an object of class spectral_model (as returned by the calibrate function).

...

arguments to be passed to methods (not functional).

Value

No return value, called for side effects.

Author(s)

Leonardo Ramirez-Lopez


print method for spectral_multimodel

Description

print method for spectral_multimodel

Usage

## S3 method for class 'spectral_multimodel'
print(x, ...)

Value

Returns x invisibly.


Print method for an object of class spectral_prediction

Description

Prints the content of an object of class spectral_prediction

Usage

## S3 method for class 'spectral_prediction'
print(x, ...)

Arguments

x

an object of class spectral_prediction (as returned by the predict function).

...

arguments to be passed to methods (not functional).

Value

No return value, called for side effects.

Author(s)

Claudio Orellano


Print method for an object of class spectral_validation

Description

Prints the content of an object of class spectral_validation

Usage

## S3 method for class 'spectral_validation'
print(x, ...)

Arguments

x

an object of class spectral_validation (as returned by the validate_prediction function).

...

arguments to be passed to methods (not functional).

Value

No return value, called for side effects.

Author(s)

Claudio Orellano


Prepare data for augmenting a nax application

Description

This function collects all the necessary data that is required prior updating a nax application.

Usage

proximate_add2nax(formulas = NULL, data, metadata_list = NULL, skip_indices_list = NULL)

Arguments

formulas

a list containing one or more objects of class formula where each of them represents the model to be calibrated.

data

a data.frame containing the data of the variables in the model (as in the calibrate function).

metadata_list

a list of containing the specifications for the metadata of each model in formulas given in the same order. Each element in the list should be defined as in the metadata argument of calibrate using the add_model_metadata function. Defaults to NULL.

skip_indices_list

a list of vectors of integers for the indices in the input data to be skipped for the computation of each of the models in formulas. The vectors in this list must be provided in the same order as their corresponding counterparts in formulas. Defaults to NULL. In case a list is passed, the list components must be filled with numeric() for those formulas where there is no indices to be skipped.

Value

A list mirroing the objects passed to the function.

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

proximate_recalibrate_nax

calibrate,

preprocess_recipe,

fit_plsr,

fit_xlsr,

calibration_control,

calibrate_models


Create a data frame for NIRWise PLUS applications

Description

Create a data frame of class "proximate_data", similar to proximate_read_data, but without the need for a file. Instead, data can be supplied directly from R.

Usage

proximate_data(
  spc, id, properties = NULL, row = seq_len(nrow(spc)), check = "True", date = Sys.time(),
  snr = NULL, barcode = "", note = "", begin = Sys.time(), end = Sys.time(),
  recipe = "", coeffs = NULL
)

Arguments

spc

A matrix containing the spectral data. Note that the names of the columns must indicate the corresponding wavelength range at which the spectra was measured. Hence, the column names must be convertible to numerical values.

id

A vector of length equal to the number of rows of spc. Corresponds to the ID of the spectra, and must be provided.

properties

Either NULL (default) or a matrix containing numerical values, indicating the reference values of each property, where the column names correspond to the names of the properties. If a matrix is provided, it must contain the same number of rows as spc, but can contain NA values.

row

A vector of length equal to the number of rows of spc. Contains the row number of the observation.

check

A vector of characters with length equal to the number of rows of spc or a single character. Must only contain characters "True" or "False". Defaults to "True".

date

A vector of length equal to the number of rows of spc or a single character. Indicates the date when the measurement was taken. Format should be: "year-month-day hour:min:sec". In case an object inheriting from "POSIXct", formatting will be done automatically. Defaults to Sys.time().

snr

A vector of length equal to the number of rows of spc, a single character or NULL. Indicated the serial number of the instrument the measurement was taken with. Defaults to NULL, in which case the serial numbers are all equal to "0000000000".

barcode

A vector of length equal to the number of rows of spc or a single character. Contains the barcodes for the measurement. Defaults to an empty string.

note

A vector of length equal to the number of rows of spc or a single character. Contains notes for the measurements. Defaults to an empty string.

begin

A vector of length equal to the number of rows of spc or a single character. Contains the time and date of the beginning of the measurement process. Format should be: "year-month-day hour:min:sec". In case an object inheriting from "POSIXct", formatting will be done automatically. Defaults to the system's current date and time.

end

A vector of length equal to the number of rows of spc or a single character. Contains the time and date of the ending of the measurement process. Format should be: "year-month-day hour:min:sec". In case an object inheriting from "POSIXct", formatting will be done automatically. Defaults to the system's current date and time.

recipe

A vector of length equal to the number of rows of spc or a single character. Contains the recipe for the measurements. Defaults to an empty string.

coeffs

A list with exactly three entries. Parameter is ignored if the wavelength resolution of spc is constant. For non-constant resolution, this parameter must be supplied. See details on this parameter needs to be defined. Default is NULL.

Details

This function provides an alternative way of creating a data.frame with the necessary structure that is required by many functions of this package. In particular, this function does not require any already existing files like proximate_read_data.

Note that only the first two arguments to this function are required for creating the data frame. However, the properties argument should most often also be provided, as these contain the necessary reference values for the process of modeling and creating an application with the spectral data.

Most parameters of this function can either have length equal to the number of rows of spc or length equal to one. In latter case, the value is recycled for every row of the returned data frame.

Furthermore, we emphasize that the column names of matrix spc must contain the wavelength ranges of the spectra.

In case these spectra do not have a constant resolution, the function will require additional information on how the spectral wavelength range can be recovered. Then, the parameter coeffs will be mandatory and must contain information on the polynomial coefficients that were used to obtain the wavelengths. More information, including an example, can be seen in the vignette about the
vignette(ProxiMate-Structure-of-the-application-files). A concrete example is also given below.

The coeffs must be a named list with exactly 3 entries: X1, X2, X3. In ProxiMate data files (.tsv), they can be seen at columns #X1, #X2, #X3. Note that both X1 and X2 must be vectors of either length 1 or 2, containing the start and end pixels respectively, while X3 is a list of length 1 or 2, containing polynomial coefficients as vectors of arbitrary length. The entries of the coeffs can either be for a near-infrared only (i.e. length 1), or for both the visible and near-infrared range (i.e. length 2).

The coefficients are attached to the returned data.frame as an attribute "coeffs".

Value

A data.frame of class proximate_data containing all the metadata, response variables and spectra. The spectra is returned in a matrix embedded in the data.frame which can be accessed as ...$spc.

Author(s)

Claudio Orellano

Examples

data("NIRcannabis")
dat <- NIRcannabis

# Reconstruct NIRcannabis with properties in a different order
spc <- dat$spc
properties <- matrix(
  c(dat$CBD, dat$CBDA, dat$THC, dat$THCA),
  ncol = 4, dimnames = list(NULL, c("CBD", "CBDA", "THC", "THCA"))
)
datc <- proximate_data(
  spc, dat$ID, properties, dat$ROW,
  date = dat$Date, snr = dat$SNR, barcode = dat$Barcode,
  note = dat$Note, begin = dat$Begin, end = dat$End, recipe = dat$Recipe
)

# They are similar to each other (except the order of properties):
dat_refs <- which(names(dat) %in% c("Reference", colnames(properties)))
datc_refs <- which(names(datc) %in% c("Reference", colnames(properties)))
all.equal(dat[, -dat_refs], datc[, -datc_refs]) # TRUE

# In case of non-constant wavelengths, have to pass the coefficients to the function.
# Coefficients are usually given as #X1, #X2, #X3 in ProxiMate .tsv files,
# e.g. using coefficients example of vignette(Structure-of-the-application-files):
coeffs <- list(
  X1 = c(823, 4),
  X2 = c(1074, 272),
  X3 = list(
    c(0, 0, 0, -3.618926e-05, 2.137782, -1.333363e+03),
    c(2.04E-10, -1.28E-07, 2.80E-05, -4.76e-3, 3.89, 880.06)
  )
)

# You can extract the wavelengths in nm using these coefficients like this:
# Note that NIR pixels must be shifted by one to the right, as they are zero-based
pixel_seq <- list((coeffs$X1[1]:coeffs$X2[1]), (coeffs$X1[2]:coeffs$X2[2]) + 1)
vis_wavs <- mapply(
  pixel_seq[[1]],
  FUN = function(x) coeffs$X3[[1]] %*% c(x^5, x^4, x^3, x^2, x^1, 1)
)
nir_wavs <- mapply(
  pixel_seq[[2]],
  FUN = function(x) coeffs$X3[[2]] %*% c(x^5, x^4, x^3, x^2, x^1, 1)
)
wavs <- c(vis_wavs, nir_wavs)

# Above coefficients now have to be passed to the proximate_data()
# function since there are non-constant wavelengths.

# If we (wrongly) assume that NIRcannabis has such wavelengths:
rand_mat <- matrix(rnorm((length(wavs) - ncol(spc)) * nrow(spc)), nrow = nrow(spc))
spc <- cbind(rand_mat, spc)
colnames(spc) <- wavs

# Now we can create data object with coefficients
datcc <- proximate_data(
  spc, dat$ID, properties, dat$ROW,
  date = dat$Date, snr = dat$SNR, barcode = dat$Barcode,
  note = dat$Note, begin = dat$Begin, end = dat$End, recipe = dat$Recipe,
  coeffs = coeffs
)

# Coefficients can be viewed with
attr(datcc, "coeffs")

Merge datasets of class proximate_data

Description

This function allows you to quickly merge two separate datasets of class proximate_data into a single one. The first dataset must be of class proximate_data, while the second may be any kind of list-like format, but must contain at least columns named spc and ID.

Usage

proximate_merge(x)

Arguments

x

a list containing objects of class proximate_data, obtained from proximate_read_data or via proximate_data. The first element in the list is used as the reference for aligning the spectral data of the remaining elements. See details.

Details

This functions provides a way to merge different datasets into a single table.

In cases where the first dataset in the list (the one used as reference for spectral alignment) has spectral data with an spectral range outside the limits of another dataset, the spectral data of such dataset will not be extrapolated. In that case the spectral variables outside such limits will be filled with NAs.

The function checks for any of the standard names of a .tsv file of ProxiMate, identifying any unexpected column names as properties.

Propeties that are contained in both datasets are merged into a single column. Otherwise, the columns of a property that is only contained in one of the datasets is filled up with NA.

Value

a data.frame of class proximate_data, containing the merged data.

Author(s)

Claudio Orellano

See Also

proximate_read_data, proximate_data

Examples

# to do

Read model parameters from ProxiMate .cal files

Description

Reads the metadata and model parameters from one or more .cal files generated by BUCHI ProxiMate sensors. The function extracts the preprocessing recipe, regression method, PLS weights, loadings, scores, intercepts, and bias terms required to project new spectra into the score space and produce predictions. Spectral regression coefficients are not retrieved directly; predictions are computed in the score space via predict.read_cal.

Usage

proximate_read_cal(file, ignore_version = FALSE)

## S3 method for class 'read_cal'
predict(object, newdata, get_comp = c("optimal", "all"),
        get_scores = FALSE, bias_index = 1, ...)

Arguments

file

a character vector of .cal file paths.

ignore_version

a logical. If FALSE (default), files with no version information or created with NIRWise PLUS prior to version 1.0 raise an error. If TRUE, such files are read with a warning instead; predictions from these files may deviate from those produced on the instrument.

object

an object of class read_cal as returned by proximate_read_cal().

newdata

a matrix of new spectral data to predict from. Column names must be coercible to the wavelengths used in the model.

get_comp

a character string. Either "optimal" (default) to return predictions only for the optimal number of components, or "all" to return predictions for every available component.

get_scores

a logical indicating whether PLS scores should be returned alongside predictions. Default is FALSE.

bias_index

the index of the bias to be applied in the list of biases. These are generated in NIRWise PLUS based on the number of files containing the calibration data. Default = 1.

...

not currently used.

Value

For proximate_read_cal(), a list of class "read_cal" with the following elements:

For predict.read_cal(), a list with the following elements:

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

proximate_recalibrate_nax, proximate_read_nax


Read ProxiMate (.tsv) files

Description

This function imports .tsv files generated by BUCHI ProxiMate sensors.

Usage

proximate_read_data(file)

Arguments

file

A string indicating the name (and path) of the .tsv file. A textConnection can also be passed.

Value

A data.frame containing all the metadata, response variables and spectra in the tsv file. The spectra is returned in a matrix embedded in the data.frame which can be accessed as ...$spc.

Author(s)

Leonardo Ramirez-Lopez

Examples

data("NIRcannabis")
filename <- paste0(tempdir(), "/NIRcannabis.tsv")
# Need to produce a tsv file before we can read it
proximate_write_data(
  x = NIRcannabis,
  file = filename,
  properties = c("CBDA", "THCA", "CBD", "THC")
)
# Equivalent to dataset NIRcannabis
dat <- proximate_read_data(filename)

Reads and summarizes ProxiMate spectroscopic applications (nax files)

Description

This function reads and summarizes the main aspects of BUCHI ProxiMate applications which are files of extension .nax. In addition, the file is retain as raw binary in the object generated by this function.

Usage

proximate_read_nax(file, ignore_version = FALSE)

Arguments

file

a character vector containing the .nax file name (and path).

ignore_version

a logical passed to proximate_read_cal. If FALSE (default), embedded .cal files with no version information or created with NIRWise PLUS prior to version 1.0 raise an error. Set to TRUE to read them with a warning instead.

Value

A a list of class nax which contains the following objects:

In case, any of the above components is encrypted a character string indicating so will be returned. In case of the rtf calibration reports are not present in the nax, a NULL will be returned for rtf_info.

Author(s)

Leonardo Ramirez-Lopez


Recalibrate a nax file

Description

This function updates a nax file

Usage

proximate_recalibrate_nax(x, 
                preprocess_recipes = NULL,
                methods = NULL,
                control = calibration_control(seed = 1),
                name,
                add = NULL)

Arguments

x

an object of class nax as returned by the proximate_read_nax function.

preprocess_recipes

an optional list with one or more objects of class preprocess_recipe that are to be tested for finding the optimal one for each model in the list passed to formulas.

methods

an optional list containing one ore more objects of class fit_constructor which are as returned by one of the fit_constructors functions, indicating what type of regression method to use along with its parameters.

control

a calibration_control object as returned by the calibration_control function, indicating how some aspects of the calibration process must be conducted (e.g. cross-validation and outlier detection). Default is calibration_control(seed = 1). See details.

name

a vector length at most 2, consisting of characters for the name and alias of the application. Defaults to "Untitled".

add

an optional object of class nax_augment as returned by the proximate_add2nax function.

Value

A list of class "spectral_multimodel". See calibrate_models function.

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

proximate_add2nax

calibrate,

preprocess_recipe,

fit_plsr,

fit_xlsr,

calibration_control,

calibrate_models


Write NIRWise PLUS readable tab-separated files

Description

This function writes tab-separated value files in a readable NIRWise PLUS software format. These files contain visible and Near-Infrared absorbance spectra along with response variables and metainformation (e.g. sample ID, date, comments, etc).

Usage

proximate_write_data(x, file, id, spc, spc_round = 8, barcode = "", properties = NULL,
          note = "", recipe = "", created, snr)

Arguments

x

a data.frame of spectral data and metadata, for which the tab separated value file should be generated. See details.

file

a character for the path (and name) in which the tsv will be saved.

id

a vector of characters of length equal to the number of observations in x or of length 1. Each entry gives an observation a specific ID. If length is 1, this entry is recycled for every observation.

spc

either a character or a vector of integers. Specifies where the spectra can be found inside x. Default is "spc".

spc_round

an integer. To how many decimal places should the spectra be rounded? Defaults to 8 decimal places.

barcode

a vector of characters of length equal to the number of observations in x or of length 1. Each entry specifies the barcode for each observation; if length is 1, the entry is recycled for every observation. Default is an empty character.

properties

a vector of characters of arbitrary length. Which properties in x are to be added to the tsv? Note that any missing reference values for the properties are set to 0. Default is NULL.

note

a vector of characters of length equal to the number of observations in x or of length 1. The vector corresponds to the notes for each observation, or, if the vector is of length one, to all notes of all observations. Defaults to an empty character.

recipe

a vector of characters of length equal to the number of observations in x or of length 1. This vector corresponds to the recipe for each observation, or, if the vector is of length one, to all recipes of all observations. Defaults to an empty character.

created

a vector of characters of length equal to the number of observations in x. This vector should contain the date and time of when each observation was measured. If not provided and not contained in x, this parameter will be set to the current date and time of the system.

snr

a vector of characters, corresponding to the serial number of the device on which the measurement was taken. If not provided and not found in x, this parameter will be set to a vector of character of length equal to the number of rows in x, where each individual character is given by 0000000000.

Details

This function creates a tab separated value file, which is readable by both NIRWise PLUS software and the proximate_read_data function.

The main usage is to transform an already given data file into a format which is readable by NIRWise PLUS. Therefore, if some data of the given object x is already of the correct form, one can pass the corresponding values simply by passing the specific row of x to this function; for example, by passing note = x$Note.

Value

Invisibly returns NULL. Called for its side effect of writing a tab-separated value file to file.

Author(s)

Leonardo Ramirez-Lopez

Examples


data("NIRcannabis")
filename <- file.path(tempdir(), "NIRcannabis.tsv")

proximate_write_data(
  x = NIRcannabis,
  file = filename,
  id = NIRcannabis$ID,
  spc = "spc",
  spc_round = 8,
  barcode = NIRcannabis$Barcode,
  properties = c("CBDA", "THCA", "CBD", "THC"),
  note = NIRcannabis$Note,
  recipe = NIRcannabis$Recipe,
  created = NIRcannabis$Begin
)

# Since we do not change anything, the following produces the same tsv:
proximate_write_data(
  x = NIRcannabis,
  file = filename,
  properties = c("CBDA", "THCA", "CBD", "THC")
)
# Delete the file
file.remove(filename)



Write calibration (.cal), project (.prj) and report (.rtf) files to a specified directory

Description

This function allows to write native ProxiMate calibration, project and report files from a spectral_model object.

Usage

proximate_write_model(object, path, tsv_paths, application_name = "Untitled",
                      cal = TRUE, prj = TRUE, rtf = TRUE,
                      verbose = TRUE, internal_prj_path = NULL)

Arguments

object

a list of models of class spectral_model. These models should be generated using the calibrate function.

path

a string for the directory in which the files should be saved.

tsv_paths

a vector of character strings for the paths (including the names) of the tsv data files. See details.

application_name

a string with the name of the generated files. Defaults to "Untitled".

cal

a logical. Should a calibration file (.cal) be written? Default is TRUE.

prj

a logical. Should a project file (.prj) be written? Default is TRUE.

rtf

a logical. Should a report in rich text format (.rtf) be written? Default is TRUE.

verbose

a logical. Should progress bars for the generated files be printed? Default is TRUE.

internal_prj_path

a string. Only used for changing the path printed on the first line of the project file. This is necessary mainly for calls from proximate_write_nax, as it creates the project file in a temporary file, which would also store that temporary path into the project file. This argument allows you to overwrite that path individually. Otherwise, this parameter may be ignored. If NULL (default), will be set to path.

Details

This function generates files with extensions ".prj" (project file), ".cal" (calibration file), and ".rtf" (report) for the provided models of class spectral_model in the argument object. Each file type can be individually enabled or disabled via the cal, prj, and rtf arguments. All files will be named according to the chosen name of the application (given by application_name). Note that in contrast to proximate_write_nax, the metadata does not influence the name of the application. This allows models to be passed directly to this function without the need for metadata. Additionally, the name of the response variable is automatically added to the names of the produced files, so that all generated files have unique names.

Value

Invisibly returns NULL. Called for its side effect of writing calibration, project and/or report files to path.

Author(s)

Claudio Orellano, Leonardo Ramirez-Lopez

Examples


data("NIRcannabis")
control <- calibration_control(validation_type = "kfold", number = 3, folds = "sequential")
amodel <- calibrate(CBDA ~ spc,
  data = NIRcannabis, preprocess = preprocess_recipe(),
  method = fit_plsr(5), control = control, verbose = FALSE
)

proximate_write_model(
  object = list(amodel),
  path = tempdir(),
  tsv_paths = tempfile(fileext = ".tsv"),
  application_name = "Untitled",
  cal = TRUE, prj = TRUE, rtf = TRUE,
  verbose = FALSE
)


Create an application file for the given list of models

Description

This function provides a flexible way to create an application file (.nax) which can be deployed into ProxiMate sensors.

Usage

proximate_write_nax(
   object, path, metadata, tsv_name, empty_tsv_name, spc = "spc",
   external_properties = NULL, report = TRUE, verbose = TRUE,
   internal_prj_path = NULL
)

Arguments

object

a list of objects of class spectral_model, as generated by calibrate. Note that at least one of these models must contain the relevant input data (i.e. run with return_inputs = TRUE in calibrate method).

path

a character for the directory in which the file should be produced.

metadata

an object of class application_metadata, as generated by add_application_metadata. If not provided (and object does not contain the metadata either), default values are used with a warning.

tsv_name

an optional character. If not supplied, this parameter is set to the name of the application plus the current date. See details.

empty_tsv_name

an optional character. For ProxiMate applications, this argument should be different to tsv_name or it cannot be imported into the device. If missing, is set to the name of the application. See details.

spc

a character to indicate the column name of the spectra used in the data provided to the object list of models. This parameter is passed to the proximate_write_data method. Defaults to "spc".

external_properties

a list for additional files to be included in the application file. Defaults to NULL. See details for how these files may be added.

report

a logical. Should reports of the models be generated and added to the file? Defaults to TRUE.

verbose

a logical indicating whether progress bars during the creation of the file should be printed. Defaults to TRUE.

internal_prj_path

a string. Only used for changing the path printed on the first line of each project file. For almost all cases, this argument can be ignored. The only case where you should adjust this parameter is when you are creating the application (.nax) file in a certain folder, but actually want to move it to another one (e.g. on a different platform). If NULL (default), the path is set equal to path, with "Calibrations/" added at the end.

Details

This function is capable of generating an application (.nax) file, which contains compressed data files for the application. All files inside this .nax file are organized in a fixed way, such that they are importable into a ProxiMate device. For that, all models to be imported should be in a list, and each individual model should be generated using the calibrate function, preferably with the input data saved in it. This can easily be done by calling the method with return_inputs = TRUE. Note that at least one model in object must contain input data, otherwise an error will occur.

Furthermore, note that the data argument in calibrate for all models in one single application must be from one single data set. In particular, one single data.frame must suffice to describe the inputs of all models in object. The data that is actually used to train these models can still be different, e.g. by specifying the rows that you want to exclude from a certain model (see skip argument of calibrate). An error will be thrown if this is not the case.

The directory path is created automatically (if it does not exist). Inside, the application file is generated, which contains the following compressed files: a file for the metadata (.nad), project (.prj) and calibration (.cal) files for all the provided models in object, possibly report (.rtf) files (as indicated by the report argument), a tab-separated value (.tsv) file of the spectral data, and an empty tab-separated value (.tsv) file.

The metadata file (.nad) is required for a successful import of the application into a ProxiMate device. This requires metadata in every model, which should be added using add_model_metadata prior to the call of this function. Otherwise, default values for the model metadata will be used with a warning. Furthermore, application specific metadata is required, which can be either specified by providing the argument metadata, or included in the list of models object (see add_application_metadata), where the former option will take precedence. If neither option is available, default values of add_application_metadata are used with a warning.

Furthermore, this function provides a way of adding separately generated project and calibration files through the parameter external_properties. Note that these files have to be either in the directory of the provided path or in a sub-directory "Calibrations" thereof. External properties must be provided as a list containing model metadata (using the add_model_metadata method) in order to be added properly to the application file.

These external files must also be named according to the naming convention of the rest of the models used. In particular, the function searches the provided path and the sub-directory "Calibrations" for files named with the following format: app_name.property_name.cal, app_name.property_name.prj and (if report is TRUE) app_name.property_name.rtf, where the app_name is taken from the application metadata, and the property_name from model metadata passed to external_properties. If the files cannot be found, a warning will be displayed.

An example for adding an external property is given in the example section below.

Note that if an application file for the given application already exists, the files inside the compressed application file are updated, but already present files are not deleted.

Value

Invisibly returns NULL. Called for its side effect of writing a .nax application file to path.

Author(s)

Claudio Orellano, Leonardo Ramirez-Lopez

Examples


data("NIRcannabis")
control <- calibration_control(validation_type = "kfold", number = 3, folds = "sequential")
# Models for application files must have model metadata!
model_metadata <- add_model_metadata(unit = "%")
modell <- calibrate(CBDA ~ spc,
  data = NIRcannabis, preprocess = preprocess_recipe(),
  method = fit_plsr(15), control = control,
  metadata = model_metadata, verbose = FALSE
)

app_metadata <- add_application_metadata(name = "app")
proximate_write_nax(
  object = list(modell),
  path = tempdir(),
  metadata = app_metadata,
  tsv_name = "some_tsv",
  empty_tsv_name = "another_tsv",
  report = TRUE,
  verbose = FALSE
)

# Another model
modelr <- calibrate(THCA ~ spc,
  data = NIRcannabis, preprocess = preprocess_recipe(),
  method = fit_plsr(15), control = control,
  metadata = model_metadata, verbose = FALSE
)

# Generate some files to be added separately
proximate_write_model(
  object = list(modelr),
  path = tempdir(),
  tsv_paths = tempdir(),
  application_name = "app",
  cal = TRUE, prj = TRUE, rtf = TRUE,
  verbose = FALSE
)

# Now add them using external properties. Requires a name for the property!
proximate_write_nax(
  object = list(modell),
  path = tempdir(),
  metadata = app_metadata,
  tsv_name = "some_tsv",
  empty_tsv_name = "another_tsv",
  external_properties = list(add_model_metadata(unit = "%", name = "THCA")),
  report = TRUE,
  verbose = FALSE
)


Read and parse ProxiScout data from CSV or XLSX files

Description

Reads spectral data files in either .csv or .xlsx format, identifies spectral data columns based on numeric column names, converts reflectance values from percentages to absolute units, and stores them in a matrix under the spc column.

Usage

proxiscout_read_data(file, references_file)

Arguments

file

A character string specifying the path to the input file. The file must be either have .csv or a .xlsx extension.

references_file

An optional character string specifying the path to a file containing reference values. See details.

Details

This function allows the user to give the path to one or two files at once.

If two file paths are given, the files are assumed to contain the spectral data in file, while references_file contains only the reference values. Both files must have a column that contains the regex sample, and the entries must coincide (excluding potential repetition identificators). These files are then merged together by the column with the name containing sample.

If only file is given, it must contain the spectral columns, and may or may not contain reference values.

In general, inside file, any column AFTER the spectra are identified as predictions, and are collected into a matrix called predictions (if any exist). Columns that contain numerical values and do not contain typical column names (see extract_property_names for more details) that appear BEFORE the spectral data columns are identified reference values.

The function:

Value

A data.frame where:

Note

This function assumes spectral column names follow a strict numeric pattern (e.g. "3921.0") and removes any prefixed characters such as "X" that may be added by read.csv. These names are converted to numeric and used as column names of the spectral matrix.

Author(s)

Leonardo Ramirez-Lopez, Claudio Orellano


ProxiScout repetition pattern

Description

Returns the pattern that can be used to identify repetitions in the sample ID of ProxiScout data files

Usage

proxiscout_repetition_pattern()

Value

A character that can be used as a regex for identifying repetitions in ProxiScout data files

Author(s)

Claudio Orellano


Write data files for ProxiScout devices

Description

This function writes comma-separated files in a format compatible with ProxiScout-related software, which typically require two separate comma-separated files - one file for the spectra, and another file for reference values. These files are created inside the specified directory (argument path).

Usage

proxiscout_write_data(x, path, file_prefix, properties = NULL, spc = "spc")

Arguments

x

a data.frame of spectral data for which to write the data files. Typically, this is returned by proxiscout_read_data and of class "proxiscout_data".

path

a character for the directory in which the files will be saved.

file_prefix

a character for the prefix of the generated files. The files are then named as ⁠[file_prefix]_spectra.csv⁠ and ⁠[file_prefix]_properties.csv⁠. Default is proxiscout_export.

properties

a vector of characters of arbitrary length. Which properties in x are to be added to the csv? Default is NULL.

spc

either a character or a vector of integers. Specifies where the spectra can be found inside x. Default is "spc".

Details

This function creates up to two comma separated files in the directory path, which are usable by ProxiScout-related software. These files are named according to the file_prefix argument and contain the spectra together with the sample names and device ID, respectively the reference values with the sample names.

Typically, the data provided to this function is imported with proxiscout_read_data and of class "proxiscout_data", but it is also possible to construct a data.frame by hand and provide it to this function.

The properties argument specifies which columns in x are the reference values written to the ⁠[file_prefix]_properties.csv⁠ file. If empty (default), this file is not created, as it would only contain sample names. Any row in the provided properties that only contains NA values are dropped. In general, NA values are set to an empty string ("")

The sample names are detected automatically from x as the column with a name that contains "sample". If none are detected, the function will throw an error. This column will be named "Sample Name" in the ⁠[file_prefix]_spectra.csv⁠ file, and "sampleName" in the ⁠[file_prefix]_properties.csv⁠ file.

Similarly, the device ID is a required column and is identified as having a "device" string inside the name of the column. This column is only written into the ⁠[file_prefix]_spectra.csv⁠ file, with a fixed named "Device Id".

All other columns in either file only correspond to the spectra respectively the reference values. In particular, other columns in x are dropped.

Value

A character with the paths to the created files.

Author(s)

Leonardo Ramirez-Lopez, Claudio Orellano


Write a calibration model to ProxiScout JSON format

Description

Serializes a model of class spectral_model (including its preprocessing recipe) into a JSON format that can be imported into the NeoSpectra NIR Hub and deployed on ProxiScout sensors (see Details).

Usage

proxiscout_write_model(object, file = NULL)

Arguments

object

an object of class spectral_model that contains the preprocessing recipe and final model to be serialized.

file

an optional character string with the path (including file name) where the JSON output should be written. If NULL (default), no file is written and the JSON string is returned. If a path is provided, the JSON is written to that file and returned invisibly.

Details

The JSON output produced by this function can be imported into the NeoSpectra NIR Hub and used within a ProxiScout application. Once imported, the NeoSpectra Scan mobile app linked to a ProxiScout sensor can access the model and use it to compute and display spectral predictions.

The JSON pipeline always begins with two hardware-specific steps that are added automatically, regardless of the preprocessing recipe in object: (1) scaling raw reflectance from the 0–100 range reported by the sensor to the 0–1 range, and (2) averaging repeated scans of the same sample. These steps precede any user-defined preprocessing.

Constraints and supported preprocessing steps:

Value

If file = NULL (default), the JSON string is returned visibly so it can be inspected or assigned to a variable. If file is specified, the JSON string is written to that file and returned invisibly (i.e. it is not printed to the console, following the standard R convention for functions called primarily for their side effect).

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

calibrate, get_proxiscout_wavenumbers, prep_resample

Examples


data("NIRcannabis")
control <- calibration_control(
  validation_type = "kfold", number = 3, folds = "sequential"
)
recipe <- preprocess_recipe(
  prep_resample(grid = "proxiscout"),
  prep_snv(),
  prep_derivative(m = 1, w = 11, p = 2, algorithm = "savitzky-golay"),
  device = "proxiscout"
)
model <- calibrate(
  THCA ~ spc,
  data = NIRcannabis, preprocess = recipe,
  method = fit_plsr(10), control = control, verbose = FALSE
)

json_model <- proxiscout_write_model(model)
json_model

proxiscout_write_model(model, file = file.path(tempdir(), "my_model.json"))


extract info from rtf calibration reports

Description

internal

Usage

read_rtfs(files)

Arguments

files

a vector of rtf files with their paths

Value

A list containing extracted information from the RTF calibration reports.


Read and format spectral data from a file

Description

This function reads spectral data from a file and extracts the spectral columns based on a specified prefix, or a range of columns. It can handle various delimiters and decimal separators.

Usage

read_spc(file, sep = "\t", dec = ".", header = TRUE, spectra_prefix = "", 
         spectra_starts = NA, spectra_ends = NA, ...)

Arguments

file

a character string specifying the path to the file containing the spectral data.

sep

a character string indicating the field separator character. Defaults to "\t".

dec

a character string used for decimal points. Defaults to ".".

header

logical value indicating whether the file contains the names of the variables as its first line. Defaults to TRUE

spectra_prefix

a character string specifying the prefix used for spectral column names. If empty, the function will use column indices instead.

spectra_starts

an integer indicating the starting column index for the spectral data, used when spectra_prefix is not specified.

spectra_ends

an integer indicating the ending column index for the spectral data, used when spectra_prefix is not specified. If not provided, defaults to the last column.

...

additional arguments passed to read.table.

Details

The function reads a file and extracts the spectral data based on either a column name prefix or specified column indices. The spectral data is returned as a matrix in the spc column of the resulting data frame.

Value

a data frame with the original data and a matrix of spectral data stored in the spc column.

Author(s)

Leonardo Ramirez-Lopez

Examples


# write a file with spectra
data("NIRsoil", package = "prospectr")
spc_small <- NIRsoil$spc[1:5, ]
colnames(spc_small) <- paste0("X", colnames(spc_small))
tmp_df <- data.frame(ID = 1:5, Nt = NIRsoil$Nt[1:5], spc_small, check.names = FALSE)
tmp_file <- tempfile(fileext = ".txt")
write.table(tmp_df, file = tmp_file, sep = "\t", row.names = FALSE)

# read that
result <- read_spc(tmp_file, spectra_prefix = "X")



A function to create calibration and validation sample sets for k-fold cross-validation

Description

for internal use only! This function implements k-fold sampling. based on either a random or sequential selection of observations. If group is provided, the sampling is done based on the groups. This function is used to create groups for k-fold cross-validations.

Usage

sample_kfold(
  N,
  number,
  group = NULL,
  sampling = c("random", "sequential"),
  seed = NULL
)

Arguments

N

the total number of observations.

number

the number of folds.

group

the labels for each sample in indicating the group each observation belongs to.

sampling

a character vector indicating hw to sample. Options are: "random" (default) or "sequential" (the one used in NIRWise PLUS).

seed

an integer for random number generator (default NULL).

Value

a list with two matrices (hold_in and hold_out) giving the indices of the observations in each column. The number of columns represents the number of sampling repetitions.


A function to create calibration and validation sample sets for leave-one-out cross-validation

Description

for internal use only! If group is provided, the sampling is done based on the groups.

Usage

sample_loo(N, group = NULL)

Arguments

N

the total number of observations.

group

the labels for each sample in y indicating the group each observation belongs to.

Value

a list with two matrices (hold_in and hold_out) giving the indices of the observations in each column. The number of columns represents the number of sampling repetitions.


Simple leave-one-out sampling

Description

For internal use only

Usage

sample_loo_basic(N)

Value

A list with hold_in (calibration indices) and hold_out (validation indices) matrices.


A function to create calibration and validation sample sets for leave-group-out cross-validation

Description

for internal use only! This is stratified sampling based on the values of a continuous response variable (y). If group is provided, the sampling is done based on the groups and the average of y per group. This function is used to create calibration and validation groups for leave-group-out cross-validations (or leave-group-of-groups-out cross-validation if group argument is provided).

Usage

sample_stratified(y, p, number, group = NULL, replacement = FALSE, seed = NULL)

Arguments

y

a matrix of one column with the response variable.

p

the percentage of samples (or groups if group argument is used) to retain in the validation_indices set

number

the number of sample groups to be crated

group

the labels for each sample in y indicating the group each observation belongs to.

replacement

A logical indicating sample replacements for the calibration set are required.

seed

an integer for random number generator (default NULL).

Value

a list with two matrices (hold_in and hold_out) giving the indices of the observations in each column. The number of columns represents the number of sampling repetitions.


Calculate filter for Savitzky-Golay

Description

Calculate filter for Savitzky-Golay

Usage

sgf(p, n, m = 0)

Value

A numeric matrix containing the Savitzky-Golay filter coefficients.


Simple k-fold sampling

Description

For internal use only

Usage

simple_kfold_sampling(
  N,
  number,
  sampling = c("random", "sequential"),
  seed = NULL
)

Value

A list with two matrices (hold_in and hold_out) giving the indices of the observations in each column for each fold.


The spectral_fit class

Description

An object of class spectral_fit represents a fitted PLS or XLS regression model for a single component sequence. It is produced internally by calibrate and is accessible via object$final_model$model.

A spectral_fit object is a list with the following elements:

Author(s)

Leonardo Ramirez-Lopez and Claudio Orellano

See Also

calibrate, fit_plsr, fit_xlsr


Calculate the ASCII Value Difference Between Two Strings

Description

This function calculates the difference between two strings based on their ASCII values.

Usage

string_diff(s1, s2)

Arguments

s1

A character string.

s2

A character string.

Details

The function ensures that both strings are of the same length by padding them with spaces if necessary. It then computes the difference between the ASCII values of corresponding characters in the strings.

Value

The absolute difference between the ASCII values of the characters in the two strings.


Generates a template for .prj files

Description

Generates a template for .prj files

Usage

template(file_version = c("prj", "cal"), v1 = TRUE)

Arguments

file_version

a string, either c('prj', 'cal'), indicating which type of template should be produced.

v1

a boolean. For cal files, should the line for version 1.0 be printed?

Value

a vector of characters, which may be filled with correct values of the computed model.


Validate predictions of class 'spectral_prediction'

Description

Calculate several prediction validation statistics for a prediction of class 'spectral_prediction'.

Usage

validate_prediction(prediction, reference)

Arguments

prediction

an object of class 'spectral_prediction', as returned by the predict function.

reference

a vector or a matrix with one column, containing the response variable.

Value

An object of class "spectral_validation", which is a list containing the following validation statistics of the prediction:

Author(s)

Claudio Orellano

Examples

data("NIRcannabis")
skips <- c(10, 25, 37)
simple_model <- calibrate(CBDA ~ spc,
  data = NIRcannabis, preprocess = preprocess_recipe(),
  method = fit_plsr(5), control = calibration_control("kfold"),
  skips = skips, verbose = FALSE
)

# Predict the skipped indices
pred <- predict(simple_model,
  newdata = NIRcannabis[skips, ],
  ncomp = simple_model$final_ncomp,
  verbose = FALSE
)

# Validate skipped indices
validate_prediction(pred, NIRcannabis$CBDA[skips])

Writes metadata required for a Proximate application file

Description

Internal function for generating a metadata file required for creating a ProxiMate application.

Usage

write_nad(object, path, application_meta, external_properties = NULL, verbose = TRUE)

Arguments

object

a list of models of class spectral_model for which the metadata files should be generated.

path

a string for the directory in which the files will be saved.

application_meta

a list of class application_metadata, containing the metadata for the application. See add_application_metadata.

external_properties

a list of external properties. More details in proximate_write_nax. Defaults to NULL.

verbose

a logical. Should messages about the generated files be printed? Default is TRUE.

Details

This function takes a list of models of class spectral_model and generates the corresponding metadata file (.nad). This file allows the ProxiMate calibration software to import a .nax file. Thus, the main purpose of this file is to added to the zip structure of an application file (.nax). See proximate_write_nax for more details on how this file is used.

Note that it is crucial for all provided models to have some metadata added.

Value

Invisibly returns NULL. Called for its side effect of writing a .nad metadata file to path.

Author(s)

Claudio Orellano, Leonardo Ramirez-Lopez

Need a high-speed mirror for your open-source project?
Contact our mirror admin team at info@clientvps.com.

This archive is provided as a free public service to the community.
Proudly supported by infrastructure from VPSPulse , RxServers , BuyNumber , UnitVPS , OffshoreName and secure payment technology by ArionPay.