\name{GGIR}
\alias{GGIR}
\title{
  Shell function for analysing an accelerometer dataset.
}
\description{
  This function is designed to help users operate all steps of the
  analysis. It helps to generate and structure milestone data,
  and produces user-friendly reports. The function acts as a shell with
  calls to \link{g.part1}, \link{g.part2}, \link{g.part3},
  \link{g.part4} and \link{g.part5}.
}
\usage{
GGIR(mode = 1:5,
     datadir = c(),
     outputdir = c(),
     studyname = c(),
     f0 = 1, f1 = 0,
     do.report = c(2, 4, 5, 6),
     configfile = c(),
     myfun = c(),
     verbose = TRUE, ...)
}
\arguments{
  \item{mode}{
    Numeric (default = 1:5).
    Specify which of the five parts need to be run, e.g., mode = 1 makes
    that \link{g.part1} is run; or mode = 1:5 makes
    that the whole GGIR pipeline is run, from \link{g.part1} to \link{g.part5}.
    Optionally mode can also include the number 6 to tell GGIR to run \link{g.part6}
    which is currently under development.
  }
  \item{datadir}{
    Character (default = c()).
    Directory where the accelerometer files are stored, e.g.,
    "C:/mydata", or list of accelerometer filenames and directories, e.g.
    c("C:/mydata/myfile1.bin", "C:/mydata/myfile2.bin").
  }
  \item{outputdir}{
    Character (default = c()).
    Directory where the output needs to be stored. Note that this
    function will attempt to create folders in this directory and uses
    those folder to keep output.
  }
  \item{studyname}{
    Character (default = c()).
    If the datadir is a folder, then the study will be given the name of the
    data directory. If datadir is a list of filenames then the studyname as specified
    by this input argument will be used as name for the study.
  }
  \item{f0}{
    Numeric (default = 1).
    File index to start with (default = 1). Index refers to the filenames sorted
    in alphabetical order.
  }
  \item{f1}{
    Numeric (default = 0).
    File index to finish with (defaults to number of files available).
  }
  \item{do.report}{
    Numeric (default = c(2, 4, 5, 6)).
    For which parts to generate a summary spreadsheet: 2, 4, 5, and/or 6. Default is c(2, 4, 5, 6).
    A report will be generated based on the available milestone data. When creating
    milestone data with multiple machines it is advisable to turn the report
    generation off when generating the milestone data, value = c(),
    and then to merge the milestone data and turn report generation back
    on while setting overwrite to FALSE.
  }
  \item{configfile}{
    Character (default = c()).
    Configuration file previously generated by function GGIR. See details.
  }
  \item{myfun}{
    List (default = c()).
    External function object to be applied to raw data. See package vignette for detailed tutorial with
    examples on how to use the function embedding:
    https://cran.r-project.org/package=GGIR/vignettes/ExternalFunction.html
  }
  \item{verbose}{
    Boolean (default = TRUE).
    to indicate whether console message should be printed. Note that warnings and error are
    always printed and can be suppressed with suppressWarning() or suppressMessages().
  }
  \item{...}{
    Any of the parameters used GGIR. Given the large number of parameters used in GGIR
    we have grouped them in objects that start with "params_". These are documented in the
    details section. You cannot provide these objects as argument to function GGIR, but
    you can provide the parameters inside them as input to function GGIR.
  }
}
\value{
  The function provides no values, it only ensures that other functions are called
  and that their output is stored. Further, a configuration file is stored containing
  all the argument values used to facilitate reproducibility.
}
\details{
  Once you have used function GGIR and the output directory (outputdir) will be filled
  with milestone data and results. Function GGIR stores all the explicitely
  entered argument values and default values for the argument that are not explicitely
  provided in a csv-file named config.csv stored in the root of the output folder.
  The config.csv file is accepted as input to GGIR with argument \code{configfile}
  to replace the specification of all the arguments, except \code{datadir} and \code{outputdir}.

  The practical value of this is that it eases the replication of analysis, because
  instead of having to share you R script, sharing your config.csv file will be
  sufficient. Further, the config.csv file contribute to the reproducibility
  of your data analysis.

  Note: When combining a configuration file with explicitely provided argument
  values, the explicitely provided argument values will overrule
  the argument values in the configuration file. If a parameter is neither provided
  via the configuration file nor as input then GGIR uses its default paramter values which
  can be inspected with command \code{print(load_params())}, and if you are specifically
  interested in a certain subgroup of parameters, e.g., physical activity, then you
  can do \code{print(load_params()$params_phyact)}. These defaults are part of the GGIR
  code and cannot be changed by the user.

  The parameters that can be used in GGIR are:

    \subsection{params_general}{
    A list of parameters used across all GGIR parts that do not fall in any of the other
    categories.
    \describe{

      \item{overwrite}{
        Boolean (default = FALSE).
        Do you want to overwrite analysis for which milestone
        data exists? If overwrite = FALSE, then milestone data from a previous analysis will
        be used if available and visual reports will not be created again.}

      \item{dayborder}{
        Numeric (default = 0).
        Hour at which days start and end (dayborder = 4 would mean 4 am).}

      \item{do.parallel}{
        Boolean (default = TRUE).
        Whether to use multi-core processing (only works if at least 4 CPU cores are available).}

      \item{maxNcores}{
        Numeric (default = NULL).
        Maximum number of cores to use when argument do.parallel is set to true.
        GGIR by default uses either the maximum number of available cores or the number of files to
        process (whichever is lower), but this argument allows you to set a lower maximum.}

      \item{acc.metric}{
        Character (default = "ENMO").
        Which one of the acceleration metrics do you want to use for all acceleration magnitude
        analyses in GGIR part 5 and the visual report? For example: "ENMO", "LFENMO", "MAD",
        "NeishabouriCount_y", or "NeishabouriCount_vm". Only one acceleration metric can be specified
        and the selected metric needs to have been calculated in part 1 (see \link{g.part1})
        via arguments such as \code{do.enmo = TRUE} or \code{do.mad = TRUE}.}

      \item{part5_agg2_60seconds}{
        Boolean (default = FALSE).
        Whether to use aggregate epochs to 60 seconds as part of the GGIR 
        \link{g.part5} analysis. Aggregation is doen by averaging. 
        Note that when working with count metrics such as Neishabouri counts this
        means that the threshold can stay the same as in part 2, because again the
        threshold is expressed relative to the original epoch size, even if averaged
        per minute. For example if we want to use a cut-point 100 count per minute
        then we specify \code{mvpathreshold = 100 * (5/60)} as well as
        `threshold.mod = \code{100 * (5/60)} regardless of whether we set 
        part5_agg2_60seconds to TRUE or FALSE.}

      \item{print.filename}{
        Boolean (default = FALSE).
        Whether to print the filename before analysing it (in case do.parallel = FALSE).
        Printing the filename can be useful to investigate
        problems (e.g., to verify that which file is being read).}

      \item{desiredtz}{
        Character (default = "", i.e., system timezone).
        Timezone in which device was configured and experiments took place.
        If experiments took place in a different timezone, then use this
        argument for the timezone in which the experiments took place and
        argument \code{configtz} to specify where the device was configured.
        Use the "TZ identifier" as specified at
        \href{https://en.wikipedia.org/wiki/Zone.tab}{https://en.wikipedia.org/wiki/Zone.tab}
        to set desiredtz, e.g., "Europe/London".}

      \item{configtz}{
        Character (default = "", i.e., system timezone).
        At the moment only functional for GENEActiv .bin, AX3 cwa, ActiGraph .gt3x,
        and ad-hoc csv file format.
        Timezone in which the accelerometer was configured. Only use this argument
        if the timezone of configuration and timezone in which recording took
        place are different. Use the "TZ identifier" as specified at
        \href{https://en.wikipedia.org/wiki/Zone.tab}{https://en.wikipedia.org/wiki/Zone.tab}
        to set configtz, e.g., "Europe/London".}

      \item{sensor.location}{
        Character (default = "wrist").
        To indicate sensor location, default is wrist. If it is hip, the HDCZA algorithm for sleep detection
        also requires longitudinal axis of sensor to be between -45 and +45 degrees.}

      \item{windowsizes}{
        Numeric vector, three values
        (default = c(5, 900, 3600)).
        To indicate the lengths of the windows as in c(window1, window2, window3):
        window1 is the short epoch length in seconds, by default 5, and this is the time 
        window over which acceleration and angle metrics are calculated;
        window2 is the long epoch length in seconds for which non-wear and signal clipping 
        are defined, default 900 (expected to be a multitude of 60 seconds);
        window3 is the window length of data used for non-wear detection and by default 3600 seconds.
        So, when window3 is larger than window2 we use overlapping windows,
        while if window2 equals window3 non-wear periods are assessed by non-overlapping windows.}

      \item{idloc}{
        Numeric (default = 1).
        If idloc = 1 the code assumes that ID number is stored in the obvious header field. Note that for ActiGraph data
        the ID is never stored in the file header.
        For value set to 2, 5, 6, and 7, GGIR looks at the filename and extracts the character string preceding the first
        occurance of a "_" (idloc = 2), " " (space, idloc = 5), "." (dot, idloc = 6),
        and "-" (idloc = 7), respectively.
        You may have noticed that idloc 3 and 4 are skipped, they were used for one study in 2012,
        and not actively maintained anymore, but because it is legacy code not omitted.}

      \item{expand_tail_max_hours}{
        Numeric (default = NULL).
        This parameter has been replaced by \code{recordingEndSleepHour}.}
        
      \item{recordingEndSleepHour}{
        Numeric (default = NULL).
        Time (in hours) at which the recording should end (or later) to expand the
        \link{g.part1} output with synthetic data to trigger sleep detection for last night.
        Using argument \code{recordingEndSleepHour} implies the assumption that the
        participant fell asleep at or before the end of the recording if the recording
        ended at or after \code{recordingEndSleepHour} hour of the last day.
        This assumption may not always hold true and should be used with caution.
        The synthetic data for metashort entails: timestamps continuing
        regularly, zeros for acceleration metrics other than EN, one for EN.
        Angle columns are created in a way that it triggers the sleep detection using
        the equation: \code{round(sin((1:length_expansion) / (900/epochsize))) * 15}.
        To keep track of the tail expansion \link{g.part1} stores the length of the expansion in
        the RData files, which is then passed via \link{g.part2}, \link{g.part3},
        and \link{g.part4} to \link{g.part5}. In \link{g.part5} the tail expansion
        size is included as an additional variable in the csv-reports.
        In the \link{g.part4} csv-report the last night is omitted, because we know
        that sleep estimates from the last night will not be trustworthy. Similarly,
        in the \link{g.part5} output columns related to the sleep assessment will
        be omitted for the last window to avoid biasing the averages. Further,
        the synthetic data are also ignored in the visualizations and time series
        output to avoid biased output.}

      \item{dataFormat}{
        Character (default = "raw").
        To indicate what the format is of the data in datadir.
        Alternatives: ukbiobank_csv, actiwatch_csv, actiwatch_awd, 
        actigraph_csv, and sensewear_xls, which correspond to epoch level data 
        files from, respecitively, UK Biobank in csv format, Actiwatch in csv 
        format, Actiwatch in awd format, ActiGraph csv format, and Sensewear in
        xls format (also works with xlsx). Here, the assumed epoch size for
        UK Biobank csvdata is 5 seconds. 
        The epoch size for the other non-raw data formats is 
        flexible, but make sure that you set first value of argument 
        \code{windowsizes} accordingly. Also when working with 
        non-raw data formats specify argument \code{extEpochData_timeformat} as
        documented below. For ukbiobank_csv nonwear is a column in the data itself,
        for actiwatch_csv, actiwatch_awd, actigraph_csv, and sensewear_xls non-wear 
        is detected as 60 minute rolling zeros. The length of this window can be 
        modified with the third value of argument \code{windowsizes} expressed in
        seconds.}
        
       \item{maxRecordingInterval}{
        Numeric (default = NULL).
        To indicate the maximum gap in hours between repeated measurements with the same
        ID for the recordings to be appended. So, the assumption is that the
        ID can be matched, make sure argument \code{idloc} is set correctly.
        If argument \code{maxRecordingInterval} is set to NULL (default) recordings
        are not appended. If recordings overlap then GGIR will use the data from
        the latest recording. If recordings are separated then the timegap between
        the recordings is filled with data points that resemble monitor not worn.
        The maximum value of maxFile gap is 504 (21 days). Only recordings from the
        same accelerometer brand are appended. The part 2 csv report
        will show number of appended recordings, sampling rate for each, time overlap or gap 
        and the names of the filenames of the respective recording.}
        
      \item{extEpochData_timeformat}{
        Character (default = "\%d-\%m-\%Y \%H:\%M:\%S").
        To specify the time format used in the external epoch level data when
        argument \code{dataFormat} is set to "actiwatch_csv", "actiwatch_awd", 
        "actigraph_csv" or "sensewear_xls". For example "\%Y-\%m-\%d \%I:\%M:\%S \%p" for 
        "2023-07-11 01:24:01 PM" or "\%m/\%d/\%Y \%H:\%M:\%S" "2023-07-11 13:24:01"
        }
      }
    }

    \subsection{params_rawdata}{
    A list of parameters used to related to reading and pre-processing
    raw data, excluding parameters related to metrics as those are in
    the params_metrics object.
    \describe{

      \item{backup.cal.coef}{
        Character (default = "retrieve").
        Option to use backed-up calibration coefficient instead of
        deriving the calibration coefficients when analysing the same file twice.
        Argument backup.cal.coef has two usecase. Use case 1: If the auto-calibration
        fails then the user has the option to provide back-up
        calibration coefficients via this argument. The value of the argument needs to
        be the name and directory of a csv-spreadsheet with the following column names
        and subsequent values: "filename" with the names of accelerometer files on which
        the calibration coefficients need to be applied in case auto-calibration fails;
        "scale.x", "scale.y", and "scale.z" with the scaling coefficients; "offset.x",
        "offset.y", and "offset.z" with the offset coefficients, and;
        "temperature.offset.x", "temperature.offset.y", and "temperature.offset.z"
        with the temperature offset coefficients. This can be useful for analysing
        short lasting laboratory experiments with insufficient sphere data to perform
        the auto-calibration, but for which calibration coefficients can be derived
        in an alternative way.  It is the users responsibility to compile the
        csv-spreadsheet. Instead of building this file the user can also
        Use case 2: The user wants to avoid performing the auto-calibration repeatedly
        on the same file. If backup.cal.coef value is set to "retrieve" (default) then
        GGIR will look out for the "data_quality_report.csv" file in the outputfolder
        QC, which holds the previously generated calibration coefficients. If you
        do not want this happen, then deleted the data_quality_report.csv from the
        QC folder or set it to value "redo".}

      \item{minimumFileSizeMB}{
        Numeric (default = 2).
        Minimum File size in MB required to enter processing.
        This argument can help to avoid having short uninformative 
        files to enter the analyses. Given that a typical accelerometer 
        collects several MBs per hour, the default setting should only skip 
        the very tiny files.}

      \item{do.cal}{
        Boolean (default = TRUE).
        Whether to apply auto-calibration or not by \link{g.calibrate}.
        Recommended setting is TRUE.}

      \item{imputeTimegaps}{
        Boolean (default = TRUE).
        To indicate whether timegaps larger than 1 sample should be imputed.
        Currently only used for .gt3x data and ActiGraph .csv format, where timegaps
        can be expected as a result of Actigraph's idle sleep.mode configuration.}

      \item{spherecrit}{
        Numeric (default = 0.3).
        The minimum required acceleration value (in g) on both sides of 0 g
        for each axis. Used to judge whether the sphere is sufficiently populated}

      \item{minloadcrit}{
        Numeric (default = 168).
        The minimum number of hours the code needs to read for the
        autocalibration procedure to be effective (only sensitive to
        multitudes of 12 hrs, other values will be ceiled).
        After loading these hours only extra data is loaded if 
        calibration error has not been reduced to under 0.01 g.}

      \item{printsummary}{
        Boolean (default = FALSE).
        If TRUE will print a summary of the calibration procedure in
        the console when done.}

      \item{chunksize}{
        Numeric (default = 1).
        Value to specify the size of chunks to be
        loaded as a fraction of an approximately 12 hour period for auto-calibration
        procedure and as fraction of 24 hour period for the metric calculation, e.g., 
        0.5 equals 6 and 12 hour chunks, respectively. 
        For machines with less than 4Gb of RAM memory or with < 2GB memory per process
        when using \code{do.parallel = TRUE} a value below 1 is recommended.
        The value is constrained by GGIR to not be lower than 0.05. Please note that
        setting 0.05 will not produce output when 3rd value of parameter windowsizes
        is 3600.}

      \item{dynrange}{
        Numeric (default = NULL).
        Provide dynamic range of 8 gravity.}

      \item{interpolationType}{
        Integer (default = 1).
        To indicate type of interpolation to be used
        when resampling time series (mainly relevant for Axivity sensors),
        1=linear, 2=nearest neighbour.}

      \item{rmc.file}{
        Character (default = NULL).
        Filename of file to be read if it is in the working directory,
        or full path to the file otherwise.
      }

      \item{rmc.nrow}{
        Numeric (default = NULL).
        Number of rows to read, same as nrow argument in \link[utils]{read.csv} and nrows in \link[data.table]{fread}.
        The whole file is read by default (i.e., rmc.nrow = Inf).}

      \item{rmc.skip}{
        Numeric (default = 0).
        Number of rows to skip, same as skip argument in \link[utils]{read.csv} and in \link[data.table]{fread}.}

      \item{rmc.dec}{
        Character (default = ".").
        Decimal used for numbers, same as dec argument in \link[utils]{read.csv} and in \link[data.table]{fread}.}

      \item{rmc.firstrow.acc}{
        Numeric (default = NULL).
        First row (number) of the acceleration data.}

      \item{rmc.firstrow.header}{
        Numeric (default = NULL).
        First row (number) of the header. Leave blank if the file does not have a header.}

      \item{rmc.header.length}{
        Numeric (default = NULL).
        If file has header, specify header length (number of rows).}

      \item{rmc.col.acc}{
        Numeric, three values
        (default = c(1, 2, 3)).
        Vector with three column (numbers) in which the acceleration signals
        are stored.}

      \item{rmc.col.temp}{
        Numeric (default = NULL).
        Scalar with column (number) in which the temperature is stored.
        Leave in default setting if no temperature is available. The temperature
        will be used by \link{g.calibrate}.}

      \item{rmc.col.time}{
        Numeric (default = NULL).
        Scalar with column (number) in which the timestamps are stored.
        Leave in default setting if timestamps are not stored.}

      \item{rmc.unit.acc}{
        Character (default = "g").
        Character with unit of acceleration values: "g", "mg", or "bit".}

      \item{rmc.unit.temp}{
        Character (default = "C").
        Character with unit of temperature values: (K)elvin, (C)elsius, or (F)ahrenheit.}

      \item{rmc.unit.time}{
        Character (default = "POSIX").
        Character with unit of timestamps: "POSIX", "UNIXsec" (seconds since origin, see argument \code{rmc.origin}),
        "character", or "ActivPAL" (exotic timestamp format only used in the ActivPAL
        activity monitor).}

      \item{rmc.format.time}{
        Character (default = "%Y-%m-%d %H:%M:%OS").
        Character  giving a date-time format as used by \link[base]{strptime}.
        Only used for rmc.unit.time: character and POSIX.}

      \item{rmc.bitrate}{
        Numeric (default = NULL).
        If unit of acceleration is a bit then provide bit rate, e.g., 12 bit.}

      \item{rmc.dynamic_range}{
        Numeric or character (default = NULL).
        If unit of acceleration is a bit then provide dynamic range deviation
        in g from zero, e.g., +/-6g would mean this argument needs to be 6. If you give this
        argument a character value the code will search the file header for elements with
        a name equal to the character value and use the corresponding numeric value
        next to it as dynamic range.}

      \item{rmc.unsignedbit}{
        Boolean (default = TRUE).
        If unsignedbit = TRUE means that bits are only positive numbers.
        if unsignedbit = FALSE then bits are both positive and negative.}

      \item{rmc.origin}{
        Character (default = "1970-01-01").
        Origin of time when unit of time is UNIXsec, e.g., 1970-1-1.}

      \item{rmc.desiredtz}{
        Character (default = NULL).
        Timezone in which experiments took place. This argument is scheduled to
        be deprecated and is now used to overwrite \code{desiredtz} if not provided.}

      \item{rmc.configtz}{
        Character (default = NULL).
        Timezone in which device was configured. This argument is scheduled to
        be deprecated and is now used to overwrite \code{configtz} if not provided.}

      \item{rmc.sf}{
        Numeric (default = NULL).
        Sample rate in Hertz, if this is stored in the file header then that will be used
        instead (see argument \code{rmc.headername.sf}).}

      \item{rmc.headername.sf}{
        Character (default = NULL).
        If file has a header: Row name under which the sample frequency can be found.}

      \item{rmc.headername.sn}{
        Character (default = NULL).
        If file has a header: Row name under which the serial number can be found.}

      \item{rmc.headername.recordingid}{
        Character (default = NULL).
        If file has a header: Row name under which the recording ID can be found.}

      \item{rmc.header.structure}{
        Character (default = NULL).
        Used to split the header name from the header value, e.g., ":" or " ".}

        \item{rmc.check4timegaps}{
        Boolean (default = FALSE).
        To indicate whether gaps in time should be imputed with zeros.
        Some sensing equipment provides accelerometer with gaps in time. The rest of
        GGIR is not designed for this, by setting this argument to TRUE the gaps
        in time will be filled with zeros.}

      \item{rmc.col.wear}{
        Numeric (default = NULL).
        If external wear detection outcome is stored as part of the data then this can be used by GGIR.
        This argument specifies the column in which the wear detection (Boolean) is stored.}

      \item{rmc.doresample}{
        Boolean (default = FALSE).
        To indicate whether to resample the data based on the available timestamps and extracted
        sample rate from the file header.}

      \item{rmc.noise}{
        Numeric (default = 13).
        Noise level of acceleration signal in m\emph{g}-units, used when working 
        ad-hoc .csv data formats
        using \link{read.myacc.csv}. The \link{read.myacc.csv} does not take rmc.noise as argument,
        but when interacting with \link{GGIR} or \link{g.part1} rmc.noise is used.}
        
      \item{rmc.scalefactor.acc}{
        Numeric value (default 1) to scale the acceleration signals via multiplication.
        For example, if data is provided in m/s2 then by setting this to 1/9.81
        we would derive gravitational units.
      }

      \item{frequency_tol}{
        Number (default = 0.1) as passed on to readAxivity from the GGIRread package.
        Represents the frequency tolerance as fraction between 0 and 1. When the relative bias
        per data block is larger than this fraction then the data block will be imputed
        by lack of movement with gravitational oriationed guessed from most recent
        valid data block. Only applicable to Axivity .cwa data.
      }
      \item{nonwear_range_threshold}{
        Numeric (default 150) used to define maximum value range per axis for non-wear
        detection, used in combination with brand specific standard deviation per
        axis.
      }
      
    }
  }

  \subsection{params_metrics}{
  A list of parameters used to specify the signal metrics that need to be extract in GGIR \link{g.part1}.
  \describe{

      \item{do.anglex}{
        Boolean (default = FALSE).
        If TRUE, calculates the angle of the X axis relative to the horizontal:
        \deqn{angleX = (\tan{^{-1}\frac{acc_{rollmedian(x)}}{(acc_{rollmedian(y)})^2 +
        (acc_{rollmedian(z)})^2}}) * 180/\pi}}

      \item{do.angley}{
        Boolean (default = FALSE).
        If TRUE, calculates the angle of the Y axis relative to the horizontal:
        \deqn{angleY = (\tan{^{-1}\frac{acc_{rollmedian(y)}}{(acc_{rollmedian(x)})^2 +
        (acc_{rollmedian(z)})^2}}) * 180/\pi}}

      \item{do.anglez}{
        Boolean (default = TRUE).
        If TRUE, calculates the angle of the Z axis relative to the horizontal:
        \deqn{angleZ = (\tan{^{-1}\frac{acc_{rollmedian(z)}}{(acc_{rollmedian(x)})^2 +
        (acc_{rollmedian(y)})^2}}) * 180/\pi}}

      \item{do.zcx}{
        Boolean (default = FALSE).
        If TRUE, calculates metric zero-crossing count for x-axis. For computation specifics
        see source code of function \link{g.applymetrics}}

      \item{do.zcy}{
        Boolean (default = FALSE).
        If TRUE, calculates metric zero-crossing count for y-axis. For computation specifics
        see source code of function \link{g.applymetrics}}

      \item{do.zcz}{
        Boolean (default = FALSE).
        If TRUE, calculates metric zero-crossing count for z-axis. For computation
        specifics see source code of function \link{g.applymetrics}}

      \item{do.enmo}{
        Boolean (default = TRUE).
        If TRUE, calculates the metric: \deqn{ENMO = \sqrt{acc_x^2 + acc_y^2 + acc_z^2} - 1}
        (if ENMO < 0, then ENMO = 0).}

      \item{do.lfenmo}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric \code{ENMO} over the low-pass filtered accelerations
        (for computation specifics see source code of function \link{g.applymetrics}).
        The filter bound is defined by the parameter \code{hb}.}

      \item{do.en}{
        Boolean (default = FALSE).
        If TRUE, calculates the Euclidean Norm of the raw accelerations:
        \deqn{EN = \sqrt{acc_x^2 + acc_y^2 + acc_z^2}}}

      \item{do.mad}{
        Boolean (default = FALSE).
        If TRUE, calculates the Mean Amplitude Deviation:
        \deqn{MAD = \frac{1}{n}\Sigma|r_i - \overline{r}|}}

      \item{do.enmoa}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric:
        \deqn{ENMOa = \sqrt{acc_x^2 + acc_y^2 + acc_z^2} - 1} (if ENMOa < 0, then ENMOa = |ENMOa|).}

      \item{do.roll_med_acc_x}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.roll_med_acc_y}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.roll_med_acc_z}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.dev_roll_med_acc_x}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.dev_roll_med_acc_y}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.dev_roll_med_acc_z}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.bfen}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.hfen}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.hfenplus}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.lfen}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.lfx}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.lfy}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.lfz}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.hfx}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.hfy}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.hfz}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.bfx}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.bfy}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.bfz}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric. For computation
        specifics see source code of function \link{g.applymetrics}.}

      \item{do.brondcounts}{
        Boolean (default = FALSE).
        this option has been deprecated (October 2022) due to issues with the 
        activityCounts package that we used as a dependency.
        If TRUE, calculated the metric via R package activityCounts.
        We called them BrondCounts because there are large number of activity counts in
        the physical activity and sleep research field. By calling them _brondcounts_
        we clarify that these are the counts proposed by Jan Br&#248;nd and implemented
        in R by Ruben Brondeel. The _brondcounts_ are intended to be an imitation of
        the counts produced by one of the closed source ActiLife software by ActiGraph.}

      \item{do.neishabouricounts}{
        Boolean (default = FALSE).
        If TRUE, calculates the metric via R package actilifecounts, which is an 
        implementation of the algorithm used in the closed-source software ActiLife 
        by ActiGraph (methods published in doi: 10.1038/s41598-022-16003-x). We use 
        the name of the first author (instead of ActiLifeCounts) of the paper and 
        call them NeishabouriCount under the uncertainty that ActiLife will implement 
        this same algorithm over time. To use the Neishabouri counts for the physical 
        activity intensity classification in part 5 (i.e., metric over the threshold.lig, 
        threshold.mod, and threshold.vig would be applied), the \code{acc.metric} 
        argument needs to be set as one of the following: "NeishabouriCount_x",
        "NeishabouriCount_y", "NeishabouriCount_z", "NeishabouriCount_vm" to use the 
        counts in the x-, y-, z-axis or vector magnitude, respectively.}

      \item{lb}{
        Numeric (default = 0.2).
        Lower boundary of the frequency filter (in Hertz) as used in the filter-based metrics.}

      \item{hb}{
        Numeric (default = 15).
        Higher boundary of the frequency filter (in Hertz) as used in the filter-based metrics.}

      \item{n}{
        Numeric (default = n).
        Order of the frequency filter as used in the filter-based metrics.}

      \item{zc.lb}{
        Numeric (default = 0.25).
        Used for zero-crossing counts only. Lower boundary of cut-off frequency filter.}

      \item{zc.hb}{
        Numeric (default = 3).
        Used for zero-crossing counts only. Higher boundary of cut-off frequencies in filter.}

      \item{zc.sb}{
        Numeric (default = 0.01).
        Stop band used for calculation of zero crossing counts. Value is the acceleration threshold
        in g units below which  acceleration will be rounded to zero.}

      \item{zc.order}{
        Numeric (default = 2).
        Used for zero-crossing counts only. Order of frequency filter.}

      \item{zc.scale}{
        Numeric (default = 1)
        Used for zero-crossing counts only. Scaling factor to be applied after
        counts are calculated (GGIR part 3).}

      \item{actilife_LFE}{
        Boolean (default = FALSE).
        If TRUE, calculates the NeishabouriCount metric with the low-frequency extension filter
        as proposed in the closed source ActiLife software by ActiGraph. Only applicable to
        the metric NeishabouriCount.}
    }
  }

  \subsection{params_cleaning}{
    A list of parameters used across all GGIR parts releated to masking or
    imputing data, abbreviated as "cleaning".
    \describe{

      \item{do.imp}{
        Boolean (default = TRUE).
        Whether to impute missing values (e.g., suspected of monitor non-wear 
        or clippling) or not by \link{g.impute} in GGIR \link{g.part2}.
        Recommended setting is TRUE.}

      \item{TimeSegments2ZeroFile}{
        Character (default = NULL).
        Takes path to a csv file that has columns "windowstart" and "windowend"
        to refer to the start and end time of a time windows in format
        "2024-10-12 20:00:00", and "filename" of the GGIR
        milestone data file without the "meta_" segment of the name. GGIR part 2
        uses this to set all acceleration values to zero and the non-wear classification 
        to zero (meaning sensor worn). Motivation: When the 
        accelerometer is not worn during the night GGIR automatically labels them
        as invalid, while the user may like to treat them as zero movement.
        Disclaimer: This functionality was developed in 2019. With hindsight it
        is not generic enough and in need for revision. Please contact GGIR 
        maintainers if you would like us to invest time in improving this 
        functionality.}

      \item{data_cleaning_file}{
        Character (default = NULL).
        Optional path to a csv file you create that holds four columns:
        ID, day_part5, relyonguider_part4, and night_part4. ID should hold the
        participant ID. Columns day_part5 and night_part4 allow you to specify which
        day(s) and night(s) need to be excluded from \link{g.part5} and 
        \link{g.part4}, respectively. When including multiple day(s)/night(s) 
        create a new line for each day/night.
        So, this will be done regardless of whether the rest of GGIR thinks 
        those day(s)/night(s) are valid. Column relyonguider_part4 allows you 
        to specify for which nights \link{g.part4} should fully rely on the
        guider. See also package vignette.}

      \item{excludefirstlast.part5}{
        Boolean (default = FALSE).
        If TRUE then the first and last window (waking-waking,
        midnight-midnight, or sleep onset-onset) are ignored in \link{g.part5}.}

      \item{excludefirstlast}{
        Boolean (default = FALSE).
        If TRUE then the first and last night of the measurement are ignored for
        the sleep assessment in \link{g.part4}.}

      \item{excludefirst.part4}{
        Boolean (default = FALSE).
        If TRUE then the first night of the measurement are ignored for the sleep 
        assessment in \link{g.part4}.}

      \item{excludelast.part4}{
        Boolean (default = FALSE).
        If TRUE then the last night of the measurement are ignored for the sleep 
        assessment in \link{g.part4}.}

      \item{includenightcrit}{
        Numeric (default = 16).
        Minimum number of valid hours per night (24 hour window between noon and noon),
        used for sleep assessment in \link{g.part4}.}

      \item{minimum_MM_length.part5}{
        Numeric (default = 23).
        Minimum length in hours of a MM day to be included in the cleaned \link{g.part5} results.}
        
      \item{study_dates_file}{
        Character (default = c()).
        Full path to csv file containing the first and last date of the expected 
        wear period for every study participant (dates are provided per individual).
        Expected format of the activity diary is: First column headers followed 
        by one row per recording. There should be three columns: first column is 
        recording ID, which needs to match with the ID GGIR extracts from the 
        accelerometer file; second column should contain the first date of the 
        study; and third column the last date of the study. Date columns should be
        by default in format "23-04-2017", or in the date format  specified by 
        argument \code{study_dates_dateformat} (below). If not specified (default), 
        then GGIR would use the first and last day of the recording as beginning 
        and end of the study. Note that these dates are used on top of the 
        \code{data_masking_strategy} selected.}
        
      \item{study_dates_dateformat}{
        Character (default = "%d-%m-%Y").
        To specify the date format used in the \code{study_dates_file} as used 
        by \link[base]{strptime}.}
        
      \item{strategy}{
        Deprecated and replaced by \code{data_masking_strategy}. If \code{strategy} 
        is specified then its value is passed on and used for \code{data_masking_strategy}.}
        
      \item{data_masking_strategy}{
        Numeric (default = 1).
        How to deal with knowledge about study protocol.
        data_masking_strategy = 1 means select data based on \code{hrs.del.start} and \code{hrs.del.end}.
        data_masking_strategy = 2 makes that only the data between the first
        midnight and the last midnight is used.
        data_masking_strategy = 3 selects the most active X days in the file where X is
        specified by argument \code{ndayswindow}, where the days are a series 
        of 24-h blocks starting any time in the day (X hours at the beginning and end
        of this period can be deleted with arguments \code{hrs.del.start} and \code{hrs.del.end})
        data_masking_strategy = 4 to only use the data after the first midnight.
        data_masking_strategy = 5 is similar to \code{data_masking_strategy = 3}, but it selects X complete
        calendar days where X is specified by argument \code{ndayswindow} 
        (X hours at the beginning and end of this period can be deleted with 
        arguments \code{hrs.del.start} and \code{hrs.del.end}).}

      \item{hrs.del.start}{
        Numeric (default = 0).
        How many HOURS after start of experiment did wearing
        of monitor start? Used in GGIR \link{g.part2} when \code{data_masking_strategy = 1}.}

      \item{hrs.del.end}{
        Numeric (default = 0).
        How many HOURS before the end of the experiment did
        wearing of monitor definitely end? Used in GGIR \link{g.part2} when \code{data_masking_strategy = 1}.}

      \item{maxdur}{
        Numeric (default = 0).
        How many DAYS after start of experiment did experiment
        definitely stop? (set to zero if unknown).}

      \item{ndayswindow}{
        Numeric (default = 7).
        If \code{data_masking_strategy} is set to 3 or 5, then this is the size of 
        the window as a
        number of days. For data_masking_strategy 3 value can be fractional, e.g. 7.5,
        while for data_masking_strategy 5 it needs to be an integer.}

      \item{includedaycrit.part5}{
        Numeric (default = 2/3).
        Inclusion criteria used in part 5 for number of valid hours during the 
        waking hours of a day,
        when value is smaller than or equal to 1 used as fraction of waking hours, 
        when value above 1 used as absolute number of valid hours required.
        Do not confuse this argument with argument \code{includedaycrit} which is
        only used in GGIR part 2 and applies to the entire day.}

      \item{segmentWEARcrit.part5}{
        Numeric (default = 0.5).
        Fraction of \code{qwindow} segment expected to be valid in part 5, where 
        0.3 indicates that at least 30 percent of the time should be valid.}
        
      \item{segmentDAYSPTcrit.part5}{
        Numeric vector or length 2 (default = c(0.9, 0)).
        Inclusion criteria for the proportion of the segment that should be 
        classified as day (awake) and spt (sleep period time) to be considered 
        valid. If you are interested in comparing time spent in behaviour then it
        is better to set one of the two numbers to 0, and the other defines the 
        proportion of the segment that should be classified as day or spt, respectively.
        The default setting would focus on waking hour
        segments and includes all segments that overlap for at least 90 percent 
        with waking hours. In order to shift focus to the SPT you could use
        c(0, 0.9) which ensures that all segments that overlap for at least
        90 percent with the SPT are included.
        Setting both to zero would be problematic when comparing time spent in
        behaviours between days or individuals: A complete segment
        would be averaged with an incomplete segments (someone going to bed or waking up
        in the middle of a segment) by which it is no longer clear whether the person
        is less active or sleeps more during that segment. Similarly it is not
        clear whether the person has more wakefulness during SPT for a segment or
        woke up or went to bed during the segment.
      }
      \item{includedaycrit}{
        Numeric (default = 16).
        Minimum required number of valid hours in calendar day specific to analysis 
        in part 2. If you specify two values as in c(16, 16) then the first value 
        will be used in part 2 and the second value will be used in part 5 and 
        applied as a criterion on the full part 5 window. Note that this is then 
        applied in addition to parameter includedaycrit.part5 which only looks 
        at valid data during waking hours.}

      \item{max_calendar_days}{
      Numeric (default = 0).
      The maximum number of calendar days to include (set to zero if unknown).}

      \item{nonWearEdgeCorrection}{
      Boolean (default = TRUE).
        If TRUE then the non-wear detection around the edges of the recording (first
        and last 3 hours) are corrected following description in vanHees2013 as
        has been the default since then. This functionality is advisable when working 
        with sleep clinic or exercise lab data typically lasting less than a day.
      }

      \item{nonwear_approach}{
        Character (default = "2023").
        Whether to use the traditional version of the non-wear detection algorithm 
        (nonwear_approach = "2013") or the new version (nonwear_approach = "2023"). 
        The 2013 version would use the longsize window (windowsizes[3], one hour 
        as default) to check the conditions for nonwear identification and would 
        flag as nonwear the mediumsize window (windowsizes[2], 15 min as default) 
        in the middle. The 2023 version differs in which it would flag as nonwear 
        the full longsize window. For the 2013 method the longsize window is centered 
        in the centre of the mediumsize window, while in the 2023 method the longsizewindow 
        is aligned with its left edge to the left edge of the mediumsize window.
      }
    }
  }

  \subsection{params_phyact}{
    A list of parameters releated to physical activity as used in GGIR \link{g.part2} and GGIR \link{g.part5}.
    \describe{

      \item{mvpathreshold}{
        Numeric (default = 100).
        Acceleration threshold for MVPA estimation in GGIR \link{g.part2}.
        This can be a single number or an vector of numbers,
        e.g., \code{mvpathreshold = c(100, 120)}.
        In the latter case the code will estimate MVPA separately for each threshold.
        If this variable is left blank, e.g., \code{mvpathreshold = c()}, then
        MVPA is not estimated.}

      \item{mvpadur}{
        Numeric (default = 10).
        The bout duration(s) for which MVPA will be calculated. Only used in GGIR \link{g.part2}.}

      \item{boutcriter}{
        Numeric (default = 0.8).
        A number between 0 and 1, it defines what fraction of a bout needs to be above the
        mvpathreshold, only used in GGIR \link{g.part2}.}

      \item{threshold.lig}{
        Numeric (default = 40).
        In \link{g.part5}: Threshold for light physical activity to
        separate inactivity from light. Value can be one number or an vector of multiple
        numbers, e.g., \code{threshold.lig =c(30,40)}. If multiple numbers are entered then
        analysis will be repeated for each combination of threshold values. Threshold is
        applied to the first metric in the milestone data, so if you have only specified
        \code{do.enmo = TRUE} then it will be applied to ENMO.}

      \item{threshold.mod}{
        Numeric (default = 100).
        In \link{g.part5}: Threshold for moderate physical activity
        to separate light from moderate. Value can be one number or an vector of
        multiple numbers, e.g., \code{threshold.mod = c(100, 120)}.
        If multiple numbers are entered then analysis will be repeated for each
        combination of threshold values. Threshold is applied to the first metric in the
        milestone data, so if you have only specified \code{do.enmo = TRUE}
        then it will be applied to ENMO.}

      \item{threshold.vig}{
        Numeric (default = 400).
        In \link{g.part5}: Threshold for vigorous physical activity
        to separate moderate from vigorous. Value can be one number or an vector of
        multiple numbers, e.g., \code{threshold.vig =c(400,500)}. If multiple numbers are
        entered then analysis will be repeated for each combination of threshold values.
        Threshold is applied to the first metric in the milestone data, so if you
        have only specified \code{do.enmo = TRUE} then it will be applied to ENMO.}

      \item{boutdur.mvpa}{
        Numeric (default = c(1, 5, 10)).
        Duration(s) of MVPA bouts in minutes to be extracted.
        It will start with the identification of the longest to the shortest duration.
        In the default setting, it will start with the 10 minute bouts, followed by 5 minute
        bouts in the rest of the data, and followed by 1 minute bouts in the rest of the data.}

      \item{boutdur.in}{
        Numeric (default = c(10, 20, 30)).
        Duration(s) of inactivity bouts in minutes to be extracted.
        Inactivity bouts are detected in the segments of the data which
        were not labelled as sleep or MVPA bouts.
        It will start with the identification of the longest to the shortest duration.
        In the default setting, it will start with the identification of 30 minute bouts,
        followed by 20 minute bouts in the rest of the data, and followed by 10 minute
        bouts in the rest of the data. Note that we use the term inactivity instead
        of sedentary behaviour for the lowest intensity level of behaviour. The reason
        for this is that GGIR does not attempt to classifying the activity type
        sitting at the moment, by which we feel that using the term sedentary
        behaviour would fail to communicate that.
      }

      \item{boutdur.lig}{
        Numeric (default = c(1, 5, 10)).
        Duration(s) of light activity bouts in minutes
        to be extracted. Light activity bouts are detected in the segments of the data
        which were not labelled as sleep, MVPA, or inactivity bouts.
        It will start with the identification of the longest to the shortest duration.
        In the default setting, this will start with the identification of
        10 minute bouts, followed by 5 minute bouts in the rest of the data, and followed
        by 1 minute bouts in the rest of the data.}

      \item{boutcriter.mvpa}{
        Numeric (default = 0.8).
        A number between 0 and 1, it defines what fraction of a bout needs to be above the \code{threshold.mod}.}

      \item{boutcriter.in}{
        Numeric (default = 0.9).
        A number between 0 and 1, it defines what fraction of a bout needs to be below the \code{threshold.lig}.}

      \item{boutcriter.lig}{
        Numeric (default = 0.8).
        A number between 0 and 1, it defines what fraction of a bout needs to be between
        the \code{threshold.lig} and the \code{threshold.mod}.}

      \item{frag.metrics}{
        Character (default = NULL).
        Fragmentation metric to extract. Can be "mean", "TP", "Gini",
        "power", or "CoV", "NFragPM", or all the above metrics with "all".
        See package vignette for description of fragmentation metrics.}

      \item{part6_threshold_combi}{
        Character (default = "40_100_120") to indicate the threshold combination derived in
        part 5 to be used for part 6
      }
    }
  }

  \subsection{params_sleep}{
    A list of parameters used to configure the sleep analysis as performend in
    GGIR \link{g.part3} and \link{g.part4}.
    \describe{

      \item{relyonguider}{
        Boolean (default = FALSE).
        Sustained inactivity bouts (sib) that overlap with the guider are 
        labelled as sleep. If \code{relyonguider = FALSE} and the sib overlaps only
        partially with the guider then it is the sib that defines the edge 
        of the SPT window and not the guider.
        If \code{relyonguider = TRUE} and the sib overlaps only partially with the 
        guider then it is the guider that defines the edge of the SPT window 
        and not the sib. If participants were instructed NOT to wear the 
        accelerometer during waking hours and \code{ignorenonware=FALSE} then 
        set to \code{relyonguider=TRUE}, in all other scenarios set to FALSE.}

      \item{relyonsleeplog}{
        Boolean (default = FALSE).
        Do not use, now replaced by argument relyonguider.
        Values provided to argument relyonsleeplog will be passed on to
        argument relyonguider to not preserve functionality of old R scripts.}

      \item{def.noc.sleep}{
        Numeric (default = 1).
        The time window during which sustained
        inactivity will be assumed to represent sleep, e.g., \code{def.noc.sleep = c(21, 9)}.
        This is only used if no sleep log entry is available. If
        left blank \code{def.noc.sleep = c()} then the 12 hour window centred
        at the least active 5 hours of the 24 hour period will be used
        instead. Here, L5 is hardcoded and will not change by changing
        argument winhr in function \link{g.part2}. If def.noc.sleep is filled
        with a single integer, e.g., \code{def.noc.sleep=c(1)} then the window
        will be detected with based on built in algorithms.
        See argument \code{HASPT.algo} from \link{HASPT} for specifying which of the
        algorithms to use.}

      \item{sleepwindowType}{
        Character (default = "SPT").
        To indicate type of information in the sleeplog, "SPT" for sleep period time.
        Set to "TimeInBed" if sleep log recorded time in bed to enable calculation
        of sleep latency and sleep efficiency.}

      \item{nnights}{
        Numeric (default = NULL).
        This argument has been deprecated.}

      \item{loglocation}{
        Character (default = NULL).
        Path to csv file with sleep log information.
        See package vignette for how to format this file.}

      \item{colid}{
        Numeric (default = 1).
        Column number in the sleep log spreadsheet in which the participant ID code is stored.}

      \item{coln1}{
        Numeric (default = 2).
        Column number in the sleep log spreadsheet where the onset of the first night starts.}

      \item{ignorenonwear}{
        Boolean (default = TRUE).
        If TRUE then ignore detected monitor non-wear periods to avoid
        confusion between monitor non-wear time and sustained inactivity.}

      \item{constrain2range}{
        Deprecated, used to be a Boolean (default = TRUE)
        Whether or not to constrain the range of
        threshold used in the diary free sleep period time window detection.}

      \item{HASPT.algo}{
        Character (default = "HDCZA").
        To indicate what algorithm should be used for the sleep period time detection.
        Default "HDCZA" is Heuristic algorithm looking at Distribution of Change in Z-Angle as
        described in van Hees et al. 2018. Other options included:
        "HorAngle", which is based on HDCZA but replaces non-movement detection of
        the HDCZA algorithm by looking for time segments where the angle of the
        longitudinal sensor axis has an angle relative to the horizontal plane
        between -45 and +45 degrees. And "NotWorn" which is also the same as HDCZA
        but looks for time segments when a rolling average of acceleration
        magnitude is below 5 per cent of its standard deviation, see 
        Cookbook vignette in the Annexes of https://wadpac.github.io/GGIR/ 
        for more detailed guidance on how to use "NotWorn".}
    
      \item{HDCZA_threshold}{
        Numeric (default = c())
        If \code{HASPT.algo} is set to "HDCZA" and HDCZA_threshold is NOT NULL,
        (e.g., HDCZA_threshold = 0.2), then that value will be used as threshold
        in the 6th step in the diagram of Figure 1 in van Hees
        et al. 2018 Scientific Report (doi: 10.1038/s41598-018-31266-z). However,
        doing so has not been supported by research yet and is only intended to
        facilitate methodological research, so we advise sticking with the default in
        line with the publication. Further, if HDCZA_threshold is set to a numeric vector of
        length 2, e.g. c(10, 15), that will be used as percentile and 
        multiplier for the above mentioned 6th step.
      }

      \item{HASPT.ignore.invalid}{
        Boolean (default = FALSE).
        To indicate whether invalid time segments should be ignored in the
        heuristic guiders. If \code{FALSE} (default), the imputed 
        angle or activity metric during the invalid time segments are used.
        If \code{TRUE}, invalid time segments are ignored (i.e., they cannot 
        contribute to the guider). If \code{NA}, then invalid time segments are 
        considered to be no movement segments and can contribute to the guider.
        When HASPT.algo is "NotWorn", HASPT.ignore.invalid is automatically set to
        NA.
        }

      \item{HASIB.algo}{
        Character (default = "vanHees2015").
        To indicate which algorithm should be used to define the
        sustained inactivity bouts (i.e., likely sleep).
        Options: "vanHees2015", "Sadeh1994", "Galland2012".}

      \item{Sadeh_axis}{
        Character (default = "Y").
        To indicate which axis to use for the Sadeh1994 algorithm, and  other algortihms
        that relied on count-based Actigraphy such as Galland2012.}

      \item{sleeplogsep}{
        Character (default = NULL).
        This argument is deprecated.}

      \item{nap_model}{
        Character (default = NULL).
        To specify classification model. Currently the only option is "hip3yr", which
        corresponds to a model trained with hip data in 3-3.5 olds trained with parent diary data.}

      \item{longitudinal_axis}{
        Integer (default = NULL).
        To indicate which axis is the longitudinal axis.
        If not provided, the function will estimate longitudinal axis as the axis
        with the highest 24 hour lagged autocorrelation. Only used when
        \code{sensor.location = "hip"} or \code{HASPT.algo = "HorAngle"}.}

      \item{anglethreshold}{
        Numeric (default = 5).
        Angle threshold (degrees) for sustained inactivity periods detection.
        The algorithm will look for periods of time (\code{timethreshold})
        in which the angle variability is lower than \code{anglethreshold}.
        This can be specified as multiple thresholds, each of which will be implemented, e.g.,
        \code{anglethreshold = c(5,10)}.}

      \item{timethreshold}{
        Numeric (default = 5).
        Time threshold (minutes) for sustained inactivity periods detection.
        The algorithm will look for periods of time (\code{timethreshold})
        in which the angle variability is lower than \code{anglethreshold}.
        This can be specified as multiple thresholds, each of which will be implemented, e.g.,
        \code{timethreshold = c(5,10)}.}

      \item{possible_nap_window}{
        Numeric (default = c(9, 18)).
        Numeric vector of length two with range in clock hours during which naps are
        assumed to take place, e.g., \code{possible_nap_window = c(9, 18)}. Currently
        used in the context of an explorative nap classification algortihm that
        was trained in 3.5 year olds. }

      \item{possible_nap_dur}{
        Numeric (default = c(15, 240)).
        Numeric vector of length two with range in duration (minutes) of a nap,
        e.g., \code{possible_nap_dur = c(15, 240)}. Currently
        used in the context of an explorative nap classification algortihm that
        was trained in 3.5 year olds.}
        
      \item{sleepefficiency.metric}{
        Numeric (default = 1).
        If 1 (default), sleep efficiency is calculated as detected sleep time during
        the SPT window divided by log-derived time in bed. If 2, sleep efficiency is
        calculated as detected sleep time during the SPT window divided by detected
        duration in sleep period time plus sleep latency (where sleep latency refers
        to the difference between time in bed and sleep onset). sleepefficiency.metric is only considered
        when argument \code{sleepwindowType = "TimeInBed"}}
        
      \item{possible_nap_edge_acc}{
        Numeric (default = Inf).
        Maximum acceleration before or after the SIB for the nap to be considered.
        By default this will allow all possible naps.
      }
    }
  }

  \subsection{params_247}{
    A list of parameters releated to description of 24/7 behaviours that do not fall
    under conventional physical activity or sleep outcomes, these parameters are used
    in GGIR \link{g.part2} and GGIR \link{g.part5}:
    \describe{

      \item{qwindow}{
        Numeric or character (default = c(0, 24)).
        To specify windows over which all variables are calculated, e.g., acceleration 
        distribution, number of valid hours, LXMX analysis, MVPA.
        If numeric, qwindow should have length two, e.g., \code{qwindow = c(0, 24)},
        all variables will only be calculated over the full 24 hours in a day. If
        \code{qwindow = c(8, 24)} variables will be calculated over the window 0-8, 8-24 and 0-24.
        All days in the recording will be segmented based on these values.
        If you want to use a day specific segmentation in each day then you can set 
        qwindow to be the full path to activity diary file (character). Expected 
        format of the activity diary is: First column headers followed by one row 
        per recording, first column is recording ID, which needs to match with the 
        ID GGIR extracts from the accelerometer file. Followed by date column in 
        format "23-04-2017", where date format is specified by argument 
        \code{qwindow_dateformat} (below). Use the character combination date, 
        Date or DATE in the column name. This is followed by one or multiple columns 
        with start times for the activity types in that day format in hours:minutes:seconds. 
        The header of the column will be used as label for each activity type. 
        Insert a new date column before continuing with activity types for next day. 
        Leave missing values empty. If an activity log is used then individuals who 
        do not appear in the activity log will still be processed with value 
        \code{qwindow = c(0, 24)}. Dates with no activity log data can be skipped,
        no need to have a column with the date followed by a column with the next 
        date. If times in the activity diary are not multiple of the short window 
        size (epoch length), the next epoch is considered (e.g., with epoch of 5 
        seconds, 8:00:02 will be redefined as 8:00:05 in the activity log).
        When using the qwindow functionality in combination with GGIR part 5 then
        make sure to check that arguments \code{segmentWEARcrit.part5} and 
        \code{segmentDAYSPTcrit.part5} are specified to your research needs.
        }

      \item{qwindow_dateformat}{
        Character (default = "%d-%m-%Y").
        To specify the date format used in the activity log as used by \link[base]{strptime}.}

      \item{M5L5res}{
        Numeric (default = 10).
        Resolution of L5 and M5 analysis in minutes.}

      \item{winhr}{
        Numeric (default = 5).
        Vector of window size(s) (unit: hours) of LX and MX analysis,
        where look for least and most active consecutive number of X hours.}

      \item{qlevels}{
        Numeric (default = NULL).
        Vector of percentiles for which value needs to be extracted. These need to 
        be expressed as a fraction of 1, e.g., c(0.1, 0.5, 0.75). There is no limit 
        to the number of percentiles. If left empty then percentiles will not be extracted. 
        Distribution will be derived from short epoch metric data. Argument qlevels 
        can for example be used for the MX-metrics (e.g. Rowlands et al) as discussed in the
        \href{https://cran.r-project.org/package=GGIR/vignettes/GGIR.html}{main package vignette}}

      \item{ilevels}{
        Numeric (default = NULL).
        Levels for acceleration value frequency distribution in m\emph{g}, e.g., 
        \code{ilevels = c(0,100,200)}. There is no limit to the number of levels. 
        If left empty then the intensity levels will not be extracted. Distribution 
        will be derived from short epoch metric data.}

      \item{iglevels}{
        Numeric (default = NULL).
        Levels for acceleration value frequency distribution
        in m\emph{g} used for intensity gradient calculation (according to the method by
        Rowlands 2018). By default this is argument is empty and the intensity gradient
        calculation is not done. The user can either provide a single value (any) to
        make the intensity gradient use the bins \code{iglevels = c(seq(0,4000,by=25), 8000)}
        or the user could specify their own distribution. There is no constriction to the
        number of levels.}

      \item{IVIS_windowsize_minutes}{
        Numeric (default = 60).
        Window size of the Intradaily Variability (IV) and Interdaily
        Stability (IS) metrics in minutes, needs to be able to add up to 24 hours.}

      \item{IVIS_epochsize_seconds}{
        Numeric (default = NULL).
        This argument is deprecated.}

      \item{IVIS.activity.metric}{
        Numeric (default = 2).
        Metric used for activity calculation.
        Value = 1, uses continuous scaled acceleration.
        Value = 2, tries to collapse acceleration into a binary score of rest
        versus active to try to simulate the original approach.}

      \item{IVIS_acc_threshold}{
        Numeric (default = 20).
        Acceleration threshold to distinguish inactive from active.}

      \item{qM5L5}{
        Numeric (default = NULL).
        Percentiles (quantiles) to be calculated over L5 and M5 window.}

      \item{MX.ig.min.dur}{
        Numeric (default = 10).
        Minimum MX duration needed in order for intensity gradient to be calculated.}

      \item{LUXthresholds}{
        Numeric (default = c(0, 100, 500, 1000, 3000, 5000, 10000)).
        Vector with numeric sequence corresponding to
        the thresholds used to calculate time spent in LUX ranges.}

      \item{LUX_cal_constant}{
        Numeric (default = NULL).
        If both LUX_cal_constant and LUX_cal_exponent are
        provided LUX values are converted based on formula y = constant * exp(x * exponent)}

      \item{LUX_cal_exponent}{
        Numeric (default = NULL).
        If both LUX_cal_constant and LUX_cal_exponent are provided LUX
        LUX values are converted based on formula y = constant * exp(x * exponent)}

      \item{LUX_day_segments}{
        Numeric (default = NULL).
        Vector with hours at which the day should be segmented for
        the LUX analysis.}

      \item{L5M5window}{
        Argument deprecated after version 1.5-24.
        This argument used to define the start and end time, in 24 hour clock hours,
        over which L5M5 needs to be calculated. Now this is done with argument qwindow.}

      \item{cosinor}{
        Boolean (default = FALSE). Whether to apply the cosinor analysis from the ActCR package.}
        
      \item{part6CR}{
        Boolean (default = FALSE) to indicate whether circadian rhythm analysis should be run by part 6.
      }
      \item{part6HCA}{
        Boolean (default = FALSE) to indicate whether Household Co Analysis should 
        be run by part 6.
      }
      \item{part6Window}{
        Character vector with length two (default = c("start", "end")) to indicate
        the start and the end of the time series to be used for circadian rhythm analysis
        in part 6. In other words, this parameters is not used for Household co-analysis.
        Alternative values are: "Wx", "Ox", "Hx", where "x" is a number to indicat
        the xth wakeup, onset or hour of the recording. Negative values for "x"
        are also possible and will count relative to the end of the recording. For example,
        c("W1", "W-1") goes from the first till the last wakeup, c("H5", "H-5") 
        ignores the first and last 5 hours, and c("O2", "W10") goes from the second
        onset till the 10th wakeup time.
      }
    }
  }

  \subsection{params_output}{
    A list of parameters used to specify whether and how GGIR stores its output at various stages of the
    process.
    \describe{

      \item{storefolderstructure}{
        Boolean (default = FALSE).
        Store folder structure of the accelerometer data.}

      \item{do.part2.pdf}{
        Boolean (default = TRUE).
        In \link{g.part2}: Whether to generate a pdf for \link{g.part2}.}

      \item{do.part3.pdf}{
        Boolean (default = TRUE).
        In \link{g.part3}: Whether to generate a pdf for \link{g.part3}.}

      \item{timewindow}{
        Character (default = c("MM", "WW")).
        In \link{g.part5}: Timewindow over which summary statistics are derived.
        Value can be "MM" (midnight to midnight), "WW" (waking time to waking time),
        "OO" (sleep onset to sleep onset), or any combination of them.}

      \item{save_ms5rawlevels}{
        Boolean (default = FALSE).
        In \link{g.part5}: Whether to save the time series classification (levels)
        as csv or RData files (as defined by \code{save_ms5raw_format}). Note that 
        time stamps will be stored in the column \code{timenum} in UTC format (i.e., 
        seconds from 1970-01-01). To convert timenum to time stamp format, you 
        need to specify your desired time zone, e.g., 
        \code{as.POSIXct(mdat$timenum, tz = "Europe/London")}.}

      \item{save_ms5raw_format}{
        Character (default = "csv").
        In \link{g.part5}: To specify how data should be stored: either "csv" or 
        "RData". Only used if \code{save_ms5rawlevels = TRUE}.}

      \item{save_ms5raw_without_invalid}{
        Boolean (default = TRUE).
        In \link{g.part5}: To indicate whether to remove invalid days from the 
        time series output files. Only used if \code{save_ms5rawlevels = TRUE}.}

      \item{epochvalues2csv}{
        Boolean (default = FALSE).
        In \link{g.part2}: If TRUE then epoch values are exported to a csv file.
        Here, non-wear time is imputed where possible.}

      \item{do.sibreport}{
        Boolean (default = FALSE).
        In \link{g.part4}: To indicate whether to generate report for the sustained 
        inactivity bouts (SIB). If set to TRUE and when an advanced sleep diary is
        available in part 4 then part 5 will use this to generate summary statistics
        on the overlap between self-reported nonwear and napping with SIB. Here,
        SIB can be filter based on argument possible_nap_edge_acc and the first value
        of possible_nap_dur}

      \item{do.visual}{
        Boolean (default = TRUE).
        In \link{g.part4}: If TRUE, the function will generate a pdf with a visual
        representation of the overlap between the sleeplog entries and the accelerometer
        detections. This can be used to visually verify that the sleeplog entries do
        not come with obvious mistakes.}

      \item{outliers.only}{
        Boolean (default = FALSE).
        In \link{g.part4}: Only used if \code{do.visual = TRUE}. If FALSE,
        all available nights are included in the visual representation of the data and sleeplog.
        If TRUE, then only nights with a difference in onset or waking time
        larger than the variable of argument \code{criterror} will be included.}

      \item{criterror}{
        Numeric (default = 3).
        In \link{g.part4}: Only used if \code{do.visual = TRUE} and \code{outliers.only = TRUE}.
        criterror specifies the number of minimum number of hours difference
        between sleep log and  accelerometer estimate for the night to be
        included in the visualisation.}

      \item{visualreport}{
        Boolean (default = TRUE).
        If TRUE, then generate visual report based on combined output
        from \link{g.part2} and \link{g.part4}. Please note that the visual report
        was initially developed to provide something to show to study participants
        and not for data quality checking purposes. Over time we have improved
        the visual report to also be useful for QC-ing the data. However, some of
        the scorings as shown in the visual report are created for the visual report
        only and may not reflect the scorings in the main GGIR analyses as reported in the
        quantitative csv-reports. Most of our effort in the past 10 years has gone
        into making sure that the csv-report are correct, while the visualreport has
        mostly been a side project. This is unfortunate and we hope to find funding
        in the future to design a new report specifically for the purpose of
        QC-ing the analyses done by GGIR.}

      \item{viewingwindow}{
        Numeric (default = 1).
        Centre the day as displayed around noon (\code{viewingwindow = 1}) or around 
        midnight (\code{viewingwindow = 2}) in the visual report generated with 
        \code{visualreport = TRUE}.}

      \item{week_weekend_aggregate.part5}{
        Boolean (default = FALSE).
        In \link{g.part5}: To indicate whether week and weekend-days aggregates
        should be stored. This is turned off by default as it generates a
        large number of extra columns in the output report.}

      \item{dofirstpage}{
        Boolean (default = TRUE).
        To indicate whether a first page with histograms summarizing the whole
        measurement should be added in the file summary reports generated with \code{visualreport = TRUE}.}

      \item{sep_reports}{
        Character (default = ",").
        Value used as sep argument in \link[data.table]{fwrite} for writing csv reports.}
        
      \item{dec_reports}{
        Character (default = ".").
        Value used as dec argument in \link[data.table]{fwrite} for writing csv reports.}

      \item{sep_config}{
        Character (default = ",").
        Value used as sep argument in \link[data.table]{fwrite} for writing csv config file.}
        
      \item{dec_config}{
        Character (default = ".").
        Value used as dec argument in \link[data.table]{fwrite} for writing csv config file.}

      \item{visualreport_without_invalid}{
          Boolean (default = TRUE).
          If TRUE, then reports generated with \code{visualreport = TRUE} only show
          the windows with sufficiently valid data according to \code{includedaycrit}
          when viewingwindow = 1 or \code{includenightcrit} when viewingwindow = 2}
    }
  }
}
\examples{
\dontrun{
  mode = c(1,2,3,4,5)
  datadir = "C:/myfolder/mydata"
  outputdir = "C:/myresults"
  studyname ="test"
  f0 = 1
  f1 = 2
  GGIR(#-------------------------------
       # General parameters
       #-------------------------------
       mode = mode,
       datadir = datadir,
       outputdir = outputdir,
       studyname = studyname,
       f0 = f0,
       f1 = f1,
       overwrite = FALSE,
       do.imp = TRUE,
       idloc = 1,
       print.filename = FALSE,
       storefolderstructure = FALSE,
       #-------------------------------
       # Part 1 parameters:
       #-------------------------------
       windowsizes = c(5,900,3600),
       do.cal = TRUE,
       do.enmo = TRUE,
       do.anglez = TRUE,
       chunksize = 1,
       printsummary = TRUE,
       #-------------------------------
       # Part 2 parameters:
       #-------------------------------
       data_masking_strategy = 1,
       ndayswindow = 7,
       hrs.del.start = 1,
       hrs.del.end = 1,
       maxdur = 9,
       includedaycrit = 16,
       L5M5window = c(0,24),
       M5L5res = 10,
       winhr = c(5,10),
       qlevels = c(c(1380/1440),c(1410/1440)),
       qwindow = c(0,24),
       ilevels = c(seq(0,400,by=50),8000),
       mvpathreshold = c(100,120),
       #-------------------------------
       # Part 3 parameters:
       #-------------------------------
       timethreshold = c(5,10),
       anglethreshold = 5,
       ignorenonwear = TRUE,
       #-------------------------------
       # Part 4 parameters:
       #-------------------------------
       excludefirstlast = FALSE,
       includenightcrit = 16,
       def.noc.sleep = 1,
       loglocation = "D:/sleeplog.csv",
       outliers.only = FALSE,
       criterror = 4,
       relyonguider = FALSE,
       colid = 1,
       coln1 = 2,
       do.visual = TRUE,
       #-------------------------------
       # Part 5 parameters:
       #-------------------------------
       # Key functions: Merging physical activity with sleep analyses
       threshold.lig = c(30,40,50),
       threshold.mod = c(100,120),
       threshold.vig = c(400,500),
       excludefirstlast = FALSE,
       boutcriter = 0.8,
       boutcriter.in = 0.9,
       boutcriter.lig = 0.8,
       boutcriter.mvpa = 0.8,
       boutdur.in = c(10,20,30),
       boutdur.lig = c(1,5,10),
       boutdur.mvpa = c(1,5,10),
       timewindow = c("WW"),
       #-----------------------------------
       # Report generation
       #-------------------------------
       do.report = c(2,4,5))

       # For externally derived Actiwatch data in .AWD format:
       GGIR(datadir = "/media/actiwatch_awd", # folder with epoch level .AWD file
          outputdir = "/media/myoutput",
          dataFormat = "actiwatch_awd",
          extEpochData_timeformat = "\%m/\%d/\%Y \%H:\%M:\%S",
          mode = 1:5,
          do.report = c(2, 4, 5),
          windowsizes = c(60, 900, 3600), # 60 is the expected epoch length
          visualreport = FALSE,
          outliers.only = FALSE,
          overwrite = TRUE,
          HASIB.algo = "Sadeh1994",
          def.noc.sleep = c()) # <= because we cannot use HDCZA for ZCY

       # For externally derived Actiwatch data in .CSV format:
       GGIR(datadir = "/media/actiwatch_csv", # folder with epoch level .AWD file
          outputdir = "/media/myoutput",
          dataFormat = "actiwatch_csv",
          extEpochData_timeformat = "\%m/\%d/\%Y \%H:\%M:\%S",
          mode = 1:5,
          do.report = c(2, 4, 5),
          windowsizes = c(15, 900, 3600), # 15 is the expected epoch length
          visualreport = FALSE,
          outliers.only = FALSE,
          HASIB.algo = "Sadeh1994",
          def.noc.sleep = c()) # <= because we cannot use HDCZA for ZCY

       # For externally derived UK Biobank data in .CSV format:
       GGIR(datadir = "/media/ukbiobank",
           outputdir = "/media/myoutput",
           dataFormat = "ukbiobank_csv",
           extEpochData_timeformat = "\%m/\%d/\%Y \%H:\%M:\%S",
           mode = c(1:2),
           do.report = c(2),
           windowsizes = c(5, 900, 3600), # We know that data was stored in 5 second epoch
           desiredtz = "Europe/London", # We know that data was collected in the UK
           visualreport = FALSE,
           overwrite = TRUE)
        
       # For externally derived ActiGraph count data in .CSV format assuming
       # a study protocol where sensor was not worn during the night:
       GGIR(datadir = "/examplefiles",
           outputdir = "",
           dataFormat = "actigraph_csv",
           mode = 1:5,
           do.report = c(2, 4, 5),
           windowsizes = c(5, 900, 3600),
           threshold.in = round(100 * (5/60), digits = 2),
           threshold.mod = round(2500 * (5/60), digits = 2),
           threshold.vig = round(10000 * (5/60), digits = 2),
           extEpochData_timeformat = "\%m/\%d/\%Y \%H:\%M:\%S",
           do.neishabouricounts = TRUE,
           acc.metric = "NeishabouriCount_x",
           HASPT.algo = "NotWorn",
           HASIB.algo = "NotWorn",
           do.visual = TRUE,
           includedaycrit = 10,
           includenightcrit = 10,
           visualreport = FALSE,
           outliers.only = FALSE,
           save_ms5rawlevels = TRUE,
           ignorenonwear = FALSE,
           HASPT.ignore.invalid = FALSE,
           save_ms5raw_without_invalid = FALSE)
           
           
       # For externally derived Sensear data in .xls format:  
        GGIR(datadir = "C:/yoursenseweardatafolder",
            outputdir = "D:/youroutputfolder",
            mode = 1:5,
            windowsizes = c(60, 900, 3600),
            threshold.in = 1.5,
            threshold.mod = 3,
            threshold.vig = 6,
            dataFormat = "sensewear_xls",
            extEpochData_timeformat = "\%d-\%b-\%Y \%H:\%M:\%S",
            HASPT.algo = "NotWorn",
            desiredtz = "America/New_York",
            overwrite = TRUE,
            do.report = c(2, 4, 5),
            visualreport = FALSE)
           
  }
}
\author{
  Vincent T van Hees <v.vanhees@accelting.com>
}
\references{
  \itemize{
    \item van Hees VT, Gorzelniak L, Dean Leon EC, Eder M, Pias M, et al. (2013) Separating
    Movement and Gravity Components in an Acceleration Signal and Implications for the
    Assessment of Human Daily Physical Activity. PLoS ONE 8(4): e61691.
    doi:10.1371/journal.pone.0061691
    \item van Hees VT, Fang Z, Langford J, Assah F, Mohammad A, da Silva IC, Trenell MI,
    White T, Wareham NJ, Brage S. Auto-calibration of accelerometer data for
    free-living physical activity assessment using local gravity and temperature:
    an evaluation on four continents. J Appl Physiol (1985). 2014 Aug 7
    \item van Hees VT, Sabia S, et al. (2015) A novel, open access method to
    assess sleep duration using a wrist-worn accelerometer, PLoS ONE, November 2015
  }
}
