% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/refine_chapter_overview.R
\name{refine_chapter_overview}
\alias{refine_chapter_overview}
\title{Processes A 'chapter_overview' Data Frame}
\usage{
refine_chapter_overview(
  chapter_overview = NULL,
  data = NULL,
  chunk_templates = NULL,
  label_separator = " - ",
  name_separator = NULL,
  single_y_bivariates_if_indep_cats_above = 3,
  single_y_bivariates_if_deps_above = 20,
  always_show_bi_for_indep = NULL,
  hide_bi_entry_if_sig_above = 1,
  hide_chunk_if_n_below = 10,
  hide_variable_if_all_na = TRUE,
  keep_dep_indep_if_no_overlap = FALSE,
  organize_by = c(".chapter_number", ".variable_label_prefix_dep",
    ".variable_name_indep", ".template_name"),
  arrange_section_by = c(.chapter_number = FALSE, chapter = FALSE, .variable_position_dep
    = FALSE, .variable_position_indep = FALSE, .template_name = FALSE),
  na_first_in_section = TRUE,
  max_width_obj = 128,
  max_width_chunk = 128,
  max_width_file = 64,
  max_width_folder_name = 12,
  sep_obj = "_",
  sep_chunk = "-",
  sep_file = "-",
  filename_prefix = "",
  ...,
  progress = TRUE,
  variable_group_dep = ".variable_group_dep",
  variable_group_prefix = NULL,
  n_range_glue_template_1 = "{n}",
  n_range_glue_template_2 = "[{n[1]}-{n[2]}]",
  log_file = NULL
)
}
\arguments{
\item{chapter_overview}{\emph{What goes into each chapter and sub-chapter}

\verb{obj:<data.frame>|obj:<tbl_df>} // Required

Data frame (or tibble, possibly grouped). One row per chapter. Should
contain the columns 'chapter' and 'dep', Optionally 'indep' (independent
variables) and other informative columns as needed.}

\item{data}{\emph{Survey data}

\verb{obj:<data.frame>|obj:<tbl_df>|obj:<srvyr>} // Required

A data frame (or a srvyr-object) with the columns specified in the
chapter_structure 'dep', etc columns.}

\item{chunk_templates}{\emph{Chunk templates}

\verb{obj:<data.frame>|obj:<tbl_df>|NULL} // \emph{default:} \code{NULL} (\code{optional})

Must contain columns \code{name} (user-specified unique name for the template),
\code{template} (the chunk template as \code{{glue}}-specification, \code{variable_type_dep}
and optionally \code{variable_type_indep}. The latter two are list-columns of
prototype vectors specifying which data the template will be applied to.
Can optionally contain columns whose names match the default options for
the function. These will then override the default function-wide options
for the specific template.}

\item{label_separator}{\emph{Variable label separator}

\verb{scalar<character>} // \emph{default:} \code{NULL} (\code{optional})

String to split labels on main question and sub-items.}

\item{name_separator}{\emph{Variable name separator}

\verb{scalar<character>} // \emph{default:} \code{NULL} (\code{optional})

String to split column names in data between main question and sub-items}

\item{single_y_bivariates_if_indep_cats_above}{\emph{Single y bivariates if indep-cats above ...}

\verb{scalar<integer>} // \emph{default:} \code{3} (\code{optional})

Figures and tables for bivariates can become very long if the independent
variable has many categories. This argument specifies the number of indep categories
above which only single y bivariates should be shown.}

\item{single_y_bivariates_if_deps_above}{\emph{Single y bivariates if dep-vars above ...}

\verb{scalar<integer>} // \emph{default:} \code{20} (\code{optional})

Figures and tables for bivariates can become very long if there are many dependent
variables in a battery/question matrix. This argument specifies the number of dep variables
above which only single y bivariates should be shown. Set to 0 to always show single y bivariates.}

\item{always_show_bi_for_indep}{\emph{Always show bivariate for indep-variable}

\verb{vector<character>} // \emph{default:} \code{NULL} (\code{optional})

Specific combinations with a by-variable where bivariates should always be shown.}

\item{hide_bi_entry_if_sig_above}{\emph{p-value threshold for hiding bivariate entry}

\verb{scalar<double>} // \emph{default:} \code{1} (\code{optional})

Whether to hide bivariate entry if significance is above this value.
Defaults to showing all.}

\item{hide_chunk_if_n_below}{\emph{Hide result if N below}

\verb{scalar<integer>} // \emph{default:} \code{10} (\code{optional})

Whether to hide result if N for a given dataset is below
this value. NOTE: Exceptions will be made to chr_table and chr_plot as these are
typically exempted in the first place. This might change in the future with
a separate argument.}

\item{hide_variable_if_all_na}{\emph{Hide variable from outputs if containing all NA}

\verb{scalar<boolean>} // \emph{default:} \code{TRUE} (\code{optional})

Whether to remove variables if all values are NA.}

\item{keep_dep_indep_if_no_overlap}{\emph{Keep dep-indep if no overlap}

\verb{scalar<boolean>} // \emph{default:} \code{FALSE} (\code{optional})

Whether to keep dep-indep rows if there is no overlap.}

\item{organize_by}{\emph{Grouping columns}

\verb{vector<character>} // \emph{default:} \code{NULL} (\code{optional})

Column names used for identifying chapters and sections.}

\item{arrange_section_by}{\emph{Sorting columns}

\verb{vector<character>} or \verb{named vector<logical>} // \emph{default:} \code{NULL} (\code{optional})

Column names used for sorting sections within each organize_by group. Can include
any column present in the output dataframe (both original and generated columns).
If character vector, will assume all are to be arranged in ascending order.
If a named logical vector, FALSE will indicate ascending, TRUE descending.
An error will be thrown if any specified column does not exist in the output.
Defaults to sorting in ascending order (alphabetical) for commonly needed
variable name/label info, and in descending order for chunk_templates as one
typically wants \emph{u}nivariates before \emph{b}ivariates.}

\item{na_first_in_section}{\emph{Whether to place NAs first when sorting}

\verb{scalar<logical>} // \emph{default:} \code{TRUE} (\code{optional})

Default ascending and descending sorting with \code{dplyr::arrange()} is to place
NAs at the end. This would have placed univariates at the end, etc. Thus,
saros places NAs first in the section. Set this to FALSE to override.}

\item{max_width_obj, max_width_chunk, max_width_file}{\emph{Maximum object width}

\verb{scalar<integer>} // \emph{default:} \code{NULL} (\code{optional})

Maximum width for names of objects (in R/Python environment),
chunks (#| label: ) and optional files. Note, will always replace variable
labels with variable names, to avoid very long file names.
Note for filenames: Due to OneDrive having a max path of about
400 characters, this can quickly be exceeded with a long path base path,
long file names if using labels as part of structure, and hashing with
Quarto's \code{cache: true} feature. Thus consider restricting max_width_file
to lower than what you optimally would have wished for.}

\item{max_width_folder_name}{\emph{Maximum clean folder name length}

\verb{scalar<integer>} // \emph{default:} \code{NULL} (\code{optional})

Whereas \code{max_width_file} truncates the file name, this argument truncates
the folder name. It will not impact the report or chapter names in website,
only the folders.}

\item{sep_obj, sep_chunk, sep_file}{\emph{Separator string}

\verb{scalar<character>} // \emph{default:} \code{"_"} (\code{optional})

Separator to use between grouping variables. Defaults to underscore for
object names and hyphen for chunk labels and file names.}

\item{filename_prefix}{\emph{Prefix string for all qmd filenames}

\verb{scalar<character>} // \emph{default:} \code{""} (\code{optional})

For mesos setup it might be useful to set these files (and related sub-folders) with an underscore
(\code{filename_prefix = "_"}) in front as other stub files will include these main qmd files.}

\item{...}{\emph{Dynamic dots}

<\href{https://rlang.r-lib.org/reference/dyn-dots.html}{\code{dynamic-dots}}>

Arguments forwarded to the corresponding functions that create the elements.}

\item{progress}{\emph{Whether to display progress message}

\verb{scalar<logical>} // \emph{default:} \code{TRUE}

Mostly useful when hide_bi_entry_if_sig_above < 1}

\item{variable_group_dep}{\emph{Name for the variable_group_dep column}

\verb{scalar<string>} // \emph{default:} \code{".variable_group_dep"}

This column is used to group variables that are part of the same bivariate analysis.}

\item{variable_group_prefix}{\emph{Set a prefix to more easily find it in your labels}

\verb{scalar<string>} // \emph{default:} \code{NULL}

By default, the .variable_group column is just integers. If you wish to
use this as part of your object/label/filename numbering scheme, a number
by itself will not be very informative. Hence you could set a prefix such
as "Group" to distinguish this column from other columns in the chapter_structure.}

\item{n_range_glue_template_1, n_range_glue_template_2}{\verb{scalar<string>} // \emph{default:} \verb{"\{n\}" and "[\{n[1]\}, \{n[2]\}]} (\code{optional})

Glue templates for the n_range columns to be created.}

\item{log_file}{\emph{Path to log file}

\verb{scalar<string>} // \emph{default:} \code{"_log.txt"} (\code{optional})

Path to log file. Set to NULL to disable logging.}
}
\value{
A grouped tibble (data.frame) with columns that fall into two main categories:

\strong{Input columns (from user data):}
\itemize{
\item \code{chapter} (character): Chapter name (input)
\item \code{dep} (character): Dependent variable selector (input)
\item \code{indep} (character, optional): Independent variable selector (input)
}

\strong{Constructed columns (all start with a dot):}
\itemize{
\item \code{.variable_name}, \code{.variable_position} (character/integer): Variable name and position
\item \code{.variable_label}, \code{.variable_label_prefix}, \code{.variable_label_suffix} (character): Variable label and its components
\item \code{.variable_type}, \code{.variable_type_dep}, \code{.variable_type_indep} (character): Variable type(s)
\item \code{.variable_name_dep}, \code{.variable_name_indep} (character): Names of dependent/independent variables
\item \code{.variable_label_prefix_dep}, \code{.variable_label_prefix_indep} (character): Label prefixes for dep/indep
\item \code{.variable_group_dep} (character/factor): Grouping variable for bivariate analysis
\item \code{.variable_group_id} (integer): Numeric group identifier for bivariate analysis
\item \code{.chapter_number} (integer): Chapter number
\item \code{.template_name} (character): Name of chunk template used
\item \code{.obj_name}, \code{.chunk_name}, \code{.file_name} (character): Object, chunk, and file names (for output)
\item \code{.n}, \code{.n_range} (integer/character): Sample size and range
\item \code{.n_cats_dep}, \code{.n_cats_indep} (integer): Number of categories for dep/indep
\item \code{.max_chars_labels_dep}, \code{.max_chars_labels_indep} (integer): Max label length for dep/indep
\item \code{.max_chars_cats_dep}, \code{.max_chars_cats_indep} (integer): Max category label length for dep/indep
\item \code{.n_dep}, \code{.n_indep} (integer): Number of dep/indep variables in group
\item \code{.bi_test}, \code{.p_value} (character/numeric): Statistical test name and p-value for bivariates
\item \code{.keep_bi_rows} (logical): Whether bivariate row is kept
\item Other columns may be present depending on chunk templates and options.
}

\strong{Row count estimate:}
\itemize{
\item The number of rows in the output depends on the number of chapters,
dep/indep combinations, and chunk templates. Typically, it is the sum of
all unique variable combinations specified in \code{chapter_overview}, expanded
by chunk templates and filtered by significance and other options.
For a simple overview, expect one row per variable per chapter; for
bivariates, one row per dep-indep pair.
}

\strong{Grouping variables:}
\itemize{
\item The columns used for grouping (i.e., \code{dplyr::grouped_df}) are determined
by the \code{organize_by} argument. By default, this includes
\code{.chapter_number}, \code{.variable_label_prefix_dep}, \code{.variable_name_indep},
and \code{.template_name}, but can be customized. These columns define how
the output is grouped for further analysis or reporting.
}

See function source and documentation for details on each column's meaning and usage.
}
\description{
Processes A 'chapter_overview' Data Frame
}
\examples{
ref_df <- refine_chapter_overview(
  chapter_overview = ex_survey_ch_overview
)
}
