% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/GenAlgControl.R
\name{genAlgControl}
\alias{genAlgControl}
\title{Set control arguments for the genetic algorithm}
\usage{
genAlgControl(populationSize, numGenerations, minVariables, maxVariables,
  elitism = 10L, mutationProbability = 0.01, crossover = c("single",
  "random"), maxDuplicateEliminationTries = 0L, verbosity = 0L,
  badSolutionThreshold = 2, fitnessScaling = c("none", "exp"))
}
\arguments{
\item{populationSize}{The number of "chromosomes" in the population (between 1 and 2^16)}

\item{numGenerations}{The number of generations to produce (between 1 and 2^16)}

\item{minVariables}{The minimum number of variables in the variable subset (between 0 and p - 1 where p is the total number of variables)}

\item{maxVariables}{The maximum number of variables in the variable subset (between 1 and p, and greater than \code{minVariables})}

\item{elitism}{The number of absolute best chromosomes to keep across all generations (between 1 and min(\code{populationSize} * \code{numGenerations}, 2^16))}

\item{mutationProbability}{The probability of mutation (between 0 and 1)}

\item{crossover}{The crossover type to use during mating (see details). Partial matching is performed}

\item{maxDuplicateEliminationTries}{The maximum number of tries to eliminate duplicates
(a value of \code{0} or \code{NULL} means that no checks for duplicates are done.}

\item{verbosity}{The level of verbosity. 0 means no output at all, 2 is very verbose.}

\item{badSolutionThreshold}{The worst child must not be more than \code{badSolutionThreshold} times worse than the worse parent.
If less than 0, the child must be even better than the worst parent. If the algorithm can't find a better child
in a long time it issues a warning and uses the last found child to continue.}

\item{fitnessScaling}{How the fitness values are internally scaled before the selection probabilities are assigned
to the chromosomes. See the details for possible values and their meaning.}
}
\value{
An object of type \code{\link{GenAlgControl}}
}
\description{
The population must be large enough to allow the algorithm to explore the whole solution space. If
the initial population is not diverse enough, the chance to find the global optimum is very small.
Thus the more variables to choose from, the larger the population has to be.
}
\details{
The initial population is generated randomly. Every chromosome uses between \code{minVariables} and
\code{maxVariables} (uniformly distributed).

If the mutation probability (\code{mutationProbability} is greater than 0, a random number of
variables is added/removed according to a truncated geometric distribution to each offspring-chromosome.
The resulting distribution of the total number of variables in the subset is not uniform anymore, but almost (the smaller the
mutation probability, the more "uniform" the distribution). This should not be a problem for most
applications.

The user can choose between \code{single} and \code{random} crossover for the mating process. If single crossover
is used, a single position is randomly chosen that marks the position to split both parent chromosomes. The child
chromosomes are than the concatenated chromosomes from the 1st part of the 1st parent and the 2nd part of the
2nd parent resp. the 2nd part of the 1st parent and the 1st part of the 2nd parent.
Random crossover is that a random number of random positions are drawn and these positions are transferred
from one parent to the other in order to generate the children.

Elitism is a method of enhancing the GA by keeping track of very good solutions. The parameter \code{elitism}
specifies how many "very good" solutions should be kept.

Before the selection probabilities are determined, the fitness values \eqn{f} of the chromosomes are
standardized to the z-scores (\eqn{z = (f - mu) / sd}). Scaling the fitness values afterwards with
the exponential function can help the algorithm to faster find good solutions. When setting
\code{fitnessScaling} to \code{"exp"}, the (standardized) fitness \eqn{z} will be scaled by \eqn{exp(z)}.
This promotes good solutions to get an even higher selection probability, while bad solutions
will get an even lower selection probability.
}
\examples{
ctrl <- genAlgControl(populationSize = 100, numGenerations = 15, minVariables = 5,
    maxVariables = 12, verbosity = 1)

evaluatorSRCV <- evaluatorPLS(numReplications = 2, innerSegments = 7, testSetSize = 0.4,
    numThreads = 1)

evaluatorRDCV <- evaluatorPLS(numReplications = 2, innerSegments = 5, outerSegments = 3,
    numThreads = 1)

# Generate demo-data
set.seed(12345)
X <- matrix(rnorm(10000, sd = 1:5), ncol = 50, byrow = TRUE)
y <- drop(-1.2 + rowSums(X[, seq(1, 43, length = 8)]) + rnorm(nrow(X), 1.5));

resultSRCV <- genAlg(y, X, control = ctrl, evaluator = evaluatorSRCV, seed = 123)
resultRDCV <- genAlg(y, X, control = ctrl, evaluator = evaluatorRDCV, seed = 123)

subsets(resultSRCV, 1:5)
subsets(resultRDCV, 1:5)
}
