\documentclass{article} %% Defines my own fxns \renewcommand{\th}{\textsuperscript{th}\xspace} \newcommand{\nd}{\textsuperscript{nd}\xspace} \newcommand{\st}{\textsuperscript{st}\xspace} \newcommand{\rd}{\textsuperscript{rd}\xspace} \newcommand{\sq}{\textsuperscript{2}\xspace} %% end fxns \pagestyle{headings} \usepackage[round]{natbib} %% Sweave Package for incorporating R Code \usepackage{Sweave} \DefineVerbatimEnvironment{Sinput}{Verbatim}{fontsize=\small,fontshape=sl} \DefineVerbatimEnvironment{Soutput}{Verbatim}{fontsize=\small} \DefineVerbatimEnvironment{Scode}{Verbatim}{fontsize=\small,fontshape=sl} \bibliographystyle{plainnat} \begin{document} \title{Trading with the \texttt{portfolio} package} \author{by Jeff Enos, Daniel Gerlanc, and David Kane} %%\VignetteIndexEntry{Using the tradelist class} %%\VignetteDepends{portfolio} \maketitle \SweaveOpts{echo=TRUE, quiet=TRUE} \setcounter{secnumdepth}{3} <<echo=FALSE, results=hide>>= ## Sets display options options(width = 75, digits = 2, scipen = 5) set.seed(0) ## Loads the package library(portfolio) @ <<echo=FALSE, results=hide>>= ## data saved for this example ## save(portfolios, misc, data.list, mvCandidates, file = "tradelist.RData", compress = TRUE) ## loads the dataset for this vignette load("tradelist.RData") p.current <- portfolios[["p.current.abs"]] p.target <- portfolios[["p.target.abs"]] data <- data.list[["data.abs"]] sorts <- list(alpha = 1.5, ret.1.d = 1) tl <- new("tradelist", orig = p.current, target = p.target, sorts = sorts, turnover = 2000, chunk.usd = 2000, data = data, to.equity = FALSE) @ \begin{abstract} \label{abstract} Given a set of current holdings and a target portfolio, that is, a set of desirable holdings to which we would be willing to switch if trading were free, and that our reasons for trading can be captured with one or more rank orderings, the \texttt{portfolio} package provides a way to use multiple measures of desirability to determine which trades or portions of trades to do. \end{abstract} \section{Introduction} What should we trade now? This question is much more difficult than it might first appear, and yet thousands of individuals and firms controlling trillions of dollars must answer it each day. Consider a simple example. Imagine that the investment universe is restricted to 10 securities and that our portfolio must hold 5 equal-weighted long positions. At any given point in time, we will hold one of those portfolios. The simplest possible ``trade'' is to do nothing, keeping the same portfolio in the next period that we hold in the current one. A period can be 5 minutes or 5 months or any length of time. The next simplest trade is a single position swap. Trade one of our 5 current holdings for one of the 5 securities not in the portfolio. There are 25 such trades. Continuing up the complexity scale, there are 100 trades in which we replace 2 securities in the portfolio with 2 securities not in the portfolio. Considering all sets of possible trades, there are 252 options (including the option of no trading), which is equal to the total number of possible portfolios, $10 \choose 5$. In a world of perfect information, we would know the future returns for each of the 10 securities in the universe. Given this information, and some preferences with regard to risk and return, we could examine all 252 options and determine which was best. Unfortunately, in a real world example with thousands of securities in the universe and possibly hundreds in the portfolio, there is no way to consider every possible portfolio. \section{Complications} The problem of choosing the set of trades to perform, or to which target portfolio to trade, is difficult because of the sheer number of possible solutions. As a result, it is impossible to look at every set of possible trades, or each target portfolio that results from these trades. Even then, suppose we could arrive at a single, desirable target portfolio. There are still complications when determining exactly which portions of the resulting trades should be done. \begin{itemize} \item{\bf{Liquidity}}: Even if it were simple to determine the target portfolio, it may be difficult to get there. Imagine that moving to the target portfolio requires that we trade one million shares of IBM; however, suppose IBM typically trades 100,000 shares per day. How are we going to buy all the necessary shares in one day? Even if we bought the entire day's volume (an impossibility) it would take us ten days to get the entire position. \item{\bf{Price Impact}}: Although commission and spread may be linear in trade volume, price impact is not. We are a participant in the market, and every time we trade we impact the price. Price impact is generally small if we trade a modest portion, say 10\%, of volume. But if we trade more, then the price will move against us. Over some range, price impact increases more than linearly. \item{\bf{Trade Costs}}: Trading is not free so we will want to do less of it in the real world than we might care to do in theory. Basic trading costs (including commissions and spread) tend to enter the calculation linearly. Trade twice as much and we pay twice the costs. \item{\bf{Turnover}}: Turnover is the flip-side of holding period. In an ideal world, holding period would be endogenous. We would select the holding period which maximised the risk-adjusted return of the portfolio. But, in the real world, almost all portfolios have targeted holding periods to which we much adhere. We are only allowed a certain amount of turnover. \item{\bf{Ranking Trades}}: We may have multiple criteria for ranking trades. Some criteria may be more appropriate for ranking certain types of trades under specific circumstances. In the case where we have a large number of criteria, how do we choose the most appropriate criterion for each trade? \end{itemize} None of these problems is impossible to overcome, but all of them conspire to make a general solution to the trading problem extremely difficult. Therefore, we simplify. \section{Key Simplifying Assumptions} \label{simplifying assumption} The \texttt{portfolio} package makes three major simplifying assumptions. The first is that we have created a ``target'' or ``ideal'' portfolio, a set of positions that is desirable and to which we would be willing to switch if trading were free. This assumption is implausible but it does serve to make the problem tractable. If we only consider trades which move us closer to the target portfolio, it is much easier to handle the other difficulties associated with turnover, liquidity and the like. Instead of looking at all possible buys, for example, we only need to analyse buys for securities in which the target portfolio has more shares than the current portfolio. The second simplifying assumption is that different criteria for trading can be captured with a rank ordering. We discard the information used to create the ranks. The third simplifying assumption is that no one type of trade is intrinsically better than another type of trade. All things equal, buys, sells, covers, and shorts are equally preferable. \section{Implementation} \SweaveOpts{echo=FALSE, quiet=TRUE} Our simplifying assumptions allow us to solve the trading problem much more easily, but implementing the solution still requires many steps. Consider a simple example where we already have a small portfolio consisting of positions in various equities. We have been given an additional \$1,000 to invest in the portfolio, and we must invest this \$1,000 over the course of one trading day. This is not a realistic scenario, but having a set amount of time in which to trade will simplify our example. Throughout the document, we will refer to our present holdings as the ``current'' portfolio. The ``target portfolio'' is an ideal set of holdings to which we would immediately switch if trading were free as per the first simplifying assumption. Note that in this simple example the only trades we will be considering are buys. \subsection{Current and target holdings} Our current portfolio consists of shares of \Sexpr{nrow(p.current@shares)} companies, IBM (International Business Machines), GM (General Motors) and EBAY (EBay). <<p.current@shares, echo=FALSE>>= p.current@shares[, c("shares", "price")] @ The \texttt{shares} column expresses how many shares of each stock are in the portfolio, and the \texttt{price} column expresses the most recent price of that equity.\footnote{For simplicity, we use US dollars.} The market value of the current portfolio can be calculated by summing the products of the shares and prices. As per the simplifying assumption, we provide a target portfolio. <<p.target@shares>>= p.target@shares[, c("shares", "price")] @ We would like to buy more shares of GM and take positions in SCHW (Charles Schwab Inc.), MSFT (Microsoft), and GOOG (Google). The market value of the target portfolio is \$\Sexpr{prettyNum(portfolio:::mvLong(p.target),big.mark=",")}. \subsection{Portfolio difference} The portfolio difference may be understood as the trades that would change our current holdings into our target holdings. If trading were free and instantaneous, we would immediately complete all these trades and reach our target portfolio. Alas, trading is not free, and we will most likely not complete all the orders in one day. Some of them probably require that we purchase a large portion of the daily trading volume (over 10\%), at which point the trade may become significantly less desirable. From the portfolio difference, we determine our \emph{candidate trades}. \begin{description} \item{\bf{candidate trades}}: The complete set of trades we would have to make to trade from our current portfolio to the target portfolio. If trading were free, we would make all of these trades right now. \end{description} Below, we list the candidate trades. <<echo=FALSE>>= tl@candidates[, c("side", "shares", "mv")] @ The \texttt{side} column expresses what type of trade we will be making.\protect\footnote{In later examples, S will represent a sell, X will represent a short and C will represent a cover.} All the candidate trades are buys so the \texttt{side} column only contains \texttt{B}. The \texttt{shares} column expresses the number of shares of each stock we must buy to reach the target portfolio. The \texttt{mv} column expresses the effect that the candidate trade will have on the value of the portfolio. Buys, which increase the value of our portfolio, have a positive value. Sells, which decrease the value of the portfolio, have a negative value. As the market value of the target portfolio (\$\Sexpr{prettyNum(portfolio:::mvLong(p.target),big.mark=",")}) is greater than the market value of the original portfolio (\$\Sexpr{prettyNum(portfolio:::mvLong(p.current), big.mark = ",")}), we would have to invest an additional \$\Sexpr{prettyNum(portfolio:::mvLong(p.target)-portfolio:::mvLong(p.current),big.mark=",")} to trade from our current portfolio to our target portfolio. However, we only have \$1,000 with which we may buy additional shares. Therefore, we have to decide which subset of the candidate trades we will make. One of our simplifying assumptions is that we would instantly switch to the target portfolio if trading were free. This implies that all of the candidate trades are desirable. However, they are not all equally desirable. Some trades are better than others. We want to determine which candidate trades or subsets of the candidate trades yield the most utility on the margins. If we had unlimited funds or could freely trade between our current and target portfolios, we would not have to express preferences amongst trades. However, in the real world, we must decide, given a set of possible trades, which trades we should make first. One way to do this involves assigning each trade a value of overall desirability. For example, one could use the values of a \emph{signal}, calculated for each stock, as the measure of desirability for each trade. \begin{description} \item{\bf{signal}}: a value, most likely generated by some sort of quantitative model, which expresses the relative quality of the candidate trades. \end{description} In our example, we assign to trades values of a signal called alpha. When we associate trades with the values of alpha we say that we ``sort by alpha'' or ``use alpha as a \emph{sort.}'' Like portfolio construction, signal generation is beyond the scope of this document. In this example, the alpha signal is already calculated and provided for use in a sort. In the table below, the candidate at the top of the data frame has the highest value for alpha and is therefore the most desirable trade with respect to this signal. <<shows alpha sort>>= tmp <- data.frame(side = tl@candidates[, "side"], alpha = tl@ranks[, "alpha"]) row.names(tmp) <- tl@candidates$id tmp <- tmp[order(tmp$alpha, decreasing = TRUE),] tmp @ Based on the above signal values, MSFT is the best trade, SCHW is the second best trade, and GM is the worst trade with an alpha value of \Sexpr{data[match("GM",data[["id"]]),"alpha"]}. \subsection{Preliminary ranks} We determine which trades are most desirable by generating an overall measure of desirability for each trade. The first step in generating this value involves creating a \emph{rank ordering} of the trades for each sort we have created. A definition of this term follows: \begin{description} \item{\bf{rank ordering}}: a linear, relational ordering of the candidates, where each candidate is assigned a rank from the set $1, 2, 3\dots{}n$ where $n$ is the number of candidate trades. Trade 1 provides the greatest utility and trade $n$ provides the least utility. In creating a rank ordering we discard cardinal information such as a signal and replace it with a whole number ranking. \end{description} We rank and order the candidates by the signal called alpha below: <<trades ranked and ordered by alpha>>= ## for buys, ranks by the inverse because lower values are better tl@ranks$rank <- rank(-tl@ranks$alpha, na.last = TRUE, ties.method = "random") ## removes the "ret.1.d" column for successful row binding later on alpha <- tl@ranks[,!names(tl@ranks) %in% "ret.1.d"] ## appends a column so we know what sort these values come from alpha$sort <- "alpha" alpha[order(alpha$rank), c("rank", "side", "alpha", "shares", "mv")] @ While the alpha column provides an absolute measure of desirability, rank expresses the relative desirability amongst trades. We say that we lose \emph{cardinal information} when we use ranks. \begin{description} \item{\bf{cardinal information}}: The values used to create a rank ordering. The creation of ranks abstracts these values and replaces them with an ordering that reflects the value of an element relative to other elements in the rank ordering. \end{description} In some cases we may want to use more than one measure of desirability. We may have more than one source of cardinal information. Imagine that we want to use both alpha and one-day return as the cardinal information in our sorts. If we believe in one day reversal, we would assign higher ranks to both orders to sell stocks with positive one-day returns and to orders to buy stocks with negative one-day returns. However, we associate more desirable buys with greater sort values. To account for this, the inverse of one-day return is used as the cardinal information for a one-day reversal sort. Therefore, if the one-day return for GM is $-0.10$, the value used in the one-day reversal sort is $0.10$. Below, the table on the left shows the different stocks' one-day return. The table on the right shows the ranks and input values in the one-day reversal sort ret.1.d. \begin{verbatim} side one.day.ret side rank ret.1.d (sort) GM B -0.10 GM B 1 0.10 GOOG B -0.01 GOOG B 2 0.01 MSFT B 0.01 MSFT B 3 -0.01 SCHW B 0.02 SCHW B 4 -0.02 \end{verbatim} <<ret.1.d sort>>= tmp <- tl@ranks[order(tl@ranks$ret.1.d), c("side","ret.1.d")] tmp <- cbind(rank = 1:nrow(tmp), tmp) tmp$ret.1.d <- tmp$ret.1.d[order(tmp$ret.1.d, decreasing = TRUE)] row.names(tmp) <- tl@candidates$id @ \Sexpr{row.names(tmp)[1]} has the highest rank according to one-day reversal because it has the most negative return of all the buys. \subsubsection{The problem of multiple sorting criteria} When we combine the sorts in a single data frame, it is not clear which sort values we should use. If we order by alpha we get the following set of ranks: <<two sorts sorting by alpha>>= tmp.1 <- tl@ranks[order(tl@ranks$alpha, decreasing = TRUE), c("alpha", "ret.1.d")] tmp.1 <- tmp.1 <- cbind(rank = 1:nrow(tmp.1), tmp.1) tmp.1 @ Ranking by the inverse of one-day return yields another ordering: <<two sorts sorting by ret.1.d>>= tmp.2 <- tl@ranks[order(tl@ranks$ret.1.d, decreasing = TRUE), c("alpha", "ret.1.d")] tmp.2 <- cbind(rank = 1:nrow(tmp.2), tmp.2) tmp.2 @ When we use multiple sorts, there is no obvious way in which we would order the trades by desirability. When sorting by alpha, \Sexpr{row.names(tmp.1)[1]} is the most desirable trade, and when sorting by the inverse of one-day return, \Sexpr{row.names(tmp.2)[1]} is the most desirable trade. We cannot easily compare or combine the two sorts because we do not know what the exact relationship is between one-day reversal and alpha, and because the sorts are on different numeric scales. <<trades ranked and ordered by ret.1.d>>= ## we don't actually show any of these values right here ## for buys, ranks by the inverse because lower values are better tl@ranks$rank <- rank(-tl@ranks$ret.1.d, na.last = TRUE, ties.method = "random") ## removes the "alpha" column for successful row binding later on ret.1.d <- tl@ranks[,!names(tl@ranks) %in% "alpha"] ## appends a column so we know what sort these values come from ret.1.d$sort <- "ret.1.d" @ %% As per our second simplifying assumption, we lose a certain amount of %% data when we replace cardinal information with ranks. In doing this %% the \texttt{portfolio} package makes the important assumption, that %% all our different criteria for trading can be captured with a rank %% ordering. Anyone who uses the package should be aware of this %% assumption. This assumption and the assumption that the user provides %% a target portfolio are the most significant assumptions we make. %% However, if we did not only consider ranks, it would be difficult to %% compare sorts. We would have to force the user to provide a function %% that expresses the relationship between the sorts. Writing such a %% function is hard, especially if the we use multiple sorts. To %% facilitate the comparison of sorts, we introduce a weighting scheme. \subsection{Weighting sorts} \label{words weighting sorts} At this point we face two problems. First, we have measures of desirability that are on totally different numeric scales. Inverse of return is in percent return, and alpha is in some other units. In order to work with both variables at the same time, we transform each measure into a series of ranks. The second problem we face is that the two variables we're using may not be equally important. As sorts express preferences amongst trades, weights express preferences amongst sorts. By assigning each sort a weight, we express how important that sort is relative to other sorts. To illustrate some weighting examples, let's consider the scenario in which we have assigned a weight of 1 to both the alpha and one-day reversal sorts. By assigning the same weight to both sorts we assert that they are equally important. Assigning a weight directly affects the sort rankings by causing them to be divided by the weight. However, we have assigned both of the sorts a weight of 1 so the ranks remain the same. <<weighted alpha>>= ## saves off alpha$rank alpha.rank.orig <- alpha$rank alpha$rank <- alpha$rank alpha[order(alpha$rank), c("rank", "side", "alpha", "shares", "mv")] @ The ranks for one-day return remain the same because one-day reversal has a weight of 1. <<unweighted ret.1.d>>= ret.1.d[order(ret.1.d$rank), c("rank", "side", "ret.1.d", "shares", "mv")] @ Having divided the original \emph{raw ranks} by weight, we now have \emph{weighted ranks}. \begin{description} \item{\bf{raw ranks}}: the original, linearly spaced ranks, built on the scale $1, 2, 3\dots{}n$ \item{\bf{weighted ranks}}: the raw ranks divided by sort weight. \end{description} We now have two ranks associated with each candidate, one from the alpha sort and another from the one-day reversal sort. To illustrate that we have duplicate ranks for each sort, we combine the equally-weighted alpha and one-day reversal sorts to form a single data frame. <<unweighted ranks>>= ## sets the ranks of alpha to the original, unweighted ranks alpha$rank <- alpha.rank.orig ## subsets out the "alpha" and "ret.1.d" columns so that both data frames have the same set of columns alpha <- alpha[,!names(alpha) %in% "alpha"] ret.1.d <- ret.1.d[,!names(ret.1.d) %in% "ret.1.d"] overall.ranks <- rbind(alpha, ret.1.d) overall.ranks <- overall.ranks[order(overall.ranks$rank), c("id", "sort", "rank", "side", "shares", "mv")] row.names(overall.ranks) <- paste(overall.ranks$id, overall.ranks$sort, sep = ".") overall.ranks[, c("rank", "sort", "side", "shares", "mv")] @ The row names contain the equity ticker symbols and the name of the sort that generated the rank. For each rank there are two candidates, one of which has been associated with a rank from alpha and the other which has been associated with a rank from one-day reversal. In cases such as this where we have equally weighted sorts there will be a candidate trade from each sort at every rank. If we use $n$ sorts, we will have $n$ ranks associated with each candidate. We only want one rank associated with each candidate. So that each candidate only has one rank associated with it, we assign each rank the best rank generated for it by any sort. We have done this in the data frame below. <<unweighted w/out duplicates>>= ranks <- alpha top.ranks <- aggregate(overall.ranks[c("rank")], by = list(id = overall.ranks$id), min) ranks$rank <- top.ranks$rank[match(ranks$id, top.ranks$id)] ranks[order(ranks$rank), c("rank", "shares", "mv")] @ Both GM and MSFT have been assigned a rank of one. This occurs because MSFT has been ranked 1 by the alpha sort and GM has been ranked 1 by the one-day reversal sort. SCHW has been ranked 2 by the alpha sort and GOOG has been ranked 3 by the alpha sort. When we equally weight the sorts we are equally likely to use ranks from either sort. This behaviour is logical because assigning sorts equal weights suggests that they are equally important. However, the sorts may not always be equally important. In the next example we use a weighting scheme that causes us to use one sort to the exclusion of the other. Let's say that we do not want to consider one-day reversal. To ignore all of the one-day reversal values, we make alpha 10 times more important than one-day reversal. Therefore, we will consider 10 ranks from alpha for every one rank from one-day reversal. As there are only \Sexpr{nrow(tl@candidates)} candidate trades, we will choose the rankings in alpha over all ranks in the one-day reversal sort. <<unbalanced weights>>= ## Assigns one sort, alpha, a much higher weight than the other sort ## restores the original alpha rankings alpha$rank <- alpha.rank.orig ## weights the alpha rankings by 10 alpha$rank <- alpha$rank / 10 overall.ranks <- data.frame() overall.ranks <- rbind(alpha, ret.1.d) overall.ranks <- overall.ranks[order(overall.ranks$rank), c("id", "sort", "rank", "side", "shares", "mv")] row.names(overall.ranks) <- paste(overall.ranks$id, overall.ranks$sort, sep = ".") overall.ranks[c("rank", "side", "shares", "mv")] @ Creating this unbalanced weighting causes us to stack the alpha ranks on top of the one-day reversal ranks. Since we always assign the lowest rank from all trades to a sort, we will consider the alpha ranks before any of the one-day reversal ranks. <<unbalanced w/out duplicates>>= top.ranks <- do.call(rbind, lapply(split(overall.ranks, overall.ranks$id), function(x) { x[which.min(x$rank),] })) top.ranks <- top.ranks[order(top.ranks$rank),] top.ranks[c("rank","sort","shares","mv")] ranks <- alpha top.ranks <- aggregate(overall.ranks[c("rank")], by = list(id = overall.ranks$id), min) ranks$rank <- top.ranks$rank[match(ranks$id, top.ranks$id)] ## ranks[order(ranks$rank), c("rank", "sort", "shares", "mv")] @ Making the alpha sort 10 times as important as the one-day reversal sort causes us to only use ranks from the alpha sort. We do not even consider the number 1 ranked one-day reversal trade until we examine all the alpha values ranked in the top ten. As we only have 4 candidate trades, we do not consider any trades from one-day reversal. The last weighting we will consider falls somewhere in between the previous two. We weight the alpha sort by an additional 50\%, and as a result divide all of the ranks in the alpha sort by 1.5. <<mixed weighting>>= ## returns alpha$rank to original level alpha$rank <- alpha.rank.orig alpha$rank <- alpha$rank / 1.5 overall.ranks <- data.frame() overall.ranks <- rbind(alpha, ret.1.d) overall.ranks <- overall.ranks[order(overall.ranks$rank), c("id", "sort", "rank", "side", "shares", "mv")] row.names(overall.ranks) <- paste(overall.ranks$id, overall.ranks$sort, sep = ".") overall.ranks[c("rank", "side", "shares", "mv")] @ This causes us to consider 3 ranks from the alpha sort for every 2 ranks from the one-day reversal sort. <<mixed w/out duplicates>>= top.ranks <- do.call(rbind, lapply(split(overall.ranks, overall.ranks$id), function(x) { x[which.min(x$rank),] })) top.ranks <- top.ranks[order(top.ranks$rank),] top.ranks[c("rank","sort","shares","mv")] ranks <- alpha top.ranks <- aggregate(overall.ranks[c("rank")], by = list(id = overall.ranks$id), min) ranks$rank <- top.ranks$rank[match(ranks$id, top.ranks$id)] tmp <- ranks[order(ranks$rank), c("rank", "sort", "shares", "mv")] ## tmp @ We use three of the ranks from the alpha sort and one rank from the one-day reversal sort. This is the weighting scheme that we will use in the rest of the example. To review, the ranking process has four steps. First, we ranked each trade according to both alpha and one-day reversal to generate raw ranks. Second, we weighted these ranks. Third, we combined the alpha and one-day reversal ranks. Fourth, we eliminated duplicates by associating each trade with the lowest rank assigned to it by either alpha or one-day reversal. We call these ranks preliminary ranks because they are not the final values we use to determine the desirability of each trade. Nonetheless, we must generate preliminary ranks before we can arrive at final ranks, the calculation of which we describe in the next section. \subsection{Generating synthetic ranks} \label{generating synthetic ranks} Consider a scenario with 100 candidate trades. If trade 1 is $X$ better than trade 2, is trade 99 $X$ better than trade 100? Most portfolio managers would argue that the difference in utility between trade 1 and trade 2 is greater than the difference in utility between trade 99 and trade 100. However, with raw ranks, we make no assertion of how much better one trade is than another trade. To express the tendency for us to derive more utility from the most highly ranked trades, we synthesise yet another set of values from the weighted ranks. We call these values synthetic ranks.\protect\footnote{We are abusing the term ``ranks'' by using it in several different contexts.} \begin{description} \item{\bf{synthetic ranks}}: values generated by mapping the weighted ranks to a truncated normal distribution ($> 85^{th}$ percentile on $N(0,1)$). \end{description} First, we re-rank the weighted ranks: <<>>= tmp$rank <- rank(tmp$rank, ties.method = "first") tmp[order(tmp$rank),c("rank","shares","mv")] @ Next, we evenly distribute the ranks on the interval $[0.85,1)$ such that the best ranked trades are closest to 1 and the worst ranked trades are closest to $0.85$: <<scaling ranks>>= ## a hacked version of the scaling function in calcRanks, built only ## for a list of all buys r.max <- max(tmp$rank) + 1 r.mult <- 0.15 r.add <- 0.85 tmp$rank.s <- (r.mult * tmp$rank[nrow(tmp):1] / r.max) + r.add ## Saves off rank.s for later use rank.s <- tmp tmp[c("rank","shares","mv","rank.s")] @ We list the scaled ranks in \texttt{rank.s}. Next, we map to a truncated normal distribution.\protect\footnote{$> 85^{th}$ percentile of $N(0,1)$} <<synthesised ranks>>= tmp$rank.t <- qnorm(tmp$rank.s) tmp[c("rank", "shares", "mv","rank.s", "rank.t")] @ The \texttt{rank.t} column lists the ranks mapped to a truncated normal distribution. MSFT has the best rank and GOOG has the worst rank. We might expect to see a \texttt{rank.t} of approximately 3.5 for the best ranked trade, but because we only have \Sexpr{nrow(tmp)} candidates and the scaled values are evenly spaced on the interval $[0.85,1)$, the normalised value of the best ranked trade is not as great as it would be if we had 100 trades. Recall that synthetic ranks express the tendency for there to be greater differences in desirability between adjacent, highly ranked trades ($1,2,3\dots{}$) than between adjacent, poorly ranked trades: \\* \begin{table}[!htbp] \begin{tabular}[c]{|rr|rr|rr|} \hline rank & $\Delta$ & $N(0,1)$ & $\Delta$ & $> 85^{th}$ of $N(0,1)$ & $\Delta$ \\ \hline 1 & 1 & 3.50 & 1.17 & 3.50 & 0.53 \\ 2 & 1 & 2.32 & 0.27 & 2.96 & 0.21 \\ 3 & 1 & 2.05 & 0.17 & 2.74 & 0.13 \\ 4 & 1 & 1.88 & 0.13 & 2.61 & 0.10 \\ 5 & 1 & 1.75 & 0.11 & 2.51 & 0.08 \\ . & . & . & . & . & . \\ . & . & . & . & . & . \\ 48 & 1 & 0.05 & 0.03 & 1.46 & 0.01 \\ 49 & 1 & 0.02 & 0.02 & 1.45 & 0.01 \\ 50 & 1 & 0.00 & 0.02 & 1.44 & 0.01 \\ 51 & 1 & -0.02 & 0.02 & 1.43 & 0.01 \\ 52 & 1 & -0.05 & 0.03 & 1.42 & 0.01 \\ . & . & . & . & . & . \\ . & . & . & . & . & . \\ 96 & 1 & -1.64 & 0.11 & 1.06 & 0.00 \\ 97 & 1 & -1.75 & 0.13 & 1.06 & 0.00 \\ 98 & 1 & -1.88 & 0.17 & 1.06 & 0.00 \\ 99 & 1 & -2.05 & 0.27 & 1.06 & 0.00 \\ 100 & - & -2.32 & - & 1.06 & - \\ \hline \end{tabular} \caption[Synthetic rank distributions]{Creating synthetic ranks using a linear distribution, a normal distribution, and a truncated normal distribution. Delta columns express the difference in desirability between adjacent trades.\label{distribution table}} \end{table} Table \ref{distribution table} expresses the differences amongst distributions we might use to rank 100 trades. The \texttt{rank} column contains the raw ranks for the 5 best trades, the 5 middle-ranked trades, and the 5 worst trades. In this example the ranks on $[1,100]$ are spaced on intervals of one. The rank difference between every trade is the same. The difference between trade 1 and trade 2 is the same as the difference between trade 99 and trade 100. The normal distribution column $(N(0,1))$ expresses what happens when we normalise the raw ranks. The normal distribution correctly expresses our belief that there is a large difference in desirability between the best ranked trades. However, use of the normal distribution would incorrectly suggest that there are similarly large desirability differences between the worst trades. We get these results when using the normal distribution because the best and worst ranked trades lie in the tails of the distribution. We do not want large differences in desirability amongst the worst ranked trades. The desirability differences decrease until we reach trade 50, then increase again as we move towards the other tail of the distribution. We want desirability to remain the same on the margin past the 50th trade. To address the problems associated with normalising to $N(0,1)$, we normalise to a normal distribution truncated below the 85th percentile. In the right\-most delta $(\Delta)$ column, the synthetic rank differences between the best ranked trades are over 50 times greater than the synthetic rank differences between the middle ranked trades. Every trade ranked worse than 50 has a similar synthetic rank difference. Although the subset $[0.85,1)$ is slightly arbitrary, (we could have set the lower extreme to be 0.84, 0.86, or another similar value) it serves our purpose of expressing large differences in desirability where we find the best buys, on one tail, and small differences in desirability amongst the worst buys, on the other. Recall the steps we have taken towards generating our final synthetic rank. First, we converted the sort values to raw ranks. Second, we converted the raw ranks to weighted ranks. Third, we scaled the weighted ranks to $[0.85,1)$ to generate scaled weights. Lastly, we mapped the scaled weights to a truncated normal distribution for our final synthetic rank. By only using the $85^{th}$ percentile and above, we express our belief that the differences in desirability between the best ranked trades is much greater than the differences in desirability between the worst ranked trades. If the costs associated with trading any stock, all things being equal, were the same, we would not care about the difference in utility between trades. We would move down the trade list from best to worst until we reached our allotted turnover. However, our trading influences prices and may reduce the desirability of a trade. \subsection{Chunks, synthetic rank, and trade-cost adjustment} \label{Chunks, synthetic rank, and trade-cost adjustment} We want to know at what point the cost of trading an equity exceeds the utility of trading that equity. In the \texttt{portfolio} package, we use synthetic rank to represent utility. Determining the cost of purchasing an additional share is impossible if our smallest trading unit is an entire order so we break each order into \emph{chunks}. \begin{description} \item{\bf{chunk}}: A portion of a candidate trade. \end{description} We break candidate trades into chunks by market value. Each chunk has a market value of approximately \$\Sexpr{tl@chunk.usd}: <<chunk w/out tca.rank>>= tl@chunks[order(-tl@chunks$rank.t), c("side", "shares", "mv", "alpha", "ret.1.d", "rank.t", "chunk.shares", "chunk.mv")] @ The candidate trades are broken into \Sexpr{nrow(tl@chunks)} chunks. The number following the ticker in the row name expresses the chunk number for that particular equity. The \texttt{chunks.mv} column expresses the market value of each chunk. The \texttt{chunk.shares} column expresses how many shares are in each chunk. \subsubsection{Trade-cost adjustment of individual chunks} As we trade a greater percentage of the average daily volume, the price of the trades will increase. To reflect this phenomenon, we penalise the synthetic ranks of the chunk as we trade greater percentages of the daily volume. We call this penalty \emph{trade-cost adjustment}. \begin{description} \item{\bf{trade-cost adjustment}}: Lowering a chunk's rank because of trading volume. \end{description} To fix this idea, let's first examine the daily volumes of our candidate trades.\protect\footnote{The \texttt{volume} column represents some measure of past trading volume such as the average trading volume over the last 30 days. A daily measure of \texttt{volume} is not required; we would use whatever measure is natural for the frequency with which we trade.} <<trading volume>>= trading.volume <- data.frame(rank.t = tl@ranks$rank.t, volume = tl@data$volume[match(tl@ranks$id, tl@data$id)], shares = tl@ranks$shares) row.names(trading.volume) <- tl@ranks$id trading.volume[order(-trading.volume$rank.t),] @ The trades we want to make for MSFT, SCHW, and GOOG involve less than 3\% of the daily trading volume. However, we want to trade 100\% of the daily trading volume of GM. We would probably not be able to purchase all of these shares in one day, and even if we could, we would affect prices significantly. Moving into the position over several days would be better. We use a trade-cost adjustment function to express how increasing trade costs reduce the desirability of candidate trades. To better approximate utility, we penalise synthetic ranks at the chunk level. Doing this allows us to better determine at which point the cost of trading an additional chunk is greater than the utility derived by trading an additional chunk. We perform trade-cost adjustment on the chunks by keeping track of what percentage of the daily volume we have traded with each additional chunk. In the trade-cost adjustment function used in this example, the first chunk to cross the threshold of 15\% of the daily trading volume is penalised by a fixed amount. All subsequent chunks are penalised by that amount, and any further chunks that pass 30\% or 45\% percent of the daily trading volume receive further penalties. The function used in this example also prevents any adjustment on the first chunk of a candidate trade. Below, we can see that the second chunk of the trade for GM has been trade-cost adjusted: <<chunks w/ tca.rank>>= tl@chunks[order(-tl@chunks$rank.t), c("side", "mv", "alpha", "ret.1.d", "rank.t", "chunk.shares", "chunk.mv", "tca.rank")] @ The \texttt{tca.rank} column expresses the synthetic rank adjusted for trade costs. Since GM is the only candidate for which we want to purchase more than 15\% of the daily trading volume, it is the only candidate for which we trade-cost adjust the chunks. Every chunk of GM beyond the first has been trade-cost adjusted. This will cause us to consider the chunks of other candidate trades before we trade additional chunks of GM: <<ordered chunks w/ tca.rank>>= tl@chunks[order(tl@chunks$tca.rank, decreasing = TRUE), c("side", "mv", "alpha", "ret.1.d", "rank.t", "chunk.shares", "chunk.mv", "tca.rank")] @ As MSFT is the best ranked candidate and does not receive a trade-cost penalty, we would trade all the shares of MSFT before considering the other candidates.\protect\footnote{Assuming that derived turnover is greater than the market value of all the candidate trades.} Having completed all the chunks of MSFT, we would consider the first chunk of GM, the only chunk which has not been trade-cost adjusted. Subsequently, we would trade all the chunks of SCHW and GOOG, the candidate trades ranked 3 and 4. Lastly, we trade the penalised chunk of GM. \subsubsection{Synthetic rank and trade-cost adjustment of small portfolios} \label{Synthetic rank and trade-cost adjustment of small portfolios} In this example, trade-cost adjustment decreases the desirability of the second chunk of GM in a non-trivial way. Although GM is ranked second as a candidate trade, every other candidate trade would be made before we completed all the chunks of GM. When we consider such a small number of trades, we assume that all of the trades are of approximately equal quality; the difference in utility between candidate trades is fairly small. This occurs because the scaled ranks are evenly distributed on $[0.85,1)$: <<rank.s>>= rank.s[c("rank","shares","mv","rank.s")] @ When we only have \Sexpr{nrow(tl@candidates)} candidates, none of the scaled ranks will be very close to $1$, and consequently, none of the synthetic ranks will fall at the extreme tail of the normal distribution: <<rank.t>>= rank.t <- rank.s rank.t$rank.t <- qnorm(rank.t$rank.s) rank.t[c("rank","shares","mv","rank.s","rank.t")] @ Consequently, the difference in utility between candidate trades will be small when there are few candidate trades. Heuristically, this seems correct because if we are making very few trades, we would most likely derive similar utility from any of them. Therefore, it makes sense for us to trade the other three candidates if the costs associated with trading GM are large. \subsubsection{Synthetic rank and trade-cost adjustment of large portfolios} \label{Synthetic rank and trade-cost adjustment of large portfolios} Moving away from our example for a moment, imagine that we have a large current and target portfolio, the trade list for which contains 100 candidate trades. When we evenly distribute the scaled ranks on the interval $[0.85,1)$, we have more synthetic ranks at the extreme tail: <<misc$rank.s>>= misc$rank.s @ The row names express the equity ticker symbols. \texttt{rank} is the raw rank. \texttt{rank.s} is the scaled rank, and \texttt{rank.t} is the synthetic rank. The best ranked trade, \Sexpr{row.names(misc[["rank.s"]])[1]}, has a scaled rank value very close to one and a synthetic rank close to three. This indicates that the best rank falls at the tail of the normal distribution. The worst ranked candidates not only have low synthetic ranks, but they also have very small differences in synthetic rank. If we trade-cost adjust one of the poorly ranked candidates we will most likely not trade it until we have traded all other candidates not penalised by trade cost adjustment. On the other hand, we would still trade \Sexpr{row.names(misc[["rank.s"]])[1]}, \Sexpr{row.names(misc[["rank.s"]])[2]}, or \Sexpr{row.names(misc[["rank.s"]])[3]}, even if some of the chunks had been trade-cost adjusted. Let's quickly review how we generate the final, synthetic ranks. The preliminary values from which we draw the raw ranks are the sorts we define. In this example, we defined sorts for alpha and one-day reversal. In creating raw ranks, we ignore the underlying values used by the sorts. At this point, we still have a different set of raw ranks for each sort. To express preferences amongst the sorts, we apply weights to the sorts. This step yields weighted ranks. From the sets of weighted ranks, we associate with each candidate the best weighted rank from any sort. Next, we scale the buys to the interval $[0.85,1)$. This step yields scaled ranks. From scaled ranks, we generate synthetic ranks by mapping the scaled ranks to a truncated normal distribution. Next, we break the candidates into chunks and perform trade-cost adjustment as necessary. This yields trade-cost adjusted ranks which are the final measure of chunks' desirability. \subsection{Sorting theory} \label{sorting theory} Chooing the best candidate when we have multiple measures of desirability is difficult. Consider the situation where we must choose ten stocks to trade. In our example, assuming that we use some type of formula to generate alpha, we might be able to incorporate our other sorts into the formula for alpha. Instead of having alpha and one-day reversal as distinct sorts, we would only have one sort, alpha, which would also take one-day reversal into account. For this to work, however, we would have to write a function that accounted for the the ordering of every trade by every sort. Furthermore, this function would have to take into account our preference for certain sorts over other sorts. To elaborate on how difficult it is to create such a function, let us consider the situation where we must choose our ten favourite trades, in no particular order, using the data in the table below. \begin{table}[!htbp] \begin{tabular}[c]{|r|r|r|r|r|r|} \hline symbol & raw rank & alpha & symbol & raw rank & one-day return \\ \hline IBM & 1 & 1.57 & HPQ & 1 & -0.063 \\ MS & 2 & 1.26 & SUNW & 2 & -0.056 \\ EBAY & 3 & 1.24 & AET & 3 & -0.041 \\ CBBO & 4 & 1.21 & YHOO & 4 & -0.036 \\ SCHW & 5 & 1.15 & T & 5 & -0.014 \\ PAYX & 6 & 1.12 & CVX & 6 & -0.011 \\ HAL & 7 & 1.12 & GOOG & 7 & -0.011 \\ AMD & 8 & 1.10 & PAYX & 8 & -0.002 \\ MSFT & 9 & 0.99 & CBBO & 9 & 0.003 \\ CVX & 10 & 0.96 & HAL & 10 & 0.009 \\ AET & 11 & 0.92 & QCOM & 11 & 0.011 \\ HPQ & 12 & 0.81 & EBAY & 12 & 0.014 \\ QCOM & 13 & 0.77 & SCHW & 13 & 0.029 \\ GOOG & 14 & 0.65 & AAPL & 14 & 0.036 \\ YHOO & 15 & 0.64 & MS & 15 & 0.041 \\ \hline \end{tabular} \caption[alpha and one-day return ranks]{The alpha and one-day returns of candidates suggest different rank orderings. All of the candidates are buys.\label{theory table 1}} \end{table} Table \ref{theory table 1} has a row for each of 15 candidates, their alpha and one-day reversal values, and the raw ranks we would generate from these values. All of the candidates are buys so greater alpha values are better and lesser one-day reversal values are better. One portfolio manager might decide that she wants to make trades based only on alpha. She chooses the top ten trades according to alpha. A second portfolio manager may want to make trades based only on one-day return. She chooses the top ten trades according to one-day return. The third portfolio manager considers both alpha and one-day return and choose her favorite trades by examining both. Portfolio manager three believes in buying equities which have had price decreases of greater than 4\% during the previous trading day. Consequently, she would buy HPQ, SUNW, and AET. She would fill her remaining orders using the top 7 trades according to alpha. How would the third portfolio manager write a function that expresses her trading preferences? What if some days she acted like the first portfolio manager and on other days like the second portfolio manager? How would she account for a change in preference for one of the sorts? Our solution allows any of these portfolio managers to express her trading preferences without having to write a function that relates the different measures of desirability. Instead, she would use the weighting function that the \texttt{portfolio} package provides. She would examine the trade list created using different weighting schemes and adjust the weights until the utility derived from the last candidate traded was greater than the cost of the first trade \emph{not} made. For example, the portfolio manager may decide that YHOO is a better reversal trade than the last alpha trade and revise the weighting scheme so that she makes one less alpha trade and one more reversal trade. \begin{table}[!htbp] \begin{tabular}[c]{|r|r|r|r|r|r|} \hline symbol & raw rank & alpha & symbol & raw rank & ret.1.d \\ \hline IBM & 1 & 1.57 & HPQ & 1 & -0.063 \\ MS & 2 & 1.26 & SUNW & 2 & -0.056 \\ EBAY & 3 & 1.24 & AET & 3 & -0.041 \\ CBBO & 4 & 1.21 & YHOO & 4 & -0.036 \\ \cline{5-5} SCHW & 5 & 1.15 & T & 5 & -0.014 \\ PAYX & 6 & 1.12 & CVX & 6 & -0.011 \\ \cline{2-2} HAL & 7 & 1.12 & GOOG & 7 & -0.011 \\ AMD & 8 & 1.10 & PAYX & 8 & -0.002 \\ MSFT & 9 & 0.99 & CBBO & 9 & 0.003 \\ CVX & 10 & 0.96 & HAL & 10 & 0.009 \\ AET & 11 & 0.92 & QCOM & 11 & 0.011 \\ HPQ & 12 & 0.81 & EBAY & 12 & 0.014 \\ QCOM & 13 & 0.77 & SCHW & 13 & 0.029 \\ GOOG & 14 & 0.65 & AAPL & 14 & 0.036 \\ YHOO & 15 & 0.64 & MS & 15 & 0.041 \\ \hline \end{tabular} \caption[Trading Preferences II]{Portfolio manager 3 revises her trading preferences.\label{theory_table_2}} \end{table} What ultimately matters is the last candidate we decide to trade and the first candidate we decide not to trade. By using rank orders instead of underlying values, we do not have to combine the different sorts. Instead, we can express our preferences for different, possibly unrelated criteria through the use of a weighting scheme we provide in \texttt{portfolio}. \subsection{Pairing trades} Let us return to discussing trade list construction. In practise, most equity portfolios must be maintained at a specific market value. One logical way to achieve this result would be to pair desirable buys and sells of equal market value, which is what we do in the \texttt{portfolio} package. We call these pairings of buys and sells a swap: \begin{description} \item{\bf{swap}}: A pairing of a buy and sell or short and cover of similar market market value and desirability. \end{description} We have already created the framework to create swaps; we break the candidates into chunks of similar market value and then rank these chunks individually. If our candidate trades included buys and sells, we would simply match the most desirable buys with the most desirable sells. However, our candidate trades are all buys, and we want to increase the market value of our portfolio by \$1,000. \subsubsection{Dummy chunks} If we want to increase the market value of the portfolio, we must buy more than we sell. Therefore, we do not want to pair a buy with a sell. We just want buys. The situation where we just want buys or sells is a special case. The \texttt{portfolio} package is structured so that we must also trade in pairs. To work within the package framework we introduce the concept of \emph{dummy chunks}: \begin{description} \item{\bf{dummy chunk}}: A \emph{fake} buy or sell chunk that we pair with a real buy or sell chunk in situations where we want to increase or decrease the market value of the portfolio. \end{description} As our example only contains buys, we have paired every buy with a dummy sell.\protect\footnote{We only show the head of the swaps table.} <<swaps table>>= head(tl@swaps[, c("tca.rank.enter", "tca.rank.exit", "rank.gain")]) @ In the table above, the row names express the chunk ticker symbols that form the swap. To the left of the comma is an enter chunk, and to the right of the comma is an exit chunk.\protect\footnote{Enter chunks are either a buy or short. A buy allows us to take a long position and a short allows us to take a short position. Exit chunks are either sells or covers. A sell allows us to exit a long position and a cover allows us to exit a short position.} The exit chunks all have a symbol \texttt{NA.0} because they are dummy sells. The \texttt{tca.rank.enter} column expresses the trade-cost adjusted rank of the enter chunk, the buy, and the \texttt{tca.rank.exit} column expresses the trade-cost adjusted rank of the exit chunk, the dummy sell. The \texttt{rank.gain} column expresses the difference in trade-cost adjusted rank between the enter and the exit, the buy and dummy sell. We have spent considerable time discussing the generation of all types of ranks for buys, but we have not yet discussed ranking sells. For sells, better ranks are more negative. Therefore, a great sell might have a synthetic rank of -3.5. Recall that our goal is to make the trades which yield the most utility. In spending our \$1,000, we want to trade the best chunks. So that we make the best buys when increasing the market value of the portfolio, we assign the dummy sells an arbitrarily high rank. In the table above, the dummy sells have a trade-cost adjusted rank of -10,000. We match the best the buys and sells by calculating rank gain. As no real sells will yield the same rank gain that the pairing of buy and a dummy sell yields, we create pairs with all the dummy sells before even considering other sells. As there are no sells in this example, all the swaps consist of a buy and a dummy sell. Let's quickly review why we create swaps. We want to maximise utility by making the candidate trades or portions of candidate trades that yield the greatest utility. Generally, we want to maintain the portfolio equity at a constant level. A logical way to do this involves pairing buys and sells of similar market value. To maximise utility, we should pair the most best ranked buys and sells. In special cases, we want to increase or decrease the market value of our portfolio. In order to do this, we must make more of one type of trade. However, this would require that we have swaps that contain only a buy or sell. Since we cannot have a swap of only one trade, we introduce dummy trades. As dummy trades have an arbitrarily high synthetic rank they pair with the best buys and sells to ensure that we choose the most useful candidates in changing the market value of the portfolio. \subsection{Accounting for turnover} \emph{Note: this and subsequent sections need to account for change in turnover application. Now all swaps are done such that the total market value of trades goes up to but doesn't exceed the turnover amount. In the meantime I have adjusted the example's turnover to \$2,000 so that at least one chunk is done, although now Sweave chunks will be inconsistent with the text.} \\ As we stated earlier, holding period would be endogenous if we could always set it to maximise risk-adjusted return. However, most real world portfolios have a set holding period and consequently, a set turnover. There is no real concept of turnover or holding period in this example. We have \$1,000 to invest in our portfolio over the course of a single day. Although this additional investment does not represent turnover, we can view our \$1,000 as representing a daily turnover of \$1,000. We want to make the best ranked trades until the cumulative market value of these trades exceeds the money we have to invest. Analogously, we would say that we want to make the best ranked trades until we exceed turnover. As our turnover in this example is \$\Sexpr{tl@turnover}, all of our trades will not have a market value greater than \$\Sexpr{tl@turnover}: <<>>= tl@swaps.actual[, c("tca.rank.enter", "tca.rank.exit", "rank.gain")] @ MSFT is the the best ranked trade. Consequently, we choose swaps of MSFT before choosing other swaps. We make \Sexpr{nrow(tl@swaps.actual)} because each swap has a value of approximately \$\Sexpr{tl@chunk.usd}, and our turnover is \$\Sexpr{tl@turnover}. \subsection{Actual orders} We do not want to submit two orders for 8 shares of MSFT. Before submitting the trade list, we must roll-up the swaps into larger orders. We first remove the dummy chunks: <<remove idiots>>= tl@chunks.actual[, c("side", "mv", "alpha", "ret.1.d", "rank.t", "chunk.shares", "chunk.mv", "tca.rank")] @ Then we combine the chunks to form a single order per candidate: <<>>= tl@actual[, !names(tl@actual) %in% c("id")] @ We now have an order for \Sexpr{tl@actual[1,"shares"]} shares of \Sexpr{tl@actual[1,"id"]}, which is the sum of the chunks of \Sexpr{tl@actual[1,"id"]}. Having discussed in words the process of trade list creation, we describe, step-by-step, the process of building a \texttt{tradelist} object in R. %% \section{A simple example} %% \SweaveOpts{echo=FALSE, quiet=TRUE} %% Assume that we already have a small portfolio consisting of positions %% in various equities. We have been given an additional \$1,000 to %% invest in the current portfolio, and we must invest this \$1,000 over %% the course of one trading day. We only consider buys. This is not %% a realistic scenario, but it is simple. %% Recall our first simplifying assumption that we already have a %% ``target portfolio'', an ideal set of holdings that we would %% immediately switch to if trading were free. We use the \$1,000 to move %% towards the target portfolio. %% %\subsection{Holdings} %% %\subsubsection{Current portfolio} %% %\subsubsection{Target portfolio} %% %\subsubsection{Portfolio difference and candidate trades} %% \subsection{Current and target holdings} %% Our current portfolio consists of shares of %% \Sexpr{nrow(p.current@shares)} companies, IBM (International Business %% Machines), GM (General Motors) and EBAY (EBay). %% <<p.current@shares, echo=FALSE>>= %% p.current@shares[, c("shares", "price")] %% @ %% The \texttt{shares} column expresses how many shares of each stock are in %% the portfolio, and the \texttt{price} column expresses the most recent %% price of that equity.\footnote{For simplicity, we express monetary %% values in this document as US dollars.} The market value of the %% current portfolio can be calculated by summing the products of the %% shares and prices, and it is %% \$\Sexpr{prettyNum(portfolio:::mvLong(p.current),big.mark=",")}. %% As per our simplifying assumption the user already has a target %% portfolio. %% <<p.target@shares>>= %% p.target@shares[, c("shares", "price")] %% @ %% We would like to buy more shares of GM and take positions in SCHW %% (Charles Schwab Inc.), MSFT (Microsoft), and GOOG (Google). The %% market value of the target portfolio is %% \$\Sexpr{prettyNum(portfolio:::mvLong(p.target),big.mark=",")}. %% The target portfolio expresses the positions we want to take with our %% \$1,000. It is an ideal set of holdings that we would immediately %% switch to if trading were free. The positions in the target portfolio %% are all desirable, otherwise we would not switch to them. However, it %% is not given that we can move to the target portfolio. Liquidity, %% price impact, and turnover complicate trading. At best, we will move %% in the direction of the target portfolio. There are a variety of ways %% to do this. The \texttt{portfolio} package helps us to decide which %% is the best. %% \subsection{Portolio difference and candidate trades} %% The portfolio difference may be understood as the trades that would %% change our current holdings into our ideal holdings. From the %% portfolio difference, we determine our \emph{candidate trades}. %% \begin{description} %% \item{\bf{candidate trades}}: The set of trades to move from %% the current portfolio to the target portfolio. If trading were free, %% we would make all of these trades. %% \end{description} %% \subsection{Expressing preferences amongst trades} %% Part of our simplifying assumption is that we would instantly switch %% to the target portfolio if trading were free. This implies that all %% of the candidate trades are desirable. However, they are not all %% equally desirable. Some trades are better than others. We want to %% determine which candidate trades or subsets of the candidate trades %% yield the most utility. We might do this by assigning each stock some %% measure of utility. %% This approach works fine when we only have one measure of utility, but %% fails when we have multiple measures of utility. Let's say that we %% have two measures of utility, \emph{alpha} and \emph{one-day return}. %% Alpha is a measure of utility. Using some quantitative model, we %% generate a measure of alpha for the stocks in our portfolio. %% Each stock has some value associated with it which represents alpha. %% Positive alpha values indicate that we should buy a stock; the greater %% the magnitude, the better the buy. EBAY (EBay) is the best buy and %% IBM (International Business Machines) is worst buy. The opposite %% applies for sells. Negative alpha values indicate that we should sell %% stock. %% Based on these alpha values, we decide to buy all of these stocks. %% When we associate trades with alpha, we say that we sort by alpha or %% use alpha as a \emph{sort.} %% \begin{description} %% \item{\bf{sort}}: a set of values associated with a set of positions. %% Higher values suggest positive future performance and negative values %% suggest poor future performance. Based on a sort we can determine %% what type of trade is most desirable for each position. Therefore, we %% want to buy or cover positions with positive sort value and sell or %% short positions with negative sort values. %% \end{description} %% Like portfolio construction, alpha generation is beyond the scope of %% this document; we provide alpha values and will be using alpha as a %% sort. We associate good buys with greater alpha values. The %% candidate at the head of the data frame has the highest alpha value %% and is therefore the most desirable trade. %% \subsection{Preliminary ranks} %% We determine which trades are most desirable by generating an overall %% measure of desirability for each trade. The first step in generating %% this value involves creating a \emph{rank ordering} of the trades for each %% sort we have created. %% \begin{description} %% \item{\bf{rank ordering}}: a linear ordering of the candidates, %% where each candidate is assigned a rank from the set $1, 2, %% 3\dots{}n$. Trade 1 provides the greatest utility and trade $n$ %% provides the least utility. %% \end{description} %% <<trades ranked and ordered by alpha>>= %% ## removes the "ret.1.d" column for successful row binding later on %% alpha <- tl@ranks[,!names(tl@ranks) %in% "ret.1.d"] %% ## appends a column so we know what sort these values come from %% alpha$sort <- "alpha" %% @ %% We lose \emph{cardinal information} when we use ranks. Cardinal %% information is the set of values we use to create the ranks. In some %% cases we may want to use more than one measure of desirability. We %% may have more than one source of cardinal information. Imagine that %% we want to use both alpha and one-day return as sorts. If we believe %% in one day reversal, we would assign higher ranks to both orders to %% sell stocks with positive one-day returns and to orders to buy stocks %% with negative one-day returns. However, we associate more desirable %% buys with greater sort values. To account for this, we have taken the %% inverse of all one-day return values. Therefore, if the return %% one-day return for GM appears to be $0.10$, it should really be %% $-0.10$. The table on the left shows the one-day return values after %% we have taken their inverse. The table to the right contains that %% actual one-day return values. %% \begin{verbatim} %% rank side ret.1.d rank side ret.1.d %% GM 1 B 0.10 GM 1 B -0.10 %% GOOG 2 B 0.01 GOOG 2 B -0.01 %% MSFT 3 B -0.01 MSFT 3 B 0.01 %% SCHW 4 B -0.02 SCHW 4 B 0.02 %% \end{verbatim} %% <<ret.1.d sort>>= %% tmp <- tl@ranks[order(tl@ranks$ret.1.d), c("side","ret.1.d")] %% tmp <- cbind(rank = 1:nrow(tmp), tmp) %% tmp$ret.1.d <- tmp$ret.1.d[order(tmp$ret.1.d, decreasing = TRUE)] %% row.names(tmp) <- tl@candidates$id %% @ %% \Sexpr{row.names(tmp)[1]} has the highest rank according to one-day return %% because it has the most negative return of all the buys, but for the %% \texttt{portfolio} package to properly process our one-day reversal %% sort, we must take the inverse of all the one-day return values. %% \subsubsection{The problem of multiple sorting criteria} %% When we combine the sorts in a single data frame, it is not clear %% which sort values we should use. If we order by alpha we get the %% following set of ranks: %% <<two sorts sorting by alpha>>= %% tmp.1 <- tl@ranks[order(tl@ranks$alpha, decreasing = TRUE), c("alpha", "ret.1.d")] %% tmp.1 <- tmp.1 <- cbind(rank = 1:nrow(tmp.1), tmp.1) %% tmp.1 %% @ %% Ordering by one-day return yields another ordering: %% <<two sorts sorting by ret.1.d>>= %% tmp.2 <- tl@ranks[order(tl@ranks$ret.1.d, decreasing = TRUE), c("alpha", "ret.1.d")] %% tmp.2 <- cbind(rank = 1:nrow(tmp.2), tmp.2) %% tmp.2 %% @ %% When we use multiple sorts, there is no obvious way by hich we would %% order the trades by desirability. When sorting by alpha, %% \Sexpr{row.names(tmp.1)[1]} is the most desirable trade, and when %% sorting by one-day return, \Sexpr{row.names(tmp.2)[1]} is the most %% desirable trade. We cannot easily compare or combine the two sorts %% because we do not know what the exact relationship is between one-day %% reversal and alpha. Neither sort is even on the same numeric scale. %% Should we alternate between using values from alpha and one-day %% return? How would we decide how often to alternate between the sorts? %% The way in which we express preferences amongst trades in the %% \texttt{portfolio} package represents our answer to these questions. %% In sections \ref{words weighting sorts} through \ref{generating %% synthetic ranks} we discuss our method for ranking trades when we %% multiple measures of desirability. In section \ref{sorting theory} we %% will discuss the reasoning behind our mtehod of ordering trades. %% <<trades ranked and ordered by ret.1.d>>= %% ## we don't actually show any of these values right here %% ## for buys, ranks by the inverse because lower values are better %% tl@ranks$rank <- rank(-abs(tl@ranks$ret.1.d), na.last = TRUE, ties.method = "random") %% ## removes the "alpha" column for successful row binding later on %% ret.1.d <- tl@ranks[,!names(tl@ranks) %in% "alpha"] %% ## appends a column so we know what sort these values come from %% ret.1.d$sort <- "ret.1.d" %% @ %% %% As per our second simplifying assumption, we lose a certain amount of %% %% data when we replace cardinal information with ranks. In doing this %% %% the \texttt{portfolio} package makes the important assumption, that %% %% all our different criteria for trading can be captured with a rank %% %% ordering. Anyone who uses the package should be aware of this %% %% assumption. This assumption and the assumption that the user provides %% %% a target portfolio are the most significant assumptions we make. %% %% However, if we did not only consider ranks, it would be difficult to %% %% compare sorts. We would have to force the user to provide a function %% %% that expresses the relationship between the sorts. Writing such a %% %% function is hard, especially if the we use multiple sorts. To %% %% facilitate the comparison of sorts, we introduce a weighting scheme. %% \subsection{Weighting sorts} %% \label{words weighting sorts} %% As sorts express preferences amongst stocks, weights express %% preferences amongst sorts. A weight is a measure of how important a %% sort is relative to other sorts. Say that alpha and one-day return %% are equally important to us. We assert this by assigning them both %% the same weight. Assigning a weight divides the sort values by that %% weight. We assign a weight of one. %% <<weighted ranks>>= %% ## saves off alpha$rank %% alpha.rank.orig <- alpha$rank %% ## creates a column w/ 1-day return %% alpha$ret.1.d <- ret.1.d$ret.1.d %% ## orders each column individually %% alpha[order(alpha$rank), c("alpha","ret.1.d")] %% @ %% However, the ranks for one-day return remain the same because one-day %% return has a weight of 1. %% <<unweighted ret.1.d>>= %% @ %% Having divided the original, \emph{raw ranks} by weight, we now have %% \emph{weighted ranks}. %% \begin{description} %% \item{\bf{raw ranks}}: the original, linearly spaced ranks, built on %% the scale $1, 2, 3\dots{}n$ %% \item{\bf{weighted ranks}}: the raw ranks divided by the weights of %% the sorts. %% \end{description} %% We now have two ranks associated with each candidate, one from the %% alpha sort and another from the one-day return sort. To illustrate %% that we have duplicate ranks for each sort, we bind the %% equally-weighted alpha and one-day return sorts to form a single data %% frame. %% <<unweighted ranks>>= %% ## sets the ranks of alpha to the original, unweighted ranks %% alpha$rank <- alpha.rank.orig %% ## subsets out the "alpha" and "ret.1.d" columns so that both data frames have the same set of columns %% alpha <- alpha[,!names(alpha) %in% "alpha"] %% ret.1.d <- ret.1.d[,!names(ret.1.d) %in% "ret.1.d"] %% overall.ranks <- rbind(alpha, ret.1.d) %% overall.ranks <- overall.ranks[order(overall.ranks$rank), c("id", "sort", "rank", "side", "shares", "mv")] %% row.names(overall.ranks) <- paste(overall.ranks$id, overall.ranks$sort, sep = ".") %% overall.ranks[, c("rank", "sort", "side", "shares", "mv")] %% @ %% Equally weighted sorts produce candidate trades from each sort at every %% rank. Using $x$ sorts produces $x$ ranks for each candidate. We %% assign each candidate the best rank generated for it by any sort: %% <<unweighted w/out duplicates>>= %% ranks <- alpha %% top.ranks <- aggregate(overall.ranks[c("rank")], by = list(id = overall.ranks$id), min) %% ranks$rank <- top.ranks$rank[match(ranks$id, top.ranks$id)] %% ranks[order(ranks$rank), c("rank", "shares", "mv")] %% @ %% Both GM and MSFT have been assigned a rank of one. This occurs %% because MSFT has been ranked 1 by the alpha sort and GM has been %% ranked 1 by the one-day return sort. SCHW has been ranked 2 by the %% alpha sort and GOOG has been ranked 3 by the alpha sort. %% When we equally weight the sorts we are equally likely to use ranks %% from either sort. This behavior is logical because assigning sorts %% equal weights suggests that they are equally important. However, the %% sorts may not always be equally important. In the next example we use %% a weighting scheme that causes us to use one sort to the exlusion of %% the other. %% To ignore one-day return we make alpha 10 times more important. We %% consider 10 ranks from alpha for every one rank from one-day return. %% As there are only \Sexpr{nrow(tl@candidates)} candidate trades, we %% choose the rankings in alpha over the ranks in the one-day return %% sort. %% <<unbalanced weights>>= %% ## Assigns one sort, alpha, a much higher weight than the other sort %% ## restores the original alpha rankings %% alpha$rank <- alpha.rank.orig %% ## weights the alpha rankings by 10 %% alpha$rank <- alpha$rank / 10 %% overall.ranks <- data.frame() %% overall.ranks <- rbind(alpha, ret.1.d) %% overall.ranks <- overall.ranks[order(overall.ranks$rank), c("id", "sort", "rank", "side", "shares", "mv")] %% row.names(overall.ranks) <- paste(overall.ranks$id, overall.ranks$sort, sep = ".") %% overall.ranks[c("rank", "side", "shares", "mv")] %% @ %% This extreme weighting stacks the alpha ranks on top of the one-day %% return ranks. Since we always assign the lowest rank from all trades %% to a sort, we consider the alpha ranks before any of the one-day %% return ranks. %% <<unbalanced w/out duplicates>>= %% ranks <- alpha %% top.ranks <- aggregate(overall.ranks[c("rank")], by = list(id = overall.ranks$id), min) %% ranks$rank <- top.ranks$rank[match(ranks$id, top.ranks$id)] %% ranks[order(ranks$rank), c("rank", "sort", "shares", "mv")] %% @ %% Making the alpha 10 times as important as the one-day return sort %% causes us to only use ranks from the alpha sort. We do not even %% consider the number 1 ranked one-day return trade until we examine all %% the alpha sorts ranked in the top ten. %% The last weighting we consider falls somewhere in between the %% previous two. %% <<mixed weighting>>= %% ## returns alpha$rank to original level %% alpha$rank <- alpha.rank.orig %% alpha$rank <- alpha$rank / 1.5 %% overall.ranks <- data.frame() %% overall.ranks <- rbind(alpha, ret.1.d) %% overall.ranks <- overall.ranks[order(overall.ranks$rank), c("id", "sort", "rank", "side", "shares", "mv")] %% row.names(overall.ranks) <- paste(overall.ranks$id, overall.ranks$sort, sep = ".") %% overall.ranks[c("rank", "side", "shares", "mv")] %% @ %% Having assigning a weight of 1.5 to alpha we divide each alpha value %% by 1.5. %% <<mixed w/out duplicates>>= %% ranks <- alpha %% top.ranks <- aggregate(overall.ranks[c("rank")], by = list(id = overall.ranks$id), min) %% ranks$rank <- top.ranks$rank[match(ranks$id, top.ranks$id)] %% tmp <- ranks[order(ranks$rank), c("rank", "shares", "mv")] %% tmp %% @ %% We use three of the ranks from the alpha sort and one %% rank from the one-day return sort. This is the weighting scheme that %% we use in the rest of the example. %% To review, the ranking process has four steps. First, we ranked each %% trade according to both alpha and one-day return to generate raw %% ranks. Second, we weight these ranks. Third, we combined the alpha %% and one-day return ranks. Fourth, we eliminated duplicates by %% associating each trade with the lowest rank assigned to it by either %% alpha or one-day return. We call these ranks preliminary ranks %% because they are not the final values we use to determine the %% desirability of each trade. Nonetheless, we must generate preliminary %% ranks before we can arrive at final ranks, the calculation of which we %% describe in the next section. %% \subsection{Generating synthetic ranks} %% \label{generating synthetic ranks} %% If trade 1 is $X$ better than the trade 2, then is trade 99 $X$ better %% than trade 100? Most portfolio managers would argue that the %% difference in utility between trade 1 and trade 2 is greater than the %% difference in utility between trade 99 and trade 100. However, with %% raw ranks, we make no assertion of how much better one trade is than %% another trade. To express the tendency for us to derive more utility %% from the most highly ranked trades, we synthesise yet another set of %% values from the weighted ranks. We call these values synthetic %% ranks.\protect\footnote{We are abusing the term ``ranks'' by using it %% in several different contexts.} %% \begin{description} %% \item{\bf{synthetic ranks}}: values generated by mapping the weighted %% ranks to a truncated normal distribution ($> 85^{th}$ percentile on %% $N(0,1)$). %% \end{description} %% First, we re-rank the weighted ranks: %% <<>>= %% tmp$rank <- rank(tmp$rank, ties.method = "first") %% tmp[order(tmp$rank),] %% @ %% Next, we evenly distribute the ranks on the interval interval %% $[0.85,1)$ such that the best ranked trades are closest to 1 and the %% worst ranked trades are closest to $0.85$: %% <<scaling ranks>>= %% ## a hacked version of the scaling function in calcRanks, built only %% ## for a list of all buys %% r.max <- max(tmp$rank) + 1 %% r.mult <- 0.15 %% r.add <- 0.85 %% tmp$rank.s <- (r.mult * tmp$rank[nrow(tmp):1] / r.max) + r.add %% ## Saves off rank.s for later use %% rank.s <- tmp %% tmp %% @ %% We list the scaled ranks in \texttt{rank.s}. Next, we map to the a %% truncated normal distribution.\protect\footnote{$> 85^{th}$ percentile %% of $N(0,1)$} %% <<synthesised ranks>>= %% tmp$rank.t <- qnorm(tmp$rank.s) %% tmp[,c("rank", "rank.s", "rank.t", "shares", "mv")] %% @ %% The \texttt{rank.t} column lists the ranks mapped to the truncated %% normal distribution. \Sexpr{tl@ranks[1,"id"]} has the best rank and %% \Sexpr{tl@ranks[nrow(tl@ranks),"id"]}, has the lowest rank. We might %% expect to see a \texttt{rank.t} of approximately 3.5 for the best %% ranked trade, but because we only have \Sexpr{nrow(tmp)} candidates %% and the scaled values are evenly spaced on the interval $[0.85,1)$, %% the normalised value of the best ranked trade is not as great as %% it would be if we had 100 trades. %% \emph{preliminary ranks section} %% We mentioned in section \ref{} that synthetic ranks express the %% greater differences in utility between adjacent, highly ranked trades %% ($1,2,3\dots{}$) than between adjacent, poorly ranked trades: \\* %% \begin{table}[!htbp] %% \begin{tabular}[c]{|rr|rr|rr|} %% \hline %% rank & $\Delta$ & $N(0,1)$ & $\Delta$ & $> 85^{th}$ of $(0,1)$ & $\Delta$ \\ %% \hline %% 1 & 1 & 3.50 & 1.17 & 3.50 & 0.53 \\ %% 2 & 1 & 2.32 & 0.27 & 2.96 & 0.21 \\ %% 3 & 1 & 2.05 & 0.17 & 2.74 & 0.13 \\ %% 4 & 1 & 1.88 & 0.13 & 2.61 & 0.10 \\ %% 5 & 1 & 1.75 & - & 2.51 & - \\ %% . & . & . & . & . & . \\ %% . & . & . & . & . & . \\ %% 48 & 1 & 0.05 & 0.03 & 1.46 & 0.01 \\ %% 49 & 1 & 0.02 & 0.02 & 1.45 & 0.01 \\ %% 50 & 1 & 0.00 & 0.02 & 1.44 & 0.01 \\ %% 51 & 1 & -0.02 & 0.02 & 1.43 & 0.01 \\ %% 52 & 1 & -0.05 & - & 1.41 & - \\ %% . & . & . & . & . & . \\ %% . & . & . & . & . & . \\ %% 96 & 1 & -1.64 & 0.11 & 1.06 & 0.00 \\ %% 97 & 1 & -1.75 & 0.13 & 1.06 & 0.00 \\ %% 98 & 1 & -1.88 & 0.17 & 1.06 & 0.00 \\ %% 99 & 1 & -2.05 & 0.27 & 1.06 & 0.00 \\ %% 100 & - & -2.32 & - & - & NA \\ %% \hline %% \end{tabular} %% \caption[Synthetic rank distributions]{Creating synthetic ranks %% using a linear distribution, a normal distribution, and a %% truncated normal distribution. Delta columns express the %% difference in utility between adjacent trades.\label{distribution %% table}} %% \end{table} %% Table \ref{distribution table} expresses the differences amongst %% distributions we might use to rank 100 trades. The \texttt{Raw Rank} %% column contains the raw ranks for the 5 best trades, the 5 %% middle-ranked trades, and the 5 worst trades. In this example the %% ranks on $[1,100]$ are spaced at intervals of one. The alpha %% difference between every trade is the same. If we use raw rank as a %% measure of alpha, we derive the same utility from every trade. Trade %% 1 is one better than trade 2, and trade 99 is one better than trade %% 100. %% The normal distribution column $(N(0,1))$ expresses what happens when %% we normalise the raw ranks. The normal distribution correctly %% expresses our belief that there is a large difference in %% alpha between the best ranked trades. However, use of the normal %% distribution would incorrectly suggest that there are similarly large %% alpha differences between the worst trades. We get these results when %% using the normal distribution because the best and worst ranked trades %% form the tails of the distribution. We do not want large %% differences in alpha amongst the worst rank trades. The %% alpha differences decrease until we reach trade 50, then increase %% again as we move towards the other tail of the distribution. We want %% alpha to remain the same on the margin past the 50th trade. %% To address the problems of a normal and linear distribution, we use a %% truncated normal distribution, $> 85^{th} \% of N(0,1)$. In the %% right\-most delta $(\Delta)$ column, the alpha differences %% between the best ranked trades is over 50 times greater than the %% alpha differences between the worst ranked trades. Every %% trade ranked worse than 50 has a similar alpha difference. %% Although the subset $[0.85,1)$ is slightly arbitrary, (we could have %% set the lower extreme to be 0.84, 0.85, or another similar value) it %% serves our purpose of expressing large differences in alpha %% and where we find the best buys, at one tail, and small differences in %% alpha amongst the worst buys. %% Recall the steps we have taken towards in generating our final measure %% of rank, synthetic rank. First, we converted the sort values to raw %% ranks. Second, we converted the raw ranks to weighted ranks. Third, %% we scaled the weighted ranks to $[0.85,1)$ to generate scaled weights. %% Lastly, we mapped the scaled weights to the truncated normal %% distribution. By only using the $85^{th}$ percentile and above, we %% express our belief that the differences in alpha between the best %% ranked trades is much greater than the differences in alpha between %% the worst ranked trades. %% If the costs associated with trading any stock, all things being %% equal, were the same, we would not care about the difference in %% utility between any trades. We would move down the trade list from %% best to worst until we match the allotted turnover. However, our %% trading influences prices and may reduce the desirability of a trade. %% \subsection{Chunks, synthetic rank, and trade-cost adjustment} %% \label{Chunks, synthetic rank, and trade-cost adjustment} %% We want to know at what point the cost of trading an equity exceeds %% the utility of trading that equity. In the \texttt{portfolio} %% package, we use synthetic rank to represent utility. Determining the %% cost of purchasing an additional share is impossible if our smallest %% trading unit is an entire order so we break each order into %% \emph{chunks}. %% \begin{description} %% \item{\bf{chunk}}: A portion of a candidate trade. %% \end{description} %% We break candidate trades into chunks by market value. Each chunk has %% a market value of approximately \$\Sexpr{tl@chunk.usd}: %% <<chunk w/out tca.rank>>= %% tl@chunks[, c("side", "shares", "mv", "alpha", "ret.1.d", "rank.t", "chunk.shares", "chunk.mv")] %% @ %% The candidate trades are broken into \Sexpr{nrow(tl@chunks)} chunks. %% The number following the period in the row name expresses the chunk %% number for that particular equity. The \texttt{chunks.mv} column %% expresses the market value of each chunk. The \texttt{chunk.shares} %% column expresses how many shares each chunk consists of. %% \subsubsection{Trade-cost adjustment of individual chunks} %% As we trade greater percentages of the average daily volume, the price %% of the equity will increase. To reflect this phenomenon, we penalise %% the synthetic ranks of the chunk as we trade greater percentages of %% the daily volume. We call this penalty \emph{trade-cost adjustment}. %% \begin{description} %% \item{\bf{trade-cost adjustment}}: Lowering a chunk's rank because of %% trading volume. %% \end{description} %% To fix this idea, let's first examine the daily volumes of our %% candidate trades.\protect\footnote{The \texttt{volume} column %% represents some measure of past trading volume such as the average %% trading volume over the last 30 days. A daily measure of %% \texttt{volume} is not required; we would use whatever measure is %% natural for the frequency with which we trade.} %% <<trading volume>>= %% trading.volume <- data.frame() %% trading.volume <- cbind(rank.t = tl@ranks[, c("rank.t")], volume = tl@data[match(tl@ranks$id, tl@data$id), c("volume")], shares = tl@ranks[, "shares"]) %% row.names(trading.volume) <- tl@ranks$id %% trading.volume %% @ %% \emph{This must be updated as I change the portfolio} %% The trades we want to make for both MSFT, SCHW, and GOOG involve less %% than 3\% of the daily trading volume. However, we must trade 100\% of %% the daily trading volume. We would probably not be able to purchase %% all these shares in one day, and even if we could, we would affect %% prices significantly. Moving into the position over several days %% would be better. %% We use a trade-cost adjustment function to express how increasing %% trade costs reduce the desirability of candidate trades. To better %% approximate utility, we penalise synthetic ranks at the chunks level. %% Doing this allows us to better determine at which point the cost of %% trading an additional chunk is greater than the utility derived by %% trading an additional chunk. We perform trade-cost adjustment on the %% chunks by keeping track of what percentage of the daily volume we have %% traded with each additional chunk. The first chunk to cross the %% threshold of 15\% of the daily trading volume is penalised by a fixed %% amount. All subsequent chunks are penalised by that amount, and any %% further chunks that pass 30\% or 45\% percent of the daily trading %% volume receive further penalties. The chunks of GM, an illiquid %% equity, have been trade-cost adjusted. %% <<chunks w/ tca.rank>>= %% head(tl@chunks[, c("side", "mv", "alpha", "ret.1.d", %% "rank.t", "chunk.shares", "chunk.mv", "tca.rank")]) %% @ %% The \texttt{tca.rank} column expresses the synthetic rank adjusted for %% trade-cost. As the only candidate for which we want to purchase more %% than 15\% of the daily trading volume is GM, it is the only candidate %% for which we trade-cost adjust the chunks. Every chunk of GM beyond %% the first has been trade-cost adjusted. This will cause us to %% consider the chunks of other candidate trades before we trade %% additional chunks of GM: %% <<ordered chunks w/ tca.rank>>= %% tl@chunks[order(tl@chunks$tca.rank, decreasing = TRUE), c("side", %% "mv", "alpha", "ret.1.d", "rank.t", "chunk.shares", %% "chunk.mv", "tca.rank")] %% @ %% As MSFT is the best ranked candidate and does not receive a trade-cost %% penalty, we would trade all the shares of MSFT before considering the %% other candidates.\protect\footnote{Assuming that derived turnover is %% greater than the market value of all the candidate trades.} Having %% completed all the trades of MSFT, we would consider the first chunk of %% GM, the only chunk which has not been trade-cost adjusted. %% Subsequently, we would trade all the chunks of SCHW and GOOG, the %% candidate trades ranked 3 and 4. Lastly, we trade the penalised %% chunks of GM. %% \subsubsection{Synthetic rank and trade-cost adjustment of small portfolios} %% \label{Synthetic rank and trade-cost adjustment of small portfolios} %% In this example, trade-cost adjustment decreases the desirability of %% the chunks of GM in a non-trivial way. Although GM is ranked 2nd as a %% candidate trade, every other candidate trade would be made before we %% completed all the chunks of GM. When we consider such a small number %% of trades, we assume that all of the trades are of approximately equal %% quality; the difference in utility between candidate trades %% is fairly small. This occurs because the scaled ranks are evenly %% distributed on $[0.85,1)$: %% <<rank.s>>= %% rank.s %% @ %% When we only have \Sexpr{nrow(tl@candidates)} candidates, none of %% the scaled ranks will be very close to $1$, and consequently, none of %% the synthetic ranks will fall at the extreme tail %% of the normal distribution: %% <<rank.t>>= %% rank.t <- rank.s %% rank.t$rank.t <- qnorm(rank.t$rank.s) %% rank.t %% @ %% Consequently, the difference in utility between candidate %% trades will be small when there are few candidate trades. %% Heuristically, this seems correct because if we are making very few %% trades, we would most likely derive similar utility from any of %% them.\protect\footnote{This does not exclude our expressing a %% preference amongst the sorts.} Therefore, it makes sense for us to %% trade the other three candidates if the costs associated with trading %% GM are non\-trivial. %% \subsubsection{Synthetic rank and trade-cost adjustment of large portfolios} %% \label{Synthetic rank and trade-cost adjustment of large portfolios} %% Moving away from our example for a moment, imagine that we have a %% large current and target portfolio, the trade list for which contains %% 100 candidate trades. When we have a large portfolio, we tend to %% view the differences in utility between candidates in the %% manner we described in section \ref{}. When evenly distribute the %% scaled ranks on the interval $[0.85,1)$, we have more ranks at %% the extreme tail: %% <<misc$rank.s>>= %% misc$rank.s %% @ %% The row names express the equity ticker symbols. \texttt{rank} is the %% raw rank. \texttt{rank.s} is the scaled rank, and \texttt{rank.t} is %% the synthetic rank. The best ranked trade %% \Sexpr{row.names(misc[["rank.s"]])[1]}, has a scaled rank value very %% close to one, \Sexpr{row.names(misc[["rank.s"]][["rank.s"]])[1]}, and %% a synthetic rank close to three. This indicates that the best rank %% falls at the tail of the normal distribution. The worst ranked %% candidates not only have low synthetic ranks, but they also have very %% small differences in synthetic rank. If we trade-cost adjust %% one of the poorly ranked candidates we will most likely not trade it %% until we have traded all other candidates not penalised by trade cost %% adjustment. On the other hand, we would still trade %% \Sexpr{row.names(misc[["rank.s"]])[1]}, %% \Sexpr{row.names(misc[["rank.s"]])[2]}, or %% \Sexpr{row.names(misc[["rank.s"]])[3]}, even if some of the chunks had %% been trade-cost adjusted: %% Here we have a subset of the hypothetical chunk table for the 100 %% candidate example. For this example, the GOOG candidate has been %% broken up into 2 chunks and the IBM candidate has been broken up into %% 4 chunks. The ranks of 2$^{nd}$, 3$^{rd}$, and %% 4$^{th}$ chunks of IBM have been penalised for trade costs. %% Therefore, we trade the first chunk of IBM, followed by all the chunks %% of GOOG. Subsequently, we trade the remaining chunks of IBM because %% the trade-cost adjusted rank of its chunks is still greater than the %% un-penalised synthetic rank of the next most desirable candidate, GM. %% Let's quickly review how we generate the final, synthetic ranks. The %% preliminary values from which we draw the raw ranks are the sorts we %% define. In this example, we defined sorts for alpha and one-day %% return. In creating raw ranks abstract away the underlying values %% provided by the sorts. At this point, we still have a different set %% of raw ranks for each sorts. To express preferences amongst the %% sorts, we apply weights to the sorts. This step yields weighted %% ranks. From the sets of weighted ranks, we associate with each %% candidate the best weighted rank from any sort. Next, we scale the %% buys to the interval $[0.85,1)$. This step yields scaled ranks. From %% scaled ranks, we generate synthetic ranks by mapping the scaled ranks %% to a truncated normal distribution. Next, we break the candidates %% into chunks and perform trade-cost adjustment as necessary. This %% yields trade-cost adjusted ranks which are the final measure of a %% chunks desirability. %% \subsection{Sorting theory} %% \label{sorting theory} %% Chooing the best candidate when we have multiple measures of %% desirability is difficult. Consider the situation where we must %% choose ten stocks to trade. %% Assuming that we use some type of formula to generate alpha, we might %% be able to incorporate our other sorts into the formula for alpha. %% Instead of having alpha and one-day return as distinct sorts, we would %% only have one sort, alpha, which would also take one-day return into %% account. For this to work, however, we would have to write a function %% that accounted for the the ordering of every trade by every sort. %% Furthermore, this function would have to take into account our %% preference for certain certain sorts over other sorts. To elaborate %% on the difficulty of this creating such a function, let us consider %% the situation where we must choose our ten favourite trades, in no %% particular order, using the data in the table below. %% \begin{table}[!htbp] %% \begin{tabular}[c]{|r|r|r|r|r|r|} %% \hline %% symbol & raw rank & alpha & symbol & raw rank & ret.1.d \\ %% \hline %% IBM & 1 & 1.57 & HPQ & 1 & -0.063 \\ %% MS & 2 & 1.26 & SUNW & 2 & -0.056 \\ %% EBAY & 3 & 1.24 & AET & 3 & -0.041 \\ %% CBBO & 4 & 1.21 & YHOO & 4 & -0.036 \\ %% SCHW & 5 & 1.15 & T & 5 & -0.014 \\ %% PAYX & 6 & 1.12 & CVX & 6 & -0.011 \\ %% HAL & 7 & 1.12 & GOOG & 7 & -0.011 \\ %% AMD & 8 & 1.10 & PAYX & 8 & -0.002 \\ %% MSFT & 9 & 0.99 & CBBO & 9 & 0.003 \\ %% CVX & 10 & 0.96 & HAL & 10 & 0.009 \\ %% AET & 11 & 0.92 & QCOM & 11 & 0.011 \\ %% HPQ & 12 & 0.81 & EBAY & 12 & 0.014 \\ %% QCOM & 13 & 0.77 & SCHW & 13 & 0.029 \\ %% GOOG & 14 & 0.65 & AAPL & 14 & 0.036 \\ %% YHOO & 15 & 0.64 & MS & 15 & 0.041 \\ %% \hline %% \end{tabular} %% \caption[alpha and one-day return ranks]{The alpha and one-day %% returns of candidates suggest different rank orderings. All of the %% candidates are buys.\label{theory table 1}} %% \end{table} %% Table \ref{theory table 1} has a row for each of 15 candidates, their %% alpha and one-day return values, and the raw ranks we would generate %% from these values. All of the candidates are buys so greater alpha %% values are better and lesser one-day return values are better. %% One portfolio manager might decide that she wants to make trades based %% only on alpha. She chooses the top ten trades according to alpha. A %% second portfolio manager may want to make trades based only on one-day %% return. She chooses the top ten trades according to one-day return. %% The third portfolio manager considers both alpha and one-day return %% and choose her favorite trades by examining both. %% Portfolio manager three believes in buying equities which have had %% price decreases of greater than 4\% during the previous trading day. %% Consequently, she would buy HPQ, SUNW, and AET. She would fill her %% remaining orders using the top 7 trades according to alpha. %% How would the third portfolio manager write a function that expresses %% her trading preferences? What if some days she acted like the first %% portfolio manager and on other days like the second portfolio manager? %% How would she account for a change in preference for one of the sorts? %% Our solution allows any of these portfolio managers to express her %% trading preferences without having to write a function that relates %% the different measures of desirability. Instead, she would use the %% weighting function that the \texttt{portfolio} package provides. She %% would examine the trade list created using different weighting schemes %% and adjust the weights until the utility derived from the %% last candidate traded was greater than the cost of the first %% trade \emph{not} made. %% For example, the portfolio manager may decide that YHOO is a better %% reversal trade than the last alpha trade and revise the weighting %% scheme so that she makes one less alpha trade and one more reversal %% trade. %% \begin{table}[!htbp] %% \begin{tabular}[c]{|r|r|r|r|r|r|} %% \hline %% symbol & raw rank & alpha & symbol & raw rank & ret.1.d \\ %% \hline %% IBM & 1 & 1.57 & HPQ & 1 & -0.063 \\ %% MS & 2 & 1.26 & SUNW & 2 & -0.056 \\ %% EBAY & 3 & 1.24 & AET & 3 & -0.041 \\ %% CBBO & 4 & 1.21 & YHOO & 4 & -0.036 \\ \cline{5-5} %% SCHW & 5 & 1.15 & T & 5 & -0.014 \\ %% PAYX & 6 & 1.12 & CVX & 6 & -0.011 \\ \cline{2-2} %% HAL & 7 & 1.12 & GOOG & 7 & -0.011 \\ %% AMD & 8 & 1.10 & PAYX & 8 & -0.002 \\ %% MSFT & 9 & 0.99 & CBBO & 9 & 0.003 \\ %% CVX & 10 & 0.96 & HAL & 10 & 0.009 \\ %% AET & 11 & 0.92 & QCOM & 11 & 0.011 \\ %% HPQ & 12 & 0.81 & EBAY & 12 & 0.014 \\ %% QCOM & 13 & 0.77 & SCHW & 13 & 0.029 \\ %% GOOG & 14 & 0.65 & AAPL & 14 & 0.036 \\ %% YHOO & 15 & 0.64 & MS & 15 & 0.041 \\ %% \hline %% \end{tabular} %% \caption[Trading Preferences II]{Portfolio manager 3 revises her %% trading preferences.\label{theory_table_2}} %% \end{table} %% What ultimately matters is the last candidate we decide to trade and %% the first candidate we decide not to trade. By using rank orders %% instead of underlying values, we do not have to combine the different %% sorts. Instead, we can express our preferences for different, %% possibly unrelated criteria through the use of a weighting scheme we %% provide in \texttt{portfolio}. %% \subsection{Pairing trades} %% Let us return to discussing trade list construction. In practise, %% most equity portfolios must be maintained at a specific market value. %% One logical way to achieve this result would be to pair desirable buys %% and sells of equal market value, and this is what we do in the %% \texttt{portfolio} package. We call these pairings of buys and sells %% a swap: %% \begin{description} %% \item{\bf{swap}}: A pairing of a buy and sell or short and cover of %% similar market market value and desirability. %% \end{description} %% We have already created the framework to create this swaps; we break %% the candidates into chunks of similar market value and then rank these %% chunks individually. If our candidate trades included buys and sells, %% we would simply match the most desirable buys with the most desirable %% sells. However, our candidate trades are all buys, and we want to %% increase the market value of our portfolio by \$1,000. %% \subsubsection{Dummy chunks} %% If we want to increase the market value of the portfolio, we must buy %% more than we sell. Therefore, we do not want to pair a buy with a %% sell. We just want buys. The situation where we just want buys or %% sells is a special case. The \texttt{portfolio} package is structured %% so that we must also trade in pairs. To work within the package %% framework we introduce the concept of \emph{dummy chunks}: %% \begin{description} %% \item{\bf{dummy chunk}}: A \emph{fake} buy or sell chunk that we pair with %% a real buy or sell chunk in situations where we want to increase or %% decrease the market value of the portfolio. %% \end{description} %% As our example only contains buys, we have paired every buy with a %% dummy sell.\protect\footnote{We only show the head of the swaps table.} %% <<swaps table>>= %% head(tl@swaps[, c("tca.rank.enter", "tca.rank.exit", %% "rank.gain")]) %% @ %% In the table above, the row names express the chunk ticker symbols %% that form the swap. To the left of the comma is an enter chunk, and %% to the right of the comma is an exit chunk.\protect\footnote{Enter %% chunks are either a buy or short. A buy allows us to take a long %% position and a short allows us to take a short position. Exit chunks %% are either sells or covers. A sell allows us to exit a long position %% and a cover allows us to exit a short position.} The exit chunks all %% have a symbol \texttt{NA.0} because they are dummy sells. The %% \texttt{tca.rank.enter} column expresses the trade-cost adjusted rank %% of the enter chunk, the buy, and the \texttt{tca.rank.exit} column %% expresses the trade-cost adjusted rank of the exit chunk, the dummy %% sell. The \texttt{rank.gain} column expresses the difference in %% trade-cost adjusted rank between the enter and the exit, the buy and %% dummy sell. %% We have spent considerable time discussing the generation of all types %% of ranks for buys, but we have not yet discussed ranking sells. For %% sells, better ranks are more negative. Therefore, a great sell might %% have a synthetic rank of -3.5. In section \ref{}, we discuss how %% we generate the ranks for the sells. For now, just note that better %% sells have more negative ranks. %% Recall that our goal is to make the trades which yield the most %% utility. In spending our \$1,000, we want to trade the best chunks. %% So that we make the best buys when increasing the market value of the %% portfolio, we assign the dummy sells an arbitrarily high rank. In the %% table above, the dummy sells have a trade-cost adjusted rank of %% -10,000. We match the best the buys and sells by calculating rank %% gain. As no real sells will yield the same rank gain that the pairing %% of buy and a dummy sell yields, we create pairs with all the dummy %% sells before even considering other sells. As there are no sells in %% this example, all the swaps consist of a buy and a dummy sell. %% Let's quickly review why we create swaps. We want to maximise utility %% by making the candidate trades or portions of candidate trades that %% yield the greatest utility. Generally, we want to maintain %% the portfolio equity at a constant level. A logical way to do this %% involves pairing buys and sells of similar market value. To maximise %% utility, we should pair the most best ranked buys and sells. In %% special cases, we want to increase or decrease the market value %% of our portfolio. In order to do this, we must make more of one type %% of trade. However, this would require that we have swaps that contain %% only a buy or sell. Since we cannot have a swap of only one trade, we %% introduce dummy trades. As dummy trades have an arbitrarily high %% synthetic rank they pair with the best buys and sells to ensure that %% we choose the most useful candidates in changing the market value of %% the portfolio. %% \subsection{Accounting for turnover} %% As we stated in section \ref{}, holding period would be endogenous if %% we could always set it to maximise risk-adjusted return. However, %% most real world portfolios have a set holding period and consequently, %% a set turnover. There is no real concept of turnover or holding %% period in this example. We have \$1,000 to invest in our portfolio %% over the course of a single day. Although this additional investment %% does not represent turnover, we can view our \$1,000 as representing a %% daily turnover of \$1,000. We want to make the best ranked trades %% until the cumulative market value of these trades exceeds the money we %% have to invest. Analogously, we would say that we want to make the %% best ranked trades until we exceed turnover. %% As our turnover in this example is \$\Sexpr{tl@turnover}, all of our %% trades will not have a market value greater than %% \$\Sexpr{tl@turnover}: %% <<>>= %% tl@swaps.actual[, c("tca.rank.enter", "tca.rank.exit", %% "rank.gain")] %% @ %% MSFT is the the best ranked trade. Consequently, we choose swaps of %% MSFT before choosing other swaps. We make %% \Sexpr{nrow(tl@swaps.actual)} because each swap has a value of %% approximately \$\Sexpr{tl@chunk.usd}, and our turnover is %% \$\Sexpr{tl@turnover}. %% \subsection{Actual orders} %% We do not want to submit two orders for 8 shares of MSFT. Before %% submitting the trade list, we must roll-up the swaps into larger %% orders. We first remove the dummy chunks: %% <<remove idiots>>= %% tl@chunks.actual[, c("side", "mv", "alpha", "ret.1.d", "rank.t", %% "chunk.shares", "chunk.mv", "tca.rank")] %% @ %% Then we combine the chunks to form a single order per candidate: %% <<>>= %% tl@actual[, !names(tl@actual) %in% c("id")] %% @ %% We now have an order for \Sexpr{tl@actual[1,"shares"]} shares of %% \Sexpr{tl@actual[1,"id"]}, which is the sum of the chunks of %% \Sexpr{tl@actual[1,"id"]}. Having discussed in words the process of %% trade list creation, we describe, step-by-step, the process of %% building a \texttt{tradelist} object in R. \section{Creating a long-only tradelist in R} \label{a long-only tradelist} \SweaveOpts{echo=TRUE, quiet=TRUE} <<echo=FALSE>>= ## Clears the search list. rm(list = ls()) load("tradelist.RData") ## prepares data for this example p.current <- portfolios[["p.current.lo"]] p.target <- portfolios[["p.target.lo"]] data <- data.list[["data.lo"]] ## Original Equity, Target Equity oe <- portfolio:::mvShort(p.current) + portfolio:::mvLong(p.current) te <- portfolio:::mvShort(p.target) + portfolio:::mvLong(p.target) ## Creates the sorts list sorts <- list(alpha = 1, ret.1.d = 1.1) ## Creates the tradelist so we can use different measures tl <- new("tradelist", orig = p.current, target = p.target, chunk.usd = 2000, sorts = sorts, turnover = 30250, target.equity = te, data = data) ## Necessary turnover to make all the candidate trades nt <- mvCandidates(tl) @ To create a \texttt{tradelist}, we need four main pieces. The first two pieces necessary to create a \texttt{tradelist} are \texttt{portfolio} objects. One of these portfolios is our current portfolio. Our current portfolio is a superset of the previous holdings. The major difference between the two portfolios is that the current portfolio in this example includes positions that we sell. This \texttt{portfolio}, named \texttt{p.current}, consists of \Sexpr{nrow(p.current@shares)} positions and has a market value of \$\Sexpr{prettyNum(oe,big.mark=",")}. <<prep-p.current.shares, echo=FALSE>>= p.current.shares <- p.current@shares[, c("shares", "price")] @ <<p.current.shares>>= p.current.shares @ The target portfolio is a superset of the previous target portfolio. It contains \Sexpr{nrow(p.current.shares)} positions and has a market value of \$\Sexpr{prettyNum(te,big.mark=",")}. <<echo=FALSE>>= p.target.shares <- p.target@shares[, c("shares", "price")] @ <<>>= p.target.shares @ We calculate the portfolio difference to determine the candidate trades.\protect\footnote{The data frame is a subset of the \texttt{candidates} data frame. We often take subsets of data frames so that they fit better on the page. If we do so we indicate this by prepending the name of the data frame with \texttt{sub}.} <<echo=false>>= sub.candidates <- tl@candidates[,!names(tl@candidates) %in% "id"] @ <<long-only candidates>>= sub.candidates @ The candidate buys are the same as before and we have 3 candidate sells. The market value is signed and expresses the net effect a candidate has on the dollar value of a portfolio. \subsection{Assigning weights} We assign weights to the sorts by creating a list. <<>>= sorts <- list(alpha = 1, ret.1.d = 1.1) @ We assign a weight of 1 to alpha and a weight of 1.1 to one-day return. \subsection{Passing additional information to \texttt{tradelist}} \label{paitt} The fourth item is a data frame. The \texttt{portfolio} package requires that this data frame contain columns for \texttt{id}, \texttt{volume}, \texttt{price.usd}, and the sorts: <<echo=false>>= row.names(data) <- data$id sub.data <- data[, c("id", "volume", "price.usd", "alpha", "ret.1.d")] @ <<necessary data>>= sub.data @ \texttt{volume} expresses some measure of average trading volume. \texttt{price.usd} is the most recent price of the security in US dollars. We must also include the sorts we define in \texttt{sorts}, \texttt{alpha} and \texttt{ret.1.d}. \subsection{Calling \texttt{new}} \label{lo new} We use \texttt{p.current}, \texttt{p.target}, the \texttt{sorts}, and \texttt{data} as arguments to \texttt{new}. <<new long-only tl>>= tl <- new("tradelist", orig = p.current, target = p.target, chunk.usd = 2000, sorts = sorts, turnover = 30250, data = data) @ <<echo=false>>= tl <- new("tradelist", orig = p.current, target = p.target, chunk.usd = 2000, sorts = sorts, turnover = 30250, target.equity = 47500, data = data) @ In this call, the \texttt{new} method for \texttt{tradelist} accepts 8 parameters:\footnote{The \texttt{new} method of \texttt{tradelist} can accept more parameters, but they are optional.} The first argument, \texttt{"tradelist"}, specifies the name of the object that we want to create. The argument to the \texttt{orig} parameter, \texttt{p.current}, is the current portfolio. The argument to the \texttt{target} parameter, \texttt{p.current}, is the target portfolio. The \texttt{sorts} parameter accepts the \texttt{sorts} list we created earlier. We create chunks with a granularity of of \$\Sexpr{prettyNum(tl@chunk.usd,big.mark=",")}. The \texttt{data} parameter accepts the data frame we created earlier with columns for \texttt{id}, \texttt{volume}, \texttt{price.usd}, and the sorts. The \texttt{turnover} parameter accepts an integer argument which expresses the maximum market value all orders made in one session. In the previous example we only had \$1,000 with which we could buy stocks. In this example, we can both buy and sell equities. We might sell an equity and use the proceeds to buy another equity. However, the turnover restriction applies to sells just as much as buys. If we have a turnover of \$1,000, we may make \$1,000 worth of buys, \$1,000 worth of sells, or something in between. For this example, we have set the turnover equal to the unsigned market value of all the candidate trades. This means that we take the absolute value of all market values, which is \$\Sexpr{prettyNum(nt,big.mark=",")}. Having set \texttt{turnover} to this value, we complete every candidate trade. We have demonstrated how to create a simple \texttt{tradelist} in R. In the next section we examine the \texttt{tradelist} that we have constructed. In doing so, we learn how the \texttt{tradelist} generation algorithm works. \section{The \texttt{tradelist} algorithm} \label{the tradelist algorithm} The \texttt{tradelist} code provides an algorithm, divisible into seven smaller steps, that generates a set of trades that will move the current, original portfolio towards an ideal, target portfolio. The seven steps in the algorithm correspond to the following methods of the \texttt{tradelist} class: \texttt{calcCandidates}, \texttt{calcRanks}, \texttt{calcChunks}, \texttt{calcSwaps}, \texttt{calcSwapsActual}, \texttt{calcChunksActual}, and \texttt{calcActual}. The user never needs to directly call any of these methods when using the \texttt{portfolio} package. A call to the \texttt{new} method of the \texttt{tradelist} class invokes the \texttt{initialize} method of \texttt{tradelist}. The \texttt{initialize} method then calls the seven methods serially. The first step of the \texttt{tradelist} algorithm involves determining which types of orders we must make in order to trade towards the target portfolio. \subsection{The \texttt{calcCandidates} method} \label{the calcCandidates method} As stated in our simplifying assumption, we only consider trades that bring us closer to the target portfolio. To determine candidate trades we calculate which positions have changed. If a position has changed, we determine what type of trade the candidate is (buy or sell) by taking the portfolio difference to generate a list of candidate trades. <<tl@candidates>>= tl@candidates @ Given the data stored in the \texttt{candidates} data frame and the \texttt{data} data frame, the \texttt{portfolio} package can generate the trade list. \subsection{The \texttt{calcRanks} Method} \label{calcranks} \label{the calcRanks method} Ranking the trades is possibly the most complicated task delegated to the \texttt{tradelist} class. When the rank-generating algorithm returns, the \texttt{ranks} data frame \texttt{tradelist} will contain the synthetic rank, \texttt{rank.t}, for each trade. \subsubsection{Interpretation of sort values} When we define a sort, we express our preference for purchasing different stocks. Lesser values express a preference for selling or shorting a position and greater values express a preference for buying or covering a position. In the previous example we only saw positive alpha values because all the candidates were buys. If the values were not positive, we might question why the trade was even a candidate. Recall our first simplifying assumption that all of the can\-di\-dates are de\-sir\-able and the \texttt{portfolio} package only helps us to determine which are the most desirable. In real life, we want to create a sort using meaningful values that express our trading preferences. One such value is one-day return. \subsubsection{Creating raw ranks for a long-only portfolio} \label{Creating raw ranks for a long-only portfolio} The first step in creating ranks is generating raw ranks. We break the trades into separate data frames by side and rank the trades within each side because one type of trade is no than another type of trade. <<echo=false>>= ranks <- tl@rank.sorts$ret.1.d ranks <- split(ranks, ranks$side) ranks$B$rank <- 1:nrow(ranks$B) ranks$S$rank <- 1:nrow(ranks$S) ranks @ The \texttt{\$B} data frame shows the buys ranked with other buys and the \texttt{\$S} data frame shows the sells ranked with other sells. The most desirable buys are those associated with the greatest values in \texttt{ret.1.d}. The most desirable sells are those associated with the least value in \texttt{ret.1.d}. Therefore, \Sexpr{ranks[["B"]][1,]} ranked 1 amongst buys, is the most desirable buy, and \Sexpr{ranks[["S"]][1,]}, ranked 1 amongst sells, is the most desirable sell.\footnote{We have taken the inverse of all the one-day return values so that the \texttt{portfolio} package interprets them correctly. If we believe one-day reversal, the best buys have negative one-day returns and the best sells have positive one-day returns. Buy low, sell high. However, the \texttt{portfolio} package interprets greater values as indicative of the best buys and lesser values as indicate of the best sells.} \subsubsection{Interleaving} \label{interleaving} We now have two tables of ranks and there are still multiple trades at each rank: a buy and sell ranked number one, number two and so on. Combining the two tables of ranks by type leaves us with duplicates: <<echo=false>>= tmp <- rbind(ranks$B, ranks$S)[order(rbind(ranks$B, ranks$S)[["rank"]]),] tmp[,!names(tmp) %in% "id"] @ We argue that there is no natural way to choose between the best buy and best sell. To deal with this ambiguity, we always break ties in rank between a buy and sell by assigning the buy the higher rank. In the following table, we create new raw ranks to eliminate the duplicates. <<echo=false>>= tl@rank.sorts[["alpha"]][,!names(tl@rank.sorts[["alpha"]]) %in% "id"] @ Notice that each candidate has a unique rank and that the rows alternate between buy and sell candidates. The best ranked candidate trade is a buy because we broke the tie for first between the best ranked buy and sell by assigning the buy the higher rank. This pattern repeats throughout the data frame because we have ties at every rank except the last. We call this process of alternating between the best ranked buys and sells \emph{interleaving}. \begin{description} \item{\bf{interleaving}}: The process of breaking the trades up by side and ranking them with other trades of the same type, thereby yielding multiple trades at each rank. We always break ties in rank with the following ordering: Buys, Sells, Covers, Shorts (B, S, C, X). \end{description} \subsubsection{Weighted ranks} \label{lo weighted ranks} Having interleaved the candidates, we divide the new raw ranks by the weight assigned to one-day return, \Sexpr{sorts[["ret.1.d"]]}. <<lo weighted ranks, echo=false>>= ranks <- tl@rank.sorts[["ret.1.d"]] ranks[["rank"]] <- ranks[["rank"]]/sorts[["ret.1.d"]] ranks @ We assigned alpha a weight of 1 so the ranks remain the same. <<>>= tl@rank.sorts[["alpha"]] @ We combine the alpha and one-day return ranks into a single data frame. <<prep-duplicates, echo=false>>= alpha <- tl@rank.sorts[["alpha"]] ret.1.d <- tl@rank.sorts[["ret.1.d"]] alpha <- alpha[,!names(alpha) %in% "alpha"] ret.1.d <- ret.1.d[,!names(ret.1.d) %in% "ret.1.d"] duplicates <- rbind(alpha, ret.1.d) duplicates <- duplicates[order(duplicates$id),] row.names(duplicates) <- 1:nrow(duplicates) @ <<duplicates, echo=false>>= duplicates @ To remove duplicates, we assign each candidate the best weighted rank associated with it by any sort. <<prep-top.ranks, echo=false>>= tl.ranks <- tl@ranks @ <<top.ranks, echo=false>>= top.ranks <- aggregate(duplicates[c("rank")], by = list(id = duplicates$id), min) tl.ranks$rank <- top.ranks$rank[match(tl.ranks$id, top.ranks$id)] tl.ranks[order(tl.ranks$rank), !names(tl@ranks) %in% c("id", "alpha", "ret.1.d", "rank.t")] @ And we re-rank the candidates. <<echo=false>>= tl.ranks$rank <- rank(tl.ranks$rank) tl.ranks <- tl.ranks[, !names(tl.ranks) %in% c("id", "alpha", "ret.1.d")] tl.ranks[order(tl.ranks$rank), !names(tl@ranks) %in% c("id", "alpha", "ret.1.d", "rank.t")] @ \subsubsection{Mapping to the truncated normal distribution} Having weighted the ranks we create synthetic ranks from a truncated normal distribution. When we only have buys, we scale the weighted ranks to $[0.85,1)$. This gives us the positive tail of the normal distribution. We associate more negative values with better sells so we want to map sells to the negative tail of the normal distribution. To do this, we scale sells to the interval $(0,0.15]$. <<scaled.ranks.lo, echo=false>>= misc$scaled.ranks.lo @ We map the scaled ranks to the normal distribution. <<pre.rank.t, echo=false>>= tl.ranks <- tl@ranks[order(tl@ranks$rank.t),!names(tl.ranks) %in% "id"] @ <<rank.t>>= tl.ranks @ \texttt{rank.t} expresses the synthetic rank. All of the sells have a negative \texttt{rank.t} because they have been mapped to the negative tail of the normal distribution, while all of the buys have a positive \texttt{rank.t} because they have been mapped to the other tail. As described in section \ref{Synthetic rank and trade-cost adjustment of large portfolios}, the synthetic ranks do not fall at the extreme tail of the normal distribution. \subsection{The \texttt{calcChunks} Method} \label{calcChunks lo} Having calculated synthetic ranks, the \texttt{portfolio} package creates the chunks table. We defined the market value of each chunk by specifying the \texttt{chunk.usd} parameter in the call to \texttt{new}. The addition of sells does not have a dramatic effect on the manner in which we generate the chunk table besides contributing negative trade-cost adjusted ranks. <<prep-chunks, echo=false>>= sub.chunks <- tl@chunks[, c("side", "rank.t", "chunk.shares", "chunk.mv", "tca.rank")] @ <<chunks>>= sub.chunks @ Most chunks have an unsigned market value of approximately \$\Sexpr{prettyNum(tl@chunk.usd,big.mark=",")}. The only chunks of market value significantly less than \$\Sexpr{prettyNum(tl@chunk.usd,big.mark=",")} are the final chunks of a candidate. These chunks are the remainders left after dividing the rest of the order into \$\Sexpr{prettyNum(tl@chunk.usd,big.mark=",")} chunks. If we order the chunks by \texttt{tca.rank}, the second chunk of GM has been severely penalised for trade costs. <<ordered chunks>>= head(sub.chunks[order(sub.chunks[["tca.rank"]]),]) @ GM has a more negative \texttt{tca.rank} than any of the buys or sells, indicating that this is the last chunk we would trade. \subsection{The \texttt{calcSwaps} Method} \label{calcSwaps long-only} The \texttt{calcSwaps} works in as it did in the previous example, the main difference being that we pair real buy chunks with real sell chunks. We determine which trades to pair for a swap by calculating \emph{rank gain}. \begin{description} \item{\bf{rank gain}}: The difference in \texttt{tca.rank} between a buy and a sell. As the most desirable buys have a very positive \texttt{tca.rank} and the most desirable sells have a very negative \texttt{tca.rank}, the best swaps have great \texttt{rank.gain} values. \end{description} Buys with high \texttt{tca.rank} have been matched with sells with low \texttt{tca.rank}. <<prep-swaps,echo=FALSE>>= swaps.sub <- tl@swaps[, c("side.enter", "tca.rank.enter", "side.exit", "tca.rank.exit", "rank.gain")] @ <<>>= swaps.sub @ We have paired almost all of the buy chunks with real sell chunks. The only buy we have not paired with a real sell chunk is the second chunk of GM. As the target portfolio (\$\Sexpr{prettyNum(te,big.mark=",")}) has approximately the same market value as the current portfolio (\$\Sexpr{prettyNum(oe,big.mark=",")}), we will not introduce any dummy chunks to account for over or under-investment. We pair GM with a dummy chunk only because we have run out of real sell chunks to match it with. As we would rather make swaps which contain a real buy and sell chunk, we assign the dummy sell chunk a poor \texttt{tca.rank} which yields a low \texttt{rank.gain} value. Consequently, we will not consider this trade until we have considered all of the other trades. \subsection{The \texttt{calcSwapsActual} Method} \label{calcSwapsActual} The remaining steps of the \texttt{tradelist} algorithm clean up the \texttt{tradelist} for final use. In the \texttt{calcSwapsActual} method we remove the most poorly ranked swaps that exceed turnover. When we created the \texttt{tradelist}, we set \texttt{turnover} to be \$\Sexpr{prettyNum(tl@turnover,big.mark=",")}, the unsigned market value of all the candidate trades. A \texttt{turnover} of \$\Sexpr{prettyNum(tl@turnover,big.mark=",")} will allow us to complete every trade. <<prep-swaps.actual, echo=FALSE>>= sub.swaps.actual <- tl@swaps.actual[, c("side.enter", "tca.rank.enter", "side.exit", "tca.rank.exit", "rank.gain")] @ <<swaps.actual>>= sub.swaps.actual @ Right now, turnover does not cause any swaps to be dropped because it is greater than the unsigned market value of all the candidate trades, which is \$\Sexpr{prettyNum(nt,big.mark=",")}. We can cause some swaps to be dropped by setting \texttt{turnover} to a value less than \$\Sexpr{prettyNum(nt,big.mark=",")}. <<pre-turnover.text, echo=false>>= tl.bak <- tl @ <<>>= tl@turnover <- 30250 - tl@chunk.usd @ <<echo=FALSE>>= tl <- portfolio:::calcSwapsActual(tl) sub.swaps.actual <- tl@swaps.actual[, c("side.enter", "tca.rank.enter", "side.exit", "tca.rank.exit", "rank.gain")] @ When we set turnover to a value equal to one chunk less (\Sexpr{tl@chunk.usd} than the difference in market value between the original and target portfolios, the \texttt{calcSwapsActual} method excises the swap with the lowest \texttt{tca.rank}. <<sub.swaps.actual>>= sub.swaps.actual @ <<restores tl, echo=false>>= tl <- tl.bak @ We have removed the third chunk of GM from the list. \subsection{The \texttt{calcChunksActual} Method} \label{calcChunksActual} Our \texttt{tradelist} is almost complete, but first we must change the swaps back into chunks. In addition, we do not want to include any orders for dummy chunks, so we will remove those when we turn the swaps back into chunks. <<echo=FALSE>>= sub.chunks.actual <- tl@chunks.actual[,!names(tl@chunks.actual) %in% c("id", "orig", "target", "shares", "mv")] @ <<sub.chunks.actual>>= sub.chunks.actual @ All of the dummy chunks have been removed. \subsection{The Final Step: Actual Orders} In the last step of \texttt{tradelist} generation, we ``roll-up'' the actual chunks for each security to form one order per security. <<prep-tl.actual, echo=false>>= tl.actual <- tl@actual[, !names(tl@actual) %in% c("id")] @ <<tl.actual>>= tl.actual @ No rows for chunks remain in the \texttt{actual} data frame. \section{A Long-Short Example} <<echo=FALSE>>= ## clear the workspace for this example rm(list = ls()) load("tradelist.RData") ## Set portfolios for long-short example p.current <- portfolios[["p.current.ls"]] p.target <- portfolios[["p.target.ls"]] ## retrieves data for the long-short portfolio data <- data.list$data.ls ## Creates the sorts list sorts <- list(alpha = 1, ret.1.d = 1/2) ## Original Equity, Target Equity oe <- portfolio:::mvShort(p.current) + portfolio:::mvLong(p.current) te <- portfolio:::mvShort(p.target) + portfolio:::mvLong(p.target) ## Creates the tradelist so we can use different measures tl <- new("tradelist", orig = p.current, target = p.target, chunk.usd = 2500, sorts = sorts, turnover = 36825, target.equity = te, data = data) ## Necessary turnover to make all the candidate trades nt <- mvCandidates(tl) @ For the most part, the \texttt{portfolio} package treats one-sided and long-short portfolios similarly. The major difference is that we now have to take four types of trades into consideration, buys, sells, shorts, and covers. \subsection{Current and target portfolios} Our current portfolio is a superset of the holdings in the previous example. This example's current portfolio includes positions that we will short and cover. The current portfolio, \texttt{p.current}, consists of \Sexpr{nrow(p.current@shares)} positions and has a market value of \$\Sexpr{prettyNum(oe,big.mark=",")}. <<prep-p.current.shares, echo=false>>= p.current.shares <- p.current@shares[, !names(p.current@shares) %in% "id"] @ <<>>= p.current.shares @ The target portfolio is a superset of the target portfolio we used in the two previous examples. It contains all the positions in the previous target portfolio plus positions that we short or cover. <<prep-p.target.shares, echo=false>>= p.target.shares <- p.target@shares[, !names(p.target@shares) %in% "id"] @ <<p.target.shares>>= p.target.shares @ The target portfolio, \texttt{p.target}, contains \Sexpr{nrow(p.target@shares)} positions and has a market value of \$\Sexpr{prettyNum(te,big.mark=",")}. We assume that we have the additional funds necessary to increase the market value of the portfolio. \subsection{Candidate trades} We calculate the portfolio difference to determine what the candidate trades will be: <<prep-sub.candidates, echo=false>>= sub.candidates <- tl@candidates[,!names(tl@candidates) %in% "id"] @ <<sub.candidates>>= sub.candidates @ We now have buy, sell, cover, and short candidates (B, S, C, X). Buys and covers have positive market values because they increase the value of the portfolio, and sells and shorts have negative market values because they decrease the value of the portfolio. Notice that all the candidate trades necessary to reach the target positions for HAL and YHOO are not on the candidate list. We do not include all the candidate trades to reach these positions because they involve side changes. \subsubsection{Side changes and restrictions} A side change occurs when a position changes from long to short or short to long. The \texttt{portfolio} package does not allow a side change to occur during a single trading session.\footnote{Writing code so that we make a side change without creating a box position is hard. We will address this in future versions of the \texttt{portfolio} package} For a side change to occur, we must make two types of trades. We must either sell first, then short, or cover first, then buy. We only allow the first of one of these trades to occur during a single trading session. The second trade is added to the restricted list so that it may be performed during a later session. The two trades that involve side changes have been added to the \texttt{restricted} list. <<prep-restricted, echo=FALSE>>= row.names(tl@restricted) <- 1:nrow(tl@restricted) @ <<restricted>>= tl@restricted @ We have added the buy candidates for HAL and YHOO to the restricted data frame so that we do not accidentally enter a box position. The \texttt{reason} column explains why these candidates have been added to \texttt{restricted}. During this trading session we will attempt to exit the short positions for HAL and YHOO by covering these positions. In a subsequent trading session we will attempt to enter a long position by buying these equities. \subsection{Creating sorts and assigning them weights} Like in the previous example, we name the sorts and assign them weights by creating a list. <<>>= sorts <- list(alpha = 1, ret.1.d = 1/2) @ We assigned a weight of \Sexpr{sorts[["alpha"]]} to alpha and a weight of \Sexpr{sorts[["ret.1.d"]]} to one-day return. \subsection{Passing additional information to \texttt{tradelist}} We must pass a data frame with columns for \texttt{id}, \texttt{price.usd}, \texttt{volume}, \texttt{alpha}, and \texttt{ret.1.d} in the call to \texttt{new}: <<echo=false>>= row.names(data) <- data$id sub.data <- data[, c("id", "volume", "price.usd", "alpha", "ret.1.d")] @ <<necessary data>>= sub.data @ Aside from having information about additional equities, this data frame does not differ greatly from the one we passed to new in section \ref{lo new}. \subsection{Calling \texttt{new}} \label{ls new} Having gathered the components necessary to build a tradelist \texttt{tradelist}, we make a call to \texttt{new}: <<new ls tradelist>>= tl <- new("tradelist", orig = p.current, target = p.target, chunk.usd = 2000, sorts = sorts, turnover = 36825, data = data) @ <<echo=false>>= tl <- new("tradelist", orig = p.current, target = p.target, chunk.usd = 2000, sorts = sorts, turnover = 36825, target.equity = te, data = data) @ We pass 8 arguments as parameters to the \texttt{new} method. The parameters are similar to those in section \ref{lo new} with the exception of turnover which we have set to \$\Sexpr{prettyNum(tl@turnover,big.mark=",")}. The value of the candidate trades in this example is greater than the value of the candidate trades in the previous example so we must set \texttt{turnover} higher if we want to complete all of the candidate trades. \section{The \texttt{tradelist} algorithm, long-short} The way the \texttt{portfolio} package builds a long-short \texttt{tradelist} is similar to the way it builds a long-only \texttt{tradelist}. We will walk through the process of creating a long-short \texttt{tradelist} with \texttt{portfolio} and discuss the differences between creating long-only and long-short trade list. \subsection{Calculating ranks} We calculate the ranks for a long-short portfolio in much the same way we do so for a long-only portfolio. The main difference we must take into is the need to rank four types of trades with other trades of the same type. In previous examples we ranked buys and sells separately. Now we rank buys, sells, covers, and shorts separately. \subsubsection{Raw ranks with a long-short \texttt{tradelist}} As per our third simplifying assumption, we do not favour one type of trade over another type of trade. As a consequence, we split and rank the trades separately. <<echo=false>>= ranks <- tl@rank.sorts$alpha ranks <- split(ranks, ranks$side) ranks$B$rank <- 1:nrow(ranks$B) ranks$S$rank <- 1:nrow(ranks$S) ranks$X$rank <- 1:nrow(ranks$X) ranks @ Like on page \pageref{Creating raw ranks for a long-only portfolio}, the \texttt{\$B} data frame shows the buys ranked with other buys and the \texttt{\$S} data frame shows the sells ranked with other sells. The \texttt{\$C} and \texttt{\$X} data frames show covers and shorts ranked with other shorts. \subsubsection{Interleaving} The last step left us with \Sexpr{length(ranks)} sets of ranks, one for each type of trade. Up to four trades will share each rank when we combine these data frames to form a list of overall rankings and the trades will be interleaved using groups of up to four.\protect\footnote{Some of the groups may not include one trade of every type.} <<echo=false>>= tmp <- do.call(rbind, lapply(ranks, function(x) {x})) tmp <- tmp[order(tmp$rank),] tmp[,!names(tmp) %in% "id"] @ As per the third simplifying assumption, there is no natural way to choose between the best buy, sell, cover, or short. To deal with this ambiguity, we always break ties in rank between a buy, sell, cover, and short by assigning the buy the highest rank, the sell the second highest rank, the cover the third highest rank, and the short the worst rank: <<echo=false>>= tl@rank.sorts[["alpha"]][,!names(tl@rank.sorts[["alpha"]]) %in% "id"] @ Once again, each candidate has a unique rank and the rows appear in groups of buys, sells, covers, and shorts. The pattern repeats throughout he data frame because we have ties at every rank except for the last. There is no tie at the last rank because we have an odd number of candidates. \subsubsection{Weighted ranks} Having interleaved the separate rankings by type, we calculate weighted ranks. <<ls weighted ranks, echo=false>>= ranks <- tl@rank.sorts[["alpha"]] ranks[["rank"]] <- ranks[["rank"]]/sorts[["alpha"]] ranks @ We double the one-day return ranks to reflect that one-day return is less important than alpha. (Recall that lesser ranks are better.) <<>>= tl@rank.sorts[["ret.1.d"]] @ We assign each candidate the best weighted rank from either sort. We combine the data frame of the candidates ranked by alpha with the data frame of the candidates ranked by one-day return: <<prep-duplicates, echo=false>>= alpha <- tl@rank.sorts[["alpha"]] ret.1.d <- tl@rank.sorts[["ret.1.d"]] alpha <- alpha[,!names(alpha) %in% "alpha"] ret.1.d <- ret.1.d[,!names(ret.1.d) %in% "ret.1.d"] duplicates <- rbind(alpha, ret.1.d) duplicates <- duplicates[order(duplicates$id),] row.names(duplicates) <- 1:nrow(duplicates) @ <<duplicates, echo=false>>= duplicates @ To remove duplicates, we assign each candidate the best weighted rank associated with it by any sort. <<prep-top.ranks, echo=false>>= tl.ranks <- tl@ranks @ <<top.ranks, echo=false>>= top.ranks <- aggregate(duplicates[c("rank")], by = list(id = duplicates$id), min) tl.ranks$rank <- top.ranks$rank[match(tl.ranks$id, top.ranks$id)] tl.ranks[order(tl.ranks$rank), !names(tl@ranks) %in% c("id", "alpha", "ret.1.d", "rank.t")] @ Once again we generate raw ranks: <<echo=false>>= tl.ranks$rank <- rank(tl.ranks$rank) tl.ranks <- tl.ranks[, !names(tl.ranks) %in% c("id", "alpha", "ret.1.d")] tl.ranks[order(tl.ranks$rank), !names(tl.ranks) %in% c("id", "alpha", "ret.1.d", "rank.t")] @ Having created weighted ranks, we prepare for the creation of synthetic ranks. \subsubsection{Mapping to the truncated normal distribution} We create synthetic ranks from by mapping the ranks to a truncated normal distribution. We scale buys and covers to the the $85^{th}$ percentile and above and sells and shorts to the $15^{th}$ percentile and below ($(0, 0.15]\cup[0.85,1)$). <<scaled.ranks.ls, echo=false>>= misc$scaled.ranks.ls @ Finally, we map the values to the truncated normal distribution: <<pre.rank.t, echo=false>>= tl.ranks <- tl@ranks[order(tl@ranks$rank.t),!names(tl.ranks) %in% "id"] @ <<rank.t>>= tl.ranks @ \subsection{Calculating chunks} Calculating chunks for a long-short portfolio functions in almost the same manner as it would for a long-only portfolio. We set the market value of each chunk to be \Sexpr{prettyNum(tl@chunk.usd,big.mark=",")} in the call to \texttt{new}. <<prep-chunks, echo=false>>= sub.chunks <- tl@chunks[, c("side", "rank.t", "chunk.shares", "chunk.mv", "tca.rank")] @ <<chunks>>= sub.chunks @ Aside from the addition of cover and short chunks, the chunk table should appear exactly as it does in section \ref{calcChunks lo}. \subsection{Calculating Swaps} Swaps work slightly differently with a long-short tradelist than with a long-only tradelist. In a long-only tradelist we only have to pair buys and sells, but in a long-short tradelist we have to pair buys, sells, shorts, and covers. The \texttt{calcSwaps} method accounts for this by matching trades within a side. We pair shorts with covers and buys with sells: <<prep-swaps,echo=FALSE>>= swaps.sub <- tl@swaps[, c("side.enter", "tca.rank.enter", "side.exit", "tca.rank.exit", "rank.gain")] @ <<>>= swaps.sub @ In the \texttt{side.enter} column we list buys (\texttt{B}) and shorts (\texttt{X}) because the only way to enter a side is by initially buying or shorting a stock. Sells and covers move us closer to exiting the position which is why we put these trades in the \texttt{side.exit} column. Like in previous examples, the labels describe the swaps. The value to the left of the comma is the name of buy or short and the name to the right of the comma is the name of a sell or cover. The number following the period is the chunk number of the stock involved in the trade. Dummy chunks work similarly for long-short portfolios as they do for long-only portfolios. The main difference is that we must create dummy shorts and covers to pair with real covers and shorts. We create \Sexpr{length(grep("NA.0",row.names(swaps.sub)))} dummy chunks. The dummy chunks at the head of the swaps table exist because the current portfolio has a lesser market value than the target portfolio. To increase the market value of the current portfolio we want to make more buys and covers than sells. The dummy chunks at the tail of the table were created because we ran out of shorts and buys to match with real covers and sells. We assign this type of dummy trade a poor trade-cost adjusted rank. \subsection{The \texttt{calcSwapsActual} Method} The \texttt{calcSwapsActual} method works in almost exactly the same way as it does for a long-only tradelist. <<>>= sub.swaps.actual <- tl@swaps.actual[, c("side.enter", "tca.rank.enter", "side.exit", "tca.rank.exit", "rank.gain")] @ <<>>= sub.swaps.actual @ We do not remove any swaps because we set the \texttt{turnover} equal to the unsigned market value of the candidate trades. If we decrease \texttt{turnover}, some of the swaps will be excised. <<echo=false>>= tl.bak <- tl @ <<>>= tl@turnover <- nt - tl@chunk.usd @ We set turnover to equal the turnover necessary to complete all of the candidate trades (\texttt{nt}), minus the maximum size of a chunk. This guarantees that we do not make trade the worst swap, in this case \Sexpr{row.names(tl@swaps)[nrow(tl@swaps)]}. By lowering \texttt{turnover} we caused the worst ranked swap to be removed. <<echo=false>>= tl <- portfolio:::calcSwapsActual(tl) @ <<echo=false>>= sub.swaps.actual <- tl@swaps.actual[, c("side.enter", "tca.rank.enter", "side.exit", "tca.rank.exit", "rank.gain")] @ <<>>= sub.swaps.actual @ <<echo=false>>= ## restores tl to pre-swaps value tl <- tl.bak @ \subsection{Calculating actual chunks} The \texttt{calcchunksActual} method works similarly to the way it does for a long-only tradelist: <<echo=FALSE>>= sub.chunks.actual <- tl@chunks.actual[,!names(tl@chunks.actual) %in% c("id", "orig", "target", "shares", "mv")] @ <<sub.chunks.actual>>= sub.chunks.actual @ We have changed the swaps back into chunks. The additional work for a long-short portfolio involves converting buy/sell and short/cover swaps into chunks instead of just dealing with buy/sell chunks. \subsection{The \texttt{calcActual} Method} The \texttt{calcActual} method works almost exactly the same way it does for a long-only tradelist: <<>>= tl@actual @ We ``roll-up'' all the chunks into single orders. \section{Conclusion} With intelligently defined sorts, the \texttt{portfolio} package is a powerful tool for managing equity portfolios. Nonetheless, the \texttt{tradelist} code could stand for improvement in certain areas, particularly the area of trade-cost adjustment. The current method of using discrete and static boundaries for determining trade-adjusted rank should be replaced by a trade-cost adjustment function. Nonetheless, we believe that our package makes the difficult problem of trading a little bit easier. \end{document} %% \subsection{Expressing preferences amongst trades} %% We want to buy stocks that will increase in price. Buy low, sell %% high. We maintain a list of stocks which we believe will increase in %% price. With each stock we associate a value, \emph{alpha}, which %% predicts future changes in price. %% \begin{table}[!htbp] %% \begin{tabular}[c]{rr|rr} %% stock & alpha & stock & alpha \\ %% \hline %% EBAY & 2.50 & AMD & -3.02 \\ %% MSFT & 2.49 & AET & -2.84 \\ %% SCHW & 2.12 & QCOM & -2.20 \\ %% GOOG & 1.85 & HAL & \\ %% GM & 1.57 & AAPL & \\ %% IBM & 0.75 & HPQ & \\ %% YHOO & 0.23 & SUNW & \\ %% \hline %% \end{tabular} %% \end{table} %% Table \ref{portfolio difference} shows our current portfolio, target %% portfolio, and the diffence between the two. %% \begin{table}[!htbp] %% \begin{tabular}[c]{rrr|rrr|rr} %% stock & shares & price & stock & shares & price & stock & $\Delta$ shares \\ %% \hline %% IBM & 10 & 10 & IBM & 20 & 10 & EBAY & 10 \\ %% GM & 10 & 15 & GM & 20 & 15 & GM & 10 \\ %% EBAY & 10 & 20 & EBAY & 20 & 20 & GOOG & 10 \\ %% GOOG & 40 & 10 & GOOG & 50 & 10 & IBM & 10 \\ %% MSFT & 10 & 15 & MSFT & 20 & 15 & MSFT & 10 \\ %% SCHW & 0 & 20 & SCHW & 15 & 20 & SCHW & 15 \\ %% \hline %% \end{tabular} %% \caption[portfolio difference]{Current portfolio, target portfolio, %% and portfolio difference (right, center, left).\label{portfolio %% difference}} %% \end{table}