An introduction to quantitative portfolio management
Mean-variance Optimization (MVO) is a portfolio selection method that Harry Markowitz introduced in his 1952 paper “Portfolio Selection.”
To simplify the problem, let us first derive a function that defines the utility of a single investment:
\[U = E[R] - \dfrac{\gamma}{2}\varepsilon\]where \(E[R]\) and \(\varepsilon\) are the expected returns and estimation error, respectively, and \(\gamma\) is a tuning parameter representing an investor’s risk aversion.
Estimating the expected returns of an asset is notoriously difficult as it is impossible to see into the future. Moreover, if perfect predictions of future returns were possible, the estimation error would be zero, making investor utility equal to future returns. That is, the risk of any given investment is defined by how inaccurate future predictions are. A simple method to estimate future asset returns is by taking the average of past returns (e.g., the mean of the past 30 months will be the forecast of the following month’s returns). Likewise, a simple method to estimate the prediction error is by taking the variances of past returns. In other words, we reformulate the equation above to derive the mean-variance utility function as
\[U = \mu - \dfrac{\gamma}{2}\sigma^2\]where \(\mu\) and \(\sigma^2\) are the mean and variance, respectively.
The mean-variance function incorporates the assumption that two assets with the same mean and variance of returns should be equally desirable to an investor. Therefore, mean-variance utility is arguably short-sighted in scope. By failing to recognize that investors are likely interested in many other performance characteristics, mean-variance utility cannot accurately estimate an investor’s utility. For example, higher-order statistical properties of past returns (e.g., skewness and kurtosis) may be useful metrics to evaluate the error of forecasted returns.
It is important to clarify that the mean-variance utility function does not assume the normal distribution of asset returns, as is commonly mistaken. Mean-variance utility assumes that investors only care about the mean and variance of asset returns, even if they concede higher moments will better account for forecasting errors. That is, \(\sigma^2 \neq \varepsilon\), and other metrics may impact the desirability of an asset (e.g., ESG scores). Moreover, many alternative utility functions are preferred conceptual frameworks for what investors truly value. Nonetheless, MVO continues to be the most influential optimization method in academia, likely because of its simplicity and strict quantitative definition.
Now that we understand how the mean-variance function defines investor utility for a single asset, how can we modify it to determine the utility of a portfolio of multiple assets? The concept is functionally the same; we need to derive a function for the mean and variance of portfolio returns. Let us first derive the mean function as the sum of asset mean returns multiplied by their weights:
\[\begin{aligned} \text{Portfolio Mean Return} &= w_{1}\mu_{1} + w_{2}\mu_{2} + w_{3}\mu_{3} + \dots + w_{N}\mu_{N}\\ &= \sum_{i=1}^{N}w_{i}\mu_{i} \end{aligned}\]where \(w_{i}\) and \(\mu_{i}\) are the weight and mean of asset \(i\), respecitvely, and \(N\) is the total number of assets to be invested.
Technically speaking, this is all we need to continue. However, let us convert the mean portfolio returns equation into matrix form to make the math easier. Additionally, GPUs are particularly efficient at matrix operations, so most practitioners try to take advantage of GPU performance by converting large datasets into matrices.
where \(x^{\top}\) is the transpose vector of asset weights, and \(\mu\) is the vector of asset mean returns. Finally, it must be noted that these values are time-dependent:
\[\text{Portfolio Mean Return} = x_{t}^{\top}\mu_{t}\]where the values at time \(t\) are estimated using data from preceding periods.
Now that we’ve got a matrix definition for the portfolio mean returns, the same logic can be applied to derive the portfolio variance:
\[\begin{aligned} \text{Portfolio Variance of Returns} &= var\left(\sum_{i=1}^{N}w_{i}\mu_{i}\right)\\ &= \sum_{i=1}^{N}\sum_{j=1}^{N}w_{i}w_{j}cov(\mu_{i},\mu_{j})\\ \end{aligned}\]where variance expansion rules are applied. To convert into matrix for we drive:
\[\begin{aligned} & \begin{pmatrix} w_{1} & w_{2} & \dots & w_{N} \end{pmatrix} \begin{pmatrix} var(\mu_{1}) & cov(\mu_{1}, \mu_{2}) & \dots & cov(\mu_{1}, \mu_{N})\\ cov(\mu_{2}, \mu_{1}) & var(\mu_{2}) & \dots & cov(\mu_{2}, \mu_{N})\\ \vdots & \vdots & \ddots & \vdots\\ cov(\mu_{N}, \mu_{1}) & cov(\mu_{N}, \mu_{2}) & \dots & var(\mu_{N})\\ \end{pmatrix} \begin{pmatrix} w_{1}\\ w_{2}\\ \vdots\\ w_{N}\\ \end{pmatrix}\\ &= x^{\top}{\Sigma}x\\ \end{aligned}\]where \({\Sigma}\) is the variance-covariance matrix of the universe of assets.
Now that we have derived the portfolio returns mean and variance, they can be plugged into our mean-variance utility function as:
\[U = x_{t}^{\top}\mu_{t} - \dfrac{\gamma}{2}x_{t}^{\top}{\Sigma}_{t}x_{t}\]where we obtain the utility of a portfolio of assets in linear algebra format.
The ultimate goal of MVO is to find a vector of asset weights, \(x_{t}\), such that this function returns the highest possible value. To visualize the optimization procedure, let’s take a generic palabora and draw a tangency line where the y-axis is at its maximum. Note that the values are entirely arbitrary, and the shape of the curve may be completely different in a real-world scenario (concavity will still hold, though). Also, the x-axis represents various combinations of asset weights, which doesn’t make sense in a multivariate framework. Nonetheless, we’re just trying to visualize the optimization process:
where we can see \(\dfrac{df}{dx} = \dfrac{dy}{dx} = \dfrac{0}{\text{change in }x} = 0\). That is, if we construct a portfolio such that the tangency line of mean-variance utility is equal to zero, our portfolio is optimized.
Because the mean-variance utility function is concave, we can formalize this optimization as a closed-form maximization problem by solving for \(x_{t}\) (the minimum of the inverse utility would be functionally the same):
\[\begin{aligned} \max_{x_{t}} &= x_{t}^{\top}\mu_{t} - \dfrac{\gamma}{2}x_{t}^{\top}{\Sigma}_{t}x_{t}\\ 0 &= \dfrac{df}{dx} \left(x_{t}^{\top}\mu_{t} - \dfrac{\gamma}{2}x_{t}^{\top}{\Sigma}_{t}x_{t}\right)\\ 0 &= \mu_{t} - \gamma{\Sigma}_{t}x_{t}\\ x_{t} &= \dfrac{1}{\gamma}{\Sigma}_{t}^{-1}\mu_{t} \end{aligned}\]where \(x_{t}\) is the optimized vector of asset weights.
We’re done! Of course, there are many ways to modify this process (e.g., constrain weights to non-negative values, etc.), but this is the core intuition and math behind possibly the most important financial concept for portfolio management. Now let’s dive into a real-world implementation using python!
Mean-variance optimization is a widely used technique in portfolio management, but a lot of the inspiration for this post comes from DeMiguel et al. (2009).
First let’s import the packages we’re going to be using.
Then, we’ll use the Pandas DataReader to load our industry portfolios. These are pre-constructed portfolios that we’ll treat as individual assets. We’ll also download the Fama-French factors to obtain the risk-free rate and market factor. Finally, we subtract the risk-free rate from our industry portfolios, so all returns are in excess of the risk-free rate.
First, create Python code for the equation we derived in the introduction, \(x_{t} = \dfrac{1}{\gamma}{\Sigma}_{t}^{-1}\mu_{t}\):
This function takes a window of data and returns the weights solved via our mean-variance derivative. For our implementation, let’s just set \(\gamma = 1\), but feel free to research what a good risk-aversion parameter might be. Also, the equation derived in the introduction only solves for absolute weights. So, we add a function parameter to normalize these weights such that they sum to one. The absolute sum of weights is used as the denominator so that, should weights sum to a negative value, the sign of an asset’s allocation is not lost.
Now that we have our mean-variance function, let’s create a rolling window to calculate the weights of our portfolio at each time step:
After calculating the weights at each time step, we multiply them by the corresponding asset returns to obtain the final portfolio returns. We also shift the weights by one time step to obtain the out-of-sample returns. That is, the mean-variance optimal portfolio at time \(t\) is applied to returns at time \(t+1\).
Your df_returns
should look something like this:
There is no unified method of interpreting the results. However, academic papers usually produce a range of tables and graphs to highlight common metrics and compare them to some relevant benchmarks. I won’t cover the details of every metric we use, but hopefully, the code will speak for itself.
1/N | MV (In-Sample) | MV (Out-Sample) | MV (Full-Sample) | |
---|---|---|---|---|
Sharpe Ratio | 0.172825 | 0.228963 | 0.0385552 | 0.238449 |
Sortino Ratio | 0.229944 | 0.307529 | 0.0421003 | 0.38138 |
Annualized Returns | 0.0809591 | 0.216877 | -0.0528296 | 0.127804 |
Annualized Volatility | 0.149322 | 0.308717 | 0.377176 | 0.162132 |
Maximum Drawdown | -0.496305 | -0.830373 | -0.996717 | -0.313587 |
Skewness | -0.621345 | 0.980511 | -1.12029 | -0.0537011 |
Kurtosis | 2.42816 | 14.248 | 22.2821 | 1.1225 |
VaR (0.05) | -0.0678418 | -0.095657 | -0.125124 | -0.0612105 |
CVaR (0.05) | -0.0990815 | -0.175262 | -0.259063 | -0.0935619 |
CEQ | 0.00652066 | 0.0164338 | -0.00172963 | 0.010065 |
Annualized Turnover | 0.483863 | 40.5315 | 45.7085 | 3.98412 |
We can see that mean-variance investing absolutely dominates the 1/N portfolio for in-sample testing, but as expected, the estimation error inherent in our out-of-sample testing completely destroys returns. Feel free to click on the legend items to add/remove them; the graph will automatically rescale.
Thanks for reading. I hope this was informative!
Feel free to download the Jupyter notebook I used to create this post.