Nov 30, 2024 by Christoph Mark
In finance, the performance of an asset is often quantified by alpha (the excess returns above a benchmark return) and beta (the volatility or risk of the asset relative to a benchmark). These metrics are estimated from historical data and are often based on only short track records. Even if a long series of historical returns is available for an asset, older data may no longer represent the current market dynamics due to regime switches. In this post, we develop time-varying, probabilistic versions of alpha and beta, and go through the process of using these new metrics to build a dynamic stocks/bonds portfolio to improve on the classic 60/40 portfolio. To accomplish this, we employ probabilistic programming, a technique to build probabilistic models with ease in Python and infer parameters from noisy data.
In our toy example, we want to invest in the S&P500 via the ETF SPY, and we want to hedge against market crashes by investing in the 20+ years treasury bonds ETF TLT. Note that we will not discuss whether TLT really represents a good hedge against market crashes in this post, we use this pair of assets simply because it is often praised as a stable long-term investment as the 60/40 portfolio. We will not do any intra-day trading, our goal is to use daily data to detect regime switches on longer time scales, often mutliple years.
Next, we need to define alpha and beta. We do this via a Capital Asset Pricing Model (CAPM), in our case also called Single Index Model (SIM). In particular, we model the log-return of TLT as follows:
where denotes the risk-free rate, denotes the return of the market (in our case, the return of SPY), and denotes Gaussian noise. This equation implies some rather unrealistic assumptions. First, it assumes that the asset log-returns (minus the risk-free rate) is linearly related to the market log-returns (minus the risk-free rate), whereas financial assets are known to feature highly non-linear relations. Second, the CAPM model assumes that the random fluctuations follow a Gaussian distribution, even if it known that financial time series commonly follow heavy-tailed distributions (for which extreme values occur more frequently). Third, the model defined above assumes that and are constant metrics that do not change over time. The performance of many financial assets, however, will depend on macro-economic conditions, product life-cycles, change in management, etc.
Before we will see how to overcome at least some of the issues described above, we turn the formula above into a Python function that specifies the likelihood function for the model defined above. This likelihood function returns the probability (density) of observing a given log-return, given the parameter values for , , , and the additional data points and . This likelihood function can then be used to infer parameter values from data by employing e.g. Maximum Likelihood Estimation or Bayesian inference. We will discuss a special case of Bayesian inference for this model below.
TODO: likelihood function, but without Python