Trading Volatility Roughness — Rethinking Statistical Arbitrage
Market neutral strategies account for a significant part of quant-based strategies. Just think about your typical equity long-short/pair trading/vol relative-value. All these strategies find their edge in detecting value (or mispricing) in idiosyncratic risk while eliminating the market beta. While there’s more than one way to skin a cat (or find value in one asset relative to another one), I think that there is a superior method that can be easily applied to different types of stat arb strategies.
Inspired by the 2018 paper named “Buy Rough, Sell Smooth” (Paul Glasserman, Pu He), I started toying with the idea that a better way can replace the good old “realized-implied” volatility trading. Unlike traditional measurements, the underlying volatility dynamic is measured in terms of path roughness/smoothness.
Understanding Roughness in volatility
Before we discuss the application to this line of thinking, let’s first understand what roughness in volatility is…
We know that assets in the financial market generally follow geometric Brownian motion, where increments are drawn from a random sample and are independent of each other (i.e. corr (x0,x1)=0), but in fact, GBM is just a special case of fractional Brownian motion (fBm) , where H=0.5 (or Hurst exponent, the autocorrelation factor). The general case for fBm has the following covariance function:
Where H ∈(0,1)
As H→ 0, the price path will become rougher (increments negatively correlated), while H→1(increments are positively correlated) will generate a smoother price path. If we think about it in a simple way, the rougher the price path, the faster the speed mean-reversion (or decay in trend), while a smoother path is consistent with a long memory of the time series.
Back to the paper in question, the researchers found that a strategy consistent with being long the roughest-volatility stocks/short smoothest-volatility stocks had an excess return of 6%/year for the 17 years between 2001–2018 (was profitable 13 out of 17 years).
The paper’s authors describe implied roughness as the decay speed in implied vol across the ATM skew in strikes and maturities, so a steeper strike skew and ATM skew term structure implies rough volatility. This can also be calibrated from traded options prices, using MC simulation while solving for the Hurst index. This method can be daunting if we want to process a significant amount of assets. Using both methods, we essentially find that a rougher volatility path is consistent with fast decay of skew (on the strike axis) and ATM skew term-structure on the time axis).
After understanding the “implied roughness” notion (derived from the implied volatility surface), we should understand the “realized roughness” notion. Unlike our “normally-used” realized volatility measures (take a pick of any discrete-time volatility estimator, like GKYZ/Parkinson/time-based realized volatility), when we estimate our realized volatility path we generally use some integrated volatility measure(for example, realized kernel). without going into the nitty-gritty of how integrated variance/volatility is estimated, it is essentially a measurement that is derived from high-frequency data and is less sensitive to noise in the time series (and selection of frequency/window of sampling) than the instantaneous volatility (our usual volatility measures).
Once we generate the historical integrated volatility estimate, we can derive our Hurst Index using the scaling properties of the underlying volatility process (for an elaborated explanation, do visit this presentation by Jim Gatheral). Generally speaking, it was found that the Hurst Exponent of financial assets, on average, ranges between 0.1 and 0.2, which means that their volatility process presents a rougher path than assumed under geometric Brownian motion.
A Typical framework for a long-short equity model can be generated based on earning releases (NLP/ML analysis), fundamental analysis, or some factor model. All these have techniques have seen their alpha decays over the years as more funds and investors started using the same practices in their analysis, as well as a fundamental change in the market over recent years, where companies’ financial health has become less important than market positioning, HFT market-making (which affects short term volatility), and FOMO-like behavior (especially by retail investors, who saw their portion in the market activity steadily growing). This environment of decaying alpha and diminishing returns by hedge funds (that see quant strategies like market-neutral/long-short strategies as their “bread-and-butter”), calls for a reassessment of how to approach quant/data analysis (especially given the fact that high-frequency data is widely available and computation power exponentially increases).
As suggested by the paper, going long stocks which had the lowest Hurst index (roughest volatility path) and selling the stocks with the highest Hurst index (smoothest volatility path) was a significantly profitable trading strategy, which doesn’t seem evident at first, but if we think about it, implied roughness prices in a premium for short term idiosyncratic risk, which can explain why buying stocks (not options) with high roughness and selling stocks with low roughness is profitable (as the market-implied distribution prices in the short term specific risk at a premium).
One possible explanation for this phenomenon is the illiquidity premium ( can be estimated by measures like Average Daily Volume or Amihud illiquidity measure). Still, because all stocks that were selected had already had high volume in option, that cannot explain the alpha, so the only possible explanation for the significant alpha could have come from market premium for short-term idiosyncratic risk (which is essentially saying before earning releases: “The market is buying protection for downside surprise”).
This roughness in the volatility also falls in line with the fact that, on average, the market exhibits low values of the Hurst Index (equity indices generally realize H around 0.1–0.2). Still, in global risk-off/general events (like FOMC announcements/ Economic data) the level of H rises sharply
From Delta One to Volatility strategies
As noted above, the paper that inspired me to toy with the idea of rough vs. smooth trading focused on trading stocks directly ( meaning a Delta One type of trade), based on the risk-neutral market expectations wrt the underlying roughness (as implied by the traded options in the market). As I focus primarily on volatility trading, it only made sense to implement that line of thinking into my research process.
While my research on that topic is still far from being complete, I will not share the actual results and analysis just yet. That said, I will share some of my findings so far, and the way I approach it compared to the simple “sell high/buy low implied” or “realized-implied” statistical arbitrage.
My focus over the years has primarily been FX and Interest Rates. My initial research used the G10 universe, using Bloomberg mid volatility points (3-years worth of daily ATM, RR, BF for liquid tenors), as well as 5-min intraday data.
Most relative value strategies in volatility space are consistent with either selling volatility in an underperforming asset (usually relative to its realized volatility) and buying a performing asset (or that its realized-implied spread is narrower). In most cases, the volatility which will be sold will have higher implied volatility, making it a positive carry trade (The theta of the short leg will more than compensate for the negative theta on the long leg).
This strategy has two significant drawbacks:
1. Most RV/vol arb strategies are constructed with vanilla options, which makes them extremely sensitive to the underlying spot paths (both in gamma term and vega-related terms)
2. realized/historical volatility is far from being an accurate predictor for future realized returns (and volatility). The further we try to forecast the worst our prediction is going to be.
As we can see the greater the leg in our regression, the worse fit we have, meaning that comparing today’s implied to the equivalent realized period will not give us an accurate assessment of whether or not implied volatility is priced at a premium (or discount).
Path independent strategy
As noted above, when we initiate a vol arb strategy (let’s assume long-short vol between two currency pairs), we generally have a view on the future realized volatility spread compared to the current implied (we don’t really matter if they both underperform their implied, as long as the spread between them works in our favor).
The biggest issue we face when strategizing this kind of trade with vanilla option is that when we dynamically hedge our options book we become extremely path-dependent. (unless it consists of variance swap replicating portfolio), so it might be that we forecasted the volatility correctly but ended up not realizing that vol (as we hedge discretely).
The main reason is that our gamma exposure has a bell-shaped behavior (peaking at the strike and decaying as we drift away from the strike). One solution for that would be to create a portfolio consists of equally distanced strangle. Let’s say we will buy (or sell) 10 strangles in equal amounts. This is how our gamma/vega exposure will look like:
Now even if we go crazy and use 20 strangles, we will end up with the same type of exposure…
So you are probably asking now how we can “flatten” our gamma exposure. Well, the trick is simply to weigh the different strikes inversely proportioned to the moneyness² (sounds complicated, but really is not). How would that help? it will overweight the low delta strike (these far OTM puts/calls) and underweight the near-the-money, creating a flat exposure, that will be independent of the spot path.
So this construction of our options book essentially makes us indifferent to the underlying spot path. An easier solution would be (given it’s possible) to trade OTC products like VolSwaps (or VarSwaps).
Analyzing historical volatility dynamic
Obviously, when it comes to finding trading signals in vol arb strategies, we generally look for value in owning volatility in one asset, while selling volatility in another asset (as the asset we are short funds the holding of the long leg). This strategy essentially tries to capture idiosyncratic volatility factors in the asset we are long while assuming the asset we are short will have a high beta to the market (and little to no specific volatility). As said, looking at historical volatility doesn’t help us as we try to find value in trades that have longer horizons (usually the periods I trade range between 1-month to 3-month, so forecasting for 1day-1week doesn’t give much value). Furthermore, when we concentrate on realized-implied volatility, we ignore other aspects of the implied distribution (namely skew and kurtosis).
The above reasons are perfect examples of what volatility roughness/smoothness analysis can become quite handy…
Very much like described in the original paper, we break down our data to implied roughness and realized roughness. Ideally, we would like to buy vol in currency pairs that have higher implied roughness (yet realize lower roughness), and sell the exact opposite (lower implied roughness, realize higher roughness). Now if that sounds far too complicated, hear me out…
We know that implied roughness in vol captures the behavior of the skew and term structure (fast decaying vol across strikes and terms), which means that if we find a pair trade (two currency pairs, where we buy one and sell the other), in which we take advantage of mispriced roughness (which far more difficult to correctly assess, hence where the value lies), we are more likely to exploit mispricing in volatility than if we were to exploit realized-implied mispricing (which we already saw very hard to correctly forecast for longer periods).
So far (again, not final results but definitely seeing significance..), I have found that less liquid currency pairs with rougher paths do outperform pairs with high liquidity and smoother paths (mostly in the front end of the curve). Now, this could be attributed to many factors (illiquidity premium, inability by large players to scale positions, higher friction, to name a few…).
Another very interesting finding (which is consistent with the original paper), is that in periods where H moves above 0.5 the strategy doesn’t generate profit. These periods were noticeable in 2016 with Brexit/us elections, mid-2019 — the trade war between US/China, and march 2020. This finding actually can serve as a good indicator of when to exit these positions as there seems to be a high degree of correlation between their performance and the “market Hurst Index”
Obviously, there is much more to explore in that field, and I feel like I barely scratched the surface of application to such analysis. With the introduction of digital assets (which should be very good candidates given their extreme nature), it pretty much feels natural to implement that kind of analysis and exploit mispricing in roughness between different assets.
Feel free to share your thought/ideas/comments.