The TRADE 60 | Page 27

[ T H O U G H T our expected impact changes for different trading strategies. We use our past data to ‘learn’ how the impact varies with strategy and then take the resultant model to use in the scenario analysis. This is why having a large repository of order data is so crucial. We need to see a lot of outcomes to be able to properly train, test and validate a model. We want to create features (AKA analytics) that can distil a large amount of data into a smaller amount of data. We can also enrich the data by applying our domain knowledge. We do this by aggre- gating and combining the data into features that describe the business process we want to model. As an example, we may want to sepa- rately aggregate all of the volume from liquidity removing trades and liquidity providing trades to compute a ratio of active to total volume. If we want to be able to model how our orders interact with the various order books that make up the market, we need to prepare the data, both market data and our order data. Rather than pass every new order book update to the model, we instead compute time-weighted liquidity metrics, such as depth of liquidity on a scale of minutes rather than microseconds. This process turns highly granular or- der book messages into aggregated analytics (features) that describe liquidity in a more macro way. We can apply similar approaches to our order data. Engineering order features for our model starts off with a ques- tion we want to answer: What is the relationship (if any) between the trading strategy and the imple- mentation shortfall of our order? The first task is to try and come up with the appropriate features L E A D E R S H I P Chris Sparrow, head of research Melinda Bui, director of trading analytics that we think will help us identify a relationship. It can help to look at a volume profile that shows aggregated volume binned into 15-minute intervals. We can also show the color of the volume done actively versus passively. We may even want to show the venues where we got the volume (but that would make the chart very hard to read). The trading strategies that could be employed range from trading most of the volume close to the start of the order to trading most | L I Q U I D M E T R I X ] of the volume near the end of the order. Other strategies would trade their volume at a more consistent pace. One feature we can try is to com- pute the time-weighted unfilled portion of the order. If most of the volume is traded at the beginning of the order, then the pace of trading slows down as the order proceeds. This proposed feature would have a small value because most of the order volume is exe- cuted in a short amount of time at the beginning of the order. If most of the volume were traded near the end of the order, the feature would have a large value. A more symmetric strategy, like a VWAP, would have a value close to 50% since the volume profile is close to being symmetric in time. Figure 1 shows some examples of trading strategy volume profiles along with the time-weighted average unfilled portion of the order. We will call the value of the proposed feature exposure. The ex- posure is the area under the curve shown in Figure 1. When we examine the trading profiles of various strategies, we also notice that some of the strat- egies may be symmetric, but they don’t trade consistently through the lifetime of the order. There are some periods where there is a lot more volume and other periods where there may be no volume. We can create another feature to mea- sure this by fitting the time-weight- ed unfilled portion of the order with a linear model. We can define a ‘roughness’ feature by comput- ing the variance of the difference between the model and the data. If the trading strategy is smooth, like a VWAP in a liquid security, then the ‘roughness’ will be low, where- as in other strategies that may be symmetric, but not consistent, the ‘roughness’ will be high. Issue 60 // TheTradeNews.com // 27