Chapter 16. Implementing High-Frequency Trading Systems

Once high-frequency trading models have been identified, the models are back-tested to ensure their viability. The back-testing software should be a "paper"-based prototype of the eventual live system. The same code should be used in both, and the back-testing engine should run on tick-by-tick data to reenact past market conditions. The main functionality code from the back-testing modules should then be reused in the live system.

To ensure statistically significant inferences, the model "training" period T should be sufficiently large; according to the central limit theorem (CLT), 30 observations is the bare minimum for any statistical significance, and 200 observations is considered a reasonable number. Given strong seasonality in intra-day data (recurrent price and volatility changes at specific times throughout the day), benchmark high-frequency models are back-tested on several years of tick-by-tick data.

The main difference between the live trading model and the back-test model should be the origin of the quote data; the back-test system includes a historical quote-streaming module that reads historical tick data from archives and feeds it sequentially to the module that has the main functionality. In the live trading system, a different quote module receives real-time tick data originating at the broker-dealers.

Except for differences in receiving quotes, both live and back-test systems should be identical; they can be ...

Get High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.