14 Data and Alpha Design

By Weijia Li

Data plays a central role in alpha design. First, we need the basic data to run a simulation. Basic data means the stock price and volume of a security. No matter what kind of alpha idea you want to backtest, you need these basic data to calculate statistics like return, Sharpe, and turnover, etc. Without these statistics, we will never know if an alpha idea is good or not. Second, data itself can inspire alpha ideas. For example, you can plot the price/volume data for some stocks and check if there is any repeating pattern in history. You can do technical analysis with the price/volume data, etc. If you have access to company earnings data, one natural idea would be to trade stocks based on company earnings.


Finding new data has always been a critical skill for an alpha researcher. People always prefer good performance and low correlated alphas. A new dataset can serve both purposes. Sometimes we can get signals from one set of data. The signals may not be strong enough even after we try our best to improve them. Now if we can get another set of data and look at companies from a different angle, we may improve the original signals and make them better. We always want to create uncorrelated alphas to diversify the alpha pool. However, even when the alpha ideas are different, sometimes alpha signals from the same dataset can still be highly correlated. There is an intrinsic correlation between the signals due ...

Get Finding Alphas: A Quantitative Approach to Building Trading Strategies now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.