The Handbook of News Analytics in Finance

3.4 A FRAMEWORK FOR REAL-TIME NEWS ANALYTICS

The core of our real-time news analysis engine relies on a scoring method that assesses the relative volume/significance of news from a specific category of news. For instance, we wish to identify periods when the volume of news about foreign exchange markets is abnormally high, or when there is a flurry of macroeconomic news announcements.

For a given topic, say foreign exchange news, the scoring procedure has the following parameters:

A list of keywords/key phrases and real-valued weights: ( W₁, γ₁),…, (W_k, γ_k).
A rolling window size, l (typically about 5–10 minutes).
A calibration rolling window size, L (typically about 90 days).

The keywords list and the last l minutes of news are used to create a raw score, and this score is normalized/calibrated using statistics about the news over the last L days (as described below).

3.4.1 Assigning scores to news

The score at a given point in time, t, is assigned as follows: Let (w₁,…, w_k) be the vector of keyword frequencies in the time interval [t – l, t) (i.e., w_i is the number of times word/phrase W_i has appeared in the last l minutes). The raw score at time t is then defined to be:

In this form, the raw score will tend to be high when news volume is high, and so we calibrate/normalize the score using the calibration rolling window: We maintain a record of the scores that have been assigned ...

Get The Handbook of News Analytics in Finance now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

The Handbook of News Analytics in Finance by Gautam Mitra, Leela Mitra

3.4 A FRAMEWORK FOR REAL-TIME NEWS ANALYTICS

3.4.1 Assigning scores to news

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly