In order to aid the study of news sentiment, DJNA has constructed an archive back to 1987 that is not survivor-biased. The archive is made up of tens of millions of financial news stories, where each story is assigned numerous DJNA attribution tags, including the historical identification of relevant companies. DJNA tracks around 27,000 historical companies. It is essential in an event study to consider companies that are no longer trading, otherwise the results have the built-in assumption that companies never go bankrupt or get acquired. Using historical ticker symbols, we match the appropriate CRSP security identifiers with each company-relevant news story as being the securities that traded under those tickers when the news stories were published. This bit of bookkeeping is a necessary preliminary to conducting a proper survivor-unbiased study. Seeing as CRSP is the pricing source we shall use, only US companies will be considered.2 Moreover, given the schedule of CRSP updates, the latest date of any data point in our universe (related to events or performance) is December 31, 2008.

The atomic measure of sentiment to be used is the DJNA MCQ ranking. If a news story mentions a company in a negative light, then the company receives a negative MCQ ranking; if a news story mentions a company in a positive light, then the company receives a positive MCQ ranking. We make this notion slightly more precise below.

Let N denote the universe of all ...

Get The Handbook of News Analytics in Finance now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.