News has always been a key factor in investment decisions. It is well established that company‐specific, macroeconomic and political news strongly influence the financial markets. As technology advances and the market participants become more connected, the volume and frequency of news are growing rapidly. In fact, more data was created in the past two years than the previous 5000 years of humanity. It is estimated that, in 2017, we created even more data in one year alone (Landro 2016). A significant portion of this comes from news sources, rendering manually processing all news‐related information humanly impossible.
This burgeoning abundance of news data, combined with significant developments in machine learning (ML), brought to the fore the application of natural language processing (NLP) in finance. NLP is a subfield of artificial intelligence concerned with programming computers to process natural language corpora in order to gain useful insights. NLP manifests itself in different forms across many disciplines under various aliases, including (but not limited to) textual analysis, text mining, computational linguistics and content analysis (Loughran and McDonald 2016).
The efficient utilization of news data in finance requires identifying relevant news in a timely and efficient manner. Major news can have a significant impact on the market and investor ...