1.3 TURNING QUALITATIVE TEXT INTO QUANTIFIED METRICS AND TIME-SERIES
A salient aspect of news analysis is to discover the informational content of news. Converting qualitative text into a machine-readable form is a challenging task. We may wish to distinguish whether a story's informational content is positive or negative; that is, determine its sentiment. We may go further and try to identify “by how much” the story is positive or negative. In doing this we may try to assign a quantified sentiment score or index to each story. A major difficulty in this process is identifying the context in which a story's language is to be judged. Sentiment may be defined in terms of how positively or negatively a human (or group of humans) interprets a story; that is, the emotive content of the story for that human. In particular, standards can be defined using experts to classify stories. Some of RavenPack's classifiers are calibrated using language training sets developed by finance experts. Further, dictionary-based algorithms which use psychology-based interpretations of words may be used. Since different groups of people are affected by events differently and have different interpretations of the same events, conflicts may arise. Moniz, Brar, and Davis (2009) gives an example of the term “dividend cuts”. This may be classified as a negative term by a dictionary-based algorithm. In contrast, it may be interpreted positively by market analysts who may believe this indicates the company is ...
Get The Handbook of News Analytics in Finance now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.