Chapter 6
Uncertainty of an Event and its Markers in Natural Language Processing
6.1. Introduction
Supposing a given event is extracted by automated means from a textual document, how are we to determine the level of certainty which should be associated therewith? Has this event already happened or is it simply predicted? If it is described as having already happened, how much trust should we attach to its actual occurrence, on the basis not only of the reliability of the source, but also of the semantic and temporal markers in the text from which it is extracted?
The studies presented in this chapter use linguistic analysis to model the uncertainty linked to detected events. The first step was to create a theoretical model to reflect how uncertainty is expressed in written texts. The second step of implementation consisted of using a reference corpus of nearly 15,000 articles and 13 million words1 to construct dictionaries and grammars covering all of uncertainty cases which might be expressed in the selected texts. These dictionaries and grammars match features, drawn from the model of uncertainty defined previously, to textual forms. Finally, uncertainty detection was combined with the detection of named entities and events from texts, in order to limit the detection and characterization of the different uncertainty forms to those relating to events only. The third and final step was technological implementation. A free-text analysis software module was developed. This module ...
Get Information Evaluation now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.