CHAPTER 2

Sentiment Analysis: Providing Categorical Insight into Unstructured Textual Data

Carol Haney,

Toluna

Sentiment, in text, is the appearance of subjectivity. That subjectivity may be an opinion or an emotion the author communicates and the reader then perceives. Both components—author intent and reader perception—are important in determining sentiment in text. An example of opinion is “the weather is mild today,” where “mild” is the author's interpretation of the temperature. What is mild to someone in the Arctic zone may be downright chilly to someone who lives near the Equator. On the other hand, objective facts are those independent of the author's interpretations. For example, “the weather is 76 degrees today with 0% humidity” is objective and, as such, has neutral sentiment.

In the online survey and social media domains, a significant amount of digital data is unstructured; that is, these data are considered text. Textual data are difficult to analyze because they do not have a fixed numeric, nominal, or ordinal structure. Textual data, if structured, almost always map to a categorical (or nominal) structure. All textual data, if separated into atomic units of an idea, have an associated sentiment, which may be positive, negative, or neutral. This chapter discusses both the conceptual implications of applying sentiment to textual data as well as the operational steps to apply structure to the text under consideration. Digital data, especially online, increasingly ...

Get Social Media, Sociality, and Survey Research now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.