Chapter 12. Text Mining
"I think it's much more interesting to live not knowing than to have answers which might be wrong." | ||
-- Richard Feynman |
The world is awash in textual data. If you Google, Bing, or Yahoo how much of the data is unstructured, that is, in a textual format, estimates would range from 80 to 90 percent. The real number doesn't matter. What does matter is that a large proportion of the data is in a text format. The implication is that anyone seeking to find insights in the data must develop the capability to process and analyze text.
When I first started out as a market researcher, I used to manually pore through page after page of moderator-led focus groups and interviews with the hope of capturing some qualitative insight—an ...
Get R: Unleash Machine Learning Techniques now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.