Note: Page numbers followed by f indicate figures and t indicate tables.


on insurance claim notes, 971
in tokenization, 47
Abstracts, roots of, 5–6
Accuracy, of text classifiers, 889
Acronyms, in tokenization, 47
Active time, 951
Aircraft accident, latent semantic indexing of, 938, 938t, 939f
Aircraft asking price, See Cessna aircraft asking price
Airline consumer sentiment, R mining of Twitter for, 134
algorithm sanity check for, 138–139
American Customer Satisfaction Index compared with R results, 144–147, 145f, 146f, 147f
comparing score distributions, 142, 143f
data.frames for, 139–140
establishing sentiment, 137
extracting text from Tweets, 135–136
graphing results, 147–148, 147f, 148f ...

