Let us begin by telling you what this book is not about. It is not about the spoken word, in any way, shape, or form. When we refer to Mining the Talk, “the talk” refers to words on the page, or to be more precise, words on the electronic page, not words out of the mouth. The reason we call it “talk” is not to be cute, but to emphasize the informal nature of the data being mined. Most data that is mined, or searched, or graphed is meant to be used in this way. That is usually why the data was put there in the first place. The collection and the analysis of the data go hand in hand. Not so with the type of data we refer to as “talk”—talk is put on the earth simply to be read. It is casual, unstructured, unpredictable, and diverse. You ...

