Text classification for question tags

This section is about supervised learning. We define the problem of assigning tags to a question as a text classification problem and we apply it to a dataset of questions from Stack Exchange.

Before introducing the details of text classification, let's consider the following question from the Movies & TV Stack Exchange website (title and body of the question have been merged):

"What's the House MD episode where he hired a woman to fake dead to fool the team? I remember a (supposedly dead) woman waking up and giving a high-five to House. Which episode was this from?"

The preceding question asks for details about a particular episode of the popular TV series House, M.D. As described earlier, questions on Stack ...

Get Mastering Social Media Mining with Python now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.