4 Textual similarity

This chapter covers

  • Representing data for authorship analysis with deep learning
  • Applying classifiers to authorship attribution
  • Understanding the merits of MLPs and CNNs for authorship attribution
  • Verifying authorship with Siamese networks

One of the most common applications in natural language processing (NLP) is determining whether two texts are similar. Common applications include

  • Document retrieval—Determining query-result similarity

  • Topic labeling—Assigning a topic to an unlabeled text based on similarity with a set of labeled texts

  • Authorship analysis—Determining whether a text is written by a certain author, based on texts attributed to that author

We will approach the topic of text similarity from the perspective ...

