May 2020
Intermediate to advanced
404 pages
10h 52m
English
You will often come across the term corpus while you are studying NLP. In layman's terms, a corpus is a collection of writings from any one author or from a genre of literature. In the study of NLP, the dictionary definition of corpus gets a bit modified and can be stated as a collection of written text documents, such that they can all be categorized together by any metric of choice. These metrics might be authors, publishers, genres, types of writing, ranges of time, and other features associated with written texts.
For example, a collection of Shakespeare's works or the threads on any forum for any given topic can both be considered a corpus.
Read now
Unlock full access