Glossary
- agglomerative
-
Agglomerative clustering is a type of hierarchical clustering that produces clusters starting with single instances that are iteratively aggregated by similarity until all belong to a single group.
- application programming interface (API)
-
An application programming interface formally defines how software components communicate. A data API might provide users with a systematic way to read or fetch information from the internet. The Scikit-Learn API exposes generalized access to machine learning algorithms implemented via class inheritance.
- bag-of-words (BOW)/continuous bag-of-words (CBOW)
-
Bag-of-words is a method of encoding text, such that every document from the corpus is transformed into a vector whose length is equal to the vocabulary of the corpus. The primary insight of a bag-of-words representation is that meaning and similarity are encoded in vocabulary.
- baleen
-
Baleen is an open source automated ingestion service for blogs to construct a corpus for natural language processing research.
- betweenness centrality
-
Given a node
Nin a graphG, the betweenness centrality indicates how connectedGis as a result ofN. Betweenness centrality is computed as the ratio of the shortest paths inGthat includeNto the total number of shortest paths inG. - bias
-
Bias is one of two sources of error in supervised learning problems, computed as the difference between an estimator’s predicted value and the true value. High bias indicates that the estimator’s ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access