1.4. The Concept of “Collection”

A collection is a group of resources that have been selected for some purpose. Similar terms are set (mathematics), aggregation (data modeling), dataset (science and business), and corpus (linguistics and literary analysis).

We prefer collection because it has fewer specialized meanings. Collection is typically used to describe personal sets of physical resources (my stamp or record album collection) as well as digital ones (my collection of digital music). We distinguish law libraries from software libraries, knowledge management systems from data warehouses, and personal stamp collections from coin collections primarily because they contain different kinds of resources. Similarly, we distinguish document collections by resource type, contrasting narrative document types like novels and biographies with transactional ones like catalogs and invoices, with hybrid forms like textbooks and encyclopedias in between.

A collection can contain identifiers for resources along with or instead of the resources themselves, which enables a resource to be part of more than one collection, like songs in playlists.

A collection itself is also a resource. Like other resources, a collection can have description resources associated with it. An index is a description resource that contains information about the locations and frequencies of terms in a document collection to enable it to be searched efficiently.

Because collections are an important and frequently used kind ...

Get The Discipline of Organizing: Professional Edition, 4th Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.