Documents, Fields, and Boosts
Documents
The best way to think of an index is as a searchable array of documents. A Ferret document
is a collection of fields representing a chunk of
data that you want to make searchable. Whether that chunk of data is a
database row, a Word document, or an MP3 file doesn’t matter. They are
all just documents to Ferret. A Ferret document can be represented
by the Ferret::Document
class. This class extends Ruby’s Hash
class, adding only a boost
attribute. In fact, as you saw in Example 1-2, documents
can also be Hash
es, where the key is
the name of the field and the value is the data stored in the
field.
The term “document” can be quite confusing. We often need to talk
about the idea of a document in an index that is implemented by the
Document
class. A document can
represent a PDF or a text document, or it can represent something like a movie or a product. Make
note of the formatting we use to distinguish documents from the
Document
class.
Earlier we mentioned that Document
s have a boost
attribute, but we didn’t say what boost
was for. The
boost
attribute gives a document a
higher weighting in the results of a search. By using the boost
attribute, you can make more important documents appear ...
Get Ferret now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.