Textbases
Abstract
Textbase is the current buzzword for document management systems, which deals with data kept in text as opposed to traditional structured data, relationships, or temporal models. It is the oldest form of data we use. Documents can be free text or semi-structured documents. The problem is that text can be treated as strings that have only syntax; that is patterns of characters that can be mathematically defined and mechanically manipulated by relatively simple algorithms. However, words have semantics; this requires human judgment or insanely complicated algorithms that are able to learn and make humanlike judgments. Most of the important business rules (laws, contracts, rules, definitions, communications, etc.) are ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access