Information Architecture for the World Wide Web, Second Edition
by Louis Rosenfeld, Peter Morville
Name
Automated Categorization
Synopsis
Software that uses human-defined rules or pattern-matching algorithms to automatically assign controlled vocabulary metadata to documents. This is equivalent to assigning documents to categories within a taxonomy.
Synonyms
Automated classification, automated indexing, automated tagging.
Examples
Interwoven Metatagger, http://www.interwoven.com/products/content_intelligence/index.html
Applied Semantics Auto-Categorizer, http://www.appliedsemantics.com/as_solutions_autocat.shtml
Inktomi Search CCE Module (Content Classification Engine), http://www.inktomi.com/products/search/products/ultraseek/cce/
SemioTagger, http://www.semio.com/products/semiotagger.asp
Comments
We see great potential to integrate human expertise in designing taxonomies with software that populates those taxonomies quickly, consistently, and inexpensively. However, note that this software:
Works best on full-text document collections
Can’t index images, applications, or other multimedia
Does not adjust for user needs or business goals
Does not understand meaning
Resources
“Extracting Value from Automated Classification Tools” by Kat Hagedorn, http://argus-acia.com/white_papers/classification.html
“Word Wranglers” by Katherine C. Adams, http://www.intelligentkm.com/feature/010101/feat1.shtml
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access