Semantic Modeling for Data

Book description

What value does semantic data modeling offer? As an information architect or data science professional, let’s say you have an abundance of the right data and the technology to extract business gold—but you still fail. The reason? Bad data semantics.

In this practical and comprehensive field guide, author Panos Alexopoulos takes you on an eye-opening journey through semantic data modeling as applied in the real world. You’ll learn how to master this craft to increase the usability and value of your data and applications. You’ll also explore the pitfalls to avoid and dilemmas to overcome for building high-quality and valuable semantic representations of data.

  • Understand the fundamental concepts, phenomena, and processes related to semantic data modeling
  • Examine the quirks and challenges of semantic data modeling and learn how to effectively leverage the available frameworks and tools
  • Avoid mistakes and bad practices that can undermine your efforts to create good data models
  • Learn about model development dilemmas, including representation, expressiveness and content, development, and governance
  • Organize and execute semantic data initiatives in your organization, tackling technical, strategic, and organizational challenges

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Who Should Read This Book
    2. What to Expect in This Book
    3. Book Outline
    4. Conventions Used in This Book
    5. O’Reilly Online Learning
    6. How to Contact Us
    7. Acknowledgments
  2. I. The Basics
  3. 1. Mind the Semantic Gap
    1. What Is Semantic Data Modeling?
    2. Why Develop and Use a Semantic Data Model?
    3. Bad Semantic Modeling
    4. Avoiding Pitfalls
    5. Breaking Dilemmas
  4. 2. Semantic Modeling Elements
    1. General Elements
      1. Entities
      2. Relations
      3. Classes and Individuals
      4. Attributes
      5. Complex Axioms, Constraints, and Rules
      6. Terms
    2. Common and Standardized Elements
      1. Lexicalization and Synonymy
      2. Instantiation
      3. Meaning Inclusion and Class/Relation Subsumption
      4. Part-Whole Relation
      5. Semantic Relatedness
      6. Mapping and Interlinking Relations
      7. Documentation Elements
    3. Summary
  5. 3. Semantic and Linguistic Phenomena
    1. Ambiguity
    2. Uncertainty
    3. Vagueness
    4. Rigidity, Identity, Unity, and Dependence
    5. Symmetry, Inversion, and Transitivity
    6. Closed- and Open-World Assumptions
    7. Semantic Change
    8. Summary
  6. 4. Semantic Model Quality
    1. Semantic Accuracy
    2. Completeness
    3. Consistency
    4. Conciseness
    5. Timeliness
    6. Relevancy
    7. Understandability
    8. Trustworthiness
    9. Availability, Versatility, and Performance
    10. Summary
  7. 5. Semantic Model Development
    1. Development Activities
      1. Setting the Stage
      2. Deciding What to Build
      3. Building It
      4. Ensuring It’s Good
      5. Making It Useful
      6. Making It Last
    2. Vocabularies, Patterns, and Exemplary Models
      1. Upper Ontologies
      2. Design Patterns
      3. Standard and Reference Models
      4. Public Models and Datasets
    3. Semantic Model Mining
      1. Mining Tasks
      2. Mining Methods and Techniques
    4. Summary
  8. II. The Pitfalls
  9. 6. Bad Descriptions
    1. Giving Bad Names
      1. Setting a Bad Example
      2. Why We Give Bad Names
      3. Pushing for Clarity
    2. Omitting Definitions or Giving Bad Ones
      1. When You Need Definitions
      2. Why We Omit Definitions
      3. Good and Bad Definitions
      4. How to Get Definitions
    3. Ignoring Vagueness
      1. Vagueness Is a Feature, Not a Bug
      2. Detecting and Describing Vagueness
    4. Not Documenting Biases and Assumptions
      1. Keeping Your Enemies Close
    5. Summary
  10. 7. Bad Semantics
    1. Bad Identity
      1. Bad Synonymy
      2. Bad Mapping and Interlinking
    2. Bad Subclasses
      1. Instantiation as Subclassing
      2. Parts as Subclasses
      3. Rigid Classes as Subclasses of Nonrigid Classes
      4. Common Superclasses with Incompatible Identity Criteria
    3. Bad Axioms and Rules
      1. Defining Hierarchical Relations as Transitive
      2. Defining Vague Relations as Transitive
      3. Complementary Vague Classes
      4. Mistaking Inference Rules for Constraints
    4. Summary
  11. 8. Bad Model Specification and Knowledge Acquisition
    1. Building the Wrong Thing
      1. Why We Get Bad Specifications
      2. How to Get the Right Specifications
    2. Bad Knowledge Acquisition
      1. Wrong Knowledge Sources
      2. Wrong Acquisition Methods and Tools
    3. A Specification and Knowledge Acquisition Story
      1. Model Specification and Design
      2. Model Population
    4. Summary
  12. 9. Bad Quality Management
    1. Not Treating Quality as a Set of Trade-Offs
      1. Semantic Accuracy Versus Completeness
      2. Conciseness Versus Completeness
      3. Conciseness Versus Understandability
      4. Relevancy to Context A Versus Relevancy to Context B
    2. Not Linking Quality to Risks and Benefits
    3. Not Using the Right Metrics
      1. Using Metrics with Misleading Interpretations
      2. Using Metrics with Little Comparative Value
      3. Using Metrics with Arbitrary Value Thresholds
      4. Using Metrics That Are Actually Quality Signals
      5. Measuring Accuracy of Vague Assertions in a Crisp Way
      6. Equating Model Quality with Information Extraction Quality
    4. Summary
  13. 10. Bad Application
    1. Bad Entity Resolution
      1. How Entity Resolution Systems Use Semantic Models
      2. When Knowledge Can Hurt You
      3. How to Select Disambiguation-Useful Knowledge
      4. Two Entity Resolution Stories
    2. Bad Semantic Relatedness
      1. Why Semantic Relatedness Is Tricky
      2. How to Get the Semantic Relatedness You Really Need
      3. A Semantic Relatedness Story
    3. Summary
  14. 11. Bad Strategy and Organization
    1. Bad Strategy
      1. What Is a Semantic Model Strategy About?
      2. Buying into Myths and Half-Truths
      3. Underestimating Complexity and Cost
      4. Not Knowing or Applying Your Context
    2. Bad Organization
      1. Not Building the Right Team
      2. Underestimating the Need for Governance
    3. Summary
  15. III. The Dilemmas
  16. 12. Representation Dilemmas
    1. Class or Individual?
    2. To Subclass or Not to Subclass?
    3. Attribute or Relation?
    4. To Fuzzify or Not to Fuzzify?
      1. What Fuzzification Involves
      2. When to Fuzzify
      3. Two Fuzzification Stories
    5. Summary
  17. 13. Expressiveness and Content Dilemmas
    1. What Lexicalizations to Have?
    2. How Granular to Be?
    3. How General to Be?
    4. How Negative to Be?
    5. How Many Truths to Handle?
    6. How Interlinked to Be?
    7. Summary
  18. 14. Evolution and Governance Dilemmas
    1. Model Evolution
      1. Remember or Forget?
      2. Run or Pace?
      3. React or Prevent?
      4. Knowing and Acting on Your Semantic Drift
    2. Model Governance
      1. Democracy, Oligarchy, or Dictatorship?
      2. A Centralization Story
    3. Summary
  19. 15. Looking Ahead
    1. The Map Is Not the Territory
    2. Being an Optimist, but Not Naïve
    3. Avoiding Tunnel Vision
    4. Avoiding Distracting Debates
      1. Semantic Versus Nonsemantic Frameworks
      2. Symbolic Knowledge Representation Versus Machine Learning
    5. Doing No Harm
    6. Bridging the Semantic Gap
  20. Bibliography
  21. Glossary
  22. Index

Product information

  • Title: Semantic Modeling for Data
  • Author(s): Panos Alexopoulos
  • Release date: August 2020
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492054276