Before continuing on a journey through information management, it is important to have a common understanding of how information and data can be described in a structured way. Even unstructured data, such as documents, contains or relates to some form of structured data, such as the fields in a database.
Apart from truly random data sets, which have some limited value, every data set or document has relationships. For example, these relationships could exist between database fields, through a structure within a document, or as assumed associations between Web pages through keywords.
There is, however, a very useful mathematical tool called graph theory that can be applied to gain a much deeper understanding of data. Graph theory describes networks of nodes. Network theory is formally called graph theory in mathematics; so for the balance of this chapter, consider the words network and graph to be synonymous.
Each node in the graph is called a vertex. The connections between vertices are called edges. We aren't talking about any other form of physical networking. Rather, we are discussing network theory in the abstract and applying theory from our mathematical colleagues to the newer science of information management, and specifically data modeling. The rules of data modeling are often lost in the detail of the individual business problem, so it is very useful to have some tools to help abstract the problem. Hence, each vertex ...