CHAPTER 2DATA RESOURCES

Data and a data resource can, respectively, be understood as data that has a solitary existence in cyberspace and as a data resource that is collected and stored with similar data to a certain scale. This kind of relationship is analogous to gathering some phone numbers in a city – the data is a single resident's phone number, and a data resource refers to phone numbers of all residents in a city.

The standard way to classify data resources is as general and dedicated data resources according to data organization, and further by how accessible the data is, as judged by whether the data resource is sensitive or is available for public use. The general data resources refer to the database systems that assembled data during the early stages of information processing (Oracle, SQL Server, DB2, etc.), and the dedicated data resources to geographic data, medical images (X-ray film, MRI and CT scans, etc.), and multimedia whose processing is by dedicated equipment or software. In this regard the term “sensitive” is purely descriptive; whether a data resource is sensitive or publicly available must fit criteria guided by law.

In this chapter, we discuss the typical applications of data resources in data science and classify data resources into seven types under various specific domains.

2.1 SCIENTIFIC DATA

Data is currently a major study objective in scientific research. Data science has already been formed as a discipline, and is now supporting all research ...

Get The Data Industry now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.