Chapter 3. Social Data

What is social data?

From Fluidinfo’s perspective, it is data from many easily discernible sources that, by virtue of the way it is stored, can be shared, repurposed, enhanced, annotated, augmented, and easily queried in an environment that encourages open participation.

This chapter aims to show the benefits of publishing social data. The following topics will be of particular interest:

  • How different sources map their data to Fluidinfo

  • Conventions that have emerged for organizing data

  • How data is shared, reused, and enhanced between different sources and applications

Three different domains of data will be used to explore these issues: social networking data based on Twitter, O’Reilly’s book catalog, and articles from some technology-related blogs.

Twitter and Social Data

Twitter, because of its popularity and influence, boasts one of the most “crunched” datasets on the Internet. Dozens of services, public and for hire, process Twitter data. But metadata about Twitter users and tweets is not widely shared, and in this section we’ll show how Fluidinfo can expose interesting facts about Twitter’s use.

Walled Gardens of Data

Twitter[12] is a social networking site whose users tweet messages of no more than 140 characters in length. Users follow one another in order to subscribe to one another’s streams of tweets. Millions of users around the world share tweets about every imaginable subject.

Twitter’s API allows developers to get at the service’s data. For example, it ...

Get Getting Started with Fluidinfo now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.