1From Trace to Web Data: An Ontology of the Digital Footprint

The development of new masses of digital data is a reality that continues to give rise to a great deal of thought on the part of a multitude of actors and positions: researchers, engineers, journalists, business leaders, etc. Big data presents itself at first sight as a technological solution, brought about by digital companies and computer research laboratories to a problem that is not always clearly expressed. It takes the form of media and commercial discourses, rather prospective in nature, about what the abundance of digital data could change. One specificity of these discourses in relation to other technological and social changes is that they are de facto discourses on knowledge. They frequently adopt a system of enunciation and legitimation inspired by scientific research, and more specifically by the natural sciences, from which they take up the notions of data, model, hypothesis and method. In an article emblematic of the rhetoric of big data, and now refuted many times,1 Chris Anderson (2008), then editor-in-chief of the trade magazine Wired, wrote:

“There is now a better way. Petabytes allow us to say: ‘Correlation is enough’. We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot”.

In a similar vein, Big ...

Get Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.