Decentralize now?

Can a technology reboot reopen the Web?

By Simon St. Laurent

July 12, 2016

Rain drops on glass (source: Mike via Flickr)

The Web opened with an explosion of DIY openness, but over the last few years more and more of us have turned to centralized services for our needs. We still center much of our work on web technologies, but much of the “everyone their own publisher” joy has faded.

A few weeks ago, a few hundred people gathered at the Internet Archive for the Decentralized Web Summit, asking explicitly about “Locking the Web Open.” Is it possible to build a system which doesn’t eventually funnel control over communications into a very few hands? Can we create a new Web fundamentally more open than the old Web?

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

The Web (and Internet) dream

The Web, much like previous media, promised to democratize communications. Everyone could read. The original proposal centered on information-sharing needs within a single organization, so “copyright enforcement and data security [were] of secondary importance at CERN, where information exchange is still more important than secrecy.”

That simple shift in priorities paved the way for a much simpler hypertext system than its predecessors, one that exploded beyond its original planned home into a world that didn’t always appreciate the cavalier approach the Web took toward either copyright or security. Alan Kay can complain bitterly that “the Web was done by amateurs,” but that grand (grotesque?) simplification made it possible.

It also included this now tantalizing requirement, which in many ways set the Web on its path to growth:

Non-Centralisation

Information systems start small and grow. They also start isolated and then merge. A new system must allow existing systems to be linked together without requiring any central control or coordination.

That architectural approach and minimal reverence for traditional approaches to controlling information quickly brought at least the first half of Stewart Brand’s famous quote to prominence:

Information wants to be free. Information also wants to be expensive. …That tension will not go away.

It also fit well with John Perry Barlow’s Declaration of the Independence of Cyberspace. The Web was off and running!

Even today, these aspects dominate some parts of the Web conversation, like the classic “What is the Open Web?” The IndieWeb movement works on tools and approaches to help people connect while staying in control of their own content. The dream is still very much alive, as shown by a piece on “Team Web,” and the “What the Web Means to Mozilla (WTWMTM)” document Mitchell Baker shared in the opening of the Decentralized Web Summit.

Paying for it

The tricky part of all of those proclamations is that the Web skipped out on the option of paying content creators for their work. While it avoided, say, the complexity of Xanadu, and sites certainly developed the power to accept credit cards and other forms of payment, payment wasn’t built into the Web. It was an extra step requiring a major commitment. Free content was easy, while paid content was hard, so we learned to work with variations on free content.

You can do a lot with free. The Web rapidly flooded with conversation, with people talking to each other who previously had a much harder time finding each other. That conversation inspired ideas like The Cluetrain Manifesto, which suggested a future of networked conversations fueling markets.

Much of that conversation spiraled down, burdened by advertisers trying to reach every possible customer in the hope that cheap ads would convert enough people to customers to pay off. The finances have long seemed shaky, the performance is routinely terrible, and even as Flash fades, the security isn’t great. Bruce Lawson recently gave a Velocity keynote exploring how the ad spiral took off and what we might do to escape it. Advertising is slowing the Web and adding surveillance while failing to be profitable enough to support it.

Another group of promotional conversations started out more promising, but are often corrupted by people who wanted to steer the conversation toward their own set of products and often did it badly. If a company’s own efforts couldn’t drive enough conversation, what about encouraging bloggers to write about products? Extending product placement conversations might generate traffic, but often reduces trust in the Web as a whole.

Yet other conversations were simply blocked or driven underground because they used content other people owned (largely through copyright) as their foundation. While O’Reilly (my employer and the place publishing this article) has argued that digital piracy is progressive taxation, other intellectual property owners have been rather more ferocious about insisting that the Web’s default of free distribution damages rather than helps their business models.

Instead of free distribution, these IP owners want Digital Rights Management (DRM), perhaps better described from a Web perspective as Digital Restriction Management. The Web is about sharing, while DRM is largely about stopping sharing: adding restrictions to what had been open connections. DRM relies on a combination of legal enforcement and technical integration, with the World Wide Web Consortium’s Encrypted Media Extensions as the point of contact and friction.

Power laws and information empires

From the beginning, the wide-open everywhere Web had a problem. As the number of web sites grew, it quickly became hard to find things. Yet Another Hierarchically Organized Oracle” (Yahoo!) got its start adding order to the chaos, and search engines trawled the Web as well. Google figured out how to use the chaos to its advantage (using links to evaluate content relevance), and established dominance in search.

The other classic bottleneck for finding content came not from the Web, but from the older Internet infrastructure. Domain names were the classic digital real estate battlefield, and even lent a suffix to the name of the first Internet-driven business boom (and bust): dot-com.

While you could theoretically host a site on any computer hooked up to the Internet, that required too much of most people. Dialup connections—themselves a key point of centralization for the Internet—didn’t have the speed to support local hosting, and putting sites closer to bigger connections created valuable real estate. My first site went up on Panix, a New York ISP, and my second went up (yes, really) on America Online. AOL was trying to keep people in its walled garden, and offering websites was part of that. Other services, like GeoCities, offered easy publishing to grow its neighborhoods quickly.

Bigger faster was, of course, the key. Dot-coms raced to “disintermediate brick and mortar.” If Amazon could undercut booksellers through centralization and logistical economies of scale, all kinds of business was up for grabs. Pets.com seems to get all the flak, but it was far from alone in its race to stake out a claim to as many users as possible to build its fortunes. Conversations about the need for more competition are still very much alive.

On the web tools front, the Browser Wars were a similar race to accumulate users through new features, creative marketing, and connecting browsers with operating systems.

In the relative quiet of the dot-com bust, there was a period when it didn’t seem like the digital world was chasing real estate, a period when people were figuring out crazy new ways to talk with each other over this Web thing. “Weblogs” and “blogging” were the new opportunity for anyone—at least anyone with enough technical bent to get started—to share information. Reverse chronological order set minimal expectations, and it grew from there. Comments appeared, and connections between blogs, and standards for sharing feeds like RSS and Atom, and once again everyone could publish …

At just that point, though, Clay Shirky popped up to point out that while everyone could publish, not everyone had an audience. Power laws applied to weblogs, making their readership wildly unequal. People’s choices and recommendations and link sharing led to a situation where a few people had the vast majority of the traffic.

A few years later, Tim Wu’s book The Master Switch told similar stories across a much wider range of “information empires”: telephones, film, radio, television, and the Internet. I hear The Master Switch cited regularly as an inspiration, though I also listen for the next part: are people using it as a guide to empire building, or are they, like Wu, joining the resistance?

Let us then, not fail to protect ourselves from the will of all who might seek domination of those resources we cannot do without. If we do not take this moment to secure our sovereignty over the choices that the information age has allowed us to enjoy, we cannot reasonably blame its loss on those who are free to enrich themselves by taking it from us in a manner history has foretold.

Unfortunately, while Wu’s warning indeed raised the profile of network neutrality, it’s had less effect on the Web. The rise of social media, perhaps especially Facebook, has continued along imperial lines. Concentrating digital real estate remains the core of the business game. Even with the absolute best of intentions, that much control over what people see can seem like a problem in itself. Mix in national politics, and it gets even more complicated.

What would a breakthrough look like?

When I arrived at the Decentralized Web Summit, the place was already buzzing with “The Web’s Creator Looks to Reinvent It,” a New York Times article about meetings the day before.

Wendy Hanamura of the Internet Archive and Mitchell Baker of Mozilla welcomed the crowd, and Baker provided a solid foundation for the day in the What the Web Means to Mozilla (WTWMTM).

The next two talks felt very much like the patriarchs addressing the next generation. After blessing the crowd—”May all of your packets land in the right bitbucket”—Vint Cerf, “Father of the Internet,” spoke on archiving, achieving permanence in a world he had helped build from transient packets. Tim Berners-Lee, “Father of the Web,” took a more critical look at aspects of the Web that had taken turns he didn’t love:

“Individual personal data has been locked up in these silos,” especially in social networks, and
Surveillance models of advertising: “the deal – it’s a myth that it has to be, it’s a myth that everyone’s happy with it, and it’s a myth that it’s optimal.”

Berners-Lee saw cause for optimism in the Solid (Social Linked Data) project he’s been working on at MIT, which explicitly promises that users will own their own data, and be able to get in and out of it. He also noted techniques like using cryptographic hashes as identifiers, at least “if you believe that the blockchain isn’t another centralization we’re liable to fall into.”

That day saw two other keynotes. Brewster Kahle delivered a mostly optimistic talk about ways to lock the Web open, while Cory Doctorow explored what needs to be done now to minimize potential future rot, using the present Web’s turn toward DRM as an example of failure. While the earlier talks explored what was missing, these two laid out parameters identifying success and failure.

The rest of the day was panels. Most focused on key problems that needed decentralized solutions to make this all work:

They also showed some shorter videos on enterprise and creative projects built on decentralized models. The conference site also includes video of the workshops and lightning talks that went much deeper into specific fields and projects.

By the end of that day, I had a vision of a very different web, a web of servers combining secure identification with shared publishing responsibility. Instead of posting files or data to “a web server” under my control, I’d just be publishing to the web (using identifiers linked to my identifiers), and letting the web sort out what goes where, optimizing its own storage and delivery structures. That architectural shift takes me into what feels like uncharted territory, but it is clearly technically and financially possible.

While the things published would technically just be data, code is also data, and that code could publish more data. While the structure of this new Web would be very different, with the server side focusing more exclusively on data management, its capabilities are largely similar.

Conflicts

The glowing vision of a decentralized publishing platform comes with a few challenges, both technical and social.

Much of the vision around naming and identity, a key part of making this system run on a large scale, centered on blockchain approaches. Ethereum and specifically the Decentralized Autonomous Organization were held up as models for programmatic but secure approaches. Recordkeeping could be powerful stuff!

However, it was the wrong month to be promoting the DAO’s smart contracts, as logic left some gaps. Ethereum’s reputation is dented. As the DAO sorts out and blockchain technology matures, this may become less of an issue, but for now a colleague asked me what these tools can do without the blockchain.

The blockchain isn’t required to use, for example, the peer-to-peer InterPlanetary File System (IPFS), though its more detailed whitepaper suggests it as a place to store blockchain information.

Blockchain may open doors, however. One of the features of MaidSafe that was popular in my lunch conversation at the summit is its ability to give “coins” to users who provide storage. Building a payment network into the system from the beginning makes it easier to talk people into providing hosting and possibly more. The current Internet and Web certainly have business models, but they run alongside the technology rather than being built into it.

The social challenges may be larger, however.

Even as Vint Cerf was describing his dream of a self-archiving Web, comments on Twitter and in the room were asking the difficult questions of whether everything actually should be archived. The right to be forgotten goes over badly with many tech visionaries and some archivists, but the European Union has upheld it. While the vision of a censor-proof Web appeals to some, questions about what should happen when people want to remove their own content suggest that mechanisms for deletion aren’t only the demands of censors. (Is a version control mechanism enough?)

Priorities are also a tricky question. Tim Berners-Lee pronouncing walled gardens a curse while letting DRM slip through the W3C doesn’t go over well with everyone. Payment systems, which the current Web standards process just got around to addressing, raise other complex questions.

Also, while it was clear that the people attending the summit (including me!) were eager to see and use a decentralized web, it’s much less clear that a broader audience craves decentralization. Diaspora still exists, but hardly caught on. Searching the attendee list for “Facebook” brings up nothing.

(Plus, of course, the event itself used the centralized Slack for chat rather than less centralized IRC.)

Hope

The Web is changing.

Most of the technical side of that conversation is about change within the current set of tools, questioning whether the best way forward is through standards conversations or more open experimental models, and what kind of trust is possible. The social side of that conversation often focuses on questions like the growing importance of video and what the shift away from text might imply.

The Decentralized Web conversation is different. While it connects neatly with many traditional Web technology conversations—hyperlinks are at the core—it looks more toward peer-to-peer approaches rather than the classic HTTP client-server model. IPFS, for example, builds on WebRTC, a protocol many people think of for audio and video but which also provides peer-to-peer data channels. The focus tends to be more on where data lives, the server side of the traditional conversation, than on where it is viewed, the client side. While most of us think of the Web as the surface we see in browsers, the structure underneath shapes everything.

The technical side of the decentralized web is promising, if not yet settled. Despite the DAO’s recent problems and a lot of work yet to be done, it’s been clear for a long time that peer-to-peer architectures can distribute content efficiently. These approaches can coexist easily with the traditional HTTP Web, and reuse the content tools of the Web. Neocities is running IPFS in production today, as an experiment alongside HTTP.

(The Decentralized Web is far from the only possible sidecar for the Web. The Seif Project also aims to promote a different approach to development, targeted at highly individualized and hopefully not shared conversations.)

The social side is unsettled. We live in a development world deeply trained to seeing servers as central distribution points, even when they’re containerized and running in the cloud. Does this model solve enough concrete problems to drive developers to rethink current practices? Might platform coops drive adoption?

For now, I’d suggest exploring the technology and pondering what we might do with it. This list of technologies is a good place for developers to start.

Post topics: Web Programming