What is a blockchain?
Unpacking the complexity of blockchain, term by term.
So, what is a blockchain? It’s a complicated question because the inventor of bitcoin, the pseudonymous Satoshi Nakamoto, didn’t use the term in the original bitcoin paper. For many, “the blockchain” is nothing more than a shorthand for “how bitcoin works.” But more usefully, the blockchain is a distributed ledger, shared by untrusted participants, with strong guarantees about accuracy and consistency. What does that mean? Let’s unpack it term by term:
A ledger: If you go into antiquarian bookstores, you may have seen piles of books from the 19th century in which accountants entered transactions by hand. Those are ledgers. Ledgers are lists of transactions: items sold, and for how much; items purchased, and for how much. Those transactions are dated (timestamped) and ordered. Ledgers are strictly append-only: transactions can be added, but you can’t go back and edit or delete them. A blockchain can have ledger entries that are significantly more complex than credits and debits, but the concept is the same: a set of ordered entries to which new entries can be added, but old entries can be neither deleted or modified.
Shared: Ledgers handwritten in books are obviously not shared. The only people who can make entries are the accounting staff, and the only people who can read the ledger are those who have physical access to the books. That’s very 19th century, and we should we be saying “clerks” rather than “accountants.” (And pronouncing it “clarks.”) The point of a blockchain is that anyone can add entries to the ledger. More precisely, anyone with the appropriate software can put entries into a pool of entries that will eventually be checked for consistency and added to the ledger.
Distributed: Blockchains aren’t centralized. There’s no central administration that decides who has access, and what rules they must follow. There is no single point of control, and also no single point of failure. Many participants in the blockchain have copies of the entire ledger. Those copies are updated whenever blocks are added.
Untrusted participants: This is perhaps the most radical idea in the blockchain. Anyone can add entries, including people and organizations that don’t trust each other. In enterprise applications, requiring a certain amount of trust allows some important optimizations, but the concept of “untrusted participants” is fundamental to a blockchain. The technical term for protocols that produce agreement among untrusted partners is “byzantine fault tolerance” (BFT) or “byzantine agreement.”
Accuracy and consistency: Despite untrusted participants, a blockchain makes strong guarantees about the ledger’s accuracy. Specifically, participants can’t add, delete, or modify entries that have already been placed in the ledger. The copies of the distributed ledger aren’t always in strict agreement, but disagreements are quickly resolved automatically. In the past, accuracy was guaranteed by access control: only the clerks were allowed to make entries in the books. While many “enterprise blockchains” add access control, and that addition can make the blockchain much more efficient, it’s important to understand that the blockchain is all about maintaining an accurate ledger with participants that you don’t trust—possibly even participants who are hostile, and who would want to corrupt the ledger.
This definition omits several things that are associated with bitcoin, and since blockchain is often a shorthand for “how bitcoin is implemented,” it’s worth being explicit about what I’ve left out. So, here are some important features that don’t necessarily belong to a blockchain.
Users: It’s counterintuitive, but users and user IDs are never stored in the bitcoin blockchain. Users are an abstraction maintained by bitcoin wallets. Users are not stored in the blockchain. bitcoin has a concept of a “user address,” but that name is misleading. User addresses are public, but they don’t identify users. Wallets should generate a new address for each transaction, making it difficult to associate any address with a specific user. Reusing addresses makes it possible to de-anonymize users.
Privacy: It’s hard to discuss bitcoin without also discussing privacy (more properly, pseudonymity). But if a blockchain doesn’t have a concept of a user, it really can’t have a concept of privacy, either. Users have privacy precisely because the bitcoin blockchain can track transactions without knowing about them.
The logic behind bitcoin privacy is counterintuitive, and, frankly, brilliant. It works by keeping public data structures and protocols clean from any knowledge of users.
Having said that, it is almost inconceivable that anyone would create (or use) a blockchain that doesn’t have a notion of privacy somewhere—if not in the blockchain itself, in the applications layered on top of the blockchain. And, while bitcoin has often been criticized for enabling criminal transactions, enterprises developing blockchain apps have privacy requirements at least as stringent as the bad guys. If Bank of America is going to use a blockchain to clear financial transactions with Santander, you can bet they don’t want their friends at Wells Fargo watching. Any blockchain developer has to be aware of user identity and privacy.
While this definition mentions neither blocks nor chains, it gives you a good sense of what a blockchain can do. With all the hype surrounding blockchains, it’s important to understand what they can and can’t do. They aren’t magic; they are solutions to a specific set of important problems. If you’re building applications that span enterprises, and that need to keep accurate records in the presence of untrusted partners, you should be thinking about blockchains.
Learning Path: Introduction to Blockchain Applications — Dr. Jonathan Reichental helps you fully understand the scope of blockchain technology and how it can be used across a variety of applications and industries.