








Blockchain
Basics
A Non-Technical Introduction
in 25 Steps
―
Daniel Drescher
A NON-TECHNICAL INTRODUCTION
IN 25 STEPS
Daniel Drescher
Blockchain Basics: A Non-Technical Introduction in 25 Steps
Daniel Drescher
Frankfurt am Main, Germany
ISBN-13 (pbk): 978-1-4842-2603-2
ISBN-13 (electronic): 978-1-4842-2604-9
DOI 10.1007/978-1-4842-2604-9
Library of Congress Control Number: 2017936232
Copyright © 2017 by Daniel Drescher
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.
Managing Director: Welmoed Spahr
Editorial Director: Todd Green
Acquisitions Editor: Susan McDermott
Development Editor: Laura Berendson
Technical Reviewer: Laurence Kirk
Coordinating Editor: Rita Fernando
Copy Editor: Mary Bearden
Compositor: SPi Global
Indexer: SPi Global
Artist: SPi Global
Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.
For information on translations, please e-mail rights@apress.com, or visit
http://www.apress.com/rights-permissions.
Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Print and eBook Bulk Sales web page at http://www.apress.com/bulk-sales.
Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub via the book’s product page, located at www.apress.com/9781484226032.
For more detailed information, please visit http://www.apress.com/source-code.
Printed on acid-free paper
Apress Business: The Unbiased Source of Business Information Apress business books provide essential information and practical advice, each written for practitioners by recognized experts. Busy managers and professionals in all areas of the business world—and at all levels of technical sophistication—look to our books for the actionable ideas and tools they need to solve problems, update and enhance their professional skills, make their work lives easier, and capitalize on opportunity.
Whatever the topic on the business spectrum—entrepreneurship, finance, sales, marketing, management, regulation, information technology, among others—Apress has been praised for providing the objective information and unbiased advice you need to excel in your daily work life. Our authors have no axes to grind; they understand they have one job only—to deliver up-to-date, accurate information simply, concisely, and with deep insight that addresses the real needs of our readers.
It is increasingly hard to find information—whether in the news media, on the Internet, and now all too often in books—that is even-handed and has your best interests at heart. We therefore hope that you enjoy this book, which has been carefully crafted to meet our standards of quality and unbiased coverage.
We are always interested in your feedback or ideas for new titles. Perhaps you’d even like to write a book yourself. Whatever the case, reach out to us at editorial@apress.com and an editor will respond swiftly. Incidentally, at the back of this book, you will find a list of useful related titles. Please visit us at www.apress.com to sign up for newsletters and discounts on future purchases.
The Apress Business Team
About the Author vii About the Technical Reviewer ix Introduction xi Stage 1:
Terminology and Technical Foundations 1
Step 1:
Thinking in Layers and Aspects 3
Step 2:
Seeing the Big Picture 9
Step 3:
Recognizing the Potential 19
Stage I1: Why the Blockchain Is Needed 27
Step 4:
Discovering the Core Problem 29
Step 5:
Disambiguating the Term 33
Step 6:
Understanding the Nature of Ownership 39
Step 7:
Spending Money Twice 49
Stage III: How the Blockchain Works 55
Step 8:
Planning the Blockchain 57
Step 9:
Documenting Ownership 63
Step 10: Hashing Data 71
Step 11: Hashing in the Real World 81
Step 12: Identifying and Protecting User Accounts 93
Step 13: Authorizing Transactions 103
Step 14: Storing Transaction Data 111
Step 15: Using the Data Store 123
Step 16: Protecting the Data Store 135
Step 17: Distributing the Data Store Among Peers 145
Step 18: Verif ying and Adding Transactions 153
Step 19: Choosing a Transaction History 165
Contents
Step 20: Paying for Integrity 183
Step 21: Bringing the Pieces Together 189
Stage IV: Limitations and How to Overcome Them 203
Step 22: Seeing the Limitations 205
Step 23: Reinventing the Blockchain 213
Stage V: Using the Blockchain, Summary, and Outlook 221
Step 24: Using the Blockchain 223
Step 25: Summarizing and Going Further 235
Index 249
Daniel Drescher is an experienced banking professional who has held positions in electronic security trading in several banks. His recent activities have focused on automation, machine learning, and big data in the context of security trading. Among others, Daniel holds a doctorate in econometrics from the Technical University of Berlin and an MSc in software engineering from the University of Oxford.
About the Technical
Reviewer
Laurence Kirk who after a successful career
writing low latency financial applications for the
City of London, was captivated by the potential
of distributed ledger technology. He moved to
Oxford to study for his master’s degree and set
up Extropy.io, a consultancy working with start-
ups to develop applications on the Ethereum
platform. Passionate about distributed technol-
ogy, he now works as a developer, evangelist, and
educator about Ethereum.
This introduction answers the most important question that every author has to answer: Why should anyone read this book? Or more specifically: Why should anyone read another book about the blockchain? Continue reading and you will learn why this book was written, what you can expect from this book, what you cannot expect from this book, for whom the book was written, and how the book is structured.
Why Another Book About the Blockchain?
The blockchain has received a lot of attention in the public discussion and in the media. Some enthusiasts claim that the blockchain is the biggest invention since the emergence of the Internet. Hence, a lot of books and articles have been written in the past few years about the blockchain. However, if you want to learn more about how the blockchain works, you may find yourself lost in a universe of books that either quickly skim over the technical details or that discuss the underlying technical concepts at a highly formal level. The former may leave you unsatisfied because they miss to explain the technical details necessary to understand and appreciate the blockchain, while the latter may leave you unsatisfied because they already require the knowledge you want to acquire.
This book fills the gap that exists between purely technical books about the blockchain, on the one hand, and the literature that is mostly concerned with specific applications or discussions about its expected economic impact or visions about its future, on the other hand.
This book was written because a conceptual understanding of the technical foundations of the blockchain is necessary in order to understand specific blockchain applications, evaluate business cases of blockchain startups, or follow the discussion about its expected economic impacts. Without an appreciation of the underlying concepts, it will be impossible to assess the value or the potential impact of the blockchain in general or understand the added value of specific blockchain applications. This book focuses on the underlying concepts of the blockchain since a lack of understanding of a new technology can lead to being carried away with the hype and being disappointed later on because of unrealistic unsubstantiated expectations.
Introduction
This book teaches the concepts that make up the blockchain in a nontechnical fashion and in a concise and comprehensible way. It addresses the three big questions that arise when being introduced to a new technology: What is it?
Why do we need it? How does it work?
What You Cannot Expect from This Book
The book is deliberately agnostic to the application of the blockchain. While cryptocurrencies in general and Bitcoin in particular are prominent applications of the blockchain, this book explains the blockchain as a general technology. This approach has been chosen in order to highlight generic concepts and technical patterns of the blockchain instead of focusing on a specific and narrow application case. Hence, this book is:
• Not a text specifically about Bitcoin or any other
cryptocurrency
• Not a text solely about one specific blockchain application
• Not a text about proofing the mathematical foundations
of the blockchain
• Not a text about programming a blockchain
• Not a text about the legal consequences and implications
of the blockchain
• Not a text about the social, economic, or ethical impacts
of the blockchain on our society or humankind in general
However, some of these points are addressed to some extent at appropriate points in this book.
What You Can Expect from This Book
This book explains the technical concepts of the blockchain such as transactions, hash values, cryptography, data structures, peer-to-peer systems, distributed systems, system integrity, and distributed consensus in a nontechnical fashion.
The didactical approach of this book is based on four elements:
• Conversational style
• No mathematics and no formulas
• Incremental steps through the problem domain
• Use of metaphors and analogies
xiii
Conversational Style
This book is deliberately written in a conversational style. It does not use mathematical or computer science jargon in order to avoid any hurdle for nontechnical readers. However, the book introduces and explains the necessary terminology needed to join the discussion and to understand other publications about the blockchain.
No Mathematics and No Formulas
Major elements of the blockchain such as cryptography and algorithms are based on complex mathematical concepts, which in turn come with their own demanding and sometimes frightening mathematical notation and formulas.
However, this book deliberately does not use any mathematical notation or formulas in order to avoid any unnecessary complexity or hurdle for nontechnical readers.
Incremental Steps Through the Problem Domain
The chapters in this book are called steps for a good reason. These steps form a learning path that incrementally builds the knowledge about the blockchain.
The order of the steps was chosen carefully. They cover the fundamentals of software engineering, explain the terminology, point out the reason why the blockchain is needed, and explain the individual concepts that make up the blockchain as well as their interactions. Calling the individual chapters steps highlights their dependence and their didactical purpose. They form a logical sequence to be followed instead of being chapters that could be read independently.
Use of Metaphors and Analogies
Each step that introduces a new concept starts with a pictorial explanation by referring to a situation from real life. These metaphors serve four major purposes. First, they prepare the reader for introduction to a new technical concept. Second, by connecting a technical concept to an easy-to-understand real-world scenario, the metaphors reduce the mental hurdle to discover a new territory. Third, metaphors allow learning new concepts by similarities and analogies. Finally, metaphors provide rules of thumb for memorizing new concepts.
Introduction
How This Book Is Organized
This book consists of 25 steps grouped into five major stages that all together form a learning path, which incrementally builds your knowledge of the blockchain. These steps cover some fundamentals of software engineering, explain the required terminology, point out the reasons why the blockchain is needed, explain the individual concepts that make up the blockchain as well as their interactions, consider applications of the blockchain, and mention areas of active development and research.
Stage I: Terminology and Technical Foundations
Steps 1 to 3 explain major concepts of software engineering and set the terminology necessary for understanding the succeeding steps. By the end of Step 3, you will have gained an overview of the fundamental concepts and an appreciation of the big picture in which the blockchain is located.
Stage II: Why the Blockchain Is Needed
Steps 4 to 7 explain why the blockchain is needed, what problem it solves, why solving this problem is important, and what potential the blockchain has. By
the end of Step 7, you will have gained a good understanding of the problem domain in which the blockchain is located, the environment in which it provides the most value, and why it is needed in the first place.
Stage III: How the Blockchain Works
The third stage is the centerpiece of this book since it explains how the blockchain works internally. Steps 8 to 21 guide you through 15 distinct technical concepts that all together make up the blockchain. By the end of Step 21, you will have reached an understanding of all the major concepts of the blockchain, how they work in isolation, and how they interact in order to create the big machinery that is called the blockchain.
Stage IV: Limitations and How to Overcome Them
Steps 22 to 23 focus on major limitations of the blockchain, explain their reasons, and sketch possible ways to overcome them. By the end of Step 23, you will understand why the original idea of the blockchain as explained in the previous steps may not be suitable for large-scale commercial applications, what changes were made to overcome these limitations, and how these changes altered the properties of the blockchain.
xv
Stage V: Using the Blockchain, Summary, and Outlook
Steps 24 and 25 consider how the blockchain can be used in real life and what questions should to be addressed when selecting a blockchain application. This stage also points out areas of active research and further development. By
the end of Step 25, you will have gained a well-grounded understanding of the blockchain and you will be well prepared to read more advanced texts or to become an active part in the ongoing discussion about the blockchain.
Accompanying Material
The website www.blockchain-basics.com offers accompanying material for some of the steps of this book.
I
Terminology
and Technical
Foundations
This stage explains major concepts of software engineering and establishes a way to organize and standardize our communication about technology.
This learning stage also introduces the concepts of software architecture and integrity and how they relate to the blockchain. By the end of this stage, you will have gained an understanding of the purpose of the blockchain and its potential.
1
Thinking in
Layers and
Aspects
Analyzing systems by separating them into
layers and aspects
This step lays the foundation of our learning path through the blockchain by introducing a way to organize and standardize our communication about technology. This step explains how you can analyze a software system and why it is important to consider a software system as a composition of layers. Furthermore, this step illustrates what you can gain from considering different layers in a system and how this approach helps us to understand the blockchain. Finally, this step provides a short introduction to the concept of software integrity and highlights its importance.
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_1
Step 1 | Thinking in Layers and Aspects
The Metaphor
Do you have a mobile phone? I would guess yes, as most people now have at least one. How much do you know about the different wireless communication protocols that are used to send and receive data? How much do you know about electromagnetic waves that are the foundation of mobile communication? Well, most of us do not know very much about these details because it is not necessary to know them in order to use a mobile phone and most of us do not have the time to learn about them. We mentally separate the mobile phone into the parts we need to know and the parts that can be ignored or taken for granted.
This approach to technology is not restricted to mobile phones. We use it all the time when we learn how to use a new television set, a computer, a washing machine, and so forth. However, these mental partitions are highly individual since what is considered important and what is not depends on our individual preferences, the specific technology, and our goals and experiences. As a result, your mental partition of a mobile phone may differ from my mental partition of the same mobile phone. This typically leads to problems in communication in particular when I try to explain to you what you should know about a certain mobile phone. Hence, unifying the way of partitioning a system is the key point when teaching and discussing technology. This step explains how to partition or layer a system and hence sets the basis for our communication about the blockchain.
Layers of a Software System
The following two ways of partitioning a system are used throughout this book:
• Application vs. implementation
• Functional vs. nonfunctional aspects
Application vs. Implementation
Mentally separating the user’s needs from the technical internals of a system leads to a separation of the application layer from the implementation layer. Everything that belongs to the application layer is concerned with the user’s needs (e.g., listening to music, taking photos, or booking hotel rooms).
Everything that belongs to the implementation layer is concerned with making these things happen (e.g., converting digital information into acoustic signals, recognizing the color of a pixel in a digital camera, or sending messages over the Internet to a booking system). Elements of the implementation layer are technical by nature and are considered a means to an end.
5
Functional vs. Nonfunctional Aspects
Distinguishing between what a system does and how it does what it does leads to the separation of functional and nonfunctional aspects. Examples of functional aspects are sending data over a network, playing music, taking photos, and manipulating individual pixels of a picture. Examples of nonfunctional aspects are a beautiful graphical user interface, fast-running software, and an ability to keep user data private and save. Other important nonfunctional aspects of a system are security and integrity. Integrity means that a system behaves as intended, and it involves many aspects such as security and correctness. 1 There is a nice way to remember the difference between functional and nonfunctional aspects of a system by referring to grammar usage in the English language: verbs describe actions or what is done, while adverbs describe how an action is done. For example, a person can walk quickly or slowly. In both cases, the action of “walk” is identical but how the action is performed differs.
As a rule of thumb, one can say that functional aspects are similar to verbs, while nonfunctional aspects are similar to adverbs.
Considering Two Layers at the Same Time
Identifying functional and nonfunctional aspects as well as separating application and implementation layer can be done at the same time, which leads to a two-dimensional table. Table 1-1 illustrates the result of mentally layering a mobile phone in this way.
Table 1-1. Example of Mentally Layering a Mobile Phone Layer
Functional Aspects
Nonfunctional Aspects
Application
Taking photos
The graphical user interface
Making phone calls
looks beautiful
Sending e-mails
Easy to use
Browsing the Internet
Messages are sent fast
Sending chat messages
Implementation
Saving user data internally
Store data efficiently
Making a connection to the nearest
Saving energy
mobile connector
Maintaining integrity
Accessing pixels in the digital camera
Ensure user privacy
1Chung, Lawrence, et al. Non-functional requirements in software engineering. Vol. 5. New York: Springer Science & Business Media, 2012.
Step 1 | Thinking in Layers and Aspects
Table 1-I may explain the visibility (or the lack of it) of specific elements of a system to its users. Functional aspects of the application layer are the most obvious elements of a system, because they serve obvious needs of the users.
These elements are typically the ones users learn about. On the other hand, the nonfunctional aspects of the implementation layer are rarely seen as major elements of the system. They are typically taken for granted.
Integrity
Integrity is an important nonfunctional aspect of any software system. It has three major components2:
• Data integrity: The data used and maintained by the system are complete, correct, and free of contradictions.
• Behavioral integrity: The system behaves as intended and it is free of logical errors.
• Security: The system is able to restrict access to its data and functionality to authorized users only.
Most of us may take integrity of software systems for granted because most of the time we luckily interact with systems that keep their integrity. This is due to the fact that programmers and software engineers have invested a lot of time and effort into the development of systems to achieve and maintain integrity. As a result, we may be a bit spoiled when it comes to appreciating the work done by software engineers to create systems that maintain a high level of integrity. But our feelings may change as soon as we interact with a system that fails to do so. These are the occasions when you face a loss of data, illogical software behavior, or realize that strangers were able to access your private data. These are the occasions when your mobile phone, your computer, your e-mail software, your word processor, or your spreadsheet calculator make you angry and forget your good manners! On these occasions, we begin to realize that software integrity is a highly valuable commodity.
Hence, it should not come as a surprise that software professionals spend a lot of their time working on this seemingly tiny nonfunctional aspect of the implementation layer.
2.Boritz, J. Efrim. IS practitioners’ views on core concepts of information integrity.
International Journal of Accounting Information Systems 6.4 (2005): 260–279.
7
Outlook
This step provided an introduction to some general principles of software engineering. In particular, the concepts of integrity and functional vs. nonfunctional aspects as well as application vs. implementation of a software system were illustrated. Understanding these concepts will help you appreciate the wider scope in which the blockchain exists. The next step will present the bigger picture by using the concepts introduced in this step.
Summary
• Systems can be analyzed by separating them into:
• Application and implementation layer
• Functional and nonfunctional aspects
• The application layer focuses on the user’s needs, while
the implementation layer focuses on making things
happen.
• Functional aspects focus on what is done, while nonfunc-
tional aspects focus on how things are done.
• Most users are concerned with the functional aspects
of the application layer of a system, while nonfunc-
tional aspects of a system, in particular those of the
implementation layer, are less visible to users.
• Integrity is an important nonfunctional aspect of any
software system and it has three major elements:
• Data integrity
• Behavioral integrity
• Security
• Most software failures, such as losses of data, illogical
behavior, or strangers accessing one’s private data, are
the result of violated system integrity.
2
Seeing the Big
Picture
Software architecture and its relation to the
blockchain
This step not only provides the big picture in which the blockchain is located, but it also highlights its location within the big picture. In order to allow you to see the big picture, this step introduces the concept of software architecture and explains its relation to the concept of separating a system into layers and aspects. In order to help you recognize the location of the blockchain within the big picture, this step highlights the relationship between the blockchain and software architecture. Finally, this step points out the core purpose of the blockchain in just one sentence. Appreciating its purpose is a cornerstone in understanding the blockchain and understanding the course of the succeeding steps.
The Metaphor
Have you ever bought a car? Most of us have. Even if you have never bought a car, you probably know that cars are equipped with different types of engines (e.g., diesel, gasoline, or electric engine). This is an example of the process
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_2
Step 2 | Seeing the Big Picture
of modularization, which is the result of applying the idea of layering to cars.
Having the choice among different engines when buying a car can result in amazing differences in the vehicle. Two cars that look identical from the out-side can differ dramatically with respect to the power of their engines and hence have very different driving performance. Additionally, your choice of the engine will have an impact on other characteristics of the car, like its price, its operational costs, the type of fuel consumed, the exhaust system, and the dimensions of the brakes. With this picture in mind, understanding the role of the blockchain within the big picture will be much easier.
A Payment System
Let’s apply the concept of layering to a payment system. Table 2-1 shows some of the user’s needs as well as some of the nonfunctional aspects of both the application and the implementation layers.
Table 2-1. Aspects and Layers of a Payment System
Layer
Functional Aspects
Nonfunctional Aspects
Application
Deposit money
The graphical user interface looks beautiful
Withdraw money
Easy to use
Transfer money
Transfer of money is done fast
Monitor account balance
System has many participants
Implementation
?
Available 24 hours a day
Fraud resistant
Maintaining integrity
Ensure user privacy
Have you spotted the question mark in that part of the table were you nor-mally see information about the technology used to make the system work?
This space was left blank on purpose. It is the place where you decide which
“engine” should be used to run your system. The next section will tell you a bit more about the engine equivalent in software systems.
Two Types of Software Architecture
There are many ways to implement software systems. However, one of the fundamental decisions when implementing a system concerns its architecture, the way in which its components are organized and related to one another.
Blockchain Basics
11
The two major architectural approaches for software systems are centralized and distributed. 1
In centralized software systems, the components are located around and connected with one central component. In contrast, the components of distributed systems form a network of connected components without having any central element of coordination or control.
Figure 2-1 depicts these two contrary architectures. The circles in the figure represent system components, also called nodes, and the lines represent connections between them. At this point, it is not important to know the details of what these components do and what information is exchanged between the nodes. The important point is the existence of these two different ways of organizing software systems. On the left-hand side of Figure 2-1, a distributed architecture is illustrated where components are connected with one another without having a central element. It is important to see that none of the components is directly connected with all other components. However, all components are connected with one another at least indirectly. The right-hand side of Figure 2-1 illustrates a centralized architecture where each component is connected to one central component. The components are not connected with one another directly. They only have one direct connection to the central component.
Figure 2-1. Distributed (left) vs. centralized (right) system architecture 1Tanenbaum, Andrew S., and Maarten Van Steen. Distributed systems: principles and paradigms.
Upper Saddle River, NJ: Pearson Prentice Hall, 2007.
Step 2 | Seeing the Big Picture
The Advantages of Distributed Systems
The major advantages of a distributed system over single computers are2:
• Higher computing power
• Cost reduction
• Higher reliability
• Ability to grow naturally
Higher Computing Power
The computing power of a distributed system is the result of combining the computing power of all connected computers. Hence, distributed systems typically have more computing power than each individual computer. This has been proven true even when comparing distributed systems comprised of computers of relatively low computing power with isolated super computers.
Cost Reduction
The price of mainstream computers, memory, disk space, and networking equipment has fallen dramatically during the past 20 years. Since distributed systems consist of many computers, the initial costs of distributed systems are higher than the initial costs of individual computers. However, the costs of creating, maintaining, and operating a super computer are still much higher than the costs of creating, maintaining, and operating a distributed system. This is particularly true since replacing individual computers of a distributed system can be done with no significant overall system impact.
Higher Reliability
The increased reliability of a distributed system is based on the fact that the whole network of computers can continue operating even when individual machines crash. A distributed system does not have a single point of failure. If one element fails, the remaining elements can take over. Hence, a single super computer typically has a lower reliability than a distributed system.
2Tanenbaum, Andrew S., and Maarten Van. Steen. Distributed systems: principles and paradigms.
Upper Saddle River, NJ: Pearson Prentice Hall, 2007.
13
Ability to Grow Naturally
The computing power of a distributed system is the result of the aggregated computing power of its constituents. One can increase the computing power of the whole system by connecting additional computers with the system. As a result, the computing power of the whole system can be increased incrementally on a fine-grained scale. This supports the way in which the demand for computing power increases in many organizations. The incremental growth of distributed systems is in contrast to the growth of the computing power of individual computers. Individual computers provide identical power until they are replaced by a more powerful computer. This results in a discontinuous growth of computing power, which is only rarely appreciated by the consumers of computing services.
The Disadvantages of Distributed Systems
The disadvantages of distributed systems compared to single computers are:
• Coordination overhead
• Communication overhead
• Dependency on networks
• Higher program complexity
• Security issues
Coordination Overhead
Distributed systems do not have central entities that coordinate their members. Hence, the coordination must be done by the members of the system themselves. Coordinating work among coworkers in a distributed system is challenging and costs effort and computing power that cannot be spent on the genuine computing task, hence, the term coordination overhead.
Communication Overhead
Coordination requires communication. Hence, the computers that form a distributed system have to communicate with one another. This requires the existence of a communication protocol and the sending, receiving, and
Step 2 | Seeing the Big Picture
processing of messages, which in turn costs effort and computing power that cannot be spend on the genuine computing task, hence, the term communication overhead.
Dependencies on Networks
Any kind of communication requires a medium. The medium is responsible for transferring information between the entities communicating with one another. Computers in distributed systems communicate by means of messages passed through a network. Networks have their own challenges and adversities, which in turn impact the communication and coordination among computers that form a distributed system. However, without any network, there will be no distributed system, no communication, and therefore no coordination among the nodes, thus the dependency on networks.
Higher Program Complexity
Solving a computation problem involves writing programs and software. Due to the disadvantages mentioned previously, any software in a distributed system has to solve additional problems such as coordination, communication, and utilizing of networks. This increases the complexity of the software.
Security Issues
Communication over a network means sending and sharing data that are critical for the genuine computing task. However, sending information through a network implies security concerns as untrustworthy entities may misuse the network in order to access and exploit information. Hence, any distributed system has to address security concerns. The less restricted the access to the network over which the distributed nodes communicate is, the higher the security concerns are for the distributed system.
Distributed Peer-to-Peer Systems
Peer-to-peer networks are a special kind of distributed systems. They consist of individual computers (also called nodes), which make their computational resources (e.g., processing power, storage capacity, data or network band-width) directly available to all other members of the network without having
Blockchain Basics
15
any central point of coordination. The nodes in the network are equal concerning their rights and roles in the system. Furthermore, all of them are both suppliers and consumers of resources.
Peer-to-peer systems have interesting applications such as file sharing, content distribution, and privacy protection. Most of these applications utilize a simple but powerful idea: turning the computers of the users into nodes that make up the whole distributed system. As a result, the more users or customers use the software, the larger and more powerful the system becomes. This idea, its consequences, and it challenges are discussed in the following steps.
Mixing Centralized and Distributed Systems
Centralized and distributed systems are architectural antipodes. Technical antipodes have always inspired engineers to create hybrid systems that inherit the strength of their parents. Centralized and distributed systems are no exception to this. There are two archetypical ways of combining these antipodes, and they need to be understood since they will become important when learning about blockchain applications in the real world. They are centrality within a distributed system and the distributed system inside the center.
The graphic on the left-hand side of Figure 2-2 illustrates an architecture that establishes a central component within a distributed system. On first glance, the components seem to form a distributed system. However, all of the circles are connected with the larger circle located in the middle. Hence, such a system only appears to be distributed on a superficial view, but it is a centralized system in reality.
Figure 2-2. Mixing distributed with centralized architecture
Step 2 | Seeing the Big Picture
The graph on the right-hand side of Figure 2-2 illustrates the opposite approach. Such a system appears to be a centralized system on first glance, because all the circles in the periphery only have one direct connection to a large central component. However, the central component contains a distributed system inside. The components in the periphery may not even be aware of the distributed system that lives within the central component.
What these two approaches have in common is that it is hard to determine their true nature. Are they distributed or centralized? It may not be necessary to give these architectures unique names. However, it is important to point out their dual nature. This is particularly important because it may not be easy to spot the centrality or the distributed nature within them. I will come back to this point later when I discuss the way the blockchain is commercialized.
Identifying Distributed Systems
The emergence of hybrid architectures makes it hard to identify distributed systems clearly. Formulating a generally accepted definition of distributed systems is beyond the scope of this book. However, for the course of this book it is important to have an idea of what a distributed system is and how it differs from other software systems. If you are in doubt whether or not a system is distributed, look for a single component (e.g., a database, a name or user registry, a login or logoff component, or an emergency switch-off button) that could terminate the whole system. If you find such a component, the system under consideration is not distributed.
■ Note If one single component exists, e.g., a single switch-off button that can bring down the whole system, then the system is not distributed.
The Purpose of the Blockchain
When designing a software system, one can choose which architectural style will be used, similar to choosing an engine for a car. The architectural decision can be done independently from the functional aspects of the application layer. As a result, one can create distributed as well as centralized systems with identical functionality on the application layer. The architecture is only a means to an end when it comes to implementing a system. Hence, a payment system, as was proposed in Table 2-1, can be implemented as a distributed or centralized system.
Each of the two architectural concepts has its own advantages and disadvantages and their own specific way of doing things. Choosing a specific architecture has consequences on how you will achieve the functional and
17
nonfunctional aspects of a system. In particular, both architectural concepts have very different approaches to ensure integrity. And this is the point where the blockchain enters the picture. The blockchain is a tool for achieving integrity in distributed software systems. Hence, it can be seen as a tool to achieve a nonfunctional aspect of the implementation layer.
■ Note The purpose of the blockchain is to achieve and maintain integrity in distributed systems.
Outlook
Achieving integrity in a distributed system is very technical and it may sound a bit boring. However, the question that makes this achievement exciting for many people depends on what the distributed system will do and what kind of centralized system it replaces. The next step explains how a peer-to-peer system has changed our world and why the blockchain as a tool for achieving integrity in distributed software systems has the potential to change the world too.
Summary
• The architecture of a software system determines how
its components are organized and related to one another.
• Centralized and distributed software architectures can
be seen as antipodes.
• A distributed system consists of a number of indepen-
dent computers that cooperate with one another by
using a communication medium in order to achieve a
specific objective without having any centralized element
of control or coordination.
• As a rule of thumb, one can state that as soon as a system
has a single component that could bring down the whole
system it is not distributed, regardless of how complex its
architecture looks.
• The blockchain is part of the implementation layer of a
distributed software system.
• The purpose of the blockchain is to ensure a specific
nonfunctional aspect of a distributed software system
that is: achieving and maintaining its integrity.
3
Recognizing the
Potential
How peer-to-peer systems may
change the world
This step deepens our understanding of the purpose of the blockchain by considering a specific kind of distributed system: the peer-to-peer system. As a result, this step will help you understanding why there is so much excitement about the blockchain among technologists and business professionals alike.
This step also points out the major area of application in which the blockchain is expected to provide the most value. Additionally, this step discusses some consequences of peer-to-peer systems in the real world.
The Metaphor
Can you remember the last time you bought a CD for yourself in a music store or in a department store? Most people have not bought actual CDs for a long time now, because the music industry went through a dramatic change.
Nowadays, people download individual songs from music portals, share mp3
files among friends, or use music streams on their mobile devices instead of buying CDs. This change started with the emergence of a piece of software
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_3
Step 3 | Recognizing the Potential
that allowed people to share their music files with one another. But what was so special about that software? This is what one of its inventors had to say about this:
This system, what’s most interesting about it is, you’re interacting with peers, you’re exchanging information with a person down the street.
—Shawn Fanning, cofounder of Napster
What Fanning and his coworkers invented was a peer-to-peer system for sharing music. Back in the late 1990s, this software ushered in a new era for the established business model of the music industry. This step explains what the emergence of Napster, the decline of CD sales, and the dramatic changes of the music industry have to do with the blockchain.
How a Peer-to-Peer System Changed a Whole
Industry
The music industry has worked for a long time in the following way: musicians made contracts with studios, which recorded the songs, produced and marketed the music records on a variety of media (e.g., vinyl, tape, or CD), which in turn were sold to the customers via a variety of distribution channels, including department stores and specialized shops. The studios actually worked as intermediaries between musicians and people who enjoy listening to music. Music studios could maintain their role as intermediaries due to their exclusive knowledge and skills in producing, marketing, and distributing records. However, in the first decade of the 2000s, the environment in which the music studios operated changed dramatically.
The digitalization of music, the availability of recording equipment at affordable prices, the growing spread of privately used PCs, and the emergence of the Internet made music studios dispensable. The three functions of music studios—producing, marketing, and distributing records—could be done by the artists and the consumers themselves. Napster played a major role in the replacement of the music studios as intermediaries. With Napster, people no longer relied on the music studios to get the latest hits. It was possible to share individual music files with people all over the world without the need to buy any CDs. The peer-to-peer approach of Napster, actually being a kind of
21
a digital sharing bazaar for mp3 files, gave consumers access to a wider range of music than ever before, making the music studios partly dispensable and
causing them significant losses.1
The Potential of Peer-to-Peer Systems
The Napster case taught us that peer-to-peer systems have the potential to reshape whole industries based on a simple idea: replacing the middleman with peer-to-peer interactions. In the case of the music industry, the traditional studios and their marketing and distribution channels that acted as the middlemen between artists and consumers have been replaced by peer-to-peer file sharing systems. The major characteristics that made the music industry so vulnerable to being replaced by peer-to-peer systems are the immaterial nature of music and the low costs of copying and transferring data.
The power of peer-to-peer systems is not restricted to the music industry.
Each industry that mainly acts as a middleman between producers and customers of immaterial or digital goods and services is vulnerable to being replaced by a peer-to-peer system. This statement may sound a bit abstract, but you may discover many middlemen for immaterial and digital goods and services around you once you recognize the largest of them all: the financial industry.
What is it that you have in your bank account or on your credit or debit card?
Is it real y money? The money you own has been turned into immaterial bits and bytes long ago. Only a small amount of actual money exists as physical banknotes and coins. The vast majority of the world’s money and assets exists as immaterial bits and bytes in the centralized information technology systems of the financial industry. Banks and many other players of the financial industry are just middlemen between producers and consumers of bits and bytes that make up our money and our wealth. The act of borrowing, lending, or transferring money from one account to another is just the transfer of an immaterial good operated by middlemen, also called intermediaries. It is amazing how many middlemen are involved in seemingly simple transactions (e.g., transferring money from one bank account to another one in a different country involves up to five middlemen, which all need their processing time and impose their own fees). As a result, something as simple as transferring an amount of money from one bank account to another in a different country involves a long 1Hong, Seung-Hyun. The effect of Napster on recorded music sales: evidence from the consumer expenditure survey. Stanford Institute for Economic Policy Research Working Paper (2004): 3–18; Leyshon, Andrew. Scary monsters? Software formats, peer-to-peer networks, and the spectre of the gift. Environment and Planning D: Society and Space 21.5 (2003): 533–558.
Step 3 | Recognizing the Potential
processing time and incurs high transactions costs. In a peer-to-peer system, the same transfer would be much simpler and it would take less time and costs since it could be processed as what it is: a transfer of bits and bytes between two peers or nodes, respectively.
The advantage of peer-to-peer systems over centralized systems is that direct interactions occur between contractual partners instead of indirect interactions through a middleman, hence, there is less processing time and lower costs.
The advantages of peer-to-peer systems are not restricted to money transfer. Every industry that mainly acts as a middleman between producers and customers of immaterial or digital goods and services is vulnerable to being replaced by a peer-to-peer system. As digitalization continues, more and more items of everyday life and an increasing amount of goods and services will become immaterial and will benefit from the efficiencies of peer-to-peer systems. Advocates of peer-to-peer systems argue that almost all aspects of our life will be affected by the emergence of digitalization and peer-to-peer networks such as payments, money saving, loans, insurance, as well as issuance and validation of birth certificates, driving licenses, passports, identity cards, educational certificates, and patents and labor contracts. Most of them already exist in digital form in centralized systems run by institutions that are nothing other than a middleman between natural suppliers and customers.
■ Note Replacing the middleman is also called disintermediation. It is considered a serious threat to many business and companies that mainly act as intermediaries between different groups of people, such as buyers and seller, borrowers and lenders, or producers and consumers.
Terminology and the Link to the Blockchain
Now that you have learned about the potential of peer-to-peer systems, it is necessary to clarify the terminology of the problem domain and to explain its relation to the blockchain. In particular, the following points need to be discussed:
• The definition of a peer-to-peer system
• Architecture of peer-to-peer systems
• The link between peer-to-peer systems and the blockchain
23
The Definition of a Peer-to-Peer System
Peer-to-peer systems are distributed software systems that consist of nodes (individual computers), which make their computational resources (e.g., processing power, storage capacity, or information distribution) directly available to another. When joining a peer-to-peer system, users turn their computers into nodes of the system that are equal concerning their rights and roles.
Although users may differ with respect to the resources they contribute, all the nodes in the system have the same functional capability and responsibility. Hence, the computers of all users are both suppliers and consumers of resources.2
For example, in a peer-to-peer file sharing system, the individual files are stored on the users’ machines. When someone wants to download a file in such a system, he or she is downloading it from another person’s machine, which could be the next door neighbor or someone located halfway around the world.
Architecture of Peer-to-Peer Systems
Peer-to-peer systems are distributed computer systems by construction since they are made of individual nodes that share their computational resources among others. However, there are also peer-to-peer systems that still utilize elements of centralization. Centralized peer-to-peer systems maintain central nodes to facilitate the interaction between peers, to maintain directories that describe the services offered by the peer nodes, or to perform look-ups and
identification of the nodes.3 Centralized peer-to-peer systems typically utilize a hybrid architecture, such as the one that was illustrated on the left-hand side of Figure 2-2. Such architecture allows combining the advantages of centralized and distributed computing. On the other hand, purely distributed peer-to-peer systems do not have any element of central control or coordination.
Hence, all nodes in those systems perform the same tasks, acting both as providers and consumers of resources and services.
An example of a centralized peer-to-peer system is Napster, which maintained a central database of all nodes connected with the system and the songs available on these nodes.
2Tanenbaum, Andrew S., and Maarten Van Steen. Distributed systems: principles and paradigms.
Upper Saddle River, NJ: Pearson Prentice Hall, 2007.
3Eberspächer, Jörg, and Rüdiger Schollmeier. First and second generation of peer-to-peer systems. In Peer-to-peer systems and applications. Berlin Heidelberg: Springer Verlag, 2005: 35–56.
Step 3 | Recognizing the Potential
The Link Between Peer-to-Peer Systems and the
Blockchain
As discussed in Step 2, the blockchain can be considered a tool for achieving and maintaining integrity in distributed systems. Purely distributed peer-to-peer systems may use the blockchain in order to achieve and to maintain system integrity. Hence, the link between purely distributed peer-to-peer systems and the blockchain is its usage for achieving and maintaining integrity in purely distributed systems.
The Potential of the Blockchain
The relation between purely distributed peer-to-peer systems to the blockchain is that the former uses the latter as a tool to achieve and maintain integrity.
Hence, the argument that explains the excitement about and the potential of the blockchain is: Purely distributed peer-to-peer systems have a huge commercial potential as they can replace centralized systems and change whole industries due to disintermediation. Since purely distributed peer-to-peer systems may use the blockchain for achieving and maintaining integrity, the blockchain becomes important as well. However, the major fact that excites people is the disintermediation. The blockchain is only a means to an end that helps to achieve that.
■ Note The excitement about the blockchain is based on its ability to serve as a tool for achieving and maintaining integrity in purely distributed peer-to-peer systems that have the potential to change whole industries due to disintermediation.
Outlook
This step explained what peer-to-peer systems are and highlighted their potential to change whole industries due to disintermediation. Additionally, this step pointed out that the excitement about the blockchain is due to its ability to serve purely distributed peer-to-peer systems to fulfill their tasks.
However, the question of why achieving and maintaining integrity in distributed systems is so important has not been answered yet. The next step will discuss that question in more detail.
25
Summary
• Peer-to-peer systems consist of computers, which make
their computational resources directly available to
another.
• The advantage of peer-to-peer systems is their ability to
allow users to interact directly with one another instead
of interacting indirectly through middlemen.
• Replacing middlemen with peer-to-peer systems increases
processing speed and reduces costs.
• Peer-to-peer systems can be centralized or purely
distributed.
• Purely distributed peer-to-peer systems form a network
of equal members that interact directly with one another
without having any central coordination.
• Napster demonstrated the power of peer-to-peer sys-
tems as its file sharing system ushered in a new era for
the business model of the traditional music industry,
which mainly acted as a middleman between artists and
consumers.
• Every industry that mainly acts as a middleman between
producers and customers of immaterial or digital goods
and services is vulnerable to being replaced by peer-to-
peer systems.
• A huge part of our financial system is simple interme-
diation between suppliers and consumers of money,
which mainly exists as digital or immaterial good. Hence,
digitalization and peer-to-peer systems may reshape the
financial industry in a similar fashion as Napster reshaped
the music industry.
• As digitalization continues, more aspects of our everyday
lives and an increasing amount of goods and services will
become immaterial and will benefit from the advantages
of peer-to-peer systems.
• The excitement about the blockchain is based on its
ability to serve as a tool for achieving and maintaining
integrity in purely distributed peer-to-peer systems that
have the potential to change whole industries due to
disintermediation.
II
Why the
Blockchain
Is Needed
This stage explains the problem that the blockchain is supposed to solve and why solving this problem is important. This stage also deepens your understanding of the problem domain in which the blockchain is located, the environment in which it provides the most value, and its relation to trust, integrity, and the management of ownership. By the end of this stage, you will have gained a deeper understanding of the purpose of the blockchain and you will have reached a differentiated understanding of the term blockchain itself.
4
Discovering the
Core Problem
How to herd a group of independent computers
The previous two steps pointed out the purpose of the blockchain in general and highlighted its importance for purely distributed peer-to-peer systems in particular. It turned out that maintaining integrity in distributed systems is the major purpose of the blockchain. But why is maintaining integrity in distributed systems and purely distributed peer-to-peer systems in particular such a challenge? This step answers that question by discovering the subtle relation between trust and integrity of purely distributed peer-to-peer systems. As a result, this step will deepen your understanding of the importance of integrity and uncovers the major problem to be solved by the blockchain. Finally, this step describes the environment in which the blockchain is expected to provide the most value.
The Metaphor
Many languages have a pictorial saying for describing the situation when someone tries to organize a chaotic group of individuals. For example, in English one would describe such a situation as trying to herd cats, as it illustrates the challenges of herding a group of obstinate and intractable animals that do not accept or even recognize a central authority. Does the problem of trying to organize a group of individuals who do not accept or recognize a
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_4
Step 4 | Discovering the Core Problem
central authority sound familiar? It happens that this is exactly the situation of a purely distributed peer-to-peer system, which consists of individual and independent nodes without having any kind of central control or coordination. This step explains a major challenge of purely distributed peer-to-peer systems and how it relates to the blockchain.
Trust and Integrity in Peer-to-Peer Systems
Trust and integrity are two sides of the same coin. In the context of software systems, integrity is a nonfunctional aspect of a system to be safe, complete, consistent, correct, and free of corruption and errors. Trust is also the firm belief of humans in the reliability, truth, or ability of someone or something without evidence, proof, or investigation. Trust is given in advance and will increase or decline based on the results of interactions on an ongoing basis.
With respect to peer-to-peer systems, this means that people will join and continue to contribute to a system if they trust it and if the results of interacting with the system on an ongoing basis confirm and reinforce their trust. Integrity of the system is needed in order to fulfill the expectations of the users and reinforce their trust in the system. If the trust of the users is not reinforced by the system due to a lack of integrity, the users will abandon the system, which, as a result, will eventually cause it to terminate. Due to the importance of trust for the existence of peer-to-peer systems, the major question is: How do we achieve and maintain integrity in a purely distributed peer-to-peer system?
Achieving and maintaining integrity in purely distributed systems depends on a variety of factors, some of the most important are:
• Knowledge about the number of nodes or peers
• Knowledge about the trustworthiness of the peers
The chances of achieving integrity in a distributed peer-to-peer system are higher if the number of nodes as well as their trustworthiness is known.
This situation is comparable to running a private club that adheres to high moral standards and utilizes a rigorous on-boarding process for new members. However, the worst circumstances for achieving integrity in a distributed peer-to-peer system are given when the number of nodes and their trustworthiness is unknown. This is the case when running a purely distributed peer-to-peer system on the Internet that is open to everyone.
Integrity Threats in Peer-to-Peer Systems
For simplicity, one can consider two major integrity threats in peer-to-peer systems:
• Technical failures
• Malicious peers
31
Technical Failures
Peer-to-peer systems are comprised of the individual computers of its users who communicate via a network. All hardware and software components of a computer system as well as any component of a computer network have the immanent risk of failing or creating errors. Hence, any distributed system has to face the problem that its components may fail or may produce wrong results by chance.
Malicious Peers
Malicious members are the second integrity threat in peer-to-peer systems.
This source of untrustworthiness is not a technical problem, but rather a problem caused by the goals of the individuals who decide to exploit the system for their own purposes. One could say that this threat is more related to sociology and group dynamics than to technology. Dishonest and malicious peers comprise the most severe threat to the peer-to-peer system, because they attack the foundation on which any peer-to-peer system is built: trust.
As soon as users can no longer trust their peers, they will turn away and stop contributing computational resources to the system. Hence, the number of members will decline and the whole system will become less attractive to the remaining members, which in turn will accelerate the decline of the system that eventually will be abandoned completely.
The Core Problem to Be Solved by the
Blockchain
Achieving integrity and trust in the best of all conditions is easy. The real challenge is to achieve integrity and trust in a distributed system in the worst of all conditions. And this is the problem that the blockchain is supposed to solve.
The core problem to be solved by the blockchain is achieving and maintaining integrity in a purely distributed peer-to-peer system that consists of an unknown number of peers with unknown reliability and trustworthiness. This problem is not a new one. It is actually a well-known and widely discussed problem in computer science. By utilizing a metaphor from the military, the problem is widely regarded as the Byzantine general problem. 1
1Lamport, Leslie, Robert Shostak, and Marshall Pease. The Byzantine generals problem.
ACM Transactions on Programming Languages and Systems (TOPLAS) 4.3 (1982): 382–401.
Step 4 | Discovering the Core Problem
■ Note The problem to be solved by the blockchain is achieving and maintaining integrity in a purely distributed peer-to-peer system that consists of an unknown number of peers with unknown reliability and trustworthiness.
Outlook
This step highlighted the importance of integrity and trust in peer-to-peer systems. Furthermore, this step pointed out the core problem to be solved by the blockchain and emphasized its importance for achieving integrity and trust in peer-to-peer systems. However, a definition of the term blockchain is still missing. This will be the subject of the next step.
Summary
• Integrity and trust are major concerns of peer-to-peer
systems.
• People will join and continue to contribute to a peer-to-
peer system if they trust it and if the results of interacting with the system on an ongoing basis confirm and reinforce that trust.
• As soon as people lose trust in a peer-to-peer system,
they will abandon it, which in turn will cause the system
to terminate eventually.
• Major integrity threats in peer-to-peer systems are:
• Technical failures
• Malicious peers
• Achieving integrity in a peer-to-peer system depends on:
• The knowledge about the number of peers
• The knowledge about the trustworthiness of the peers
• The core problem to be solved by the blockchain is achieving and maintaining integrity in a purely distributed peer-to-peer system that is comprised of an unknown number of peers
with unknown reliability and trustworthiness.
5
Disambiguating
the Term
Four ways to define the blockchain
In the preceding steps you learned about the major purpose of the blockchain and the relation between trust and integrity of the software system.
As a result, you gained a well-grounded appreciation of the purpose of the blockchain, but you are still missing a definition of the term blockchain itself.
This step will turn your attention to the definition of the term and explain its different usages. This step will present a provisional definition of blockchain, which will guide you through the remainder of this book. Finally, this step explains why the management of ownership is a prominent application case of the blockchain.
The Term
In this discussion about the blockchain, the term is used as follows:
• As a name for a data structure
• As a name for an algorithm
• As a name for a suite of technologies
• As an umbrella term for purely distributed peer-to-peer
systems with a common application area
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_5
Step 5 | Disambiguating the Term
A Data Structure
In computer science and software engineering, a data structure is a way to organize data regardless of their concrete informational content. You can think about a data structure in terms of a floor plan for a building in architecture. A floor plan for a building addresses separating and connecting space with walls, floors, and stairs regardless of their concrete usage. When used as a name for a data structure, blockchain refers to data put together into units called blocks. One can think of these blocks much like pages in a book. These blocks are connected to one another like a chain, hence the name blockchain. In relation to a book, the words and sentences are the information to be stored.
They are written on different pages instead of being written on a large spool.
The pages are connected with one another via their position in the book and via the page numbers. You can determine if someone removed a page from the book by checking whether the page numbers continue without leaving out a number. Furthermore, the information on the pages as well as the pages within the book are ordered. The ordering is an important detail, which will be used extensively. Additionally, the chaining of the data blocks in the data structure is achieved by using a very special numbering system, which differs from the page numbering in ordinary books.
An Algorithm
In software engineering, the term algorithm refers to a sequence of instructions to be completed by a computer. These instructions often involve data structures. When used as a name for an algorithm, blockchain refers to a sequence of instructions that negotiates the informational content of many blockchain-data-structures in a purely distributed peer-to-peer system, similar to a democratic voting schema.
A Suite of Technologies
When used to refer to a suite of technologies, blockchain refers to a combination of the blockchain-data-structure, the blockchain-algorithm, as well as cryptographic and security technologies that combined can be used to achieve integrity in purely distributed peer-to-peer systems, regardless of the application goal.
An Umbrella Term for Purely Distributed Peer-to-Peer
Systems with a Common Application Area
Blockchain can also be used as an umbrella term for purely distributed peer-to-peer systems of ledgers that utilize the blockchain-technology-suite. Note
35
that in this context blockchain refers to a purely distributed system as a whole instead of referring to a software unit that is part of a purely distributed system.
The Usage of the Term in This Book
Throughout the rest of this book, blockchain refers to the shortcut for the umbrella term for purely distributed peer-to-peer systems of ledgers that utilize the blockchain-technology-suite. If any other meaning is intended, I will indicate this by explicitly using the term blockchain-data-structure, blockchain-algorithm, or blockchain-technology-suite.
■ Note The technology that is nowadays regarded as blockchain was proposed in 2008 under the pseudonym Satoshi Nakamoto,1 whose true identity has not yet been revealed.
Provisional Definition
The following definition is not complete. It still lacks important details that have not yet been presented. However, this definition serves as an intermedi-ate step toward a more complete understanding of the term:
The blockchain is a purely distributed peer-to-peer system of ledgers that utilizes a software unit that consist of an algorithm, which negotiates the informational content of ordered and connected blocks of data together with cryptographic and security technologies in order to achieve and maintain its integrity.
The Role of Managing Ownership
The provisional definition does not say anything about Bitcoin or managing ownership of cryptographic money. This may come as a surprise since many articles and books written about the blockchain claim that its purpose is to manage ownership of digital currencies. The truth is, managing ownership of cryptographic money is a very prominent and natural application case of the blockchain, but it is not the only one. The blockchain has a wide and diverse range of applications. However, there are two reasons why the management of ownership of digital goods is the most discussed application of the blockchain.
1Nakamoto, Satoshi. Bitcoin: a peer-to-peer electronic cash system. 2008. https://bitcoin.org/
Step 5 | Disambiguating the Term
First, it is the easiest to understand and to explain. Second, it is the use case with the most impact on the economy. The concept of ownership and the enforcement of ownership rights are core elements of almost every human society (even some animals have the concept of ownership and fight over its enforcement). A huge proportion of the activities of banks, insurance companies, custodians, lawyers, courts, solicitors, and consulates are concerned with just the management of ownership rights or their enforcement. Hence, managing ownership is a multibillion dollar market, and any technical innovation that could change the way we manage ownership will have a huge impact.
It turns out that the blockchain can indeed dramatically change the way we manage ownership.
The Application Area of the Blockchain in This
Book
The blockchain as a technology suite as used for managing distributed peer-to-peer systems of ledgers can have many specific applications such as managing ownership in digital goods or cryptographic currencies. However, this book deliberately does not consider just one specific application of the blockchain because I do not want to distract the attention from the core concepts by discussing just one specific application case in great detail. However, in order to make it easier for you to understand the blockchain, this book considers the general application case of managing and clarifying ownership regardless of the specific good whose ownership is managed. As a result, the general goal of managing and clarifying ownership will provide some mental guidance through your learning path and help to create a mental picture of the blockchain.
Outlook
This step clarified the term blockchain and provided a provisional definition.
This book considers the general application case of managing and clarifying ownership in order to explain the blockchain, but there really needs to be a discussion of ownership in more detail. A more detailed understanding of ownership will help you to understand the functioning of the blockchain. The next step will explore the foundation of ownership in more detail.
37
Summary
• The term blockchain is ambiguous; it has different mean-
ings for different people depending on the context.
• Blockchain can refer to:
• A data structure
• An algorithm
• A suite of technologies
• A group of purely distributed peer-to-peer systems
with a common application area
• Managing and clarifying ownership is the most prominent
application case of the blockchain but is not the only one.
• The blockchain is a purely distributed peer-to-peer sys-
tem of ledgers that utilizes a software unit that consists
of an algorithm, which negotiates the informational con-
tent of ordered and connected blocks of data together
with cryptographic and security technologies in order to
achieve and maintain its integrity.
6
Understanding
the Nature of
Ownership
Why we know what we own
Step 5 provided a preliminary definition of the blockchain and insight into why the management of ownership is regarded as its most prominent application case. This step deepens the relation between the blockchain and its prominent use case of managing ownership. In particular, this step reveals the connection between trust and integrity of purely distributed peer-to-peer systems, on the one hand, and managing ownership, on the other hand. In addition, this step also provides some general insights into the nature of ownership and introduces basic security concepts.
The Metaphor
Imagine the following situation. At home you are packing an apple into your bag for lunch. On your way to the office, you decide to go into a supermarket to buy a sandwich and some cookies. At the checkout point, you are opening your bag to collect the items you are buying. Just in this moment the employee
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_6
Step 6 | Understanding the Nature of Ownership
of the supermarket is looking at you and sees the apple in your bag, which happens to be the same kind of apples sold at the supermarket. What would the employee of the supermarket be thinking in this moment? He could falsely conclude from his observation that you may have stolen the apple from his store. Unfortunately, that supermarket does not have any supervision cameras or any security personnel, and you are the only customer at this moment. So how could you prove that you did not steal the apple?
Ownership and Witnesses
Have you ever thought about what makes you the owner of the things that belong to you? Probably, because you are still thinking about the apple in the supermarket story! So what makes you the owner of the apple in your bag?
How can you prove that you have not stolen it from the supermarket?
So imagine you are in front of a court that disputes your alleged apple-theft case. How would you prove that you are the owner of the apple? We know that in the supermarket example, it would suffice to prove your innocence when no one could testify that you had stolen the apple. However, being dis-charged from the suspicion of being a thief is not proof of ownership. So let’s stick to the question of proving your ownership.
It would be of great help if someone could testify that you had bought the apple before you went to the supermarket. Luckily, you remember the shop were you bought the apple and the employee who sold the apple to you is willing to testify to this. But you underestimated the prosecutor. He is talking to your witness in the cross-examination and asking your witness hard questions: Can he remember the apple he sold to you? Can he identify the specific apple he sold to you as the apple found in your bag? Can he identify you as the person who bought that particular apple? And finally why does he remember all these details in the first place? Could it be possible that you paid the witness money for testifying to your innocence? So this comes down to a basic principle: having one witness is good, but having many independent witnesses is the key to convincing the prosecutor of your innocence.
The last point is extremely important. The more independent witnesses who testify to the same fact, the higher the chance that this fact is indeed true. It turns out that this idea will be one of the core concepts of the blockchain.
41
Foundations of Ownership
Taking the findings of the previous section to a more abstract level, one can state that proving ownership involves three elements:
• An identification of the owner
• An identification of the object being owned
• A mapping of the owner to the object
The testimony of witnesses accomplishes all of these. Historically, eyewit-nesses have often been the only source of clarifying these elements. However, relying on oral testimonies of witnesses is time-consuming. As a result, these elements have been replaced by documents issued by trustworthy entities.
Nowadays, we can identify people with ID cards, birth certificates, and driver’s licenses. Serial numbers, production dates, production certificates, or a detailed description can be used to identify objects. These documents do not change once they are created because the identities of people and objects do not change.
The mapping between owners and objects is typically done with a ledger or register. This is not a document that stays constant once created. Every transfer of ownership needs to be documented in such a register because an outdated register or ledger cannot be a trustworthy witness for testifying ownership. The importance of having an up-to-date and orderly managed register has led to the development of special institutions in many societies.
The more valuable certain kinds of objects are, the higher the chance for the existence of a government-regulated ledger that documents the ownership of those objects. Most of these ledgers are open to everyone in order to make it easy to verify ownership and provide easy access to clarify ownership. You may do some research on your own to identify some of these ledgers in your country and to what they testify. I found ledgers for documenting ownership of real estate, patents, ships, airplanes, and companies. I even found registers for marriages, births, and deaths.
Figure 6-1 depicts the relation of the different concepts involved when designing software for managing ownership.
42
Step 6 | Understanding the Nature of Ownership
Figure 6-1. Concepts of ownership
In Figure 6-1, the concepts in the top layers are more general than those in the lower layers. The concepts on each layer can be seen as realizations of the concepts in layers above them. For example, the proof of ownership requires identification of owners and property alike as well as the mapping between owners and property. The use of ownership requires identification as well as authentication and authorization to ensure that only the legitimate person uses the property. The boxes in the very bottom row represent the implementation layer. They show, for example, that password and signature are concepts used to implement authentication and authorization. A ledger can be seen as a concrete implementation of a mapping between owners to their property.
A Short Detour to Security
Figure 6-1 used three major security related concepts that need to be explained in more detail, as their meaning in the context of software systems might be a bit different from their common usage:
• Identification
• Authentication
• Authorization
The meaning and interrelation of these three concepts can be illustrated by a real-world example. Perhaps you attempt to buy a bottle of wine in a liquor shop. Liquor shops are not allowed to sell alcoholic drinks to those who are underage. How does the liquor shop ensure that it sells wine only to the right people? The liquor shop accomplishes this by using identification, authentication, and authorization. And here is an explanation how this works.
43
Identification
Identification just means to claim to be someone by stating a name or anything else that could be used as an identifier. 1 In the liquor shop example, one could claim to be a certain person by stating a name. Identification does not prove that you really are who you claim to be. Identification does not involve the proof that you are not underage. Identification just means claiming to be a certain person.
Authentication
The purpose of authentication is to prevent someone from claiming to be someone else. Authentication means verifying or proving that you really are who you claim to be1. This proof can be provided by something you have or something you know that can serve as proof that you really are who you claim to be (e.g., an ID card, a driver’s license, or some details of the life of the person you claim to be). It is important that the proof of your claimed identity is uniquely connected to you (e.g., a photograph of your face, a fingerprint, or something else that identifies you uniquely). In the liquor shop example, this means that you can prove that you really are who you claimed to be by show-ing a driver’s license that contains a photograph of you. Comparing your face with the face shown on the photograph on the driver’s license accomplishes the verification. If you look like the person in the photograph of the driver’s license, the authentication is successful. Otherwise, the authentication fails.
Double checking one’s face with the photograph on the driver’s license aims to prevent someone from using someone else’s driver’s license.
Authorization
Authorization means granting access to specific resources or services due to the characteristics or properties of one’s identity1. Authorization is the consequence of both a successful authentication and evaluation of one’s characteristics or rights. In the liquor shop example, authorization means to decide whether you are allowed to buy a bottle of wine based on the date of birth shown on your driver’s license. The shop assistant will refuse to sell you a bottle of wine if you are too young based on the date of birth shown 1Van Tilborg, Henk, and Sushil Jajodia, eds. Encyclopedia of cryptography and security. New York: Springer Science & Business Media, 2014.
44
Step 6 | Understanding the Nature of Ownership
on your driver’s license. Note that in this case the refusal is not due to a failed authentication. Identification and authentication worked well, and because of the correct identification, the shop assistant can identify you as an underage person. Hence, authorization is always the result of evaluating the characteristics or properties of the previously authenticated identity against some rules.
■ Note Identification means claiming to be someone. Authentication means proving that you really are who you claimed to be. Authorization means getting access to something due to the previously authenticated identity.
Purposes and Properties of a Ledger
Figure 6-2 illustrates how the proof of ownership and transfer of ownership relate to the purpose and the properties of a ledger.
Figure 6-2. Concepts and principles of a ledger
45
The major lesson to be learned from Figure 6-2 is the fact that a ledger has
to fulfill two opposing roles. On the one hand, a ledger serves as a means for proving ownership, which relies on reading historic data preserved in the ledger. On the other hand, the ledger has to document any transfer of ownership, which in turn implies that new data are produced and written to the ledger.
One of the most important differences of these two purposes can be summarized in the opposing nature of transparency and privacy.
Proving ownership is easier when the ledger is open to anyone. Hence, transparency is the basis of proving ownership rights in a similar way as witnesses making a public testimony in court. However, transferring ownership must be exclusively restricted to the lawful owner. So privacy forms the basis of transferring ownership. Since writing in the ledger means changing ownership, only very trustful entities should be given writing access to ledgers.
The conflicting forces of transparency vs. privacy, proving ownership vs. transferring ownership, and reading the ledger vs. writing the ledger can also be found in the blockchain. It turns out that the blockchain is a gigantic distributed peer-to-peer system of ledger-like data structures that can be read by everyone.
Ownership and the Blockchain
A witness in the form of a government-regulated ledger is the key in clarifying ownership of valuable goods. But what happens if such a ledger is damaged or destroyed? Or what happens if someone responsible for updating the ledger makes an error or forges it on purpose? In this case, the ledger does not reflect reality. This is disastrous because everybody believes that the ledger represents the truth, similar to a witness in court.
The problem of having only one ledger as the source for clarifying ownership can be solved in the same way as it has been solved for trials in court. Basing a verdict only on the testimony of one single witness is risky since this witness could be dishonest. Having more witnesses is better. The more independent witnesses who are interrogated, the higher the chance that those facts that are consistently mentioned among the majority of testimonies reflect the truth.
This fact can be proved by means of statistics and the law of large numbers.
Having many witnesses who independently make their own observations free of mutual influences is the key for this approach to finding the truth.
Applying this finding to the use of a ledger for clarifying ownership is straightforward: Instead of maintaining only one single ledger that could be forged, one should utilize a purely distributed peer-to-peer system of ledgers and clarify requests concerning ownership on that version of the reality on which the majority of peers agrees.
Step 6 | Understanding the Nature of Ownership
At this point you might be wondering what all this has to do with the blockchain. The relation between managing ownership with a ledger and the blockchain is summed up as:
• An individual ledger is used for maintaining information
about ownership, which is equivalent to one blockchain-data-
structure storing ownership-related data.
• The individual ledgers are stored on the computers
(nodes) of a peer-to-peer system.
• The blockchain-algorithm is responsible for letting the
individual nodes collectively arrive at one consistent ver-
sion of the state of ownership on which the final verdict
is based.
• Integrity in this system is its ability to make true state-
ments about ownership.
• Cryptography is necessary for creating a trustworthy
means of identification, authentication, and authorization
and ensuring data security.
Outlook
This step highlighted important characteristics of ownership and how they relate to the properties of ledgers. Furthermore, this step sketched how the blockchain relates to ownership and ledgers. The next step discusses an important consequence of having ownership managed in a purely distributed peer-to-peer system of ledgers.
Summary
• A proof of ownership has three elements:
• Identification of the owner
• Identification of the object being owned
• Mapping the owner to the object
• ID cards, birth certificates, and driver’s licenses as well as serial numbers, production dates, production certificates,
or a detailed object description can be used in order to
identify owners and objects.
47
• The mapping between owners and objects can be main-
tained in a ledger, which plays the same role as a witness
in a trial.
• Having only one ledger is risky since it can be damaged,
destroyed, or forged. In this case, the ledger is no longer
a trustworthy source for clarifying ownership.
• Instead of using only one central ledger, one can utilize a group of independent ledgers for documenting ownership and clarify requests concerning the ownership on
that version of the reality on which the majority of led-
gers agrees.
• It is possible to create a purely distributed peer-to-peer
system of ledgers by using the blockchain-data-structure.
Each blockchain-data-structure represents one ledger
and is maintained by one node of the system. The block-
chain-algorithm is responsible for letting the individual
nodes collectively arrive at one consistent version of the
state of ownership. Cryptography is used to implement
identification, authentication, and authorization.
• Integrity of a purely distributed peer-to-peer system of
ledgers is found in its ability to make true statements
about ownership and to ensure that only the lawful
owner can transfer his or her property rights to others.
7
Spending
Money Twice
Exploiting a vulnerability of distributed
peer-to-peer systems
In the previous step, you learned about the relation between purely distributed peer-to-peer systems and the most prominent use case of the blockchain as a means to manage ownership. You also learned that the integrity of a distributed peer-to-peer system of ledgers is found in its ability to make true statements about ownership and to ensure that only the lawful owner can transfer his or her property rights to others. But what does this statement mean in real life? What happens if integrity is violated? This step considers these questions in more details. In particular, this step introduces one of the most important examples of violated integrity in distributed peer-to-peer systems: the double spending problem.
The Metaphor
Counterfeiting bank notes is a severe crime in any country because it undermines the foundation and functioning of the economy by creating purchasing power that is not backed up by valuable resources. As a result, most bank
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_7
Step 7 | Spending Money Twice
notes are equipped with security features that make counterfeiting impossible or prohibitively costly at least. These security features, such as unique numbers, watermarks, or fluorescent fibers, work well with physical bank notes and other physical goods. But what happens if money or goods become digital and are managed in distributed peer-to-peer systems of ledgers? This step explains a specific vulnerability of distributed peer-to-peer systems used for managing ownership that is equivalent to counterfeiting bank notes. As it turns out, this vulnerability is a prominent example of violated system integrity.
The Double Spending Problem
Let’s consider a peer-to-peer system for managing ownership of real estate. In such a system, the ledgers that keep track of ownership information are maintained by the individual computers of its members instead of being maintained in a central database. Hence, each peer maintains his or her own copy of the ledger. As soon as the ownership of a house is transferred from one person to another, all the ledgers of the system need to be updated in order to contain the latest version of reality. However, passing information forward among peers and updating the individual ledgers require time. Until the last member of the system receives the new information and updates his or her copy of the ledger, the system will not be consistent. Some peers already know about the latest transfer of ownership, while other peers have not yet received that information. The fact that not all ledgers have up-to-date information makes them prone to be exploited by anyone who already has the latest information.
Let’s also imagine the following situation. Person A sells his house to person B.
The transfer of ownership from A to B is documented in one of the ledgers in the peer-to-peer system. This particular ledger needs to inform other peers about this transfer, who in turn inform other peers as well, until eventually all peers learn about the transfer of ownership from A to B. However, suppose that person A quickly approaches another ledger of the system and demands to document a different transfer of ownership of the identical house: the sale from person A to person C. If this peer has not yet learned about the transfer of ownership from A to B that happened in the past, this peer will approve and document the transfer of ownership from A to C for the identical house.
Hence, A was able to sell his house twice by exploiting the fact that distributing information about his first sell requires time. But B and C cannot own the house at the same time. Only one of them is supposed to be the new and lawful owner. Hence, the situation is called the double spending problem.
51
The Term
Similar to the term blockchain, the term double spending is ambiguous as it is used to refer to the following concepts:
• A problem caused by copying digital goods
• A problem that may appear in distributed peer-to-peer
systems of ledgers
• An example of violated integrity in purely distributed
peer-to-peer systems
Double Spending as a Problem of Copying Digital
Goods
In the context of copying digital goods, the double spending problem refers to the fact that data on a computer can be copied without noticeable limitations.
This fact causes problems with digital money or any other data that are supposed to have only one owner at a given time. Copying makes it possible to replicate data that represent pieces of digital money and use them more than once for making payments. This is the digital equivalent to replicating bank notes with a copying machine. Besides being technically possible, the copying of digital money violates the core principle of money: an identical piece of money cannot be given to different people at the same time. The ability to copy and spend digital money multiple times renders the money useless, hence, the double spending problem.
Double Spending as a Problem of Distributed
Peer-to-Peer Systems of Ledgers
When used to describe the problem of a distributed peer-to-peer system of ledgers, double spending problem refers to the fact that forwarding information to all elements of such a system requires time, thus not all peers have the same ownership information at the same time. Because not all peers have up-to-date information, they are prone to be exploited by anyone who already has the latest information. As a result, one may be able to transfer ownership more than once, resulting in double spending.
Step 7 | Spending Money Twice
Double Spending as an Example of Violated Integrity
in Distributed Peer-to-Peer Systems
The use of distributed peer-to-peer systems is not restricted to managing ownership. However, the problem of forwarding information among peers and updating the data maintained by the members of the system stays the same, regardless of the specific application domain. Hence, on a more abstract level, the double spending problem can be seen as a problem of maintaining data consistency in distributed peer-to-peer systems. Since data consistency is one aspect of system integrity, one could say that the double spending problem is a specific example of violated system integrity.
How to Solve the Double Spending Problem
Because double spending can have different meanings, there is no single way to prevent it. Instead, many different solutions may exist. The following sections describe some of them.
Solving Double Spending as a Problem of Copying
Digital Goods
The problem of spending digital money or any other digital assets more than once just by copying the data is actually a problem related to the nature of ownership. Any accepted means of mapping data that represents digital goods to their owners will solve that problem, regardless of its specific implementation. Even a physical central book or (more realistically) an electronic ledger, regardless of its architecture (centralized or peer-to-peer), can ensure that a digital good will only be spent once, provided the ledger works correctly all the time.
Solving Double Spending as a Problem of a
Distributed Peer-to-Peer System of Ledgers
In this context, the architecture as well as the application domain of the system are given. Distributed peer-to-peer systems of ledgers are often regarded as the classical example to derive the blockchain. The explanations provided in
Step 6 highligted the relation between the blockchain and distributed peer-to-peer systems of ledgers. Hence, the blockchain, as this term is used throughout this book, can be seen as a solution to the double spending problem in a distributed peer-to-peer system of ledgers.
53
Solving Double Spending as an Example of Violated
Integrity in Distributed Peer-to-Peer Systems
In this context, the architecture of the system is specified but the application domain is left unspecified. Hence, solutions on this level focus on achieving and maintaining integrity in distributed peer-to-peer systems, regardless of their concrete usage. However, the concrete usage of a distributed peer-to-peer system determines the meaning of integrity. For example, a simple file-sharing application may consider different aspects for defining integrity as compared to a system that manages ownership in a digital currency. Hence, the question of whether the blockchain-technology-suite is the right tool for achieving and maintaining system integrity cannot be answered without knowledge of the specific application goals. Hence, it could be possible that in specific application areas of distributed peer-to-peer systems, other technologies, data structures, and algorithms are more suitable for achieving and maintaining integrity.
■ Note The double spending problem is a prominent example of violated integrity in distributed peer-to-peer systems of ledgers, and the blockchain-technology-suite is a tool used to solve it.
The Usage of Double Spending in This Book
In this book the term double spending is used to refer to a vulnerability that may appear in purely distributed peer-to-peer systems of ledgers.
Outlook
This step explained double spending and highlighted the importance of the blockchain to achieve integrity in purely distributed peer-to-peer systems.
The next steps focus on how the blockchain achieves and maintains integrity.
Summary
• The term double spending is ambiguous; it has different meanings.
• Double spending can refer to:
• A problem caused by copying digital goods
Step 7 | Spending Money Twice
• A problem that may appear in a distributed peer-to-peer
system of ledgers
• An example of violating the integrity of distributed
peer-to-peer systems
• In this book the term double spending is used to refer to
a vulnerability of purely distributed peer-to-peer systems
of ledgers.
• The blockchain is a means to solve the double spending
problem.
III
How the
Blockchain
Works
This learning stage is the centerpiece of this book because it explains how the blockchain works internally. The 14 learning steps in this stage will guide you through all of the concepts of the blockchain and their underlying technologies.
By the end of this stage, you will have reached a solid understanding of all the major concepts of the blockchain, how they work in isolation, and how they interact in order to create the big machinery that is called the blockchain.
8
Planning the
Blockchain
The basic concepts of managing ownership with
the blockchain
The preceding steps uncovered the relation between trust, integrity, purely distributed peer-to-peer systems, and the blockchain. As a result, you now have a good understanding of what the blockchain is, why it is needed, and what problem it solves. However, you still do not know how the blockchain works internally. This step provides a first impression of how the blockchain works by explaining the general application scenario that will guide you through the succeeding steps. It also highlights the major tasks in designing a blockchain for managing ownership and provides an overview of its major concepts. This step serves as the starting point for the succeeding steps that will discuss in great detail the concepts and technologies that make up the blockchain.
The Goal
The goal here is to understand the concepts that make up the blockchain.
For didactical reasons, I will present the challenge of designing your own system for managing ownership. Hence, you will face the same challenges that
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_8
Step 8 | Planning the Blockchain
the inventor of the blockchain once faced and successfully solved: designing a piece of software that manages ownership in a purely distributed peer-to-peer system of ledgers that operates in a completely open and untrustworthy environment.
Starting Point
As a starting point, the major facts about the system under consideration can be summarized as following:
• The system will be a purely distributed peer-to-peer system, which is made of the computational resources contributed
by its users.
• The peer-to-peer system uses the Internet as a network
for connecting the individual nodes.
• Neither the number of nodes nor their trustworthiness
and reliability is known.
• The goal of the peer-to-peer system is the management
of ownership of a digital good (e.g., sales bonus points or
digital money).
The Path to Follow
There are seven major tasks that need to be addressed when designing and developing a software system that manages ownership by using a purely distributed peer-to-peer system of ledgers in an open and untrustworthy environment:
• Describing ownership
• Protecting ownership
• Storing transaction data
• Preparing ledgers to be distributed in an untrustworthy
environment
• Distributing the ledgers
• Adding new transaction to the ledgers
• Deciding which ledgers represents the truth
59
Task 1: Describing Ownership
Before you can start developing the blockchain, you need to ask yourself what you want to do with it. Since you will want to design a software system that manages ownership, you have to decide how to describe ownership first. It turns out that transactions are a good way to describe any transfer of ownership, and the complete history of transactions is the key to identifying the current owners. Hence, Step 9 will explain transactions, what they are, how you can describe them, and how you can use them to clarify ownership.
Task 2: Protecting Ownership
Describing ownership by using transactions is just the starting point. Moreover, you need a way to prevent people from accessing the property of others. In real life, you can easily prevent people from using your car or from entering your house by using doors with locks. It turns out that cryptography provides a way to protect transactions on an individual level, similar to the way doors with locks protect your individual car or house.
Protecting ownership has three major elements: identifying and authenticating owners as well as restricting access to the property to its owners. Steps 12 and
13 will explain these concepts in more details. However, these steps rely on the concept of hash values. If you have never heard about hash values before, you do not have to worry. I devoted Steps 10 and 11 to explaining hash values in
great detail. These two steps will also offer interesting insights for those who already have a technical background or know about hash values.
Task 3: Storing Transaction Data
Describing ownership by means of transactions and having security measures that protect ownership on the level of individual transactions are important steps toward the goal of designing a software system that manages ownership.
However, you need a way to store the whole history of transactions, as this history is used to clarify ownership. Since the transaction history is the core element in clarifying ownership, it must be stored in a secure way. It turns out that the blockchain-data-structure is the digital equivalent to a ledger.
Steps 14 and 15 explain the requirement that the blockchain-data-structure has to fulfill in order to serve as a digital ledger and how it is implemented.
Step 8 | Planning the Blockchain
Task 4: Preparing Ledgers to Be Distributed in an
Untrustworthy Environment
Having one isolated ledger or blockchain-data-structure that contains transaction data is great, but your aim is to design a distributed peer-to-peer system of ledgers that operates in an untrustworthy environment. Hence, you will have copies of the ledger running on untrustworthy nodes in an untrustworthy network. Furthermore, you will hand over the control of the ledgers to the whole network without having any central point of control or coordination. How can you prevent the ledgers from being forged or manipulated (e.g., by deleting transactions from the history or adding illegal transactions to it)? It turns out that the best way to prevent the transaction history from being changed is to make it unchangeable. This means the ledgers and therefore the transaction history cannot be changed once written. As a result, you will not have to fear that the ledgers will be tampered with or forged because they cannot be changed in the first place. However, having a distributed peer-to-peer system of ledgers that can never be changed sounds like a very secure but pretty useless thing because it will not allow you to add new transactions. Hence, the challenge of the blockchain-data-structure is to be unchangeable, on the one hand, while accepting new transactions being added to it, on the other hand. This sounds like a contradiction in terms, but it turns out that this is achievable with a technical trick that is explained in Step 16.
The result is a blockchain-data-structure that is append-only: it is possible to add new transactions, but it is nearly impossible to change data that were added in the past.
Task 5: Distributing the Ledgers
Once the ledger is append-only, you can create a distributed peer-to-peer system of ledgers by making copies of it available to everyone who asks for it. However, just providing copies of append-only ledgers does not fulfill your goals. A distributed system that manages ownership involves interaction between the peers or nodes, respectively. Hence, Step 17 explains how the nodes in the system interact with one another and what information is exchanged among them.
Task 6: Adding New Transactions to the Ledgers
The distributed peer-to-peer system will consist of members whose computers maintain individual copies of an append-only blockchain-data-structure.
Since the data structure allows you to add new transaction data, you will have to ensure that only valid and authorized transactions are added. It turns out that this is possible by allowing all members of the peer-to-peer system
61
to add new data and additionally turning each member of the peer-to-peer system into supervisors of their peers. As a result, all members will supervise one another and point out any mistakes made by their peers. Step 18 explains this approach in more detail as well as the incentives given to the peers for fulfilling their role.
Task 7: Deciding Which Ledgers Represent the
Truth
Once new transactions can be added to the individual ledgers in the peer-to-peer system, one runs into a problem that is typical for any distributed peer-to-peer system: different peers may have received different transactions and soon the history of transactions maintained by them differs. Hence, different versions of the transaction history can exist in the peer-to-peer system.
Since the transaction history is the basis for identifying lawful owners, having different conflicting transaction histories is a serious threat to the integrity of the system. Hence, it is important to find a way either to prevent the emergence of different transaction histories in the first place or to find a way to decide which transaction history represents the truth. Due to the nature of a purely distributed peer-to-peer system, the former approach is not possible.
As a result, you need a criterion for how to find and choose one transaction history that represents the truth. But there is another problem: there is no central authority in a purely distributed peer-to-peer system that can declare which transaction history has to be chosen. It turns out that one can solve that problem by making every node in the peer-to-peer system decide on its own which transaction history represents the truth in a way that the majority of the peers independently agree on that decision. It also turns out that the way in which the blockchain lets you add new transactions to the append-only blockchain-data-structure already contains the solution to this problem.
Step 19 explains these criteria in detail and how they are used.
Outlook
This step identified seven tasks that provide a challenging intellectual journey through the concepts that constitute the blockchain. Once you fulfill these tasks, you will arrive at the summit: an understanding of the blockchain. Step 21
is the point where you will put all of these concepts together and enjoy the results of this learning effort. Step 21 will be an overview chapter like this one, but it will draw on the technical knowledge you will have acquired in the meantime.
Step 8 | Planning the Blockchain
Summary
• In order to design a purely distributed peer-to-peer
system of ledgers for managing ownership, one has to
address the following tasks:
• Describing ownership
• Protecting ownership from unauthorized access
• Storing transaction data
• Preparing ledgers to be distributed in an
untrustworthy environment
• Forming a system of distributed the ledgers
• Adding and verifying new transactions to the ledgers
• Deciding which ledgers represent the truth
• The tasks outlined above will be addressed in the following 12 steps.
9
Documenting
Ownership
Using the course of history as evidence for the
current state of ownership
This step considers the task of describing ownership in a way that is useful for a purely distributed peer-to-peer system of ledgers. This step explains how the blockchain documents ownership and handles the transfer of ownership.
Additionally, this step points out the importance of ordering when documenting the transfer of ownership. Finally, this step highlights the importance of the integrity of transaction data for the integrity of the whole system.
The Metaphor
A relay race is a race between teams of runners, where each team member covers only a part of the total distance. During the race, each runner must hand off a specific item, the so-called baton, to the next runner within a certain zone marked on the track. At any given time during the race, only one member of the competing teams carries a baton. In order to determine which member of a given team is currently carrying the baton, it is sufficient to know to whom of the team the baton was handed over at the latest hand off.
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_9
Step 9 | Documenting Ownership
In order to keep track of who carried a baton at any given time, one needs to record the time of each hand off and the athletes who were involved. This step explains how the blockchain treats ownership in a similar fashion to that of the way relay races utilize batons.
The Goal
The goal is the documentation of ownership in a transparent and comprehensible way. Anyone who reads that documentation should be able to make an unambiguous statement concerning the association of the goods to its owners.
The Challenge
The challenge is to find documentation of ownership that not just claims that someone is the owner of something, but also provides evidence of ownership and hence serves as proof of ownership.
The Idea
Instead of describing the current state of ownership by inventory data (i.e., by listing the current possessions of all owners), one maintains a list of all transfers of ownership in a ledger in an ongoing fashion. Every transfer of ownership is described by transaction data that clearly point out which owner hands off ownership of what item and to whom at what time. The whole history of transaction data stored in a ledger becomes an audit trail that provides evidence of how everyone achieved his or her possession.1 This is equivalent to tracking every hand off of the baton in a relay race, which allows everyone to reconstruct the whole race later on.
A Short Detour to Inventory and Transaction
Data
There are two competing ways to describe ownership—through inventory data or transaction data. Inventory data describe the current state of ownership. They are similar to a bank account statement that just displays the amount of money that is currently available. Transaction data describe transfers of ownership. They are similar to a bank account statement that lists every 1Nakamoto, Satoshi. Bitcoin: A peer-to-peer electronic cash system (2008).
65
withdrawal, deposit, and transferal of money. One can derive inventory data by aggregating transaction data. Besides the fact that both inventory data and transaction data describe ownership, their underlying philosophy differs dramatically. Inventory data just state or claim ownership, while transaction data explain and thereby justify ownership. However, inventory data are often considered more convenient as they immediately state the fact that is interesting to most people, that is, the current state of ownership.
How It Works
Documenting ownership with the blockchain involves the following aspects:
• Describing the transfer of ownership
• Maintaining the history of transfers
Describing the Transfer of Ownership
A transaction is the act of transferring ownership from one owner to someone else. The act of transferring ownership relies on data that descibe the intended transfer. These data contain all information necessary to execute the transfer of ownership. An example of data that describe an intended transfer of ownership would be a bank transfer form that is used to request a bank to make a money transfer on behalf of a customer. The bank transfer form requires you to provide all information necessary to allow the bank to make the transfer on your behalf. In a similar fashion, the information used by the blockchain to describe a transaction are:
• An identifier of the account that is to hand off ownership
to another account
• An identifier of the account that is to receive ownership
• The amount of the goods to be transferred
• The time the transaction is to be done
• A fee to be paid to the system for executing the
transaction
• A proof that the owner of the account that hands off
ownership indeed agrees with that transfer
Most of these data are familiar to anyone who has made a money transfer with a bank. However, the analogy with a bank transfer ends when fees are considered. Due to the fact that banks are centralized institutions, they maintain a central fee schedule that is applied to all customers. In contrast to that, the blockchain is a distributed system without any central point of control.
Hence, the blockchain cannot have a central fee schedule. When using the
Step 9 | Documenting Ownership
blockchain, each user has to tell the system in advance how much he or she is willing to pay for having the transaction executed. The account that hands off ownership also pays the transaction fee.
Maintaining the History of Transfers
Transaction data provide the mandatory information necessary to execute a transfer of ownership as intended. Executing a transaction means making the transfer of ownership happen as described by the transaction data. Executing a transaction means adding the transaction data to a ledger. By adding transaction data to a ledger, the transaction becomes part of the transaction history, which is used to clarify ownership. When the ledger is used the next time to clarify ownership by aggregating the transaction data it contains, the newly added transaction will be included in the aggregation and hence will impact the resulting state of ownership.
The blockchain maintains the whole history of all transactions that have ever happened by storing their transaction data in the blockchain-data-structure in the order in which they occurred. Any transaction not being part of that history is regarded as if it never happened. Hence, adding transaction data to the blockchain-data-structure means making this transaction happen and allowing it to influence the result of using the history in order to identify the current owner.
Why It Works
Since transaction data contain all the information about the account that hands off ownership, the account that receives ownership, and the item and the amount to be transferred, one can reconstruct ownership information for each account as long as the whole history of transactions is available. As a result, the whole history of all transaction data is sufficient to document ownership.
Importance of Ordering
Aggregating transaction data is done for the purpose of recovering the current state of ownership and clarifying ownership. It is important to recognize that the order in which the transactions occurred must be preserved in order to arrive at the identical result every time the data are aggregated. Changing the order of transaction data will change the result of aggregating them. At first glance, the result does not seem to change very much whether I receive a payment of $50 from a friend first and transferred $50 afterward in order to pay a bill or whether these two transactions occurred in the opposite order.
But what happens if my bank account does not contain any money at all and I
67
am not allowed to overdraw it? In this case, my ability to pay my bill depends on having received the payment from my friend first. Otherwise, the bank will refuse to transfer the money to pay the bill due to a lack of funds. Hence, the order in which transactions occur does indeed matter.
Integrity of the Transaction History
Without exaggeration, one can state that the history of transaction data is the heart of any blockchain that manages ownership because it is the basis for reconstructing the state of ownership. As a result, it is necessary to keep that history of data safe, complete, correct, and consistent in order to maintain the integrity of the whole system and, as a result, be able to make true statements regarding the current state of ownership. Hence, the blockchain needs to provide security measures to ensure that only valid transaction data are added to the blockchain-data-structure. Examining validity of transaction data involves three aspects:
• Formal correctness
• Semantic correctness
• Authorization
Formal Correctness
Formal correctness means that the description of a transaction contains all required data and that the data are provided in the correct format.
Semantic Correctness
Semantic correctness focuses on the meaning of transaction data and their intended effect. Hence, validating semantic correctness requires knowledge of the business domain. Examining semantic correctness of transaction data is often done based on business rules, such as:
• Ensuring that an account does not hand off more than it
currently owns
• Preventing double spending
• Limiting the amount of items that can be transferred in a
single transaction
• Limiting the number of transactions per user
• Limiting the total amount of items spent in a given time period
• Enforcing that an account keeps an item for a minimum
time period before it can be transferred further
Step 9 | Documenting Ownership
Authorization
Only the owner of the account who hands off ownership should be allowed to advise the blockchain to execute a transaction on his or her behalf. As a result, the blockchain requires every transaction to carry information that proves that the owner of the account who hands off ownership indeed agrees with that transfer.
Outlook
This step explained transactions and their role for clarifying ownership. The following steps are mainly concerned with how the blockchain enforces that only valid transaction data are added to the history and how the history is protected from being manipulated or forged.
Summary
• Transaction data provide the following information for
describing a transfer of ownership:
• An identifier of the account who initiates the transaction
and is to transfer ownership to another account
• An identifier of that account that is to receive ownership
• The amount of the goods to be transferred
• The time the transaction is to be done
• A fee to be paid to the system for executing the
transaction
• A proof that the owner of the account who hands off
ownership agrees with that transfer
• The complete history of transaction data is an audit
trail that provides evidence of how people acquired and
handed off ownership.
• Any transaction not being part of that history is regarded
as if it never happened.
• A transaction is executed by adding it to the history of
transaction data and allowing it to influence the result of
aggregating them.
69
• The order in which transaction data are added to the his-
tory must be preserved in order to yield identical results
when aggregating these data.
• In order to maintain integrity, only those transaction data are added to the blockchain-data-structure that fulfill the
following three criteria:
• Formal correctness
• Semantic correctness
• Authorization
10
Hashing Data
Identifying data from their digital fingerprint
This step explains one of the most important base technologies of the blockchain: hash values. It discusses important properties of cryptographic hash functions and introduces patterns of applying hash functions to data.
The Metaphor
Fingerprints are impressions of the friction ridges of all or any part of the fingers of the human hand. They are considered to be able to identify humans uniquely. They have been used to investigate crimes, identify offenders, and to exonerate the innocent. This step introduces a concept for identifying data, which can be seen as the digital equivalent to fingerprints. The concept is called cryptographic hash value, and the blockchain makes extensive use of it.
Hence, understanding cryptographic hashing is mandatory for understanding the blockchain.
The Goal
In the distributed peer-to-peer system, you will deal with a huge number of transaction data. As a result, you will need to identify them uniquely and compare them as quickly and as easily as possible. Hence, the goal is to identify transaction data and possibly any kind of data uniquely by their digital fingerprints.
© Daniel Drescher 2017
D. Drescher, Blockchain Basics, DOI 10.1007/978-1-4842-2604-9_10
Step 10 | Hashing Data
How It Works
Hash functions are small computer programs that transform any kind of data into a number of fixed lengths, regardless of the size of the input data.1 Hash functions only accept one piece of data at any given time as input and create a hash value based on the bits and bytes that make up the data. Hash values can have leading zeros in order to provide the required length. There are many different hash functions that differ among others with respect to the length of the hash value they produce. An important group of hash functions is called cryptographic hash functions, which create digital fingerprints for any kind of data. Cryptographic hash functions have the following properties2:
• Providing hash values for any kind of data quickly
• Being deterministic
• Being pseudorandom
• Being one-way functions
• Being collision resistant
Providing Hash Values for Any Data Quickly
This property is actually a combination of two properties. First, the hash function is able to calculate hash values for all kinds of data. Second, the hash function does its calculation quickly. These properties are important, as you do not want the hash function to yield useless things like error messages or to take a large amount of time to return the results.
Deterministic
Deterministic means that the hash function yields identical hash values for identical input data. This means that any observed discrepancies of the hash values of data must be solely caused by the discrepancies of the input data and not by the internals of the hash function.
1Weisstein, Eric W. Hash function. From MathWorld: http://mathworld.wolfram.com/
2Rogaway, Phillip, and Thomas Shrimpton. Cryptographic hash-function basics: definitions, implications, and separations for preimage resistance, second-preimage resistance, and collision resistance. In B. Roy and W. Meier (eds.), Fast software encryption. FSE 2004. Lecture Notes in Computer Science, vol. 3017. International Workshop on Fast Software Encryption.
Berlin Heidelberg: Springer, 2004.
73
Pseudorandom