218 lines
7.5 KiB
Plaintext
218 lines
7.5 KiB
Plaintext
data Blog = Blog { me :: Programmer, posts :: [Opinion] }
|
||
[1]Posts [2]RSS
|
||
|
||
How Blockchains Work
|
||
|
||
Chances are, you know what Bitcoin is. After all, it’s valued at over $47,000
|
||
per Bitcoin right now. This post isn’t about the business side of things,
|
||
though, or the BTC speculative bubble. I want to explain how it works.^[3]1
|
||
|
||
Foundations: Hashes and Ledgers
|
||
|
||
First, a hash algorithm is a way to convert a given string into an
|
||
unpredictable string of a fixed length, called a digest.
|
||
|
||
A diagram illustrating that a hash algorithm produces a digest from a string.
|
||
|
||
Here’s a small Python program to demonstrate:
|
||
|
||
#!/usr/bin/env python3
|
||
from argparse import ArgumentParser
|
||
from hashlib import md5
|
||
|
||
|
||
def hash_string(string):
|
||
hash = md5()
|
||
hash.update(string.encode("utf-8"))
|
||
return hash.hexdigest()
|
||
|
||
|
||
if __name__ == "__main__":
|
||
parser = ArgumentParser()
|
||
parser.add_argument("STRING", help="The string to be hashed")
|
||
args = parser.parse_args()
|
||
print(hash_string(args.STRING))
|
||
|
||
Running this with different string arguments will give you digests of the
|
||
arguments:
|
||
|
||
$ ./hash ninja
|
||
3899dcbab79f92af727c2190bbd8abc5
|
||
|
||
$ ./hash samurai
|
||
99b1983cf3ee09bbaf6f43ac7b4c8748
|
||
|
||
Hashes of this type are used to check passwords—you can check whether a
|
||
password matches without storing the password itself.^[4]2
|
||
|
||
Blockchains are a kind of ledger: they have entries added to them over time.
|
||
Hashes can help with that by protecting the ordering and contents of messages.
|
||
|
||
A diagram illustrating that blockchains capture the previous digest and the
|
||
current message to produce a digest.
|
||
|
||
Here’s a brief implementation:
|
||
|
||
def hash_ledger_entry(string, previous_digest=None):
|
||
"""Hashes a string with the hash of previous entries in the ledger, if any."""
|
||
hash = md5(string.encode("utf-8"))
|
||
|
||
if previous_digest:
|
||
hash.update(previous_digest.encode("utf-8"))
|
||
|
||
return hash.hexdigest()
|
||
|
||
|
||
def generate_ledger(*strings):
|
||
"""Generates the entries in a ledger consisting of a set of strings."""
|
||
digest = None
|
||
|
||
for string in strings:
|
||
digest = hash_ledger_entry(string, digest)
|
||
yield digest, string
|
||
|
||
|
||
if __name__ == "__main__":
|
||
parser = ArgumentParser()
|
||
parser.add_argument("STRINGS", help="The ledger entries", nargs="+")
|
||
args = parser.parse_args()
|
||
|
||
for digest, string in generate_ledger(*args.STRINGS):
|
||
print(f"{digest}\t{string}")
|
||
|
||
With this script, providing a set of strings will generate a unique and ordered
|
||
ledger:
|
||
|
||
$ ./hash ninja samurai
|
||
3899dcbab79f92af727c2190bbd8abc5 ninja
|
||
6bf8d2cadde40af53d7f0fef95d4ec2c samurai
|
||
|
||
These hash ledgers are tamper-resistant because the digests of later entries
|
||
depend on the earlier entries. Modifying or adding entries will change the
|
||
digest of later entries.
|
||
|
||
$ ./hash ninja pirate samurai
|
||
3899dcbab79f92af727c2190bbd8abc5 ninja
|
||
7ec21dcf528e12036b04774754ecc4e0 pirate
|
||
636730d86709d03fed9ba64f84fc9be6 samurai
|
||
|
||
We can also add a known ending entry to the ledger to protect the last entry
|
||
from tampering:
|
||
|
||
$ ./hash ninja pirate samurai
|
||
3899dcbab79f92af727c2190bbd8abc5 ninja
|
||
7ec21dcf528e12036b04774754ecc4e0 pirate
|
||
636730d86709d03fed9ba64f84fc9be6 samurai
|
||
b233d566fe677d394aafb5eaf149e453 END
|
||
|
||
Validation
|
||
|
||
To validate a ledger, you can replay the transactions and make sure that you
|
||
get the same hashes at each step:
|
||
|
||
our_digest = None
|
||
|
||
for line in fileinput.input():
|
||
file_digest, word = line.strip().split("\t")
|
||
our_digest = hash_ledger_entry(word, our_digest)
|
||
|
||
if our_digest != file_digest:
|
||
sys.exit(f"The digest for {word} does not match.")
|
||
|
||
print("All entries match.")
|
||
|
||
With a tamper-resistant ledger where each entry depends on the previous
|
||
entries, we’ve effectively implemented a very simple blockchain. This is not
|
||
the same as the blockchain, though; for that we need…
|
||
|
||
Proofs without Authority
|
||
|
||
The novelty of Bitcoin is that it is a distributed system with no owner. This
|
||
is what enthusiasts mean when they say that the blockchain is trustless:
|
||
instead of central authority, like a bank, many “miners” compete to
|
||
successfully write a new message to the blockchain. They do this by means of a
|
||
proof-of-work algorithm, which we can implement in our ledger as well.
|
||
|
||
# Add this to your imports.
|
||
from secrets import token_bytes
|
||
|
||
|
||
def hash_ledger_entry_with_salt(salt, string, previous_digest=None):
|
||
"""Hashes a string with the hash of previous entries in the ledger, if any."""
|
||
hash = md5(string.encode("utf-8"))
|
||
hash.update(salt)
|
||
|
||
if previous_digest:
|
||
hash.update(previous_digest.encode("utf-8"))
|
||
|
||
return hash.hexdigest()
|
||
|
||
|
||
def generate_ledger(difficulty, *strings):
|
||
# Difficulty determines how many zeroes we require at the beginning of a digest.
|
||
prefix = "0" * difficulty
|
||
|
||
digest = None
|
||
previous_digest = None
|
||
|
||
for string in strings:
|
||
# We re-hash a string over and over, with random salts, until it matches the
|
||
# prefix determined by our difficulty.
|
||
while digest is None or not digest.startswith(prefix):
|
||
salt = token_bytes(16)
|
||
digest = hash_ledger_entry_with_salt(salt, string, previous_digest)
|
||
|
||
# We yield back the digest and entry, as before, but we need the salt, too.
|
||
# Without that, we can't replay the entries and verify them.
|
||
yield digest, salt.hex(), string
|
||
previous_digest = digest
|
||
digest = None
|
||
|
||
yield hash_ledger_entry_with_salt(salt, "END", previous_digest), salt, "END"
|
||
|
||
|
||
if __name__ == "__main__":
|
||
parser = ArgumentParser()
|
||
parser.add_argument(
|
||
"DIFFICULTY", help="The difficulty of confirming a ledger entry.", type=int
|
||
)
|
||
parser.add_argument("STRINGS", help="The ledger entries", nargs="+")
|
||
args = parser.parse_args()
|
||
|
||
for digest, salt, string in generate_ledger(args.DIFFICULTY, *args.STRINGS):
|
||
print(f"{digest}\t{salt}\t{string}")
|
||
|
||
The new utility accepts an additional argument, difficulty, and tries to
|
||
generate a salt value that generates a hash which matches the expected number
|
||
of zeroes:
|
||
|
||
$ ./hash 5 ninja pirate samurai
|
||
00000ad72553509e6c197e45ab7fa436 af0dce7ac4c87c2b9d9eafb6561c09f4 ninja
|
||
000000f556426cfa894ba2ce57383b1d b9d51e0e8ea977ba004e7c30be757144 pirate
|
||
000006373b2b336d6dac403a5fa90a73 dd9c6ad89f5014a0901bcb142e04e28b samurai
|
||
fa35b5a39bc0318015620684d60a27f0 dd9c6ad89f5014a0901bcb142e04e28b END
|
||
|
||
The “mining” process can require a lot of calculations. The example required,
|
||
on average, around 2.5 million attempts. That’s why Bitcoin mining consumes [5]
|
||
more electricity than many countries: on the “real” blockchain, miners are
|
||
calculating and recalculating quadrillions of hashes per bitcoin mined.
|
||
|
||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||
|
||
1. If you want to read about the bubble, I recommend [6]David Gerard. [7]↩︎
|
||
|
||
2. Note that md5 should not be used for this purpose in real applications. I
|
||
chose it here for the brevity of its digests, but it isn’t secure. [8]↩︎
|
||
|
||
|
||
References:
|
||
|
||
[1] https://asthasr.github.io/
|
||
[2] https://asthasr.github.io/index.xml
|
||
[3] https://asthasr.github.io/posts/how-blockchains-work/#fn:1
|
||
[4] https://asthasr.github.io/posts/how-blockchains-work/#fn:2
|
||
[5] https://www.bbc.com/news/technology-56012952
|
||
[6] https://davidgerard.co.uk/blockchain/
|
||
[7] https://asthasr.github.io/posts/how-blockchains-work/#fnref:1
|
||
[8] https://asthasr.github.io/posts/how-blockchains-work/#fnref:2
|