A Technical Dive into Bitcoin (The Chain Files pt. 1)

I haven't written anything in a long while and have been thinking about this subject lately and figured I would do a little research and get my thoughts down on digital paper. You can read more about it from this old post here.

Bitcoin has been around for almost a decade now and since its arrival, has had an incredible impact within the technology and business communities. When the white paper was released in 2009, the world had just suffered through the 2008 financial collapse and was still dealing with the after effects. Most essays these days focus solely on the supposed business and economic changes that the Bitcoin blockchain will introduce to the world; those changes have not fully materialized yet although there are many companies, governments, and individuals around the world doing their best to make that happen. However, in it's entire run, the security of Bitcoin has not been compromised. There have been hacks, but those are mainly due to third party vendors such as exchanges making mistakes. The mathematics and technology underpinning Bitcoin itself still runs securely. That made me more curious about it and ultimately led me to the conclusion to learn about it. I'm hoping that this essay will explain some, but not all, of the basic technical underpinning of Bitcoin. So let's get started.

Hash Functions

One of the data types that is used most frequently in Bitcoin is the cryptographic hash or hash function. A hash function takes an input of some arbitrary amount of data (it can be any amount) and turns it into a fixed-length string or byte array. No matter how large the input data is, the output data will always be the same length. However, every time you hash different data, even if the same function is being used, a completely new hash is created.  Sometimes you will see the cryptographic hash referred to simply as "hashes" since there are non-cryptographic hashes that don't have the same security properties. 

The following examples below show how hashing can be used to verify documents with text, photos, or other forms of multimedia. I am using text only for an easy to understand example.

## Run the following commands in a linux terminal
1) echo "this is our secret message. 'the rain in spain falls mainly in seattle'" | sha256sum
   ab9c98ed0d99f6cf8b7f2a8d2e96a1526aae7bf084fcca87edd41029e3adce9e  -
2) echo "this is our secret message. 'the rain in spain falls mainly in seattle'" > hash.txt
3) cat hash.txt | sha256sum
   ab9c98ed0d99f6cf8b7f2a8d2e96a1526aae7bf084fcca87edd41029e3adce9e  -

Running the echo command along with the given text and the sha256 algorithm, we see the output is a string of 64 characters that looks absolutely random. However, both end up being the same string of 64 characters. This is an extremely important part of hashing as it can be used for verification. In this case we have used it for a simple line of text; but it can be used to verify other forms of documentation and identification now. There are a fair amount of companies that are in the first stages of building an identification layer using blockchains and hashing is an important characteristic of how they will function.

Proof-of-Work

So how do hash functions work within Bitcoin? In order to understand this better we have to travel back to 1997, when cryptographer Adam Back created a technology called HashCash which pioneered something called Proof-of-Work (PoW). HashCash used PoW to limit email spam and denial-of-service attacks and it did so by requiring a selected amount of work to compute with a proof that could be verified efficiently. The algorithm was later slightly tweaked by Hal Finney for his bitcoin precursor project as a way to mine coins. But how does it work within the Bitcoin network? In order to understand this we need to understand the technology that is causing the biggest hype these days, the blockchain.

The Bitcoin blockchain is a ledger that contains the entire record of bitcoin transactions that have taken place since the original genesis block. All of these records are arranged in a sequence of "blocks" in such a way that no user can spend their holdings twice. This record is also public and anyone can verify that these transactions have taken place if they have a specific wallet address or hash. Each block references the hash of the previous block and creates a "chain" to all previous blocks back to the genesis block. These blocks are also computationally impractical to modify because each proceeding block would then have to be regenerated as well, creating completely different hashes. This helped solve the double-spend problem of previous digital currencies and has applications beyond money as well such as identity and intellectual property. 

Hash functions and the Proof-of-Work algorithm make it extremely difficult to alter the blockchain which has brought us to where we are today and helps explain why the idea behind a blockchain has turned into such a buzzworthy business focus. Trying to do so would require that all subsequent blocks are re-mined. Doing this is difficult though since any attempt to monopolize the networks computing power would be expensive due to the machinery need to complete the hash functions.

OP_RETURN

One fascinating use case of a hash function within the Bitcoin protocol is that of the OP_RETURN, a script opcode that is used to mark a transaction output as invalid. Although the protocol caused a bit of an uproar when first announced due to many in the community believing that it was irresponsible since Bitcoins use case was for financial applications and adding this made it a record for arbitrary data, the proposal was eventually accepted. This was the first use case for a digital asset proof-of-ownership and there have been quite a few use cases and apps created since its release, most notably, proofofexistence.

Although the hash function has a been a critical component for cryptography and security software in general, it has started to become a subject more people are learning due to so many newcomers to the cryptocurrency industry. This is a good thing in my opinion for two reasons; first, more people with a better understanding of the security underpinnings of technology will hopefully lead to a demanding of more robust software being built in the future, whether it is based in the cryptocurrency space or not. Second, aspiring developers who are taking what they learn and applying it are also building those security standards into their applications. I know that many people could argue back that this idea is more hopeful than anything, especially with so many hacks taking place among large companies and governments, but only time will tell. We are still in the early stages of attempts by many companies and teams to bring this technology to the mainstream and it is exciting to see many of the results.


Want proof I wrote this post? Take this key and decrypt it using Keybase.