This is a creation in Article, where the information may have evolved or changed.
The series of articles I have put on GitHub: blockchain-tutorial, Updates will be on GitHub, and may not be synced here. If you want to run the code directly, you can clone the tutorial repository on GitHub and go to the SRC directory to execute make
.
In the previous article, we constructed a very simple data structure, which is also the core of the entire blockchain database. The blockchain prototypes that are now being completed can be linked to each other through a chain relationship: each block is connected to the previous block.
However, the blockchain we implement has a huge drawback: adding chunks to the chain is too easy and inexpensive. One of the core blocks of blockchain and bitcoin is that in order to add new chunks, you have to do some very difficult work first. In this article, we will address this shortcoming.
Proof of workload
One of the key points of blockchain is that a person has to go through a series of difficult tasks in order to put data into the blockchain. It is this difficult work that makes the blockchain safe and consistent. In addition, the person who completes the job is rewarded (that is, by mining for money).
This mechanism is very similar to a phenomenon in life: a person must work hard to be rewarded or rewarded in order to support their lives. In the blockchain, the entire network is supported by the continuous work of the participants (miners) in the network, that is, the miners constantly add new blocks to the blockchain and then receive the corresponding rewards. As a result of their hard work, the newly generated chunks can be safely added to the blockchain, which maintains the stability of the entire blockchain database. It is worth noting that the person who has finished the work must prove it, and he must prove that he has done the work.
The whole "hard work and proof" mechanism is called workload proof (proof-of-work). It's not easy to get the job done because it requires a lot of computing power: even high-performance computers can't be done quickly in a short period of time. In addition, the difficulty of this work will increase over time to maintain the speed of about 6 new blocks per hour. In Bitcoin, the purpose of this work is to find a hash of a block, and the hash satisfies some of the necessary conditions. This hash also serves as a testament to the role. Therefore, seeking proof (looking for a valid hash) is the actual thing to do.
Hash calculation
In this section, we'll discuss hash calculations. If you are already familiar with this concept, you can skip this section.
The process of obtaining a hash value for the specified data is called a hash calculation. A hash is a unique representation of the computed data. A hash function enters data of any size and outputs a fixed-size hash value. Here are a few key features of hashing:
- The original data could not be recovered from a hash value. In other words, the hash is not encrypted.
- There can be only one hash for a particular data, and the hash is unique.
- Even changing only one byte in the input data can result in an entirely different hash being output.
Hash functions are widely used to detect the consistency of data. Some software providers publish checksums in addition to packages. After downloading a file, you can use the hash function to calculate a hash of the downloaded file and compare it with the hash provided by the author to ensure the integrity of the file download.
In a blockchain, a hash is used to guarantee the consistency of a block. The hash algorithm's input data contains the hash of the previous block, making it less likely (or, at least, difficult) to modify a block in the chain: because if a person wants to modify the hash of the previous block, then he must recalculate the hash of the block and all subsequent blocks.
Hashcash
Bitcoin uses Hashcash, a proof-of-work algorithm originally used to prevent spam. It can be decomposed into the following steps:
- Take some public data (for example, if it's an email, it can be the recipient's email address; in Bitcoin, it's a chunk header)
- Add a counter to this public data. Counter defaults starting from 0
- Combine the data and the counter (counter) to get a hash
Check if the hash meets certain criteria:
If the condition is met, the end
- If not, increase the counter and repeat steps 3-4
So this is a brute force algorithm: Change the counter, calculate a new hash, check, increment the counter, calculate a hash, check, and so forth. This is also why it is very expensive to calculate, because this step needs to be calculated and checked so repeatedly.
Now, let's take a closer look at the necessary conditions for a hash to be met. In the original Hashcash implementation, it was required that "the first 20 bits of a hash must be 0". In Bitcoin, this requirement is constantly changing over time. Because by design, a block must be guaranteed to be generated every 10 minutes, regardless of whether the computational power increases over time, or if more and more miners enter the network, this requirement needs to be dynamically adjusted.
To illustrate this algorithm, I got the data in a previous example ("I like Donuts") and found a hash of the first 3 bytes that were all 0.
Ca07ca is the 16 binary value of the counter, and the decimal word is 13240266.
Realize
Well, finish the theoretical level and start writing code! First, define the difficulty value of mining:
const targetBits = 24
In Bitcoin, when a block is dug out, "target bits" represents the difficulty of storing chunks in the head. Here 24 refers to the calculated hash of the first 24 bits must be 0, with 16 binary representation, that is, the first 6 bits must be 0, which can be seen in the final output. Currently, it is not possible to implement an algorithm that dynamically adjusts the target, so the difficulty is defined as a global constant.
24 is actually a number that can be arbitrarily taken, the goal is to have a target, which occupies less than 256 bits of memory space. At the same time, we want to have enough differences, but not too big, because the greater the difference, the more difficult to find a suitable hash.
type ProofOfWork struct { block *Block target *big.Int}func NewProofOfWork(b *Block) *ProofOfWork { target := big.NewInt(1) target.Lsh(target, uint(256-targetBits)) pow := &ProofOfWork{b, target} return pow}
Here, we construct the proofofwork structure, which stores pointers to a block and a target. "Target", which is the necessary condition described in the previous section. A large integer is used here, and we compare the hash to the target by first converting a hash to a large integer and then detecting whether it is smaller than the target.
In the newproofofwork function, we will be big. Int is initialized to 1, and then left 256 - targetBits
-shifted. The SHA-256 hash is a bit number, and we are going to use the SHA-256 hashing algorithm. the 16 binary form of target is:
0x10000000000000000000000000000000000000000000000000000000000
It occupies 29 bytes of memory. Here is a formal comparison with the preceding example hash:
0fac49161af82ed938add1d8725835cc123a1a87b1b196488360e58d4bfb51e300000100000000000000000000000000000000000000000000000000000000000000008b0f41ec78bab747864db66bcb9fb89920ee75f43fdaaeb5544f7f76ca
The first hash (based on the "I like Donuts" calculation) is larger than the target, so it is not a valid proof of work. The second hash (based on the "I like Donutsca07ca" calculation) is smaller than the target, so it is a valid proof.
The Translator notes: Some people suggest that the above formal comparison some "words do not conform to reality", in fact, it should not be "I like Donuts", but the original author of the meaning is no problem, may be negligence. Here's a little experiment I did:
package mainimport ( "crypto/sha256" "fmt" "math/big")func main() { data1 := []byte("I like donuts") data2 := []byte("I like donutsca07ca") targetBits := 24 target := big.NewInt(1) target.Lsh(target, uint(256-targetBits)) fmt.Printf("%x\n", sha256.Sum256(data1)) fmt.Printf("%64x\n", target) fmt.Printf("%x\n", sha256.Sum256(data2))}
Output:
You can think of a target as an upper bound of a range: if a number (from a hash is converted) is smaller than the upper bound, then this is valid and the inverse is invalid. Because the requirements are smaller than the upper bounds, they result in fewer valid numbers. Therefore, it is necessary to pass difficult work (a series of repeated calculations) in order to find a valid number.
Now we need to have the data to hash and prepare the data:
func (pow *ProofOfWork) prepareData(nonce int) []byte { data := bytes.Join( [][]byte{ pow.block.PrevBlockHash, pow.block.Data, IntToHex(pow.block.Timestamp), IntToHex(int64(targetBits)), IntToHex(int64(nonce)), }, []byte{}, ) return data}
This section is more intuitive: simply merge the target, Nonce, and Block. Here the nonce , is the above Hashcash mentioned counter, it is a cryptographic academic language.
Well, here's where all the prep work is done, and here's the core of the PoW algorithm:
func (pow *ProofOfWork) Run() (int, []byte) { var hashInt big.Int var hash [32]byte nonce := 0 fmt.Printf("Mining the block containing \"%s\"\n", pow.block.Data) for nonce < maxNonce { data := pow.prepareData(nonce) hash = sha256.Sum256(data) hashInt.SetBytes(hash[:]) if hashInt.Cmp(pow.target) == -1 { fmt.Printf("\r%x", hash) break } else { nonce++ } } fmt.Print("\n\n") return nonce, hash[:]}
First we initialize the variables:
- Hashint is a hash of the plastic representation;
- A nonce is a counter.
Then start an "infinite" loop: themaxnonce limits the loop, which equals math. MaxInt64. This is to avoid possible overflow of nonce . Even though our PoW is too difficult to implement, the counter is unlikely to overflow, but it's best to check it out in case.
In this cycle, we do things that are:
- Preparing data
- Hash the data with SHA-256
- Converts a hash to a large integer
- Compare this large integer to the target
It's as simple as what you said before. Now we can remove the Block 's sethash method and then modify the newblock function:
func NewBlock(data string, prevBlockHash []byte) *Block { block := &Block{time.Now().Unix(), []byte(data), prevBlockHash, []byte{}, 0} pow := NewProofOfWork(block) nonce, hash := pow.Run() block.Hash = hash[:] block.Nonce = nonce return block}
Here you can see that the nonce is saved as a property of Block . This is very necessary because we need to use the nonce to prove the workload later. The Block structure now looks like this:
type Block struct { Timestamp int64 Data []byte PrevBlockHash []byte Hash []byte Nonce int}
All right! Now let's run it to work:
Mining the block containing "Genesis Block"00000041662c5fc2883535dc19ba8a33ac993b535da9899e593ff98e1eda56a1Mining the block containing "Send 1 BTC to Ivan"00000077a856e697c69833d9effb6bdad54c730a98d674f73c0b30020cc82804Mining the block containing "Send 2 more BTC to Ivan"000000b33185e927c9a989cc7d5aaaed739c56dad9fd9361dea558b9bfaf5fbePrev. hash:Data: Genesis BlockHash: 00000041662c5fc2883535dc19ba8a33ac993b535da9899e593ff98e1eda56a1Prev. hash: 00000041662c5fc2883535dc19ba8a33ac993b535da9899e593ff98e1eda56a1Data: Send 1 BTC to IvanHash: 00000077a856e697c69833d9effb6bdad54c730a98d674f73c0b30020cc82804Prev. hash: 00000077a856e697c69833d9effb6bdad54c730a98d674f73c0b30020cc82804Data: Send 2 more BTC to IvanHash: 000000b33185e927c9a989cc7d5aaaed739c56dad9fd9361dea558b9bfaf5fbe
It worked! You can see that each hash starts at 0 of 3 bytes, and it takes some time to get those hashes.
There's only one thing left to do to verify the workload proof:
func (pow *ProofOfWork) Validate() bool { var hashInt big.Int data := pow.prepareData(pow.block.Nonce) hash := sha256.Sum256(data) hashInt.SetBytes(hash[:]) isValid := hashInt.Cmp(pow.target) == -1 return isValid}
Here, we are using the nonce stored above.
Check again to see if it works correctly:
func main() { ... for _, block := range bc.blocks { ... pow := NewProofOfWork(block) fmt.Printf("PoW: %s\n", strconv.FormatBool(pow.Validate())) fmt.Println() }}
Output:
...Prev. hash:Data: Genesis BlockHash: 00000093253acb814afb942e652a84a8f245069a67b5eaa709df8ac612075038PoW: truePrev. hash: 00000093253acb814afb942e652a84a8f245069a67b5eaa709df8ac612075038Data: Send 1 BTC to IvanHash: 0000003eeb3743ee42020e4a15262fd110a72823d804ce8e49643b5fd9d1062bPoW: truePrev. hash: 0000003eeb3743ee42020e4a15262fd110a72823d804ce8e49643b5fd9d1062bData: Send 2 more BTC to IvanHash: 000000e42afddf57a3daa11b43b2e0923f23e894f96d1f24bfd9b8d2d494c57aPoW: true
As can be seen, this time we have three blocks spent more than a minute, much slower than the lack of proof of work (that is, the cost is much higher):
Summarize
Our blockchain is a step further away from the real blockchain: now it takes some hard work to add new blocks, so it's possible to mine. However, it lacks some crucial features: the blockchain database is not persistent, there is no wallet, no address, no transaction, no consensus mechanism. However, all of this, we will be in the next article to achieve, now, happy to dig mine!
Link:
- Full Source Codes
- Blockchain hashing algorithm
- Proof of work
- Hashcash
This article source code: Part_2
Original:
Building Blockchain in Go. Part 2:proof-of-work