On the path towards scaling Bitcoin to billions of transactions. Part I.

In this new series, I'll go over the recent progress towards making Bitcoin a trusted financial systems vulnerable to monopolies.

Posted by louishenrifranc on July 15, 2017

On the path towards scaling Bitcoin to billions of transactions. Part I.

I’ve recently attended a Machine Learning meetup in London, where i am trask was presenting his research project, at the intersection of machine learning, homomorphic encryption, and blockchain. His project is to create a platform where a user can share their private data, and let companies train machine learning model on them; and at the end of the journey, the user gets back some money for it, and their data is never shared with anyone. If you have the time to read the slides, it is worth it (here is the link).

At the same time, there have been some amazing features that have just been added to the Bitcoin blockchain such as the lightning network. I decided it was the time to write on BlockChain and crypto currency, as they used to be my area of research two years ago.

I’m going to present briefly what a Blockchain is, what Bitcoin is, and how both of them works? If you feel the need to deepen your understanding of Bitcoin/blockchain, I still think this book from Andreas M. Antonopoulos is a reference. I loved the book, the writer is amazing, and I love his state of mind.

So let’s go. :)

Just for the sake of simplicity, when I’ll refer to Blockchain in the future, I’ll be talking about the Bitcoin’s Blockchain!

What is a Blockchain?

Figure 1. A representation of a blockchain with multiple forks

  • It is an underlying protocol of the current implementation of the famous crypto currency Bitcoin. There is no Bitcoin without Blockchain. There could be Blockchain without Bitcoin.
  • A blockchain is a distributed data base, which handles a list of records protected against falsification and modification by the nodes that stored it. Multiple entities can hold the Blockchain, but no entry can be faked. Of course, anyone can try to tamper their local version of the Blockchain, but it will instantly become desynchronized with the rest of the network.

What is a block?

Figure 2: An example of a block Block - Chain. In Blockchain, you add blocks. But what are these blocks? Well for the Bitcoin Blockchain, a block is composed of:

  • A list of transaction that the block verify. When the block is added to the blockchain, this transaction will be considered made, just as when you pay in the supermarket with your credit card.
  • A header identifying the block uniquely. This header contains:
    • The version of the Blockchain
    • A unique block hash. The block hash identifies a block uniquely and unambiguously and can be independently derived from any node by simply hashing the block header.
    • A unique hash identifying the previous block. Same principle. Because blocks are stacked on top of each other, when miners create a new block, they mine on top of another extracted block.
    • A Merkle Tree
    • A timestamp
    • A nonce
    • A level of difficulty.

Merkle Tree, Nonce, Level of difficulty, let me explain them a little bit more :)

What is a Merkle Tree?

Figure 3. A simple Merkle Tree with four transactions A Merkle tree, also known as a binary hash tree, is a data structure used for efficiently summarizing and verifying the integrity of large sets of data. In the blockchain, transactions are saved in the Merkle Tree. The tree is constructed in a bottom-up manner when leaves contain a unique representation of a single Bitcoin transaction, a hash. The top node is used to compute the single block hash.

The Merkle Tree builds on top of this leaves, and the root of the tree becomes a unique identifier of all these transactions. The use of the data structure, is to allow certain nodes that do not hold a version of the blockchain to be able to verify the correctness of any past transaction in a fraction of exchanges. For example, given a transaction, and a unique block hash, anyone can re-compute the path locally from the leave to the top nodes by asking for adjacent nodes (blue node in the next figure).

How Merkle Tree can be used for efficiently verifying any transaction

Proof of Concept for building a Merkle Tree in C++

void Block::BuildMerkleRoot()
{
    // To get a even number of transactions
    if (transactions.size() & 1) {
        transactions.push_back(transactions.at(transactions.size() - 1)); nombreTransaction++;
    }
    int N = transactions.size();

    vector<string> hashTree;
    hashTree.resize(2 * N - 1);
    for (int i = 0; i < N; i++)
        hashTree.at(2 * N - 2 - i) = transactions.at(i);
    for (int i = N - 2; i > -1; i--)
    {
        // Double SHA256 of the nodes
        hashTree.at(i) = SHA25::sha256(SHA25::sha256(hashTree.at(2 * i + 1) + hashTree.at(2 * i + 2)));
    }
    
    // Set the Hash Merkle Root in the header
    header.setHashMerkleRoot(hashTree.at(0));
    // Set the block Hash
    header.setBlockHash(SHA25::sha256(SHA25::sha256(header.get_HashMerkleRoot())));
}

What mining the Blockchain is?

First of all, to make things clear, only a part of the bitcoin network is mining! They do the hard job of keeping the blockchain secure, while as bitcoin user we can safely make the transaction between peers. Without mining, there is no safe blockchain.

Think about mining as a way to assess that some work has been made (i.e., a proof of work). Because mining is computationally expensive, if you can solve the mining problem, you’ll get a reward for it, here bitcoin. That’s is one way for the miner to get paid (not the only one).

Here is the mining problem: Find a nonce \(n\) for a given block \(B\), and a difficulty \(D\), such that:

SHA256(SHA256(headerHash(\(B\) + \(n\)))) < 0.(\(D\) zeros)1000000

SHA256 is a hash function, it is a noninvertible function. It means that if \(f(x) = y\), and \(y\) is known, there is no finite form to compute back what was \(x\). Then, if you want to find out what this \(x\) is, you’ll better start off now because every possible value in the input space can be the correct one. The solution of this problem is completely random.

Proof of Concept for mining in C++

paire Block::solveProofofWork()
{
    unsigned long long nonce = 0, incr = 0;
    unsigned long long const limit = std::numeric_limits<unsigned long long>::max();

    // Here we set the mining difficulty to finding a hash where the first Constant::DIFFICULTY_MINING are zeros.
    string sol(Constant::DIFFICULTY_MINING, '0');
    // Until no solution has been found
    while (true) {
        // Compute a new hash
        string hash = SHA25::sha256(string(header + std::to_string(incr) + std::to_string(nonce)));
        if (hash.substr(0, Constante::DIFFICULTY_MINING) == sol)
            break;
        else
            ++nonce;
        if (limit - 1 == nonce) // Increment the incr when nonce reach unsigned long long int max value
        {    
            // To overcome machine precision limit, if we reach the larget int, we add an increment, which would 
            // modify the hash
            incr++;
            nonce = 0;
        }
    }
    header.setNonce(paire(incr, nonce));

    return paire(incr, nonce);
}

Why are we calling it a proof of work?

  • The number of trailing zeros is currently 14, which corresponds to 10 minutes of computation for a supercomputer, but an eternity for any modern laptop.
  • Every 2000 blocks, the mining difficulty is automatically recomputed.
  • There are approximately 1.000.000.000.000.000.000.000.000 hashes to compute per block.
  • A miner with a hash rate of 1 trillion hashes per second would take 59 days to solve the proof of work

Why this computation problem make the Blockchain secure?

Forging fake blocks is impossible! Remember, a block is correct if it has a reference to a previously mined block. When you search for the correct nonce, you are hashing a mixture of your new block’s header, containing a reference to the previous block hash, and this nonce. If you succeed to find the correct nonce, then you can share it with the community and anyone can checks if the new block is valid. Hence it is unlikely that any miner will find the nonce multiple times in a row before everyone.

What are Bitcoin forks?

  • A fork appears when two miners solve the proof of work at the same time in different part of the network.
  • They propagate their new block to their neighbors. For simplicity, imagine that half of the network receives the first on (red block), the other half, the southern hemisphere receives the other one, the green block.
  • Every node verify the incoming block. If the block is valid, they add it to the blockchain and pass it to their neighbors.
  • The network is now probably cut in two half, where the last block added differs.
  • Half of the miner will decide to mine on top of the green block, the other part, on the red block.
  • If a new block (yellow) block is actually mined first on top of the green one, it will spread again across the network. During that time, both green and red will have been shared to all peers of the network, so “red miner” will realise they received a new block that has a higher height. At this moment they can stop mining on top of the red block, and restart again with the yellow one as predecessor, or keep trying to solve the POF with the red block as the previous one. If they do so, it might appear that this path will be privileged, because some miners were able to mine block quicker on top of this fork.
  • Every mining pool has its optimal way of splitting the mining work, but often, they tried to build different block on top of different branch, to potentially maximize their chance of being the first to mine a block that will get accepted as in the main Blockchain.
  • Given that the miners are always mining on top of the longest chain, a common consensus is established beyond a certain depth.

On the recent improvement of the Bitcoin

This common consensus can take up to one hour before they are irreversible. A new network build on top of the bitcoin has recently been merged to the core code. The Lightning Network is a decentralized system for instant, high-volume micropayments that removes the risk of delegating custody of funds to trusted third parties. In a next article, I’ll explain in depth what are the benefit of this new implementation.