Tuesday, July 5, 2011

How Bitcoin Works: A Technical Discussion

There's a lot of layman speak around for how Bitcoin (aka BTC) and Bitcoin "mining" works, but little that covers it without glossing over the details. So here's a guided tour of some of the gory details.

(The Bitcoin Wiki has most of this information, but it doesn't offer anything like a guided tour, which is what this attempts to be.)

Basics.

1. Public Key Cryptography. If you use the same password to encrypt something as you used to decrypt something, this is called "Symmetric-Key Encryption". With "Public Key Encryption", two different passwords are created; anything encrypted with one can be decrypted by the other. In practice, one is selected to be your "private" key, and kept secret; the other is designated a "public" key and broadcasted to the world.

2. Hash. Hashing functions are computer algorithms designed to take any arbitrary input of any length and produce a numeric derivative of that input. Hashes are repeatable; for the same input, you will always get the same hash. Cryptographically strong hash functions (or "one-way hashes") go a step further: A change of just one bit on the input (no matter the length) is expected to change around 50% of the output bits; also, it's expected that it's near-impossible to take any given hash and produce a string that would have created it. (That last property isn't important for Bitcoin, but is important for things like digital signatures.) Bitcoin uses SHA-256, which produces a 256bit hash of anything input to it.

3. Digital Signatures. If you take a hash of a document, then encrypt that hash with your private key, you've created a digital signature. If you attach that digital signature to the document, then anyone with your public key can create their own hash of that document, decrypt your hash using your public key, and compare. If the hashes match, then (a) that document must have come from you, and (b) must not have been altered. (Otherwise the decrypted hash wouldn't match the generated hash.)

4. Wallets and Accounts. These are constructs created by the Bitcoin clients that really have nothing to do with how Bitcoin actually works, except insofar as wallet accounts have one or more Addresses.

5. Addresses. Bitcoin addresses are really public keys. [Strictly speaking, they're hashes of public keys.  Thanks to niko for that clarification.] You can send BTC to someone (in a Transaction) at their Address, and everyone on the network can see that you did so, but only the holder of the address's private key (which is stored in their Wallet) can actually spend the BTC awarded to you by that transaction.

6. Transactions. BTC doesn't exist at addresses; BTC only exists in transactions. Any time you spend BTC, you are (completely) draining one or more "input transactions" and sending that BTC to one or more "output addresses." Note that the inputs are transactions, and the outputs are addresses. You can't partially spend ("redeem") an input transaction; if there's BTC left over, you send the "change" back to yourself at an address you own (typically created just for that transaction.) Transactions are finalized when they're included in Blocks. Including a Transaction Fee helps ensure a miner will include you in their Block. You prove that you're allowed to spend the input transaction(s) by digitally signing the transaction(s) you're spending.  (And by "you", I mean your Bitcoin client software.  This all happens without you having to worry about it.)

Mining and Blocks.

1. Miners are Bitcoin's professional auditors.  The point of Bitcoin Mining is to create Blocks of transactions (that haven't yet been included in a block.) At the time of this writing, the creator of a Block is awarded 25BTC, but this will decrease over time. The block creator is also awarded any Transaction Fees attached to transactions included in the block. Eventually, new blocks will stop creating BTC altogether, and miners will be rewarded solely through transaction fees.  The blocks are in a sequence known as the "block chain."  Sometimes two miners will create a block at the same time, which is known as a "block chain split" or "fork."  Whichever side of the split grows faster quickly becomes the "winner" and becomes the official transaction ledger for Bitcoin going forward.

[If you've wondered what a "51% attack" is, it's a mostly-theoretical attack where someone controlling more than half of the hashing power of the network could spend Bitcoin, then create a fork that spends the same transactions a different way.  They force the network to accept their reality by growing their chain faster than the "real" one, so their original transaction would be unspent.  Presumably, after the actor has received whatever product or service they'd bought with the original transaction that they've just unspent/reallocated.]

2. Difficulty. Mining consists of attempting to find a hash of: the previous block, the current set of un-blocked transactions, a transaction indicating where the block reward+fees should be paid to, and a nonce ("nonce" = "number used once"). Since any change to the nonce will very randomly change the resulting hash, Mining involves trying (literally!) trillions of different nonce values in order to find a hash that's numerically below the current difficulty number, thereby "solving" the block. In theory, the first hash you try could solve the current block; but in practice it generally takes many, many, many, many tries. Note that every hash has an equal chance of solving a block; so while there's an average time to solve a block, there's theoretically no limit to the amount of time it might take to solve a block. Fortunately we have the Law of Large Numbers working on our side.

3. Invalid Blocks. The block chain that represents the entire history of all transactions in the Bitcoin network is linear; there can be no [permanent] branches. If two miners solve a block at about the same time, a branch occurs, and miners may begin building on either of the two. Whichever block chain grows faster becomes the "valid" block chain for the network, and the blocks in the "losing" chain become invalid. This is why there's a 100-block maturation time before block rewards and transaction fees may be spent; it's insurance against those BTC becoming invalid in the meantime.

4. Pools. Because solving blocks is so ridiculously hard, you can share the work with others by joining a Pool. When you join a pool, you work alongside many other miners to attempt to solve the pool's current block. The block you're attempting to create includes the transaction that distributes both the block reward  and the block's accumulated transaction fees. "Shares" are hashes that are numerically below a much lesser difficulty level than the block difficulty; the pool can easily verify that your "share" hash is valid for the block (though otherwise useless); this proves that you've been working on that pool's block, and (through share quantity) gives a reasonably accurate depiction of how much work you've been contributing. (Note: because the pool's block-in-progress includes the transaction that pays the pool, someone can't take their "winning" hash and steal the block reward from the pool unfairly.)

Discussions about the viability of Bitcoin for any particular purpose are political, not technical, and are outside the scope of this document.

 Other Resources.

The Bitcoin Wiki: https://en.bitcoin.it/wiki/Main_Page
A more conversational description of mining: http://monedial.com/index.php/en/2012-12-21-20-51-25/general-information-sites-about-bitcoin/how-are-bitcoins-created-what-is-bitcoin-mining