Thanks for the links guys, will read now.
Here is my current understanding, by the way
The block chain is the main innovation of Bitcoin. Every transaction you make with a Bitcoin, adds a block – a copy of this ledger, to every BTC in existence.
It is clear that you have a LOT to learn about bitcoin, and not just about cryptography.
A block is a collection of one
or more transactions. Creating a transaction does not add a "block" to the blockchain. Miners add blocks to the blockchain by grouping together some transactions, creating a header, and performing a proof-of-work on the header.
A block chain is a transaction legder. A full copy of a currency's block chain contains every transaction ever executed in every Bitcoin in the currency.
Ok, now you're on the right track. This is basically correct. It would be more correct to say that a full copy of the blockchain contains every
confirmed transaction ever executed in that cryptocurrency. Unconfirmed transactions are not yet in the blockchain.
When you want to spend a BTC, the system uses asymmetric cryptography – where you have both a public key (like a PO Box) and a private key – a signature to say you authorise spending of that BTC (like a key to the PO Box).
I've heard the PO BOX analogy before. It isn't horrible for a very high level description, but it can cause confusion when someone doesn't let go of the analogy as they try to understand the protocol at a detailed level.
When you spend a BTC, the BTC network generates a mathematical puzzle to solve, in order to approve & add that transaction to the Block chain. This puzzle is so difficult, it takes the worldwide network an average of 10 minutes to solve – so it is mathematically impossible for one person (or group) to fraudulently create transactions.
The large data puzzle might be “find the integer square root of 2.16 to power of 37”.
Ok, now you are straying off a bit.
When you spend some BTC, the transaction is relayed peer to peer until eventually nearly the whole network is aware of the transaction. When miners (which are working on a proof of work for the block of transactions that they are currently attempting to add to the blockchain) hear about your transaction they add your transaction to the pool of transactions that they can choose from for their blocks. Eventually some miners include your transaction in a block that they are working on. If any of those miners are the first to solve the proof of work, then their block (with your transaction) is added to the blockchain.
The "mathematical puzzle" is not difficult, it is just time consuming. As an analogy, imagine that I give you two hundred 6-sided dice, and ask you to continually roll them all together until you roll at least fifty sixes in a single roll. This isn't a "difficult" puzzle, but it will take you a while to accomplish. Most of your rolls will have about thirty-three sixes. Some will have less, and some will have more. Now imagine a group of people all doing the same thing as you. You are all racing to see who can roll fifty sixes first. Obviously the faster an individual rolls their dice, the more likely they will be to get lucky (since they'll have more attempts than others). This is similar to an individual miner having more hash power than others. Also, as more people participate in the contest, it becomes more likely that someone, somewhere, will succeed in rolling fifty sixes. This is similar to more people getting involved in mining in the world.
In Bitcoin, instead of dice, the miners are calculating a SHA-256 hash. The results of a SHA256 hash are always the same for a given input, but unpredictable until it has been computed. Since the result is unpredictable, calculating the hashes is a bit like rolling dice. To set the "difficulty", the protocol requires that the result of the SHA256 hash be lower than some value. If it isn't, then the input is adjusted, and the hash is calculated again. This is repeated by all miners globally (each with their own inputs) until someone, somewhere, finds a hash that has a value lower than the current target.
The protocol adjusts the "difficulty" (the number of sixes that must be rolled in our analogy) based on how quickly the previous 2016 solutions were found. If it took more than 20160 minutes (average of 10 minutes per block) then the blocks took more than 10 minutes each, so the "puzzle" is too difficult. The protocol reduces the "difficulty" so that the blocks will be solved faster by setting a higher value target. In our analogy, this would be like reducing the number of sixes that must be rolled to "win". If it took less than 20160 minutes then the blocks took less than 10 minutes each, so the "puzzle" isn't difficult enough. The protocol increases the "difficulty" so that the blocks will be solved slower by setting a lower value target. In our analogy, this would be like increasing the number of sixes that must be rolled to "win".
The solution to this is called a Hash, which basically means short hand for a very long answer – usually in 10s of decimal places.
No. A hash is a particular and well defined mathematical process. A hash is essentially a "digest" or "shortened representation" of the input.
An example of a really weak hash would be "Add up all the digits, then repeatedly add up all the digits of the results until only a single digit remains" (lets call it the ADEMUP hash).
So, I could calculate the ADEMUP of 172538:
1 + 7 + 2 + 5 + 3 + 8 = 26
2 + 6 = 8
The ADEMUP of 172538 is 8.
Everyone in the world would get the exact same answer if they calcuated ADEMUP on 172538
Meanwhile, if I change the input to 172537:
1 + 7 + 2 + 5 + 3 + 7 = 25
2 + 5 = 7
The ADEMUP of 172537 is 7
This is a hash, but it isn't a very good one for cryptographic purposes. The result is too predictable. If I reduce the input by 1, then I reduce the output by 1. If I wanted to generate an ADEMUP of 1, it is clear that all I have to do is subtract 5 from the 172538 input. A cryptographically useful hash will have a result that, while repeatable for any given input, is unpredictable until you've actually calculated it. Bitcoin uses the SHA256 hash, which is well defined set of binary calculations on a value. If you are familiar with how to calculate the following on the bits of a value: OR, AND, XOR then you can calculate SHA256.
Further security is provided because each subsequent transaction in the world, is built on the Hash of the previous block.
You are getting transactions and blocks mixed up. Transactions are built on the output of a previous transaction. Blocks are built on the hash of a previous block.
Modifying the data of the previous block (EG to say I didn’t spend 1,000 BTC so I can keep them, after buying something), even by one bit, completely changes the Hash used as a solution to the transaction, and the source of all subsequent data puzzles for all other transactions.
Again you are confusing transactions and blocks. A block is a "block of transactions". In other words a group of one or more transactions. Since the transactions of a block are used as part of the input when solving the hash of the block, modifying a transaction changes the input for that block. Changing the input of the block, completely changes the hash used as the solution to the block. Since the hash solution of each block is used as part of the input to the next block, completely changing the hash used as the solution to a block completely changes the input when solving the hash of the next block. Therefore, it becomes necessary to re-solve the block that includes the transaction that you are trying to modify AND all subsequent blocks until you get caught up with the number of blocks that the rest of the world sees as the "current blockchain".
So basically trying to reverse a payment requires you to re-write every transaction ever made worldwide. Hence mathematically impossible.
Again, getting mixed up between blocks and transactions.
Trying to reverse a confirmed payment (a payment that is in a block) requires you to re-write every block solved after the one you are trying to modify.
This is so expensive in terms of CPU power (taking millions of computers) that using just a fraction of that CPU power to legitimately solve Hash algorithms (mine) is far more profitable (and also legal)
This is generally true, but it depends on how large the transaction is that you are trying to adjust and how many blocks have been solved since that transaction was included in a block. This is why it is recommended that users wait for multiple confirmations on larger value transactions. Each "confirmation" just means "another block added to the blockchain after your transaction". The more blocks, the more expensive it is to have any chance of catching up with the global network.
BitCoin uses the SHA-256 hash algorithm to generate verifiably "random" numbers in a way that requires a predictable amount of CPU effort.
Correct. Using the data from the header of the block as input, and adjusting a value in the header called a "nonce" (a value that exists only for the purposes of being adjusted so that the input to the SHA256 can be varied), Bitcoin uses SHA-256 to generate "random" numbers and sets a target value that the result of the SHA-256 hash must be less than to qualify as a "solution". By adjusting the target value, Bitcoin can adjust the amount of hashes (on average) that must be calculated before a low enough value is likely to be found. Because it is extremely fast and easy for a computer to calculate a single SHA-256 value, it is easy for all peers on the entire Bitcoin network to quickly validate that a "solution" that is broadcast is correct.