Thanks for this, which was a useful summary. Looking at the tree structure of transactions posted on blockchain.info, many of them branch into two and sometimes more branches from the original value (what is called the transaction input value I suppose), and often in a two branch tree one branch will be spent and another is unspent. I would suspect the unspent new child node is often the "change" value that is returned to the payer / sender of funds under a newly generated identifier / address, if I understand this process correctly. I'm not sure under what circumstances a parent node can split into more than three child nodes however.
A transaction is supplied value from one or more "inputs", which are just a list of previously unspent "outputs". A transaction assigns that value to one or more outputs. Each output is an address and a value assigned to that address. The rules of the protocol (that are enforced by every peer that relays the transaction as well as every miner that attempts to add the transaction to a block) require that the total value of all inputs in a transaction must be greater than or equal to the total value of all outputs in the transaction.
That tree view you are looking at is an abstraction that tries to help you visualize the movement of value as transactions occur. It doesn't show you exactly what is going on in each transaction. If a transaction receives value from multiple inputs that were all associated with the same address, then the "tree view" totals them up and shows them as a single "node" even though they are actually separated in the blockchain. If a transaction receives value from multiple inputs that are associated with
different addresses, then the "tree view" only displays the value provided to the transaction from the address you are viewing and ignores the value from the other addresses.
So while it is a useful tool for tracking the movement of value, don't rely on that tree view to understand what is actually happening within transactions or within the blockchain.
Note that I stated earlier that a transaction assigns value to one ore more outputs. In Bitcoin-Qt you'll see, in the "Send coins" window, an "Add Recipient" button. This button allows you to add additional outputs to your transaction. Doing so would cause the tree view to split into more nodes, since the transaction would assign value to each of the recipients as well as the change back into the wallet at a new address.
Even the things I'm saying so far are an abstraction. While useful for understanding how the blockchain works, my description of transactions outputs as "a value assigned to an address" is an oversimplification. In reality, a transaction output is a value and a set of instructions (known as a script) describing what criteria must be met to include the output as an input in another transaction. In the typical case, these instructions set up a requirement that a signature with a private key associated with a particular address must be provided in order to include the output as an input to another transaction. As an abstraction, we say that the output assigned the value to that address.
Today I started up the Bitcoin-QT client manually and it took about a good 3 or 4 minutes for the client to (appear to) search through the entire downloade block chain (which is about 10.5 GB at present). I was surprised it needed to do this has I currently have an empty wallet which should mean a minimum number of used keys in the client's default keypool count of 100 keys.
There are a variety of things the Bitcoin-Qt application does as it starts up to verify the integrity of the blockchain, and the wallet.dat file. Bitcoin-Qt makes sure that the data haven't been corrupted or manipulated while it wasn't running. Much of the start up time was spent on that verification.
It then proceeded to update the less than one day of blockchain the client was behind on. Because of this long wait, it would seem that the QT client isn't quite a quick starting application in terms of user friendliness (I'm aware of other newer thin clients that don't fetch the entire blockchain database)
It is clear that the priority for the developers of Bitcoin-Qt as a reference client is maintaining the security and reliability of the protocol and not user-friendliness. There appears to be an expectation that other wallets will fill the user-friendliness gap.
Based on your description of a BTC wallet structure and the transaction format of BTC, it would appear that the WALLET.DAT file would also increase in sizet gradually as more user transactions occur and the number of identifiers increases in the wallet. Do I understand it correctly that a wallet.dat file will eventually accumulate all used keys since the start of transaction with that particular wallet.dat file?
I'm not sure I understand your question correctly, but unless you manipulate the contents of the wallet.dat file, it will accumulate all the keys that are created by the wallet program while connected to that wallet.dat file, regardless of whether you use those keys or not. If you click the "New Address" button in the "Receive coins" window 30 times, then the wallet.dat file will accumulate an additional 30 addresses. If you send another 10 transactions that each have change, then the wallet.dat will accumulate another 10 addresses.
I'm a bit confused on new key generation for the keypool after the current keypool's new key pairs are used up. When that occurs, does the QT client generate one new key at a time as needed when new keys are required / requested by the user, or will the client immediately generate a new batch of new keys in order to maintain the available number of new key pairs in the key pool at the amount specified by -keypool parameter?
Bitcoin-Qt does not fill the key pool in batches. It does it immediately as keys are removed from the pool. So if you click the "New Address" button, the wallet program pulls the next key from the key pool queue, and at the same time it generates a new random key and adds it to the key pool queue, so that the number of keys in the pool remains constant. Change addresses are handled the same way (pull the change key from the front of the queue, and simultaneously generate a new random key to add to the back of the queue).
I understand that depending on the keypool size, occurance of key use frequency and also the frequency of WALLET.DAT back up by the user, the degree of risk of potential new keys lost when restoring from a recent back up will be affected. I guess that will depend on how recent the back up was made and how many new keys have been generated after the most recent WALLET.DAT back up took place. Based on this, it would seem that new key loss risks from WALLET.DAT restoration from a recent back up could be reduced by increasing the key pool size. Would this be correct?
Correct. It can be reduced by either backing up more frequently, or by increasing the keypool size. I try to keep a general idea of how many addresses have been used (the sum of how many times the "New Address" button is clicked and the number of outgoing transactions) since my last backup and then create a new backup when the number of addresses used is about one fourth of the keypool size. Then I keep the 3 most recent backups. This allows me to recover all of my currently used keys from any of my three backups if one of them becomes damaged or otherwise unusable.