The one last question I have is: if blk0001.dat has all the block data, what does blkindex.dat hold? I would guess it's just headers, but there should only be 12 MB worth of headers. What I downloaded has 170MB.
OK, I looked into this, and blkindex.dat stores both the index to the blocks and the index to the transactions + a few other things (!!!). Here's the relevant code that records a block into the index:
bool CTxDB::WriteBlockIndex(const CDiskBlockIndex& blockindex)
{
return Write(make_pair(string("blockindex"), blockindex.GetBlockHash()), blockindex);
}
In other words, the key is whatever pair<string,uint256> serializes to, and the data is everything in the IMPLEMENT_SERIALIZE block of CDiskBlockIndex (see main.h) That seems to be an index into the blk*.dat file, a block height, a link to the next block on the main chain, and a copy of the block header. For comparison, the code to store other things into this *same* database is
bool CTxDB::WriteHashBestChain(uint256 hashBestChain)
{
return Write(string("hashBestChain"), hashBestChain);
}
bool CTxDB::WriteBestInvalidWork(CBigNum bnBestInvalidWork)
{
return Write(string("bnBestInvalidWork"), bnBestInvalidWork);
}
bool CTxDB::AddTxIndex(const CTransaction& tx, const CDiskTxPos& pos, int nHeight)
{
assert(!fClient);
// Add to tx index
uint256 hash = tx.GetHash();
CTxIndex txindex(pos, tx.vout.size());
return Write(make_pair(string("tx"), hash), txindex);
}