With supernode, I'm now getting no thread toggling (1.5 days to scan history). Logging for the thread toggling got removed from headless too, and yet that mode clearly multithreads the scanning workload.
That was way too much verbose anyways. After profiling a bit of profiling it seems thread toggling is pointless. Better off setting all processes to max thread count (as returned by std::thread::hardware_concurrency()). There already is a RAM ceiling coded in, so the different parts of the scan (reading data, parsing, serializing, writing) cannot get ahead of one another. In this case it's simpler to max out thread count for each and everyone of them and let the OS sort things out.
Each part waits on the next one through mutexes and condition variables, so all these threads are sleeping until they're allowed to work again. No harm done and it squeezes as much CPU time as possible. On the other hand toggling is a pain to tune properly. With the current toggler, a mainnet fullnode scan takes me ~8m30. With all thread counts maxed out it takes short of 5m.
I like the "resume initialising from blockfile xxx" behaviour, serious productivity boost when testing supernode.
A lot changed there. The DB is now write ahead only. The previous version would modify earlier entries to mark spent TxOuts. Now it always writes ahead and keeps spentness in a dedicated DB. It speeds the process a lot (reduces rewrites) and guarantees that the DB can overwrite data by starting at the top of the last properly committed batch with no risk of corrupting the dataset.
The one thing that did get corrupted a lot was the balance and transaction count for each address. There's a whole new section of code to handle that now, independently of scanning history. You need context to compute balance, since you are tallying the effect of each TxOut and Txin for each address. The previous version of supernode tallied balance while scanning history. If the DB failed to resume in the exact same state as before it crashed, there was a decent chance a least one balance got corrupted, and that meant rescanning from scratch.
This version separates the 2 processes entirely. It first scans history, then computes balances. This simplifies and speeds up a lot of code. First of all, keeping track of balance at all times creates a lot a rewrites: every time an address appears in a batch, you need to pull the existing balance from the DB, update it and write it back. Before splitting the 2 processes, 0.94 scanned supernode in 4h30. After splitting it, I scanned the history in 1h30 and built balance in 5min.
The good part, besides the speed boost, is robustness. Since the 2 are now separate, I added an option in supernode to run only the balance tallying part for quick fixing a damaged DB. It's called "Rescan SSH". Should fix the DB in 5~20min depending on the machine.
PS: There still is room for some very significant optimization, but I've concluded they are out of the scope of this release.