Bitcoin Forum
November 18, 2024, 01:25:09 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Bitcoin Core 0.12 full initial sync: results and analysis  (Read 5249 times)
mmgen-py (OP)
Member
**
Offline Offline

Activity: 112
Merit: 27


View Profile WWW
February 24, 2016, 11:16:04 AM
Last edit: February 24, 2016, 12:57:23 PM by mmgen-py
Merited by ABCbits (2)
 #1

I just did a full initial sync of the blockchain on a low-end laptop running Bitcoin Core 0.12.  Here are the results:

Hardware:
   CPU:   Pentium 4-core N3540 @2.16GHz
   Disk:  Seagate 5400 RPM 16MB Cache SATA 6.0Gb/s
   RAM:   4GB

Software:
   Bitcoin Core v0.12 (bitcoind) / gcc 5.2.1 / Linux 4.2

Internet connection:
   3.6 Mbyte/sec

Sync summary:
   Block height: 399811
   Sync time: 35:32
   Avg CPU load: 47%
   Max RAM usage: 640 MB (16%)
   Max net usage: 2.4 Mbyte/sec (67%)

At first, I should mention that 35.5 hours to download and verify the blockchain (currently ≃67 GB) on a budget laptop is a good result, exceeding my expectations.  The speedups afforded by libsecp256k1 undoubtedly played a large role here.

There are some concerning issues, however:  As the sync process proceeded, disk and CPU usage increased and progress became slower and slower.  This slowdown is reflected in the progress summary below:

   PROGRESS  TIME   TIME(%)
   --------  ----   -------
   0%:       00:00  0.0%
   5%:       01:34  4.4%
   10%:      03:04  8.6%
   15%:      05:07  14.4%
   20%:      05:55  16.7%
   25%:      06:36  18.6%
   30%:      07:17  20.5%
   35%:      07:59  22.5%
   40%:      08:53  25.0%
   45%:      09:45  27.4%
   50%:      10:42  30.1%
   55%:      11:46  33.1%
   60%:      12:59  36.5%
   65%:      14:19  40.3%
   70%:      15:40  44.1%
   75%:      17:30  49.2%
   80%:      19:43  55.5%
   85%:      21:54  61.6%
   90%:      25:03  70.5%
   95%:      29:16  82.4%
   100%:     35:32  100.0%

As can be seen from the table, it took longer to sync the last 25% of the blockchain than it did to sync the first 75%.  Even more dramatically, almost 30% of total sync time was spent on just the last 10% of the blockchain.  DISCLAIMER: Progress is estimated by bitcoind (in the function GuessVerificationProgress) by computing the number of remaining transactions in the blockchain using a complex formula.  Inaccuracies in these progress estimates might skew the data and lead to inaccurate conclusions.

During the last 10% of the blockchain sync, both the WAIT column of "top" and the furious clicking of disk heads revealed that disk activity was becoming a quite obvious bottleneck, leading to the CPU going idle for longer and longer periods.

When not idle, CPU usage (which was low at the beginning of the sync process), reached occasional peaks of around 90% in this final stage.  Most of the CPU time during these peaks was being consumed by signature-verifying threads (i.e.  libsecp256k1), so there's probably little room for further optimization here.

Network bandwidth peaked at around 67% capacity and was therefore never a limiting factor.

Another curious pattern that emerged during the later stage of the sync process were long delays (4-5 min) during which data was still coming in through the network at a high rate but both disk and CPU were relatively idle and little or no chain updating was occurring.  This can be seen in the following excerpt of block polling data:

   TIME      BLOCK   NEW BLOCKS
   ----      -----   ----------
   22:36:25  321977  148
   22:37:29  322112  135
   22:38:29  322212  100
   22:39:29  322345  133
   22:40:39  322467  122
   22:41:48  322599  132
   22:42:52  322740  141
   22:43:53  322885  145
   22:44:53  322979  94
   22:45:53  322979  0
   22:46:53  322979  0
   22:47:53  322979  0
   22:48:53  322979  0
   22:50:42  323125  146
   22:51:57  323324  199
   22:52:57  323467  143
   22:54:05  323631  164
   22:55:12  323791  160
   22:56:29  323996  205
   22:57:29  324045  49
   22:58:39  324085  40
   22:59:39  324107  22
   23:00:39  324112  5
   23:02:00  324193  81
   23:03:17  324342  149
   23:04:28  324491  149
   23:05:49  324671  180
   23:06:50  324835  164
   23:08:10  325037  202
   23:09:21  325185  148
   23:10:23  325287  102
   23:11:23  325376  89
   23:12:23  325402  26
   23:13:24  325402  0
   23:14:24  325402  0
   23:15:37  325439  37
   23:16:40  325600  161
   23:17:46  325727  127
   23:18:48  325904  177
   23:20:20  326114  210
   23:21:24  326236  122
   23:22:31  326349  113
   23:23:37  326441  92
   23:25:01  326502  61
   23:26:01  326503  1
   23:27:01  326503  0
   23:28:01  326503  0
   23:29:01  326504  1
   23:30:04  326598  94
   23:31:07  326756  158
   23:32:07  326917  161
   23:33:20  327087  170
   23:34:46  327259  172
   23:35:50  327378  119
   23:36:50  327506  128

The gaps became longer as syncing progressed (and slowed):

   TIME      BLOCK   NEW BLOCKS
   ----      -----   ----------
   09:46:49  397357  59
   09:48:00  397397  40
   09:49:00  397423  26
   09:50:01  397442  19
   09:51:19  397498  56
   09:52:28  397541  43
   09:57:52  397616  75
   09:58:54  397660  44
   09:59:54  397663  3
   10:00:54  397663  0
   10:01:54  397663  0
   10:02:54  397663  0
   10:03:54  397663  0
   10:04:54  397663  0
   10:06:50  397720  57
   10:08:27  397779  59
   10:09:27  397779  0
   10:10:27  397779  0
   10:12:06  397817  38
   10:13:08  397857  40
   10:14:30  397888  31
   10:16:49  397952  64
   10:17:57  397982  30
   10:19:16  398042  60
   10:20:43  398098  56
   10:22:03  398146  48
   10:23:38  398193  47
   ...
   10:42:44  398692  23
   10:44:13  398729  37
   10:45:19  398756  27
   10:46:24  398782  26
   10:47:33  398805  23
   10:48:45  398834  29
   10:49:47  398860  26
   10:51:38  398910  50
   10:52:38  398914  4
   10:53:38  398914  0
   10:54:38  398914  0
   10:55:38  398914  0
   10:56:47  398924  10
   10:57:52  398956  32
   10:59:17  398991  35
   11:00:18  399028  37
   11:01:38  399073  45
   11:02:45  399110  37
   11:03:45  399117  7
   11:04:46  399117  0
   11:05:46  399117  0
   11:06:46  399117  0
   11:07:46  399117  0
   11:08:47  399135  18
   11:09:47  399171  36
   11:11:02  399214  43
   11:13:28  399320  106
   11:15:31  399394  74
   11:16:54  399440  46
   11:18:43  399503  63
   11:20:41  399575  72

Since these "dead spots" can be attributed neither to hardware nor the network, they must be the result of something sub-optimal in the software.  At quick glance, it appears that a 20-30% speedup could be achieved by fixing this problem, so there's work for the core devs to do here.

Conclusion

Even if we discount possible inaccuracies in progress estimates, it's clear from these results that sync time is not a linear function of block chain size but an exponential one.  This has serious implications for Bitcoin's scalability: even with no block size increase, sync times will continue to grow exponentially as the years pass.  If consumer-grade hardware is unable to keep pace with this growth (which can't be ruled out, now that Moore's Law has all but officially been declared dead), it may become impossible for ordinary users to run full nodes at some point in the not-so-distant future.
watashi-kokoto
Sr. Member
****
Offline Offline

Activity: 687
Merit: 269



View Profile
February 24, 2016, 12:03:31 PM
 #2

it's clear from these results that sync time is not a linear function of block chain size but an exponential one. 

You are wrong. the initial checking is fast because signature until year 2011 are not checked at all if I remember correctly.
mmgen-py (OP)
Member
**
Offline Offline

Activity: 112
Merit: 27


View Profile WWW
February 24, 2016, 12:10:33 PM
Last edit: February 24, 2016, 12:53:34 PM by mmgen-py
 #3

it's clear from these results that sync time is not a linear function of block chain size but an exponential one.  

You are wrong. the initial checking is fast because signature until year 2011 are not checked at all if I remember correctly.

The progress estimating function takes this into account, separating "expensive" (after last checkpoint, signature verification) from "cheap" (pre-last checkpoint, no signature verification) transactions, with the former given 5x weight over the latter.  Perhaps this factor needs to be adjusted.
belcher
Sr. Member
****
Offline Offline

Activity: 261
Merit: 523


View Profile
February 24, 2016, 01:00:03 PM
Merited by ABCbits (1)
 #4

Did you use the -dbcache parameter? It can speed up sync by storing parts of the UTXO set in RAM instead of on hard disk. You have 4GB so there is space to keep at least some of the UTXO set.

1HZBd22eQLgbwxjwbCtSjhoPFWxQg8rBd9
JoinMarket - CoinJoin that people will actually use.
PGP fingerprint: 0A8B 038F 5E10 CC27 89BF CFFF EF73 4EA6 77F3 1129
mmgen-py (OP)
Member
**
Offline Offline

Activity: 112
Merit: 27


View Profile WWW
February 24, 2016, 01:20:01 PM
 #5

Did you use the -dbcache parameter? It can speed up sync by storing parts of the UTXO set in RAM instead of on hard disk. You have 4GB so there is space to keep at least some of the UTXO set.

Thanks, a guy on Reddit just suggested the same thing.  I may give it a try and post the results.
https://www.reddit.com/r/Bitcoin/comments/47c5sc/bitcoin_core_012_full_initial_sync_results_and/
capcher
Full Member
***
Offline Offline

Activity: 224
Merit: 100

This user is currently ignored.


View Profile
February 24, 2016, 01:21:27 PM
 #6

Did you do the same benchmark for 0.11.2? Wondering if some of the behaviors could be normal for your specs.
jl777
Legendary
*
Offline Offline

Activity: 1176
Merit: 1134


View Profile WWW
February 24, 2016, 01:22:34 PM
Merited by ABCbits (2)
 #7

The first 200K blocks are mostly empty and represents about 10% of total size, maybe less
The big increase in size happens around block 250k and then I think around 350k

So it is normal for it to zoom through the first 50% to 75% and then slow down.

With a parallel sync, you can sustain network bandwidth speeds but that requires a lot of custom data files and multipass processing of the data. Not sure it would be practical using a DB based system.

The pauses are most likely due to waiting for data for a specific block. Depending on how many blocks are requested to a peer, it could be a long pipeline and since the order of the blocks being returned is at the discretion of the other peer (if you send request data with a long list), you might be getting a lot of blocks out of order. What I ended up doing is request one specific block at a time, so there is no long queue from a specific peer. Still I see bandwidth drops to 50% saturation at time, as peers disconnect or are just slow sometimes and until there are enough pending tasks to keep things saturated, it will be a bit inconsistent.

It is very tricky to keep things streaming.

The pruning of 0.12 is a great improvement.

James

P.S. Certainly if things are stuck updating a HDD based DB write, with all that entails, that could stop things dead in its tracks if it triggers some data movements, which DB's are prone to do

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET
jl777
Legendary
*
Offline Offline

Activity: 1176
Merit: 1134


View Profile WWW
February 24, 2016, 01:24:08 PM
 #8

Did you do the same benchmark for 0.11.2? Wondering if some of the behaviors could be normal for your specs.
I saw similar times from 0.11, I dont think there was anything to change performance a lot. Unless you consider not using up all of RAM and VM and thrashing all over the place without pruning. the pruning is fantastic. I can finally run bitcoind on my laptop again

James

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET
RealBitcoin
Hero Member
*****
Offline Offline

Activity: 854
Merit: 1009


JAYCE DESIGNS - http://bit.ly/1tmgIwK


View Profile
February 24, 2016, 01:26:32 PM
 #9

Free bump from me, this topic sounds very important, i hope core devs see this and consider it, we can make bitcoin great!

jl777
Legendary
*
Offline Offline

Activity: 1176
Merit: 1134


View Profile WWW
February 24, 2016, 01:32:46 PM
 #10

Free bump from me, this topic sounds very important, i hope core devs see this and consider it, we can make bitcoin great!
iguana already does the parallel sync and doing while fully saturating the bandwidth, as long as you have enough CPU to keep up.

On a beefy server with 1gbps connection (i know not typical) it is downloading the entire 60GB in about 12 minutes, but taking about 30 minutes to an hour to process it. That is with 8 cores.

For a more typical home connnection of 20mbps, it takes 6 hours and even with 1.4Ghz CPU it seems to mostly keep up with things, but I havent verify the latest iteration does and probably the CPU speed will be an issue if using just one core. Using 4 cores, the 24hrs of CPU time is plenty

James

P.S. iguana is an open source project in the "almost beta" stage

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4284
Merit: 8808



View Profile WWW
February 24, 2016, 01:38:30 PM
Merited by ABCbits (3)
 #11

Those pauses aren't pauses. You're going by log entries, which only note when verification has completed. Verification can only proceed in order (block _chain_, after all). When a peer is slow sending a block, you'll continue to get blocks from other peers in parallel but the rest of the pipeline will have to wait until the slower block finishes. Once it does it will catch up, and you'll get a burst of messages. If a peer truly and persistently stalls the process it will be kicked off. So long as your network is staying active, it's doing what it can. (There are times during IBD where network activity does fall down, which can be improved... but it sounds like your IO is slow enough that this isn't a factor for you.)

35 hours is a horribly long time, though I'm sure much better than prior versions.

The primary cause being disk in your system is very slow (5400 RPM laptop drive being tens of times slower than a modern SSD).  If you were to run with a larger dbcache setting you would likely see vastly improve performance-- the defaults are sized to keep the software still runnable on systems with fairly low memory. (It can't really set it automatically, because, unless you had much more ram than you do, bitcoin would end up hogging most of your memory when you wanted to use your computer for other things-- the fact that you have more memory doesn't mean it's okay for Bitcoin to use it)

By comparison, I normally measure under 4 hours on a fast desktop with a home internet connection (with dbcache cranked up).
mmgen-py (OP)
Member
**
Offline Offline

Activity: 112
Merit: 27


View Profile WWW
February 24, 2016, 01:42:53 PM
 #12

The first 200K blocks are mostly empty and represents about 10% of total size, maybe less
The big increase in size happens around block 250k and then I think around 350k

So it is normal for it to zoom through the first 50% to 75% and then slow down.

Bitcoind's progress estimation is transaction-based, not block-based.

The pauses are most likely due to waiting for data for a specific block. Depending on how many blocks are requested to a peer, it could be a long pipeline and since the order of the blocks being returned is at the discretion of the other peer

Thanks, that probably explains the pauses.  And thanks for the other points in your informative post.
jl777
Legendary
*
Offline Offline

Activity: 1176
Merit: 1134


View Profile WWW
February 24, 2016, 01:52:54 PM
Last edit: February 24, 2016, 02:06:42 PM by jl777
 #13

The first 200K blocks are mostly empty and represents about 10% of total size, maybe less
The big increase in size happens around block 250k and then I think around 350k

So it is normal for it to zoom through the first 50% to 75% and then slow down.

Bitcoind's progress estimation is transaction-based, not block-based.

The pauses are most likely due to waiting for data for a specific block. Depending on how many blocks are requested to a peer, it could be a long pipeline and since the order of the blocks being returned is at the discretion of the other peer

Thanks, that probably explains the pauses.  And thanks for the other points in your informative post.
most of the early blocks have very few transactions. Just not much crypto going on in 2010

And the estimate for the total number of transactions is probably updated based on the blocks it has seen so far, so it wont know they are blocks full of transactions until it gets there.

I take the above back, I am seeing the progress bar at block 350K which is 85%+ done but the progress bar is nowhere close to that, so it must be estimating the number of tx somehow.

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET
mmgen-py (OP)
Member
**
Offline Offline

Activity: 112
Merit: 27


View Profile WWW
February 24, 2016, 02:04:18 PM
 #14

Those pauses aren't pauses. You're going by log entries, which only note when verification has completed.

Of course that explains it.  Otherwise there would be out-of-order entries in the log, which there never are.  It doesn't explain the lack of CPU activity though during the pauses.  Sigs could be checked in the order received, couldn't they?

If you were to run with a larger dbcache setting you would likely see vastly improve performance

I may try that -- and post the results for comparison purposes.

It can't really set it automatically, because, unless you had much more ram than you do, bitcoin would end up hogging most of your memory when you wanted to use your computer for other things

A dynamic dbcache setting might be OK if the limit were set conservatively. A default setting should never be seriously sub-optimal, in my opinion.
jl777
Legendary
*
Offline Offline

Activity: 1176
Merit: 1134


View Profile WWW
February 24, 2016, 02:09:49 PM
 #15

Those pauses aren't pauses. You're going by log entries, which only note when verification has completed.

Of course that explains it.  Otherwise there would be out-of-order entries in the log, which there never are.  It doesn't explain the lack of CPU activity though during the pauses.  Sigs could be checked in the order received, couldn't they?

If you were to run with a larger dbcache setting you would likely see vastly improve performance

I may try that -- and post the results for comparison purposes.

It can't really set it automatically, because, unless you had much more ram than you do, bitcoin would end up hogging most of your memory when you wanted to use your computer for other things

A dynamic dbcache setting might be OK if the limit were set conservatively. A default setting should never be seriously sub-optimal, in my opinion.
while you can verify sigs a lot of times from the recent blocks, there are many times where you just dont know what the txid the vin refers to is, since you havent seen it yet during a parallel sync.

So you could verify a significant subset as it comes in during the first pass, but until the blockchain is fully there you cant verify all the sigs.

as the highest block is advanced, whatever sigs that are not verified yet could be verified. But with the checkpointing, there isnt the sig verification needed until the last checkpoint anyway

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET
LiteCoinGuy
Legendary
*
Offline Offline

Activity: 1148
Merit: 1014


In Satoshi I Trust


View Profile WWW
February 24, 2016, 03:55:54 PM
 #16

35.5 hours to download and verify the blockchain now? very good results.

mmgen-py (OP)
Member
**
Offline Offline

Activity: 112
Merit: 27


View Profile WWW
February 24, 2016, 04:09:57 PM
 #17

while you can verify sigs a lot of times from the recent blocks, there are many times where you just dont know what the txid the vin refers to is, since you havent seen it yet during a parallel sync.

Which places rather tight constraints on parallelism.
jl777
Legendary
*
Offline Offline

Activity: 1176
Merit: 1134


View Profile WWW
February 24, 2016, 04:29:00 PM
 #18

while you can verify sigs a lot of times from the recent blocks, there are many times where you just dont know what the txid the vin refers to is, since you havent seen it yet during a parallel sync.

Which places rather tight constraints on parallelism.

I have the blockchain validating serially all throughout the parallel sync, so all the sig validations can take 30 minutes using N cores without affecting the final completion time. Also I think I only need to validate from the latest checkpoint, so that reduces significantly the sheer number of sigs that need to be verified.

As long as users can see balances, even if the validation isnt complete yet, just need to prevent any spends until all the inputs for the tx are validated. My guess is for most of the people most of the time, it can be done so the time for sig validation is not noticeable. Also for user tx before fully validated, I could select the oldest inputs and specificallly validate the things needed to make sure those are valid.

James

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET
mmgen-py (OP)
Member
**
Offline Offline

Activity: 112
Merit: 27


View Profile WWW
February 24, 2016, 04:39:00 PM
Last edit: February 24, 2016, 06:04:07 PM by mmgen-py
 #19

Also I think I only need to validate from the latest checkpoint, so that reduces significantly the sheer number of sigs that need to be verified.

Actually, I was running rc5 as it turns out, which had the checkpoint way back at block #295000.  That's been updated in the final release, which of course will make things go much faster.

EDIT: Not so, as it turns out.  Checkpoints are no longer being updated (since v0.10), so the last one remains at #295000.
jl777
Legendary
*
Offline Offline

Activity: 1176
Merit: 1134


View Profile WWW
February 24, 2016, 04:43:56 PM
 #20

Also I think I only need to validate from the latest checkpoint, so that reduces significantly the sheer number of sigs that need to be verified.

Actually, I was running rc5 as it turns out, which had the checkpoint way back at block #295000.  That's been updated in the final release, which of course will make things go much faster.
majority of tx since 295000
It doesnt take too long to process that far, but it starts really slowing down at around 350,000. I think that is when p2sh started being used a lot more and tx just got bigger and more complicated

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!