Bitcoin Forum
April 16, 2024, 03:00:33 PM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: bitcoind JSON RPC performance declining over the blockchain?  (Read 2714 times)
ivoras (OP)
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
February 20, 2015, 09:11:36 AM
 #1

Hi,

I'm pumping data about all transactions from the blockchain into a database for future analysis, and I've arrived at the conclusion that, as I'm progressing through the blockchain, I'm spending more and more time waiting for bitcoind's responses, for about the same amount of processed data. Here's an image of database INSERTs for all transactions, their inputs and outputs over the last couple of days. At the start I was analyzing hundreds of blocks per second (which is understandable from the point that they were simple or empty), and around block 200.000 the performance took a sharp drop, to the point where I'm now doing 1-2 per second.

https://i.imgur.com/NvF2Agz.png

I see that the CPU usage of bitcoind is around 150%, which was approximately the same as when I started, even though I do 4 RPC threads. Increasing the parallelism doesn't help. I'm nowhere near IO-bound (SSDs, enough RAM). It looks simply like now, bitcoind takes a longer time to process the same aggregate amount of transactions.

From Googling around, I see that "bitcoind is hard to scale" is kind of a common conclusion, but I didn't find anything about this sharp drop in performance. Is this common knowledge? Has something changed around block 200.000?

I'm using getrawtransaction to inspect individual transactions from each block - is this the optimal way?
1713279633
Hero Member
*
Offline Offline

Posts: 1713279633

View Profile Personal Message (Offline)

Ignore
1713279633
Reply with quote  #2

1713279633
Report to moderator
1713279633
Hero Member
*
Offline Offline

Posts: 1713279633

View Profile Personal Message (Offline)

Ignore
1713279633
Reply with quote  #2

1713279633
Report to moderator
The forum strives to allow free discussion of any ideas. All policies are built around this principle. This doesn't mean you can post garbage, though: posts should actually contain ideas, and these ideas should be argued reasonably.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713279633
Hero Member
*
Offline Offline

Posts: 1713279633

View Profile Personal Message (Offline)

Ignore
1713279633
Reply with quote  #2

1713279633
Report to moderator
1713279633
Hero Member
*
Offline Offline

Posts: 1713279633

View Profile Personal Message (Offline)

Ignore
1713279633
Reply with quote  #2

1713279633
Report to moderator
1713279633
Hero Member
*
Offline Offline

Posts: 1713279633

View Profile Personal Message (Offline)

Ignore
1713279633
Reply with quote  #2

1713279633
Report to moderator
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
February 20, 2015, 09:24:28 AM
 #2

Why are you assuming the performance has anything to do with Bitcoind. JSON is just inherently a bit slow, but the performance shouldn't depend on where the transactions are located and should more or less put out an equal number of transactions per second.  Your database, on the other hand, will slow down sharply as you insert more records into it.  A quick test here shows it reading the same amount of tx per second at height 150k and 300k.

General purpose databases tend to perform very poorly for Bitcoin applications, especially if you're carrying specialized indexes... as the bitcoin data implies a large number of really tiny records for most ways of splitting it out.
laurentmt
Sr. Member
****
Offline Offline

Activity: 384
Merit: 258


View Profile
February 20, 2015, 11:53:32 AM
Last edit: February 20, 2015, 12:18:28 PM by laurentmt
 #3

The slowdown around block 200k is normal if you evaluate your perfs in term of blocks.
The main change is an increased number of transactions around the end of 2012.

With a single thread you should be able to process hundreds or thousands transactions per minute without any problem.
But don't expect to process hundred of blocks/s anymore since each block has (on average) hundreds transactions.

My best guess is that your bottleneck (if there's one) is on the write side.
Minimizing the number of roundtrips between your batch and the database might be part of the solution.
Bulk insertion of several txs should provide better performances. Good indexes should help too.

But to be honest, I'm not even sure that you have something to optimize if your script can process hundreds transactions per second.
Currently, bitcoin network capacity is around 3,5 txs/s...

I hope it helps.


ivoras (OP)
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
February 23, 2015, 12:43:38 PM
 #4

My best guess is that your bottleneck (if there's one) is on the write side.
Minimizing the number of roundtrips between your batch and the database might be part of the solution.
Bulk insertion of several txs should provide better performances. Good indexes should help too.

Nah, the write side is covered extremely well: the database is on SSDs, and MySQL uses in aggregate around 20% of a single core (that includes iowait, which itself is around 1%).

The slowness actually, really is on the bitcoind side. I can spawn an arbitrary number forked, independant client processes (so no thread locking issues between them) issuing JSON RPC queries to  bitcoind, and after about 3 clients there are no significant performance improvements (i.e. massively sub-linear improvements), with clients spending time waiting for bitcoind to respond.

OTOH, bitcoind eats around 150% CPU time and cannot use the rest of the free cores in the machine.

ivoras (OP)
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
February 23, 2015, 01:06:49 PM
 #5

But to be honest, I'm not even sure that you have something to optimize if your script can process hundreds transactions per second.
Currently, bitcoin network capacity is around 3,5 txs/s...

This is a bulk import, the real-time performance is well within the capabilities of the code and the system.

Now, my script processes 1-2 blocks/s, which is usually more than 1000 transactions/s, and it works in parallel processes, so I don't have anything left to optimize except bitcoind Cheesy
laurentmt
Sr. Member
****
Offline Offline

Activity: 384
Merit: 258


View Profile
February 23, 2015, 04:42:27 PM
Last edit: February 23, 2015, 05:10:03 PM by laurentmt
 #6

But to be honest, I'm not even sure that you have something to optimize if your script can process hundreds transactions per second.
Currently, bitcoin network capacity is around 3,5 txs/s...
Now, my script processes 1-2 blocks/s, which is usually more than 1000 transactions/s, and it works in parallel processes, so I don't have anything left to optimize except bitcoind Cheesy
Yep. IMHO, your perfs are pretty good Smiley
I guess the slowdown at 200k blocks has been experienced by all people trying to scan the blockchain.
In previous releases of bitcoin core, it was also noticeable during the first synchro of the blockchain.

In your case, the last improvement I could imagine is a bypass of the rpc api. Basically, you read directly in .dat files (example).

EDIT: another idea: Did you try to increase the number of rpc threads ? (-rpcthreads=N)
ivoras (OP)
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
February 23, 2015, 07:01:11 PM
 #7

But to be honest, I'm not even sure that you have something to optimize if your script can process hundreds transactions per second.
Currently, bitcoin network capacity is around 3,5 txs/s...
Now, my script processes 1-2 blocks/s, which is usually more than 1000 transactions/s, and it works in parallel processes, so I don't have anything left to optimize except bitcoind Cheesy
Yep. IMHO, your perfs are pretty good Smiley

Thanks! Each of the transactions are decomposed and one record for each input and for each output is created, so it fans out to a lot of DB transactions/s.

I'm confident that the db side will never be the problem in my setup.

In your case, the last improvement I could imagine is a bypass of the rpc api. Basically, you read directly in .dat files (r-A-scalable-and-fault-tolerant-blo]example).

EDIT: another idea: Did you try to increase the number of rpc threads ? (-rpcthreads=N)

Thanks for the pointer to direct .dat reading idea, it may very well be required!

Yeah, I always connect with less clients than rpcthreads - this is why I think bitcoind is the issue here. Let me try to explain better with an example:

The CPU usage of bitcoind remains the same after around 3 clients: 150%. If I add more RPC client processes, I see the CPU usage on bitcoind remain the same, the *total* number of RPC calls per second remains the same, and the CPU load of each individual client goes down (since more clients are now working on the same number of RPC calls per second). Currently, each of my 4 client processes takes up around 20% CPU time, and they are mostly waiting for responses from bitcoind.

As I've written before, the DB load is insignificant (mysqld is currently less than 5%, including iowait).
hhanh00
Sr. Member
****
Offline Offline

Activity: 467
Merit: 266


View Profile
February 24, 2015, 01:15:21 AM
 #8

Your results are very surprising. Hard to say where the problem is without knowing more about your app.
- what table structure
- what indexes, primary keys
- what isolation, transaction
- what rpc
- what queries

nicehashdev
Sr. Member
****
Offline Offline

Activity: 280
Merit: 250


View Profile
February 24, 2015, 06:56:13 PM
 #9

I can confirm that bitcoinqt or bitcoind is not something you may want to use in any larger project. After it has several thousands addresses and running for few months, this official client slows down to the point when it takes few minutes to process every block. In the mean time, it blocks all RPC calls, including ones that wouldn't have to be blocked (such as getdiff or get new address). This slowing down has nothing to do with its internal accounting system (that one works fine, but due to long time blocked RPC calls, it is much less useful than your own accounting system). Why official client slows down so considerably - doing 2 threaded processing for several minutes when new block arrives - that is something devs can explain and maybe even fix in future versions. It is a shame, I would personally trust this client the most, but due to these slow downs, it is useless for us and we will be soon switching over to BitGo.
Cryddit
Legendary
*
Offline Offline

Activity: 924
Merit: 1122


View Profile
February 24, 2015, 08:46:04 PM
 #10

bitcoind or bitcoin-qt will have lock contention fights over the database and, yeah, you probably can't get more than about 3 clients being productive at a time (depends on the I/O capabilities of the machine how long the locks last). 

Part of the problem is that lock for reading shouldn't prevent another lock for reading -- but the clients assume that they are locking for both reading and writing.  If they got read locks instead, and no client had a write lock at the same time, they ought to have much better performance.



gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
February 24, 2015, 10:53:42 PM
Last edit: February 24, 2015, 11:12:56 PM by gmaxwell
 #11

it blocks all RPC calls, including ones that wouldn't have to be blocked (such as getdiff or get new address)
The random RPC blocking you're describing sounds somewhat like what happens when you have a RPC client that leaks keepalives. With the default settings Bitcoin core has only 4 threads for handling rpc queries (as additional threads consume a fair amount of memory and people would like to run this stuff on 500 mb VPSes) and once it has that many keepalive connections open further ones will block. Many RPC client libraries leak connections, leaving open old keepalives for a long time with no intent to use them again. If this is your issue, you can forstall it (at some memory cost) by increasing the rpc thread count, or avoid it by disabling keepalive (-rpckeepalive=0)... (or better, fix your client to not leak keepalives).

If that isn't your issue, you could try actually reporting your problem (especially with a reproduction), since no one has done so.  In particular, I keep a testing wallet running with tens of thousands of addresses and many thousands of transactions haven't observed what you're describing; so I cannot tell if your comment on block updating is a red herring or not (accepting a new block really must block most RPCs, but should also only take a second).  Not sure why people are going on about software being "unsuitable" when they can't even be bothered to report issues they've had. It certainly isn't going to magically adapt to your use-case if you snark in threads burred on forums instead of reporting issues (much less contributing to improvements).

With respect the thread here, I benchmarked a node here processing over 5500 tx/s in aggregate during IBD, including the signature validation and all.  Considering that the network as a whole is limited to <10, I think this is pretty good. If there is an actual performance issue you'll need to setup a reproduction in isolation from your application in order to figure out where it is. Just saying it's slow for you is very much not helpful; we need to know what RPCs, called in what pattern, are taking what amount of time-- and ideally have just some script to reproduce it, otherwise there is just an endless loop of "seems fast to me".
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!