Bitcoin Forum
August 26, 2024, 06:29:15 PM *
News: All versions of Windows are affected by a critical security bug; make sure you update.
 
  Home Help Search Login Register More  
  Show Posts
Pages: [1]
1  Bitcoin / Development & Technical Discussion / parsing leveldb txindex on: May 25, 2015, 10:41:14 PM
I'm trying to parse the LevelDB txindex, i.e. fetch raw transaction data given a transaction hash.

I'm reading a key-value pair from LevelDB with 't' + hash_bytes, and e.g. for transaction 444b7ecbda319e184da1a3d68968e6e0ca9346ddcf7afd0e2b887a7949128805 the key's value is

80 58 8f c4 c8 66 80 9b 24

Now, if I'm reading the bitcoin source correctly, this should be three varints:

  • nFile
  • nPos
  • nTxOffset

If I'm correctly decoding them, the 3 values are: 216 34694374 20004 -- which looks good. However, the data at offset 34694374 + 20004 in file 216 is unexpected:

version: 4293214589
inCount: 93952409796607

It looks like either I'm decoding the varints the wrong way or I'm calculating the file offset wrongly?

The varint read+decode code is mostly a direct port from C:

Code:
  def varint(s):
      n = 0
      while True:
          ch = read1(s)
          n = (n << 7) | (ch & 0x7f)
          if ch & 0x80:
              n += 1
          else:
              return n
2  Bitcoin / Development & Technical Discussion / Re: bitcoind JSON RPC performance declining over the blockchain? on: February 23, 2015, 07:01:11 PM
But to be honest, I'm not even sure that you have something to optimize if your script can process hundreds transactions per second.
Currently, bitcoin network capacity is around 3,5 txs/s...
Now, my script processes 1-2 blocks/s, which is usually more than 1000 transactions/s, and it works in parallel processes, so I don't have anything left to optimize except bitcoind Cheesy
Yep. IMHO, your perfs are pretty good Smiley

Thanks! Each of the transactions are decomposed and one record for each input and for each output is created, so it fans out to a lot of DB transactions/s.

I'm confident that the db side will never be the problem in my setup.

In your case, the last improvement I could imagine is a bypass of the rpc api. Basically, you read directly in .dat files (r-A-scalable-and-fault-tolerant-blo]example).

EDIT: another idea: Did you try to increase the number of rpc threads ? (-rpcthreads=N)

Thanks for the pointer to direct .dat reading idea, it may very well be required!

Yeah, I always connect with less clients than rpcthreads - this is why I think bitcoind is the issue here. Let me try to explain better with an example:

The CPU usage of bitcoind remains the same after around 3 clients: 150%. If I add more RPC client processes, I see the CPU usage on bitcoind remain the same, the *total* number of RPC calls per second remains the same, and the CPU load of each individual client goes down (since more clients are now working on the same number of RPC calls per second). Currently, each of my 4 client processes takes up around 20% CPU time, and they are mostly waiting for responses from bitcoind.

As I've written before, the DB load is insignificant (mysqld is currently less than 5%, including iowait).
3  Bitcoin / Development & Technical Discussion / Re: bitcoind JSON RPC performance declining over the blockchain? on: February 23, 2015, 01:06:49 PM
But to be honest, I'm not even sure that you have something to optimize if your script can process hundreds transactions per second.
Currently, bitcoin network capacity is around 3,5 txs/s...

This is a bulk import, the real-time performance is well within the capabilities of the code and the system.

Now, my script processes 1-2 blocks/s, which is usually more than 1000 transactions/s, and it works in parallel processes, so I don't have anything left to optimize except bitcoind Cheesy
4  Bitcoin / Development & Technical Discussion / Re: bitcoind JSON RPC performance declining over the blockchain? on: February 23, 2015, 12:43:38 PM
My best guess is that your bottleneck (if there's one) is on the write side.
Minimizing the number of roundtrips between your batch and the database might be part of the solution.
Bulk insertion of several txs should provide better performances. Good indexes should help too.

Nah, the write side is covered extremely well: the database is on SSDs, and MySQL uses in aggregate around 20% of a single core (that includes iowait, which itself is around 1%).

The slowness actually, really is on the bitcoind side. I can spawn an arbitrary number forked, independant client processes (so no thread locking issues between them) issuing JSON RPC queries to  bitcoind, and after about 3 clients there are no significant performance improvements (i.e. massively sub-linear improvements), with clients spending time waiting for bitcoind to respond.

OTOH, bitcoind eats around 150% CPU time and cannot use the rest of the free cores in the machine.

5  Bitcoin / Development & Technical Discussion / bitcoind JSON RPC performance declining over the blockchain? on: February 20, 2015, 09:11:36 AM
Hi,

I'm pumping data about all transactions from the blockchain into a database for future analysis, and I've arrived at the conclusion that, as I'm progressing through the blockchain, I'm spending more and more time waiting for bitcoind's responses, for about the same amount of processed data. Here's an image of database INSERTs for all transactions, their inputs and outputs over the last couple of days. At the start I was analyzing hundreds of blocks per second (which is understandable from the point that they were simple or empty), and around block 200.000 the performance took a sharp drop, to the point where I'm now doing 1-2 per second.



I see that the CPU usage of bitcoind is around 150%, which was approximately the same as when I started, even though I do 4 RPC threads. Increasing the parallelism doesn't help. I'm nowhere near IO-bound (SSDs, enough RAM). It looks simply like now, bitcoind takes a longer time to process the same aggregate amount of transactions.

From Googling around, I see that "bitcoind is hard to scale" is kind of a common conclusion, but I didn't find anything about this sharp drop in performance. Is this common knowledge? Has something changed around block 200.000?

I'm using getrawtransaction to inspect individual transactions from each block - is this the optimal way?
Pages: [1]
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!