Title: how to dump or parse information from all available blocks on a full-node Post by: citb0in on October 05, 2022, 04:34:29 PM Hi all,
please excuse if this has been already answered but I really searched and didn't find any helpful ressource. Let's say I wanna dump the field "merkleroot" of blocks 700,000 to 750,000 and save the results each per line so the output.lst contains 50,001 lines in total. What is the most efficient and fastes way to parse such information without the use of public (RESTful) API servers? I'd like to search locally on my full-node. Unfortunately I have no clue how to grep for this information from the existing binary files under ~/.bitcoin/blocks/blk*.dat For performance reasons I also want avoid running commands like this ... Code: bitcoin-cli getblock $(bitcoin-cli getblockhash 700000) |grep -i merkle thanks for any hints in advance. PP. Which bitcoin-cli command does list the available blocks that a pruned node contains? Title: Re: how to dump or parse information from all available blocks on a full-node Post by: PawGo on October 05, 2022, 04:45:13 PM Take a look at https://github.com/gcarq/rusty-blockparser
It takes some time to process blocks, but that app is highly configurable and you may easily dump the data you need to the file. Title: Re: how to dump or parse information from all available blocks on a full-node Post by: citb0in on October 05, 2022, 04:56:03 PM Thank you!
This is what I call a coincidence. Just right now I switched to this thread and wanted to reply that I've found a solution which is bitcoin-iterate (https://github.com/rustyrussell/bitcoin-iterate) (by Rusty Russell) :D But thanks for the suggested tool. Do you happen to know which one is better in terms of functionality and performance ? EDIT: just experienced that bitcoin-iterate doesn't seem to work on pruned nodes because it starts reading all available blocks and looking explicitly for the genesis block (see here (https://github.com/rustyrussell/bitcoin-iterate/issues/22) where another user ran into this issue, too). Didn't find any switch in the tool yet that will allow me to bypass the genesis search. If anyone knows how to force bitcoin-iterate to run also on pruned nodes, please let me know. Meanwhile I will try the tool rusty-blockparser which you suggested... ... Well, rusty-blockparser have a lot of requirements (like cargo and during compiling process it downloads and processes dozens of modules or libraries). It takes very long until everything is installed and finished. While typing this text I'm still waiting . . At the same time I stumbled over this very simple and neat tool, called blockchain-parser (https://github.com/ragestack/blockchain-parser) (from Denis Leonov) and was last updated 7 months ago. It's very simple: You input a blk*.dat file and it outputs the content as a text file. Very quick and straight-forward. One can grep and search for the particular info needed. And as a good reference I want to point to this great article, which explains in detail how everyone could manually dump and read information from bitcoind's blk*.dat files just by using linux standard tools like od or hexdump. This article covers everything to know about bitcoind structure for such blk*.dat files. I found it very helpful. EDIT: After long time waiting for the compilation process of rusty-blockparse unfortunately it doesn't seem to work on the pruned node I tested as expected. Although I even tried to use height start and end beyond the defaults (which usually are pretty fine settings according to the usage examples and manual) rusty-blockparse only detects one single block on my node: Code: $ ./rusty-blockparser simplestats Quote [20:30:14] INFO - main: Starting rusty-blockparser v0.8.1 ... [20:30:14] INFO - index: Reading index from /home/bitcoin/.bitcoin/blocks/index ... [20:30:27] INFO - index: Got longest chain with 757246 blocks ... <--- this is not correct, my node is on height=757251) [20:30:27] INFO - blkfile: Reading files from /home/bitcoin/.bitcoin/blocks ... [20:30:27] INFO - parser: Parsing Bitcoin blockchain (range=0..) ... [20:30:27] INFO - callback: Executing SimpleStats ... [20:30:27] INFO - parser: Done. Processed 1 blocks in 0.00 minutes. (avg: 1 blocks/sec) [20:30:27] INFO - simplestats: SimpleStats: -> valid blocks: 1 -> total transactions: 1 -> total tx inputs: 1 -> total tx outputs: 1 -> total tx fees: 0.00000000 (0 units) -> total volume: 50.00000000 (5000000000 units) -> biggest value tx: 50.00000000 (5000000000 units) seen in block #0, txid: 4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b -> biggest size tx: 204 bytes seen in block #0, txid: 4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b Averages: -> avg block size: 0.28 KiB -> avg time between blocks: 0.00 (minutes) -> avg txs per block: 1.00 -> avg inputs per tx: 1.00 -> avg outputs per tx: 1.00 -> avg value per output: 50.00 Transaction Types: -> Pay2PublicKey: 1 (100.00%) first seen in block #0, txid: 4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b [20:30:27] INFO - main: Fin. as you see, it processed only one single block. But there are hundreds of blocks on that host. I even tried to specify a start range which I know for sure is valid but it didn't help either. Any clues? Seems to me like this tool also comes not along with pruned nodes ? Anyone have some insight or more information on this? But even if the tool would work on my pruned node as expected --> how should I achieve the originally mentioned and intended goal with it? I don't see in the options and in the manual any possibility to filter for such things. This tool seems to be more suitable to output dumps of balances, addresses, etc. into a .csv file, there are only three subcommands for that tool. So I'm still at the beginning. How should I achieve the goal mentioned in the beginning of my post? Title: Re: how to dump or parse information from all available blocks on a full-node Post by: BlackHatCoiner on October 05, 2022, 05:11:53 PM (see here (https://github.com/rustyrussell/bitcoin-iterate/issues/22) where another user ran into this issue, too). Pruning is considered luxury in such small projects. But, leaving an issue unanswered is a sign that you shouldn't get involved with, if at least there's a more active alternative, that is rusty-blockparser.If anyone knows how to force bitcoin-iterate to run also on pruned nodes, please let me know. That would require altering the source code. IMO, this should be the last course. You and me don't know how they've written their program. I'd honestly prefer restarting from scratch than attempting to dive into these c files. Title: Re: how to dump or parse information from all available blocks on a full-node Post by: LoyceV on October 06, 2022, 08:08:36 AM For performance reasons I also want avoid running commands like this ... If you don't need to do this often, you could just let it loop through. I tested it, and from a spinning disk I get 10,000 merkle roots in 6 minutes.Code: bitcoin-cli getblock $(bitcoin-cli getblockhash 700000) |grep -i merkle I'm surprised this isn't included in Blockchair's data dumps (https://bitcointalk.org/index.php?topic=5307550.msg56378771#msg56378771). If you want, I can add it (https://bitcointalk.org/index.php?topic=5307550.msg61071279#msg61071279). Title: Re: how to dump or parse information from all available blocks on a full-node Post by: citb0in on October 06, 2022, 12:51:10 PM just for the comparison, on my side it took 1m12sec for 2,860 blocks when using RPC
Code: $ time for i in {754500..757359..1}; do bitcoin-cli -rpcuser=myuser -rpcpassword=mypwd getblock $(bitcoin-cli -rpcuser=myuser -rpcpassword=mypwd getblockhash $i) | grep -i merkle; done Quote "merkleroot": "9d5c1910d2e75d0b3e2fb36f09341ee8043e8d18fee327fc8bbad43dec95e47d", real 1m12.700s user 0m13.720s sys 0m5.540s same thing without RPC utilized --> 46sec for 2,860 blocks Code: $ time for i in {754500..757359..1}; do bitcoin-cli getblock $(bitcoin-cli getblockhash $i) | grep -i merkle; done Quote "merkleroot": "9d5c1910d2e75d0b3e2fb36f09341ee8043e8d18fee327fc8bbad43dec95e47d", real 0m45.959s user 0m13.702s sys 0m5.364s well, as you see in my case using RPC was not faster. I'd go the way without utilizing RPC Title: Re: how to dump or parse information from all available blocks on a full-node Post by: DaveF on October 06, 2022, 01:12:48 PM well, as you see in my case using RPC was not faster. I'd go the way without utilizing RPC A lot of the speed between bitcoin-cli and rpc is going to depend on the rest of the system in terms of HDD speed / IO in general / CPU threads / RAM / and so on. It seems a bit counter intuitive and you would think a faster system would just be faster, but having run a bunch of nodes and apps on different hardware, there have been many times when A ran faster then B until you added RAM and then B was was faster. It all comes down to what resources you have available and what it is looking for. -Dave Title: Re: how to dump or parse information from all available blocks on a full-node Post by: LoyceV on October 06, 2022, 04:47:42 PM 6 minutes is very fast. It's a Xeon that's mostly idle. Currently at block 383,000, I'll post the results tomorrow.same thing without RPC utilized --> 46sec for 2,860 blocks Running from SSD, I assume?Title: Re: how to dump or parse information from all available blocks on a full-node Post by: n0nce on October 07, 2022, 01:01:45 AM same thing without RPC utilized --> 46sec for 2,860 blocks Running from SSD, I assume?Code: real 2m30.827s Title: Re: how to dump or parse information from all available blocks on a full-node Post by: citb0in on October 07, 2022, 05:44:56 AM Maybe my better performance resulted due to running a pruned node ? I'm just guessing
Title: Re: how to dump or parse information from all available blocks on a full-node Post by: LoyceV on October 07, 2022, 08:19:28 AM Here's the result:
block_hash_version_versionHex_merkleroot_time_mediantime_nonce_bits_difficulty_ chainwork_nTx_strippedsize_size_weight.tsv.gz (http://blockdata.loyce.club/block_hash_version_versionHex_merkleroot_time_mediantime_nonce_bits_difficulty_chainwork_nTx_strippedsize_size_weight.tsv.gz) (80 MB). Sample: Code: block hash version versionHex merkleroot time mediantime nonce bits difficulty chainwork nTx strippedsize size weight Daily updates take only 18 seconds. Enjoy :) Let me know if I messed anything up. Title: Re: how to dump or parse information from all available blocks on a full-node Post by: citb0in on October 07, 2022, 08:44:56 AM Here's the result: block_hash_version_versionHex_merkleroot_time_mediantime_nonce_bits_difficulty_ chainwork_nTx_strippedsize_size_weight.tsv.gz (http://blockdata.loyce.club/block_hash_version_versionHex_merkleroot_time_mediantime_nonce_bits_difficulty_chainwork_nTx_strippedsize_size_weight.tsv.gz) (80 MB). ... Great job Loyce, many thanks! |