LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
May 15, 2022, 10:20:14 AM |
|
Just a question - for the list of founded addresses, do you take in to account unconfirmed balance at the moment of snapshot or not? And opposite - if address with unconfirmed spending is still included or already not? The datadump comes from the blockchain only. Anything in mempool is ignored.
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
May 18, 2022, 11:26:27 AM |
|
Would anyone be interested in block data for Ethereum? Apart from the disk space it takes (it's even larger than Bitcoin's block data), I wouldn't mind adding it to my collection. I'm currently downloading it to compile a list of Eth addresses and their balance from this data. Amazingly, I couldn't find a complete list anywhere.
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
June 21, 2022, 11:07:54 AM |
|
Update: I have downloaded "calls", "erc-20" and "transactions". The total size is 712 GB, and I don't have the server space to store it on. I have it on my local storage (so I can upload it in about 2 days). Unfortunately, 2 files are corrupted: - blockchair_ethereum_calls_20200113.tsv.gz
- blockchair_ethereum_calls_20211110.tsv.gz
I contacted Blockchair, but they "haven't detected the issue". I'm currently downloading it to compile a list of Eth addresses and their balance from this data. Amazingly, I couldn't find a complete list anywhere. By now Blockchair has the complete Ethereum address list online: blockchair_ethereum_addresses_latest.tsv.gz. That ends my efforts to compile this list myself.
|
|
|
|
PawGo
Legendary
Offline
Activity: 952
Merit: 1372
|
|
June 22, 2022, 08:15:49 AM |
|
In the list of founded addresses there is unnecessary line "address" (kind of column header) between addresses "3.." and "bc..", line 31808561. Like you had separated lists and you concatenated them.
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
June 22, 2022, 08:42:31 AM |
|
In the list of founded addresses there is unnecessary line "address" (kind of column header) between addresses "3.." and "bc..", line 31808561. Like you had separated lists and you concatenated them. Thanks! I'll take this to my other topic.
|
|
|
|
PawGo
Legendary
Offline
Activity: 952
Merit: 1372
|
|
August 02, 2022, 01:15:59 PM Last edit: August 02, 2022, 02:33:40 PM by PawGo |
|
Hi Loyce, Would it be possible to prepare a single file with: - all block hashes - all transaction IDs Or maybe you have such a script I may execute on full node. Otherwise I would have to download packs like from http://blockdata.loyce.club/ , decompress and parse tsv (which seems to be the simplest solution Do you plan to backup /blocks/ folder from blockchair?
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
August 02, 2022, 03:01:06 PM Last edit: August 02, 2022, 05:34:30 PM by LoyceV |
|
Would it be possible to prepare a single file with: - all block hashes - all transaction IDs
Or maybe you have such a script I may execute on full node. What format are you looking for? Just a long list of hashes, or do you need to know which txid belong to which block? I think the easiest way is to get the data from /transactions/ (55 GB) and /blocks/. I can save you a 55 GB download if you get me the format you want. To test, this is running now in /transactions/: for file in `ls`; do echo "Now processing $file"; gunzip -c $file | grep -v 'block_id' | cut -f1-2 >> ../PawGo.tsv; done; gzip ../PawGo.tsv See: blockdata.loyce.club/PawGo.tsv or blockdata.loyce.club/PawGo.tsv.gz. It's 27 GB now, hashes don't compress very well. It's scheduled to be deleted in 7 days. Otherwise I would have to download packs like from http://blockdata.loyce.club/ , decompress and parse tsv (which seems to be the simplest solution That's what I would do Do you plan to backup /blocks/ folder from blockchair? No need: these files are tiny, so downloading them from Blockchair directly shouldn't take too long anyway.
|
|
|
|
PawGo
Legendary
Offline
Activity: 952
Merit: 1372
|
|
August 02, 2022, 06:27:06 PM |
|
Perfect, great, fantastic, thank you! That's exactly what I was thinking about. I owe you beer & frites.
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
August 02, 2022, 06:34:45 PM |
|
Perfect, great, fantastic, thank you! That's exactly what I was thinking about. Is it? It doesn't have the block hashes you asked for, only txids and block numbers. I owe you beer & frites. No worries, I've had my fair share today already
|
|
|
|
PawGo
Legendary
Offline
Activity: 952
Merit: 1372
|
|
August 02, 2022, 06:37:05 PM |
|
That's exactly what I was thinking about. Is it? It doesn't have the block hashes you asked for, only txids and block numbers. Yes, I was just stunned by amount of data to download. I need only txids. With block hashes it is indeed simpler. (you may delete the file)
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
August 10, 2022, 08:15:14 PM |
|
I just got a very nice Xeon-powered dedicated server (no more VPS!) from an anonymous donation, so I'm covered for now. Update: the server is currenly offline for an upgrade With more disk space, I can add Ethereum data from Blockchair (which I have locally already), and later Dogecoin data too.
I've been playing with an idea: I want to make a graph of funded (potential) ChipMixer chips over the years, with daily data. Assumptions: I'll start by looking for addresses that received 0.512 BTC. I'll exclude all addresses that received more than one transaction (ever), and I'll count the chips from the day they were funded until they day they're emptied. In Bitcoin's early years, before ChipMixer even existed, many potential chips were emptied the same day again. Those won't be counted. Problem: There's a lot of data, and I don't do databases. I was actually running out of disk space to sort the data, so this upgrade came at the right time.
|
|
|
|
PawGo
Legendary
Offline
Activity: 952
Merit: 1372
|
|
August 11, 2022, 12:17:38 PM |
|
Update: the server is currenly offline for an upgrade With more disk space, I can add Ethereum data from Blockchair (which I have locally already), and later Dogecoin data too. Hi, what's the state of ETH addresses dump? Is it uploaded already? Problem: There's a lot of data, and I don't do databases. I was actually running out of disk space to sort the data, so this upgrade came at the right time.
Do you want to talk about doing it in DB?
|
|
|
|
LoyceMobile
|
|
August 11, 2022, 12:51:36 PM |
|
Hi, what's the state of ETH addresses dump? Is it uploaded already? I have it, updated until June only, but not online yet. And I'm currently sailing so can't access it. The data I meant is the full Ethereum transaction data, about 800 GB. That will take my home internet a while to upload. Do you want to talk about doing it in DB? I'm still a total noob, but would be good to learn.
|
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
August 15, 2022, 12:54:33 PM |
|
I stumbled upon something peculiar. Take blockchair_bitcoin_outputs_20220203.tsv.gz for example: 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 11 2022-02-03 01:43:54 51200000 18970.623 bc1qt2kc82kr0wdyyyqns7qyvl377dap69ygzkpwmc witness_v0_scripthash 00145aad83aac37b9a4210138780467e3ef37a1d1488 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 11 2022-02-03 01:43:54 51200000 18970.623 bc1qt2kc82kr0wdyyyqns7qyvl377dap69ygzkpwmc witness_v0_scripthash 00145aad83aac37b9a4210138780467e3ef37a1d1488 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 11 2022-02-03 01:43:54 51200000 18970.623 bc1qt2kc82kr0wdyyyqns7qyvl377dap69ygzkpwmc witness_v0_scripthash 00145aad83aac37b9a4210138780467e3ef37a1d1488 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 12 2022-02-03 01:43:54 51200000 18970.623 bc1qx9jm85e08jasw75g0drr7y2x9xx45xck6xvxhe witness_v0_scripthash 00143165b3d32f3cbb077a887b463f1146298d5a1b16 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 13 2022-02-03 01:43:54 51200000 18970.623 bc1qmx0eraunssvj9ukel8m40cpt8ez4wxj4t2jn4q witness_v0_scripthash 0014d99f91f793841922f2d9f9f757e02b3e45571a55 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 13 2022-02-03 01:43:54 51200000 18970.623 bc1qmx0eraunssvj9ukel8m40cpt8ez4wxj4t2jn4q witness_v0_scripthash 0014d99f91f793841922f2d9f9f757e02b3e45571a55 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 13 2022-02-03 01:43:54 51200000 18970.623 bc1qmx0eraunssvj9ukel8m40cpt8ez4wxj4t2jn4q witness_v0_scripthash 0014d99f91f793841922f2d9f9f757e02b3e45571a55 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 14 2022-02-03 01:43:54 51200000 18970.623 bc1qr4jgu3t5fnjrcux646kssfmavsw5zftmj4tsc6 witness_v0_scripthash 00141d648e45744ce43c70daaead08277d641d41257b 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 15 2022-02-03 01:43:54 51200000 18970.623 bc1qw6dnjhw8qjyxtszn950l5zh57x60tm5lcsdtkr witness_v0_scripthash 0014769b395dc7048865c0532d1ffa0af4f1b4f5ee9f 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 15 2022-02-03 01:43:54 51200000 18970.623 bc1qw6dnjhw8qjyxtszn950l5zh57x60tm5lcsdtkr witness_v0_scripthash 0014769b395dc7048865c0532d1ffa0af4f1b4f5ee9f 0 -1 721577 f2ec8c7f07725959014613a5cf04dde4cf3079c8948bc011298479e751935fc3 15 2022-02-03 01:43:54 51200000 18970.623 bc1qw6dnjhw8qjyxtszn950l5zh57x60tm5lcsdtkr witness_v0_scripthash 0014769b395dc7048865c0532d1ffa0af4f1b4f5ee9f 0 -1 There are many duplicated lines! For the compressed filesize it doesn't matter much, but if I remove them, the number of lines drops by 1,046,856-879,157=167,699! I checked more archives: the older ones have only a few duplicate lines, the newer archives have tens or hundreds of thousands of duplicates. What could be the reason? And worse: it also makes me wonder if other entries could be missing.
|
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
October 06, 2022, 08:06:35 AM |
|
From how to dump or parse information from all available blocks on a full-node, I realized there's more block data missing from these data dumps. Example: bitcoin-cli getblock $(./bitcoin-cli getblockhash 700000) | head -n25 { "hash": "0000000000000000000590fc0f3eba193a278534220b2b37e9849e1a770ca959", "confirmations": 57334, "height": 700000, "version": 1073733636, "versionHex": "3fffe004", "merkleroot": "1f8d213c864bfe9fb0098cecc3165cce407de88413741b0300d56ea0f4ec9c65", "time": 1631333672, "mediantime": 1631331088, "nonce": 2881644503, "bits": "170f48e4", "difficulty": 18415156832118.24, "chainwork": "0000000000000000000000000000000000000000216dd8dc61fdffabb624feeb", "nTx": 1276, "previousblockhash": "0000000000000000000aa3ce000eb559f4143be419108134e0ce71042fc636eb", "nextblockhash": "00000000000000000002f39baabb00ffeb47dbdb425d5077baa62c47482b7e92", "strippedsize": 907224, "size": 1276422, "weight": 3998094, "tx": [ "1d8149eb8d8475b98113b5011cf70e0b7a4dccff71286d28b8b4b641f94f1e46", "ed25927576988e38e4cc8e4b19d1272c480f113fb605271b190df05aa983714e", "744556a5586736471d496c928ccca8fd58dadac6071394eca846c180b0dec6fe", "adfcbcbd4f87a725337ab0b4eb657f97123806d30ccd50fa0c107b5124692e1d", "afe5de49b7a84bb5d79d114601d81645264ebb4fcb8e1b45c280f6d788a8a7ba" ~ If anyone wants a dump of this data, just ask. I don't mind adding it.
|
|
|
|
citb0in
|
|
October 06, 2022, 08:35:10 AM |
|
Hi Loyce and thanks for adding "merkleroot" to your data. Would you mind to add also the rest ? "hash": "0000000000000000000590fc0f3eba193a278534220b2b37e9849e1a770ca959", "version": 1073733636, "versionHex": "3fffe004", "merkleroot": "1f8d213c864bfe9fb0098cecc3165cce407de88413741b0300d56ea0f4ec9c65", "time": 1631333672, "mediantime": 1631331088, "nonce": 2881644503, "bits": "170f48e4", "difficulty": 18415156832118.24, "chainwork": "0000000000000000000000000000000000000000216dd8dc61fdffabb624feeb", "nTx": 1276, "previousblockhash": "0000000000000000000aa3ce000eb559f4143be419108134e0ce71042fc636eb", "nextblockhash": "00000000000000000002f39baabb00ffeb47dbdb425d5077baa62c47482b7e92", "strippedsize": 907224, "size": 1276422, "weight": 3998094,
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
October 06, 2022, 10:46:05 AM Last edit: October 07, 2022, 10:28:19 AM by LoyceV |
|
Hi Loyce and thanks for adding "merkleroot" to your data. Would you mind to add also the rest ? I'll create a TSV with this: block hash version versionHex merkleroot time mediantime nonce bits difficulty chainwork nTx strippedsize size weight I left out the previous and next block hash to reduce the size. It should be done in about 11 hours, which means I'll share the output tomorrow. I expect the file to be around 150 MB, so I didn't bother making smaller daily files. Once done, this will get daily updates, up to the same block as Blockchair's data dumps (so not the latest blocks from my own Bitcoin Core). Updateblock_hash_version_versionHex_merkleroot_time_mediantime_nonce_bits_difficulty_ chainwork_nTx_strippedsize_size_weight.tsv.gz (80 MB). I've updated the OP
|
|
|
|
LoyceV (OP)
Legendary
Offline
Activity: 3402
Merit: 17183
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
December 13, 2022, 03:04:49 PM Last edit: July 16, 2023, 06:39:46 AM by LoyceV |
|
I only added this centralized shitcoin to utilize the 2 TB storage disk, but adding several gigabytes per week filled it. This gives enough space for a long time again.
|
|
|
|
|