Bitcoin Forum
May 04, 2024, 09:17:44 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: The fastest way to download blockchain data in csv format  (Read 259 times)
kristiano92 (OP)
Jr. Member
*
Offline Offline

Activity: 68
Merit: 1


View Profile
August 11, 2020, 12:27:25 PM
 #1

Hi,

I wonder how to download bitcoin data to csv files. I would like to get all blocks, transactions in csv files. What is the fastest way to achieve this?
I heard about bitcoin-etl and blocksci. Is it possible with both of them ?
The block chain is the main innovation of Bitcoin. It is the first distributed timestamping system.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
philipma1957
Legendary
*
Online Online

Activity: 4116
Merit: 7849


'The right to privacy matters'


View Profile WWW
August 11, 2020, 01:02:01 PM
 #2

Hi,

I wonder how to download bitcoin data to csv files. I would like to get all blocks, transactions in csv files. What is the fastest way to achieve this?
I heard about bitcoin-etl and blocksci. Is it possible with both of them ?

It is a pretty big file I would think you need to spilt it into smaller ones.


this is a pdf of blocksci white paper

https://arxiv.org/pdf/1709.02489.pdf

do you read it? it may give you some insight to your question.

▄▄███████▄▄
▄██████████████▄
▄██████████████████▄
▄████▀▀▀▀███▀▀▀▀█████▄
▄█████████████▄█▀████▄
███████████▄███████████
██████████▄█▀███████████
██████████▀████████████
▀█████▄█▀█████████████▀
▀████▄▄▄▄███▄▄▄▄████▀
▀██████████████████▀
▀███████████████▀
▀▀███████▀▀
.
 MΞTAWIN  THE FIRST WEB3 CASINO   
.
.. PLAY NOW ..
kristiano92 (OP)
Jr. Member
*
Offline Offline

Activity: 68
Merit: 1


View Profile
August 11, 2020, 02:47:25 PM
 #3

Yes, I plan to split it into many smaller files. I am looking for the best solution to quickly download this.

I know blocksci but I am not sure that it could help me to do this in my way
DougM
Full Member
***
Offline Offline

Activity: 173
Merit: 120


View Profile
August 11, 2020, 06:17:46 PM
Merited by HCP (2), ABCbits (1)
 #4

Hi,

I wonder how to download bitcoin data to csv files. I would like to get all blocks, transactions in csv files. What is the fastest way to achieve this?
I heard about bitcoin-etl and blocksci. Is it possible with both of them ?
It all depends on what you want to do with the data, but you might want to check out LoyceV's thread first to see if it answers your needs:

Without further ado:
The data
Directory:
loyce.club/blockdata/

All-in-one file:
|
|
|
Text file
blockdata.txt
blockdata.lastday.txt
|
|
|
Size (MB)
362
0.107
|
|
|
GNU zip compressed file
blockdata.txt.gz
blockdata.lastday.txt.gz
|
|
|
Size (MB)
129
0.042
|
|
|

This file is in CSV (Comma-Separated Values) format. It can be imported into a spreadsheet (this might take a while considering the size).
Sample:
The file blockdata.txt has the following columns with data for each block, the first line is a header:
Code:
id,hash,time,median_time,size,stripped_size,weight,version,version_hex,version_bits,merkle_root,nonce,bits,difficulty,chainwork,coinbase_data_hex,transaction_count,witness_count,input_count,output_count,input_total,input_total_usd,output_total,output_total_usd,fee_total,fee_total_usd,fee_per_kb,fee_per_kb_usd,fee_per_kwu,fee_per_kwu_usd,cdd_total,generation,generation_usd,reward,reward_usd,guessed_miner
0,000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f,2009-01-03 18:15:05,2009-01-03 18:15:05,285,285,1140,1,1,000000000000000000000000000001,4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b,2083236893,486604799,1,0000000000000000000000000000000000000000000000000000000100010001,04ffff001d0104455468652054696d65732030332f4a616e2f32303039204368616e63656c6c6f72206f6e206272696e6b206f66207365636f6e64206261696c6f757420666f722062616e6b73,1,0,1,1,0,0,5000000000,0.5,0,0,0,0,0,0,0,5000000000,0.5,5000000000,0.5,Unknown
1,00000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb6048,2009-01-09 02:54:25,2009-01-09 02:54:25,215,215,860,1,1,000000000000000000000000000001,0e3e2357e806b6cdb1f70b54c3a3a17b6714ee1f0e68bebb44a74b1efd512098,2573394689,486604799,1,0000000000000000000000000000000000000000000000000000000200020002,04ffff001d0104,1,0,1,1,0,0,5000000000,0.5,0,0,0,0,0,0,0,5000000000,0.5,5000000000,0.5,Unknown
2,000000006a625f06636b8bb6ac7b960a8d03705d1ace08b1a19da3fdcc99ddbd,2009-01-09 02:55:44,2009-01-09 02:54:25,215,215,860,1,1,000000000000000000000000000001,9b0fc92260312ce44e74ef369f5c66bbb85848f2eddd5a7a1cde251e54ccfdd5,1639830024,486604799,1,0000000000000000000000000000000000000000000000000000000300030003,04ffff001d010b,1,0,1,1,0,0,5000000000,0.5,0,0,0,0,0,0,0,5000000000,0.5,5000000000,0.5,Unknown
3,0000000082b5015589a3fdf2d4baff403e6f0be035a5d9742c1cae6295464449,2009-01-09 03:02:53,2009-01-09 02:55:44,215,215,860,1,1,000000000000000000000000000001,999e1c837c76a1b7fbb7e57baf87b309960f5ffefbf2a9b95dd890602272f644,1844305925,486604799,1,0000000000000000000000000000000000000000000000000000000400040004,04ffff001d010e,1,0,1,1,0,0,5000000000,0.5,0,0,0,0,0,0,0,5000000000,0.5,5000000000,0.5,Unknown
4,000000004ebadb55ee9096c9a2f8880e09da59c0d68b1c228da88e48844a1485,2009-01-09 03:16:28,2009-01-09 02:55:44,215,215,860,1,1,000000000000000000000000000001,df2b060fa2e5e9c8ed5eaf6a45c13753ec8c63282b2688322eba40cd98ea067a,2850094635,486604799,1,0000000000000000000000000000000000000000000000000000000500050005,04ffff001d011a,1,0,1,1,0,0,5000000000,0.5,0,0,0,0,0,0,0,5000000000,0.5,5000000000,0.5,Unknown
5,000000009b7262315dbf071787ad3656097b892abffd1f95a1a022f896f533fc,2009-01-09 03:23:48,2009-01-09 03:02:53,215,215,860,1,1,000000000000000000000000000001,63522845d294ee9b0188ae5cac91bf389a0c3723f084ca1025e7d9cdfe481ce1,2011431709,486604799,1,0000000000000000000000000000000000000000000000000000000600060006,04ffff001d0120,1,0,1,1,0,0,5000000000,0.5,0,0,0,0,0,0,0,5000000000,0.5,5000000000,0.5,Unknown
6,000000003031a0e73735690c5a1ff2a4be82553b2a12b776fbd3a215dc8f778d,2009-01-09 03:29:49,2009-01-09 03:02:53,215,215,860,1,1,000000000000000000000000000001,20251a76e64e920e58291a30d4b212939aae976baca40e70818ceaa596fb9d37,2538380312,486604799,1,0000000000000000000000000000000000000000000000000000000700070007,04ffff001d0123,1,0,1,1,0,0,5000000000,0.5,0,0,0,0,0,0,0,5000000000,0.5,5000000000,0.5,Unknown
7,0000000071966c2b1d065fd446b1e485b2c9d9594acd2007ccbd5441cfc89444,2009-01-09 03:39:29,2009-01-09 03:16:28,215,215,860,1,1,000000000000000000000000000001,8aa673bc752f2851fd645d6a0a92917e967083007d9c1684f9423b100540673f,2258412857,486604799,1,0000000000000000000000000000000000000000000000000000000800080008,04ffff001d012b,1,0,1,1,0,0,5000000000,0.5,0,0,0,0,0,0,0,5000000000,0.5,5000000000,0.5,Unknown
8,00000000408c48f847aa786c2268fc3e6ec2af68e8468a34a28c61b7f1de0dc6,2009-01-09 03:45:43,2009-01-09 03:16:28,215,215,860,1,1,000000000000000000000000000001,a6f7f1c0dad0f2eb6b13c4f33de664b1b0e9f22efad5994a6d5b6086d85e85e3,1716931356,486604799,1,0000000000000000000000000000000000000000000000000000000900090009,04ffff001d012c,1,0,1,1,0,0,5000000000,0.5,0,0,0,0,0,0,0,5000000000,0.5,5000000000,0.5,Unknown

Separate columns:
Instead of killing your spreadsheet, each column can be downloaded separately (the first line is a header). The column "id" (block number) is added to each file (except for id.txt) to detect errors. Make sure the rows are aligned when inserting the columns you need:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Text file
id.txt
hash.txt
time.txt
median_time.txt
size.txt
stripped_size.txt
weight.txt
version.txt
version_hex.txt
version_bits.txt
merkle_root.txt
nonce.txt
bits.txt
difficulty.txt
chainwork.txt
coinbase_data_hex.txt
transaction_count.txt
witness_count.txt
input_count.txt
output_count.txt
input_total.txt
input_total_usd.txt
output_total.txt
output_total_usd.txt
fee_total.txt
fee_total_usd.txt
fee_per_kb.txt
fee_per_kb_usd.txt
fee_per_kwu.txt
fee_per_kwu_usd.txt
cdd_total.txt
generation.txt
generation_usd.txt
reward.txt
reward_usd.txt
guessed_miner.txt
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Size (MB)
4.1
39
12
12
3.8
3.7
4.1
2.9
2.6
19
39
6.4
6.0
9.1
39
61
2.3
1.5
2.4
2.4
6.9
5.2
7.7
5.4
4.4
4.2
3.9
3.6
4.0
3.5
8.2
6.6
4.3
6.6
5.4
5.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
GNU zip compressed file
id.txt.gz
hash.txt.gz
time.txt.gz
median_time.txt.gz
size.txt.gz
stripped_size.txt.gz
weight.txt.gz
version.txt.gz
version_hex.txt.gz
version_bits.txt.gz
merkle_root.txt.gz
nonce.txt.gz
bits.txt.gz
difficulty.txt.gz
chainwork.txt.gz
coinbase_data_hex.txt.gz
transaction_count.txt.gz
witness_count.txt.gz
input_count.txt.gz
output_count.txt.gz
input_total.txt.gz
input_total_usd.txt.gz
output_total.txt.gz
output_total_usd.txt.gz
fee_total.txt.gz
fee_total_usd.txt.gz
fee_per_kb.txt.gz
fee_per_kb_usd.txt.gz
fee_per_kwu.txt.gz
fee_per_kwu_usd.txt.gz
cdd_total.txt.gz
generation.txt.gz
generation_usd.txt.gz
reward.txt.gz
reward_usd.txt.gz
guessed_miner.txt.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
File sizes on May 6, 2020. I won't update the size as they grow each day.

Background
In Bitcoin's Empty Blocks Analaysis, mikeywith asked if I can help scrape data for a 10 year analysis. Tranthidung posted a link to Blockchair Database Dumps, which offers compressed block summaries with daily updates.

Updates
All files will be updated daily.

Credits
Blockchair Database Dumps has a staggering amount of data, easily accessible in daily files (at 10 kB/s). All data presented in this topic comes from Blockchair.
I've converted these files to make it easier to access all data at once.

No spam please.
Self-moderated against spam. Discussion and questions are welcome.



Related topics
Bitcoin block data available in CSV format
List of all Bitcoin addresses with a balance
List of all Bitcoin addresses ever used
mikeywith
Legendary
*
Offline Offline

Activity: 2226
Merit: 6367


be constructive or S.T.F.U


View Profile
August 11, 2020, 09:24:19 PM
 #5

I wonder how to download bitcoin data to csv files. I would like to get all blocks, transactions in csv files. What is the fastest way to achieve this?

Been there, done that. Trust me unless your searching skill is way better mine then LoyceV's blockdata will be your best source, alternatively, you can download the "Blockchair Database Dumps" from blockchair, but if you don't have a script/bot to do that for you, then you will need to manually download 10 years worth of DAILY data, assuming we never had a whole day without finding a block then you are going to download about 3650 files, so I suggest you stick to the former option.

Keep in mind that LoyceV has that data in a .txt file, so for you to convert it to CSV you going to need a workaround, and the easiest one would be using MS Excel.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
HCP
Legendary
*
Offline Offline

Activity: 2086
Merit: 4316

<insert witty quote here>


View Profile
August 11, 2020, 09:54:35 PM
Merited by mikeywith (1)
 #6

Keep in mind that LoyceV has that data in a .txt file, so for you to convert it to CSV you going to need a workaround, and the easiest one would be using MS Excel.
Or you could simply rename the file from blockdata.txt to blockdata.csv Roll Eyes Roll Eyes Roll Eyes

No need to convert anything... a .csv is a plaintext file... it's just got the .csv extension to indicate that the data contained in the text file is in "comma separated value" format... ie. it is just a bunch of rows/columns where the column delimiter is "," and, optionally, the first row is column headers.

█████████████████████████
████▐██▄█████████████████
████▐██████▄▄▄███████████
████▐████▄█████▄▄████████
████▐█████▀▀▀▀▀███▄██████
████▐███▀████████████████
████▐█████████▄█████▌████
████▐██▌█████▀██████▌████
████▐██████████▀████▌████
█████▀███▄█████▄███▀█████
███████▀█████████▀███████
██████████▀███▀██████████
█████████████████████████
.
BC.GAME
▄▄░░░▄▀▀▄████████
▄▄▄
██████████████
█████░░▄▄▄▄████████
▄▄▄▄▄▄▄▄▄██▄██████▄▄▄▄████
▄███▄█▄▄██████████▄████▄████
███████████████████████████▀███
▀████▄██▄██▄░░░░▄████████████
▀▀▀█████▄▄▄███████████▀██
███████████████████▀██
███████████████████▄██
▄███████████████████▄██
█████████████████████▀██
██████████████████████▄
.
..CASINO....SPORTS....RACING..
█░░░░░░█░░░░░░█
▀███▀░░▀███▀░░▀███▀
▀░▀░░░░▀░▀░░░░▀░▀
░░░░░░░░░░░░
▀██████████
░░░░░███░░░░
░░█░░░███▄█░░░
░░██▌░░███░▀░░██▌
░█░██░░███░░░█░██
░█▀▀▀█▌░███░░█▀▀▀█▌
▄█▄░░░██▄███▄█▄░░▄██▄
▄███▄
░░░░▀██▄▀


▄▄████▄▄
▄███▀▀███▄
██████████
▀███▄░▄██▀
▄▄████▄▄░▀█▀▄██▀▄▄████▄▄
▄███▀▀▀████▄▄██▀▄███▀▀███▄
███████▄▄▀▀████▄▄▀▀███████
▀███▄▄███▀░░░▀▀████▄▄▄███▀
▀▀████▀▀████████▀▀████▀▀
mikeywith
Legendary
*
Offline Offline

Activity: 2226
Merit: 6367


be constructive or S.T.F.U


View Profile
August 11, 2020, 10:59:32 PM
 #7

No need to convert anything... a .csv is a plaintext file... it's just got the .csv extension to indicate that the data contained in the text file is in "comma separated value" format... ie. it is just a bunch of rows/columns where the column delimiter is "," and, optionally, the first row is column headers.

That's a good point but the plaintext in these files is not separated by a comma, will that still work in all csv viewers? Excel does recognize it as a file that has eliminated data in it, it also gives you the option of "My data has headers".

Here is how the .txt file looks by default:



Here is how it looks after importing it to excel and exporting it as .csv




Simply changing the extension from .txt to .csv didn't change the way the data is represented in Notepad, so I am not sure different programs deal with this stuff.



█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
HCP
Legendary
*
Offline Offline

Activity: 2086
Merit: 4316

<insert witty quote here>


View Profile
August 11, 2020, 11:13:54 PM
Last edit: November 15, 2023, 01:07:01 AM by HCP
Merited by ABCbits (2)
 #8

That's the individual "column" files... which aren't exactly CSV[1], as they're really just a single column of data... and that's an "End Of Line" separator issue.


It shows up fine in Notepad++... but by displaying the "hidden" whitespace, we can see it only has LF or "Line Feed" characters for "End of Line"... this is a Unix/Linux standard. However, Windows (and notepad.exe) uses "CR LF"... aka "Carriage Return, Line Feed".

Hence, the file not displaying correctly in notepad.exe


Here is how it looks after importing it to excel and exporting it as .csv
I suspect that Excel simply changed the EOL to CRLF so that it shows "correctly" in notepad.exe


[1] - Technically, they are valid according to csv specs

█████████████████████████
████▐██▄█████████████████
████▐██████▄▄▄███████████
████▐████▄█████▄▄████████
████▐█████▀▀▀▀▀███▄██████
████▐███▀████████████████
████▐█████████▄█████▌████
████▐██▌█████▀██████▌████
████▐██████████▀████▌████
█████▀███▄█████▄███▀█████
███████▀█████████▀███████
██████████▀███▀██████████
█████████████████████████
.
BC.GAME
▄▄░░░▄▀▀▄████████
▄▄▄
██████████████
█████░░▄▄▄▄████████
▄▄▄▄▄▄▄▄▄██▄██████▄▄▄▄████
▄███▄█▄▄██████████▄████▄████
███████████████████████████▀███
▀████▄██▄██▄░░░░▄████████████
▀▀▀█████▄▄▄███████████▀██
███████████████████▀██
███████████████████▄██
▄███████████████████▄██
█████████████████████▀██
██████████████████████▄
.
..CASINO....SPORTS....RACING..
█░░░░░░█░░░░░░█
▀███▀░░▀███▀░░▀███▀
▀░▀░░░░▀░▀░░░░▀░▀
░░░░░░░░░░░░
▀██████████
░░░░░███░░░░
░░█░░░███▄█░░░
░░██▌░░███░▀░░██▌
░█░██░░███░░░█░██
░█▀▀▀█▌░███░░█▀▀▀█▌
▄█▄░░░██▄███▄█▄░░▄██▄
▄███▄
░░░░▀██▄▀


▄▄████▄▄
▄███▀▀███▄
██████████
▀███▄░▄██▀
▄▄████▄▄░▀█▀▄██▀▄▄████▄▄
▄███▀▀▀████▄▄██▀▄███▀▀███▄
███████▄▄▀▀████▄▄▀▀███████
▀███▄▄███▀░░░▀▀████▄▄▄███▀
▀▀████▀▀████████▀▀████▀▀
LoyceV
Legendary
*
Offline Offline

Activity: 3304
Merit: 16596


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
August 12, 2020, 03:30:30 AM
 #9

It shows up fine in Notepad++... but by displaying the "hidden" whitespace, we can see it only has LF or "Line Feed" characters for "End of Line"... this is a Unix/Linux standard. However, Windows (and notepad.exe) uses "CR LF"... aka "Carriage Return, Line Feed".
I don't use Windows, but I'm amazed after 37 years notepad still can't process Linux new lines. Last time I checked (years ago), Wordpad could handle it.
I could easily convert the files to "Windows new lines", but that would mean adding even more files. id.txt is the only file without actual commas in it.

pooya87
Legendary
*
Offline Offline

Activity: 3444
Merit: 10546



View Profile
August 12, 2020, 03:52:57 AM
Merited by LoyceV (4), HCP (2), ABCbits (1)
 #10

I don't use Windows, but I'm amazed after 37 years notepad still can't process Linux new lines. Last time I checked (years ago), Wordpad could handle it.

the screenshot above looks like Windows 7 which only supported the default windows new line (\r\n) which is not a bad thing IMO but they added the support for Unix specific new line encoding (\n), but the newer versions of Windows (although i don't use them myself) have added the functionality 2 years ago.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
HCP
Legendary
*
Offline Offline

Activity: 2086
Merit: 4316

<insert witty quote here>


View Profile
August 12, 2020, 06:03:40 AM
Last edit: November 15, 2023, 01:06:50 AM by HCP
 #11

...but the newer versions of Windows (although i don't use them myself) have added the functionality 2 years ago.
Wouldn't you know it... I didn't even look in notepad... hahaha



It even happily tells you that it has Unix EOL encoding! Been so long since I actually bothered using notepad Tongue Roll Eyes



█████████████████████████
████▐██▄█████████████████
████▐██████▄▄▄███████████
████▐████▄█████▄▄████████
████▐█████▀▀▀▀▀███▄██████
████▐███▀████████████████
████▐█████████▄█████▌████
████▐██▌█████▀██████▌████
████▐██████████▀████▌████
█████▀███▄█████▄███▀█████
███████▀█████████▀███████
██████████▀███▀██████████
█████████████████████████
.
BC.GAME
▄▄░░░▄▀▀▄████████
▄▄▄
██████████████
█████░░▄▄▄▄████████
▄▄▄▄▄▄▄▄▄██▄██████▄▄▄▄████
▄███▄█▄▄██████████▄████▄████
███████████████████████████▀███
▀████▄██▄██▄░░░░▄████████████
▀▀▀█████▄▄▄███████████▀██
███████████████████▀██
███████████████████▄██
▄███████████████████▄██
█████████████████████▀██
██████████████████████▄
.
..CASINO....SPORTS....RACING..
█░░░░░░█░░░░░░█
▀███▀░░▀███▀░░▀███▀
▀░▀░░░░▀░▀░░░░▀░▀
░░░░░░░░░░░░
▀██████████
░░░░░███░░░░
░░█░░░███▄█░░░
░░██▌░░███░▀░░██▌
░█░██░░███░░░█░██
░█▀▀▀█▌░███░░█▀▀▀█▌
▄█▄░░░██▄███▄█▄░░▄██▄
▄███▄
░░░░▀██▄▀


▄▄████▄▄
▄███▀▀███▄
██████████
▀███▄░▄██▀
▄▄████▄▄░▀█▀▄██▀▄▄████▄▄
▄███▀▀▀████▄▄██▀▄███▀▀███▄
███████▄▄▀▀████▄▄▀▀███████
▀███▄▄███▀░░░▀▀████▄▄▄███▀
▀▀████▀▀████████▀▀████▀▀
LoyceV
Legendary
*
Offline Offline

Activity: 3304
Merit: 16596


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
August 12, 2020, 08:34:58 AM
Last edit: August 12, 2020, 08:46:06 AM by LoyceV
 #12

newer versions of Windows (although i don't use them myself) have added the functionality 2 years ago.
Wow! Let me quote microsoft.com:
Quote
For many years, Windows Notepad only supported text documents containing Windows End of Line (EOL) characters
~snip~
Today, we’re excited to announce that we have fixed this issue!
So I was wrong, it didn't take 37 years, but only 35 years. Such progress Cheesy

kristiano92 (OP)
Jr. Member
*
Offline Offline

Activity: 68
Merit: 1


View Profile
August 12, 2020, 08:49:38 AM
 #13

I wonder how to download bitcoin data to csv files. I would like to get all blocks, transactions in csv files. What is the fastest way to achieve this?

Been there, done that. Trust me unless your searching skill is way better mine then LoyceV's blockdata will be your best source, alternatively, you can download the "Blockchair Database Dumps" from blockchair, but if you don't have a script/bot to do that for you, then you will need to manually download 10 years worth of DAILY data, assuming we never had a whole day without finding a block then you are going to download about 3650 files, so I suggest you stick to the former option.

Keep in mind that LoyceV has that data in a .txt file, so for you to convert it to CSV you going to need a workaround, and the easiest one would be using MS Excel.


It is really great work by LoyceV but it hasn't resolved my problem. I want to download all transactions with input and output addresses. I only tried bitcoin-etl (https://github.com/blockchain-etl/bitcoin-etl) but I calculated that it takes more than one month to download it and save to small csv files (e.g. 50-150 blocks in one).  Do you know something else ?
LoyceV
Legendary
*
Offline Offline

Activity: 3304
Merit: 16596


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
August 12, 2020, 09:22:09 AM
Last edit: June 07, 2023, 08:18:36 AM by LoyceV
 #14

I want to download all transactions with input and output addresses.
You can use Blockchair Database Dumps to download all transactions, inputs and outputs. There's a catch though: you'll only get 100 kB/s. The last one ("outputs") is 106 GB in size, and I've downloaded it already. I think "inputs" is 3-4 times larger, so around 350 GB. That's going to take 40 days to download at that speed.
I'd love to add this data to my "collection" too, but I don't have the server space for it. Update: See Bitcoin block data (728 GB): inputs, outputs and transactions.

I only tried bitcoin-etl (https://github.com/blockchain-etl/bitcoin-etl) but I calculated that it takes more than one month to download it and save to small csv files (e.g. 50-150 blocks in one).
A fast cloud server might help. Converting small CSV files into larger files shouldn't be much of a problem.

Can you share what you're trying to do with this data?




HCP
Legendary
*
Offline Offline

Activity: 2086
Merit: 4316

<insert witty quote here>


View Profile
August 12, 2020, 12:50:01 PM
 #15

Theoretically, it should be possible to sync up a full node (with indexing switched on)... and then write (or possibly find) a block parser that can read through the raw block files and extract the data you require.

Or instead of parsing the block files, you could try and run one of the open-source "block explorers" and then use the API to query your local block explorer to retrieve the block/transaction data and output as CSV.

Honestly not sure which way would be quickest/easiest... although have the node setup and synced along with block explorer would be "reusable" if you needed to update the data etc.

█████████████████████████
████▐██▄█████████████████
████▐██████▄▄▄███████████
████▐████▄█████▄▄████████
████▐█████▀▀▀▀▀███▄██████
████▐███▀████████████████
████▐█████████▄█████▌████
████▐██▌█████▀██████▌████
████▐██████████▀████▌████
█████▀███▄█████▄███▀█████
███████▀█████████▀███████
██████████▀███▀██████████
█████████████████████████
.
BC.GAME
▄▄░░░▄▀▀▄████████
▄▄▄
██████████████
█████░░▄▄▄▄████████
▄▄▄▄▄▄▄▄▄██▄██████▄▄▄▄████
▄███▄█▄▄██████████▄████▄████
███████████████████████████▀███
▀████▄██▄██▄░░░░▄████████████
▀▀▀█████▄▄▄███████████▀██
███████████████████▀██
███████████████████▄██
▄███████████████████▄██
█████████████████████▀██
██████████████████████▄
.
..CASINO....SPORTS....RACING..
█░░░░░░█░░░░░░█
▀███▀░░▀███▀░░▀███▀
▀░▀░░░░▀░▀░░░░▀░▀
░░░░░░░░░░░░
▀██████████
░░░░░███░░░░
░░█░░░███▄█░░░
░░██▌░░███░▀░░██▌
░█░██░░███░░░█░██
░█▀▀▀█▌░███░░█▀▀▀█▌
▄█▄░░░██▄███▄█▄░░▄██▄
▄███▄
░░░░▀██▄▀


▄▄████▄▄
▄███▀▀███▄
██████████
▀███▄░▄██▀
▄▄████▄▄░▀█▀▄██▀▄▄████▄▄
▄███▀▀▀████▄▄██▀▄███▀▀███▄
███████▄▄▀▀████▄▄▀▀███████
▀███▄▄███▀░░░▀▀████▄▄▄███▀
▀▀████▀▀████████▀▀████▀▀
kristiano92 (OP)
Jr. Member
*
Offline Offline

Activity: 68
Merit: 1


View Profile
August 13, 2020, 07:39:21 AM
 #16

It sounds like XY problem, why do you specifically need those datas in csv format? If you're willing to share what you're trying to solve, some member might able to give better solution.

I want to collect all inputs and outputs of transactions with needed parameters. My idea is to create something like walletexplorer but with different heuristics. I have aws server with btc node. This data could be inother formats like json, xls. I could always convert it.
TheArchaeologist
Sr. Member
****
Offline Offline

Activity: 310
Merit: 727


---------> 1231006505


View Profile WWW
August 13, 2020, 10:14:46 AM
Merited by LoyceV (6), ABCbits (2), pooya87 (1), Heisenberg_Hunter (1)
 #17

It sounds like XY problem, why do you specifically need those datas in csv format? If you're willing to share what you're trying to solve, some member might able to give better solution.

I want to collect all inputs and outputs of transactions with needed parameters. My idea is to create something like walletexplorer but with different heuristics. I have aws server with btc node. This data could be inother formats like json, xls. I could always convert it.
You could always interact with the full node you are running, just be sure to set the  txindex=1 value.

You can create your own program to do something along the lines like this:
Code:
#get hash of block you want to parse (600000 in this example:
bitcoin-cli getblockhash 600000
Returns value: 00000000000000000007316856900e76b4f7a9139cfbfba89842c8d196cd5f91

#get all txids in the block
bitcoin-cli getblock 00000000000000000007316856900e76b4f7a9139cfbfba89842c8d196cd5f91
Returns json object with the hashes of all transactions in that block in "tx"

#Loop through every value found, in this example I chose the 10th txid, use that to get the rawtransaction and in turn decode it:
bitcoin-cli getrawtransaction 72ff0008adcf3100a20e76333554d2fb9341820266456c4aeb089d328a001a5d
Returns the raw transaction, decode it in the next step:
bitcoin-cli decoderawtransaction 010000000001025a6d6ab8d1142df6709e04cb649a580831280f266d65ea5a052f89987e825cb30700000023220020fdab9a48cf9e636fc3a352486cb4ea2e3209eefe560b7950c18c06a41624dd75ffffffff9335593b4b4d923d0e809c07d2221670f1b4a46a8d64227b55185f889d6cb0ef00000000232200209363ea25f0c3fdf47b9c76ab0fc21bd44e910a504ff748d33d265908842b9ab9ffffffff02e09da3010000000017a914f975e09c756c209b6e9d7031a8a330a91f3fb20a8744d503000000000017a914a6c3f581c093cabe4644c948ab3ff8f966d57d588704004830450221009733ddafbe721c82aaf78aba6b9ba18d6a063e5393882e8073e93f5c22c75aef022023038d94ea67c5eb4d1a75eebda5b256cf0d934e20cc723b8ceb9a4c152cf73a014730440220789266c818bb2ecc9d0f1836ab7bb0179df4abc5f772b8cc96b7778eaeef3d3902201756a2a86b23041397eb7e3c2059156c9b0f037a066305ae0529d739049ab5620147522103607df524b5afee15ece0e87b362fa66a698488919dac93faf11b27273d2b224121030adf9b1d50dfe31367f033923458d5b2488d2ee92d7c806411b517eb021d960352ae040047304402205924875cd609cba315ba70a1edd7b17eac181a4fe5eab9e48884219ca453fcd7022001b6e89f91cfb7427f0aefb13e77ca3da84f611d07bbff7cdc4dd58fb705a435014730440220386974a306a111da6bb22256de3defe911d936a98faa0d0a47616fed948c35e202207f1459c69f4b60d821ab88df5baf3ba019ae7894faba2fd349eb70dec7a5cbb50147522102a03bb11024359133088573cf644941d90451fcd4a71375556735fb42c6ea0b2a21030adf9b1d50dfe31367f033923458d5b2488d2ee92d7c806411b517eb021d960352ae00000000

Returns this JSON:
{
  "txid": "72ff0008adcf3100a20e76333554d2fb9341820266456c4aeb089d328a001a5d",
  "hash": "5c5383609de5c12519f330fa6751326a5b99d799b96a3d9db879d835f15d948e",
  "version": 1,
  "size": 665,
  "vsize": 336,
  "weight": 1343,
  "locktime": 0,
  "vin": [
    {
      "txid": "b35c827e98892f055aea656d260f283108589a64cb049e70f62d14d1b86a6d5a",
      "vout": 7,
      "scriptSig": {
        "asm": "0020fdab9a48cf9e636fc3a352486cb4ea2e3209eefe560b7950c18c06a41624dd75",
        "hex": "220020fdab9a48cf9e636fc3a352486cb4ea2e3209eefe560b7950c18c06a41624dd75"
      },
      "txinwitness": [
        "",
        "30450221009733ddafbe721c82aaf78aba6b9ba18d6a063e5393882e8073e93f5c22c75aef022023038d94ea67c5eb4d1a75eebda5b256cf0d934e20cc723b8ceb9a4c152cf73a01",
        "30440220789266c818bb2ecc9d0f1836ab7bb0179df4abc5f772b8cc96b7778eaeef3d3902201756a2a86b23041397eb7e3c2059156c9b0f037a066305ae0529d739049ab56201",
        "522103607df524b5afee15ece0e87b362fa66a698488919dac93faf11b27273d2b224121030adf9b1d50dfe31367f033923458d5b2488d2ee92d7c806411b517eb021d960352ae"
      ],
      "sequence": 4294967295
    },
    {
      "txid": "efb06c9d885f18557b22648d6aa4b4f1701622d2079c800e3d924d4b3b593593",
      "vout": 0,
      "scriptSig": {
        "asm": "00209363ea25f0c3fdf47b9c76ab0fc21bd44e910a504ff748d33d265908842b9ab9",
        "hex": "2200209363ea25f0c3fdf47b9c76ab0fc21bd44e910a504ff748d33d265908842b9ab9"
      },
      "txinwitness": [
        "",
        "304402205924875cd609cba315ba70a1edd7b17eac181a4fe5eab9e48884219ca453fcd7022001b6e89f91cfb7427f0aefb13e77ca3da84f611d07bbff7cdc4dd58fb705a43501",
        "30440220386974a306a111da6bb22256de3defe911d936a98faa0d0a47616fed948c35e202207f1459c69f4b60d821ab88df5baf3ba019ae7894faba2fd349eb70dec7a5cbb501",
        "522102a03bb11024359133088573cf644941d90451fcd4a71375556735fb42c6ea0b2a21030adf9b1d50dfe31367f033923458d5b2488d2ee92d7c806411b517eb021d960352ae"
      ],
      "sequence": 4294967295
    }
  ],
  "vout": [
    {
      "value": 0.27500000,
      "n": 0,
      "scriptPubKey": {
        "asm": "OP_HASH160 f975e09c756c209b6e9d7031a8a330a91f3fb20a OP_EQUAL",
        "hex": "a914f975e09c756c209b6e9d7031a8a330a91f3fb20a87",
        "reqSigs": 1,
        "type": "scripthash",
        "addresses": [
          "3QS3PaF9Z9jcTaULo91bq4BeU8Ps868X2W"
        ]
      }
    },
    {
      "value": 0.00251204,
      "n": 1,
      "scriptPubKey": {
        "asm": "OP_HASH160 a6c3f581c093cabe4644c948ab3ff8f966d57d58 OP_EQUAL",
        "hex": "a914a6c3f581c093cabe4644c948ab3ff8f966d57d5887",
        "reqSigs": 1,
        "type": "scripthash",
        "addresses": [
          "3Gtnqrjt11dKZXyxAyY3qhMecbfU5bKD31"
        ]
      }
    }
  ]
}

You can then grab all the data you want from the JSON-file and store it in a way you like. BTW: Inputs are in the "vin"-block and outputs are in the "vout"-block.

Sooner or later you're going to realize, just as I did, that there's a difference between knowing the path and walking the path
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!