Bitcoin Forum
November 14, 2024, 11:02:05 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: A publicly queryable database of info scraped from the blockchain.  (Read 2291 times)
fbueller (OP)
Sr. Member
****
Offline Offline

Activity: 412
Merit: 287


View Profile
March 22, 2014, 01:42:31 PM
 #1

I want to propose something that would hopefully open up some doors to researchers, analysts, who may not be able to code at this time, but want to process the blockchain for stats, trends, and other info unavailable through stuff like blockchain.info. I'm not looking for another blockexplorer, I want something that people can query for big data.

Does anyone want to help out with this? I plan to import the blockchain into a MySQL database, storing basic block information (output from getblock), and the same for transactions. Also build up a list of addresses related to a transaction.

I expect this would be an unwieldy amount of data, so input from people would be key to this. Also help in supporting this (a dedicated machine/vps that could manage this would be required.. if people donated to keep it floating that would be great.

I have made a start on coding some things would make importing the blockchain a lot quicker - A function to decode a raw transaction instead of asking bitcoind has drastically sped up my parsing times. I need to add some more rules so it recognizes unusual transactions.. it isn't incredibly flexible yet but it recognizes pay-to-pubkey-hash and pay-to-scripthash outputs. I'm tempted to write something that would just parse the blkxxxxx.dat files..

I think having a public querying engine to ask things like 'how many P2SH addresses are being used today, or how has this changed over time?'. Or a visual thing showing how transaction consensus has changed over time. Or other creative questions that I wouldn't think to ask. I know there are a few blockchain forensics/analytics companies popping up.. a community lead project focused on gleaning insight from the block chain would be great though.

Thoughts?

Bitwasp Developer.
jonald_fyookball
Legendary
*
Offline Offline

Activity: 1302
Merit: 1008


Core dev leaves me neg feedback #abuse #political


View Profile
March 22, 2014, 02:13:33 PM
 #2

I want to propose something that would hopefully open up some doors to researchers, analysts, who may not be able to code at this time, but want to process the blockchain for stats, trends, and other info unavailable through stuff like blockchain.info. I'm not looking for another blockexplorer, I want something that people can query for big data.

Does anyone want to help out with this? I plan to import the blockchain into a MySQL database, storing basic block information (output from getblock), and the same for transactions. Also build up a list of addresses related to a transaction.

I expect this would be an unwieldy amount of data, so input from people would be key to this. Also help in supporting this (a dedicated machine/vps that could manage this would be required.. if people donated to keep it floating that would be great.

I have made a start on coding some things would make importing the blockchain a lot quicker - A function to decode a raw transaction instead of asking bitcoind has drastically sped up my parsing times. I need to add some more rules so it recognizes unusual transactions.. it isn't incredibly flexible yet but it recognizes pay-to-pubkey-hash and pay-to-scripthash outputs. I'm tempted to write something that would just parse the blkxxxxx.dat files..

I think having a public querying engine to ask things like 'how many P2SH addresses are being used today, or how has this changed over time?'. Or a visual thing showing how transaction consensus has changed over time. Or other creative questions that I wouldn't think to ask. I know there are a few blockchain forensics/analytics companies popping up.. a community lead project focused on gleaning insight from the block chain would be great though.

Thoughts?

MySQL and big data don't mix.  Try something like mongo.

fbueller (OP)
Sr. Member
****
Offline Offline

Activity: 412
Merit: 287


View Profile
March 22, 2014, 02:21:50 PM
 #3

I'll bear that in mind, thanks! I'll stay away from importing everything until I can interpret most of the transaction/block formats. I don't think coding this by hand is a good idea though. All the 'bugs', that are enforced due to consensus..

Is anyone familiar enough with the satoshi code that they could write a function to record each transaction to a different db during an reindex? I'd chip in a few bit-bucks, hopefully others can too.

Bitwasp Developer.
BitCoinDream
Legendary
*
Offline Offline

Activity: 2394
Merit: 1216

The revolution will be digital


View Profile
March 22, 2014, 03:47:03 PM
 #4

I'll bear that in mind, thanks! I'll stay away from importing everything until I can interpret most of the transaction/block formats. I don't think coding this by hand is a good idea though. All the 'bugs', that are enforced due to consensus..

Is anyone familiar enough with the satoshi code that they could write a function to record each transaction to a different db during an reindex? I'd chip in a few bit-bucks, hopefully others can too.

Plz lemme know if u r ready for public testing ...I'll give it a shot... coz I was planning similar thing ....but short of resource ...in terms of money & time.

alphageek
Newbie
*
Offline Offline

Activity: 11
Merit: 0


View Profile
March 22, 2014, 04:26:12 PM
 #5

I want to propose something that would hopefully open up some doors to researchers, analysts, who may not be able to code at this time, but want to process the blockchain for stats, trends, and other info unavailable through stuff like blockchain.info. I'm not looking for another blockexplorer, I want something that people can query for big data.

Does anyone want to help out with this? I plan to import the blockchain into a MySQL database, storing basic block information (output from getblock), and the same for transactions. Also build up a list of addresses related to a transaction.

I expect this would be an unwieldy amount of data, so input from people would be key to this. Also help in supporting this (a dedicated machine/vps that could manage this would be required.. if people donated to keep it floating that would be great.

I have made a start on coding some things would make importing the blockchain a lot quicker - A function to decode a raw transaction instead of asking bitcoind has drastically sped up my parsing times. I need to add some more rules so it recognizes unusual transactions.. it isn't incredibly flexible yet but it recognizes pay-to-pubkey-hash and pay-to-scripthash outputs. I'm tempted to write something that would just parse the blkxxxxx.dat files..

I think having a public querying engine to ask things like 'how many P2SH addresses are being used today, or how has this changed over time?'. Or a visual thing showing how transaction consensus has changed over time. Or other creative questions that I wouldn't think to ask. I know there are a few blockchain forensics/analytics companies popping up.. a community lead project focused on gleaning insight from the block chain would be great though.

Thoughts?
There are quite few open source blockchain parser available on github: blockparser, bitcoin-abe, block-browser. These can dump data into sql. MySQL/PostgreSQL will be fine to handle it but about 60G+ of storage and half to several days to process.
Some of the stats about script types used are available at http://webbtc.com/stats.
Theres is also a up to date sql dump available http://dumps.webbtc.com/bitcoin/
jgarzik
Legendary
*
Offline Offline

Activity: 1596
Merit: 1100


View Profile
March 22, 2014, 04:39:22 PM
 #6

Sounds like you want something like the open source Insight: http://insight.bitcore.io/

Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
removebeforeflight
Sr. Member
****
Offline Offline

Activity: 696
Merit: 258


View Profile
March 23, 2014, 10:06:51 PM
 #7

Google '256bytes that changed the world'
and follow the John Radcliffe link. The
work has been done. I'm considering
using the Pentaho ETL tool to load into
Postgres, I don't think we have hit big
data yet..pm me.
Taras
Legendary
*
Offline Offline

Activity: 1386
Merit: 1053


Please do not PM me loan requests!


View Profile WWW
April 06, 2014, 03:42:43 AM
 #8

Sounds like you want something like the open source Insight: http://insight.bitcore.io/
Now you must use http://live.insight.is/
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!