Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: lucif on May 07, 2013, 09:59:53 AM



Title: Bitcoin addon: Distributed block chain storage
Post by: lucif on May 07, 2013, 09:59:53 AM
Hi. I am brain center of DIANNA Project (https://dianna-project.org) :) Actually my project had stuck on future scalability problem. DIANNA needs easily accessible Bitcoin block chain indexed by block hash and transaction hash.

This is very big problem for local computers - its a [giga|tera|peta]bytes of data being constantly read.

I know Bitcoin community also have worries about Bitcoin database size and its load in future.

I think we may solve this problem together once and forever.

What do I need is to design and implement distributed, redunant, DHT-like (HASH=[block/transaction body]) storage for Bitcoin. Each network participant will store some piece of block chain and respond for inbound queries.

As benefit for most network participants - there will be no need to store 100% of data locally and read 100% of data locally. Each will serve his own piece of chain. And as network will grow, storage size/read capacity will also grow.

I think it is okay to put blocks bodies to DHT-like storage a leave only block headers locally as trusted chain skeleton.

I have no expirence with C and DHT. Could someone participate with this? I can sponsor this task. And if community need this feature, other sponsors welcome.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: BenTuras on May 07, 2013, 10:16:00 AM
Sounds like a nice job for http://hadoop.apache.org/ ?


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: lucif on May 07, 2013, 10:18:45 AM
No. This should be custom DHT implementation in untrusted network.

Each client should store some number of blocks from his local headers - not arbitrary data.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: proff on May 07, 2013, 10:26:45 AM
Why not adapt existing solutions for this? Plenty of distributed key-value storage software is freely available. Is there some special requirement in your case that is not satisfied?


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: lucif on May 07, 2013, 10:32:41 AM
Yes, I think it will need to adapt some existing DHT solution with custom modifications for redunancy and storage filter (it should store only some part of chain listed in local headers plus some small percent of arbitrary orphaned blocks/transactions)


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: Sukrim on May 07, 2013, 01:13:53 PM
Isn't bloom filtering already enough for your project? If you need to trust third parties anyways, you could just do a spv client that accesses the chain with bloom filters if it needs specific transactions. This already is implemented in bitcoin-qt


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: bluemeanie1 on June 11, 2013, 03:41:02 AM
Hi. I am brain center of DIANNA Project (https://dianna-project.org) :) Actually my project had stuck on future scalability problem. DIANNA needs easily accessible Bitcoin block chain indexed by block hash and transaction hash.

This is very big problem for local computers - its a [giga|tera|peta]bytes of data being constantly read.

I know Bitcoin community also have worries about Bitcoin database size and its load in future.

I think we may solve this problem together once and forever.

What do I need is to design and implement distributed, redunant, DHT-like (HASH=[block/transaction body]) storage for Bitcoin. Each network participant will store some piece of block chain and respond for inbound queries.

As benefit for most network participants - there will be no need to store 100% of data locally and read 100% of data locally. Each will serve his own piece of chain. And as network will grow, storage size/read capacity will also grow.

I think it is okay to put blocks bodies to DHT-like storage a leave only block headers locally as trusted chain skeleton.

I have no expirence with C and DHT. Could someone participate with this? I can sponsor this task. And if community need this feature, other sponsors welcome.


It's probably possible to store blocks in a DHT with some kind of reasonable response time.  These structures are designed to store data of this capacity.

For many applications though, you need to keep a copy of the block in local storage so you can validate the chain, thus there is a good probability that the block is available at a given node.  From here you need to determine 1) how to locate the block efficiently 2) which node is appropriate to poll- for instance balancing the load on all nodes that have a given block is optimal.  

It's certainly possible to optimize the basic DHT algorithm, which is designed for general use.  You might also need to implement a broadcast function for the transactions.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: lucif on June 11, 2013, 05:31:37 AM
For many applications though, you need to keep a copy of the block in local storage so you can validate the chain
For initial validation - yes, client have to download each block via DHT and build chain headers. But it really don't need to store each block body. Only headers as trusted chain. Block bodies theirselfs can be distributed in untrusted DHT storage, as each client has local chain headers and modified block can not be accepted (as its hash will change).

Can anyone start implementing this? I can be a first donator of this task.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: bluemeanie1 on June 11, 2013, 03:41:57 PM
For many applications though, you need to keep a copy of the block in local storage so you can validate the chain
For initial validation - yes, client have to download each block via DHT and build chain headers. But it really don't need to store each block body. Only headers as trusted chain. Block bodies theirselfs can be distributed in untrusted DHT storage, as each client has local chain headers and modified block can not be accepted (as its hash will change).

Can anyone start implementing this? I can be a first donator of this task.

I've spoken to a few DHT experts on this, most of them are optimistic about this idea.  The problem is basically how to optimize the Distributed Hash Table algorithm for block storage.  You can certainly optimize the algorithm in this case.  BTW- is there any good documentation on how the current relay mechanism works for Bitcoin?  I had several people ask me for this and I couldn't find anything.

btw- I notice many on this board seem to think that 'developers' are some kind of cheap resource that you could just conjure up by throwing a few dollars around.  While there might be many people able and willing to write a few lines of code, these kind of problems are very complex and there are really very few people who are capable of solving them effectively.  I've already seen several projects releasing code that appears to perform a(much desired) function, but fails in very important ways.  Of course these problems wont show until long after it's released and people have invested real money in the system.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: Sukrim on June 11, 2013, 04:42:31 PM
I also don't really see the point in this...
If you're after a full block chain, you are probably still for very long much better off just downloading a torrent or so.
If you want to have a lite client instead, use ultraprune and/or the SPV+ mode that is in the works.
If you just want to have access to older transactions, use bloom filters.

Also it seems the OP does NOT want to distribute block storage, but rather individual transactions indexed by hash (even if they have been pruned away). To make this somehow secure I guess one needs: TX hash, remaining merkle branch + block hash. Then one can see that this transaction was actually in the block that it claims. Still I am not too sure if a local database or a trusted local key-value store (noSQL) would not be more suited to the task.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: bluemeanie1 on June 11, 2013, 05:33:53 PM
In the Confidence Chains system you have MANY block chains, and the users might download several of them for various purposes(if they want to trade an asset or participate in a market or auction).  Thus there must be a useful and robust way to retrieve this data over a P2P network.  Only the identities need to hear about transactions, and you can do this with most DHT implementations.  There is an important question for this system: how do you bootstrap a block chain over a p2p network.  Bitcoin solves this in a very specific way, I need a more generalized solution.

With Bitcoin, it's unclear as to how to best optimize the p2p network.  For me lack of docs on this subject might indicate there are some hidden exploits.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: justusranvier on June 11, 2013, 05:37:32 PM
The problem is basically how to optimize the Distributed Hash Table algorithm for block storage.
Already solved by the Freenet project:

https://freenetproject.org (https://freenetproject.org)


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: bluemeanie1 on June 11, 2013, 05:40:22 PM
The problem is basically how to optimize the Distributed Hash Table algorithm for block storage.
Already solved by the Freenet project:

https://freenetproject.org

excuse me?


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: justusranvier on June 11, 2013, 05:54:08 PM
excuse me?
Freenet's data store implements a distributed, redundant, content-addressed file system where information remains persistent while storages nodes randomly enter and leave the network, without requiring users to explicitly configure replication because the nodes automatically handle that.

Get rid of the unnecessary (and slow) anonymity layer and Freenet is perfect for storing large content-addressed data sets like the Bitcoin blockchain.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: bluemeanie1 on June 11, 2013, 06:48:02 PM
While Freenet can do that, it's not necessarily PERFECT for the job.

Bitcoin is a special case, because the data is highly redundant(practically every node has every block, or most of the blocks).  You can optimize the DHT algo to take advantage of this.  Freenet is one of many technologies in this class.  Some DHTs concentrate on security, and in Bitcoin there is no need for transport security as data at that level is public(theres nothing in a Block anyone wants to hide).

Transaction relays are a different story I think.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: lucif on June 11, 2013, 07:19:46 PM
I also don't really see the point in this...
If you're after a full block chain, you are probably still for very long much better off just downloading a torrent or so.
If you want to have a lite client instead, use ultraprune and/or the SPV+ mode that is in the works.
If you just want to have access to older transactions, use bloom filters.
If Bitcoin doesn't need this... Alternative chains, or even contracts (trading via chains) will really need indexed access to both bitcoin blocks and transactions.

So they will require somewhat of additional bitcoin client + database somewhere around. It is overkill.

May be it is possible to make an addon to bitcoin, which will optionally turn bitcoin client into DHT participant along with regular local 100% database.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: justusranvier on June 11, 2013, 07:40:58 PM
Bitcoin is a special case, because the data is highly redundant(practically every node has every block, or most of the blocks).  You can optimize the DHT algo to take advantage of this.
There's no point implementing a new distributed data store unless you're also going to eliminate the need for every node to keep a complete copy of the blockchain.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: bluemeanie1 on June 11, 2013, 09:15:33 PM
Bitcoin is a special case, because the data is highly redundant(practically every node has every block, or most of the blocks).  You can optimize the DHT algo to take advantage of this.
There's no point implementing a new distributed data store unless you're also going to eliminate the need for every node to keep a complete copy of the blockchain.

the nodes still need the blocks to VALIDATE the transactions.  If we're talking about SPV, then they don't really need the blocks at all, just the headers.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: Sukrim on June 13, 2013, 09:32:54 AM
You don't need all transactions to validate new transactions, only the UTXO set, which will be the work of SPV+ clients (see the "ulimate blockchain compression" topic).


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: RoadTrain on June 16, 2013, 04:15:20 AM
As far as I understand DIANNA there's already a solution.
DIANNA client must include sort of lite Bitcoin client. It will store the block headers to verify the blockchain.
While Bloom filter can facilitate the process of verifying DIANNA transactions.

As the DIANNA transaction is linked to a specific Bitcoin transaction, we can simply request this transaction from Bitcoin network using Bloom filter.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: lucif on June 30, 2013, 10:02:31 AM
Cross post

2. Your 'ultimate storage' grows with more users, but so does the amount of spam produced. It would solve nothing. I like many others would still prefer to store the entire chain.
You answered this by own in 4. 1000s HDD are anyway better than one.

4. Pruning, would remove all spent transactions that are 2(?) transactions back since they wouldnt be needed, dramatically reducing the size of the blockchain. At which point your solution is entirely moot since anyone could store whats left of it without issue.
This is good solution, but this breaks chain integrity. And who said spent outputs will not be needed by anyone?

5. Storage devices are getting cheaper and larger every day and so is memory. im sure if it were needed at some point in the future someone could build a custom board with a crazy amount of memory on it to store the UTXO set. With the speed memory runs at im sure someone could make a slower, cheaper, larger ramdisks for this purpose.

Everybody blindly repeat this following satoshi. But satoshi said this regarding storage space capacity. HDD also have one more very important property which nobody takes into account: IO capacity. Soon, bitcoind will run out of IO capacity of spinning HDD, and later, solid state drives.

I don't propose to discard whole local chain. I propose don't dig it without need on local side.

I know at least one use case where my solution will bring performance benefits.

I know that DHT storage just moves load from disk IO to network IO. But just realize, we have a new block with thousands transactions.

With regular client, EACH NETWORK NODE will have to dig into own local chain and do a key lookup there for each transaction. Thousands or millions of nodes will have to do same hard IO work on each new block.

With regular client + DHT enabled - only few will do this. They will cache lookup results into local DHT cache and answer to others from there. So in this case, only few nodes will perform local chain lookup. Lookup results will distribute along network in mostly cached answers.

As bonus, there will be google-large peta-scale storage for all chain with its glory.a


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: pent on June 30, 2013, 10:14:21 AM
DHT probably will let us go by google way.

Google always planned to never put major load on single node (as bitcoind doing now). This sounds like "Stay away from single expensive main frames. No one of them will never handle load of our tasks. Use many cheap and distributed nodes.". This how BigTable born. In other way, they would stuck in bottlenecks.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: pent on June 30, 2013, 10:23:54 AM
The main point of DHT implementation is to let run client with full chain even on cheap computers.

As long as you put major load on single node, the hardware requirements for that node grow along with chain. This may bring a situation when only jet set can run full chain. All others will have to blindly beleive to their chain.

This is centralization IMO.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: bluemeanie1 on June 30, 2013, 07:27:35 PM
The main point of DHT implementation is to let run client with full chain even on cheap computers.

As long as you put major load on single node, the hardware requirements for that node grow along with chain. This may bring a situation when only jet set can run full chain. All others will have to blindly beleive to their chain.

This is centralization IMO.


that's inevitable.  The block chain is currently > 8 GB.  I think things are going to move to an architecture where there a limited set of nodes in the network manage the currency, and most account owners are light clients of some kind.  It's simply not practical to have everyone have a block chain.  These distributed currencies are just getting started and Bitcoin is already unmanageable.

My vision is 1) no more proof of work 2) distributed rather than decentralized currencies.  This offers a lot of advantages.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: molecular on July 01, 2013, 05:19:34 AM
The problem is basically how to optimize the Distributed Hash Table algorithm for block storage.
Already solved by the Freenet project:

https://freenetproject.org

excuse me?

maybe contact xelister and/or da2ce7. They have been working on "bitcoin over freenet" 2 years ago.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: gmaxwell on July 01, 2013, 05:33:25 AM
I think things are going to move to an architecture where there a limited set of nodes in the network manage the currency, and most account owners are light clients of some kind.
If you want this then the Bitcoin block-chain and protocol is the wrong design to achieve it. Services like visa paypal are far better designed for serving many transactions from small clusters. More secure too— once you must trust a limited set of nodes to not cheat then protocols which cannot be compromised unless they do offer a better security model.


Title: Re: Bitcoin addon: Distributed block chain storage
Post by: bluemeanie1 on July 01, 2013, 08:11:23 AM
I think things are going to move to an architecture where there a limited set of nodes in the network manage the currency, and most account owners are light clients of some kind.
If you want this then the Bitcoin block-chain and protocol is the wrong design to achieve it. Services like visa paypal are far better designed for serving many transactions from small clusters. More secure too— once you must trust a limited set of nodes to not cheat then protocols which cannot be compromised unless they do offer a better security model.

A fairly pointless comment.

Visa doesn't have any nodes.  You can participate in the authorization process in a Visa transaction?