Bitcoin Forum
November 12, 2024, 06:16:50 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 [8] 9 »  All
  Print  
Author Topic: New paper: Accelerating Bitcoin's Trasaction Processing  (Read 36344 times)
eduffield
Legendary
*
Offline Offline

Activity: 1176
Merit: 1036


Dash Developer


View Profile WWW
December 19, 2013, 01:57:23 PM
 #141

To anyone interested in helping implement ghost:

I'm going to start implementing the changes needed to traverse the tree in the reverse direction and then implement GHOST. If anyone wants to help, feel free to submit pull requests:

https://github.com/evan82/bitcoin-ghost

Dash - Digital Cash | dash.org | dashfoundation.io | dashgo.io
LurbQBurdock
Newbie
*
Offline Offline

Activity: 24
Merit: 0



View Profile
December 19, 2013, 04:47:28 PM
 #142

Aviv & Yonatan:  Could GHOST work in the situation that transaction & block propagation across the whole network happens much slower than block generation?  For instance, in inter-planetary distances?  We'd expect blocks to be made on Mars every few seconds but each would not make it to Earth for a few minutes.
halvoraarud
Newbie
*
Offline Offline

Activity: 11
Merit: 0


View Profile
December 20, 2013, 12:04:07 AM
 #143

Aviv & Yonatan:  Could GHOST work in the situation that transaction & block propagation across the whole network happens much slower than block generation?  For instance, in inter-planetary distances?  We'd expect blocks to be made on Mars every few seconds but each would not make it to Earth for a few minutes.

The only mining on Mars should be minerals Smiley
avivz78 (OP)
Newbie
*
Offline Offline

Activity: 21
Merit: 0


View Profile WWW
December 21, 2013, 07:49:15 PM
 #144

Aviv & Yonatan:  Could GHOST work in the situation that transaction & block propagation across the whole network happens much slower than block generation?  For instance, in inter-planetary distances?  We'd expect blocks to be made on Mars every few seconds but each would not make it to Earth for a few minutes.

It would work, in the sense that the 50% attack remains hard to carry out, but the number of transactions per second would be really low. Too low to be of any use.
We're working on an extension that would fix that issue as well. It's a bit too early to tell if we'll be able to close all the holes. I haven't thought of it as a solution for an interplanetary currency though  Smiley
King Lear
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
January 05, 2014, 03:17:29 PM
 #145

Hi,

Let's reexamine proposition 8.1 and claim 8.2 of the paper, that say there will always eventually be a convergence ("collapse") to the next chain block and this convergence takes in expectation a finite time.

The proof is based on the fact that for any finite interval of time, there is a positive constant probability of the event where no new block is created anywhere in the network during a period of time of this interval's length. The problem is that you implicitly assume all miners immediately release any block they have found, and we know this is untrue. So you cannot simply derive that on some point all the nodes in the network will have exactly the same information without assuming that all nodes are honest!

If the average time of creating a new block in the tree is longer than the network diameter, your proof is nevertheless valid. Else, theoretically there is a positive probability that the network will never come to a point of sharing exactly the same information, since withholded blocks might be released once every d seconds where d is the network's diameter.
 
Now let me describe a possible DoS attack that tries to prevent convergence. We take the approach of the paper to have a simplified network of only two honest nodes, while noting that a similar attack is applicable to a more complex network. Say we have two honest nodes whose connection having a delay of d that is relatively big compared with the mean time of block creation.

Say we have two more nodes controlled by the same attacker, each one is adjacent to one of the two honest nodes, so that the delay between the pair is negligible compared to d. The two attacker's nodes may be connected by a special secret channel whose delay is lower than d, or may have no direct connection at all. Having such a special connection can only reduce the attacker's required fraction of total hash-power, but this is not crucial for the attack success. (Nevertheless I think it is most probable that such a faster channel may exist, because in reality the attacker can skip validation of directly transmitted data between his nodes, or even send only data about the number of created blocks instead of the block themselves. On the other hand those two honest nodes are actually two "centers" of the network and the delay between them is caused by having a path of at least several nodes between them…).

Let the two honest nodes have about the same hash-power, and the two attacker's nodes have relatively small hash-power compared with the honest nodes. Due to the delay, one honest node is informed only of his blocks and the released blocks of his adjacent attacker, as well as the blocks of the other honest node and the released blocks of the other attacker's node that were not released at the last d seconds. Therefore it is possible that the other honest node is working on a branch that is actually heavier that the branch the first honest node is working on, because first honest node wrongly thinks that his branch is the heavier. The attacker is trying to always fool the honest node that mines on the less heavy branch, so that the honest nodes will never converge to the same branch.

In order to analyze the required hash-power of the attacker needed so that with a positive probability there will never occur a "collapse" (in contradiction to proposition 8.1), we shall slightly modify our model so it will be easier to mathematically analyze it: instead of saying a honest node will continue to mine on his initial branch until a point where the competitive branch has been heavier already d seconds before, we say that he continues to mine on his initial branch unless the other branch is heavier by k blocks or more (ignoring any delays). In a sense having a delay can be shown to be equivalent of having such k > 0. Moreover, on reality there is no such fixed delay d, so I am not sure that the model of having a fixed corresponding k is less realistic than having a fixed delay d.

The attack goes as follows: each of the attacker's nodes keeps mining in secret on top of the branch of its adjacent honest node, and whenever the competitive second branch is reaching a tie from the point of view of the two nodes working on the first branch, meaning it is actually leading by k blocks) the attacker's node release a single block that breaks the tie in favor of its adjacent honest node. So the question is what is the minimal required hash-power of the attacker so that there is a positive probability that he will never run out of blocks when he needs to release a new block? (In fact the attacker might temporarily run out of blocks, so the network will temporarily collapse to one of the branches, but the attacker may later succeed mining the required block to reverse the collapse. We ignore that for simplicity).

Mathematically, we have a random walking on the 2k-1 numbers: 0, +-1, +-2, … +-(k-1). Each step is forward with probability 1/2 and backward with probability 1/2, where "forward" of (k-1) is regarded as (k-1) and backward of -(k-1) is regarded as -(k-1), as long as the attacker hasn’t run out of secret blocks. This simple Markov process can be shown to define a uniform distribution over the set of legal positions, therefore each of the attacker's nodes have to release a single block every 2(2k-1) blocks in expectation. So the minimum required hash rate of the attacker is 1/(2k-1) of the total honest hash-power, or 1/2k of the total network's hash-power.

The bigger is the delay d with respect to the average time it takes to create a new block, the bigger is k, and the closer to zero is the required hash-power of a successful attacker. Therefore I am afraid that 1 second per block-chain block (which is less than 1 second per a block in the tree!!) is just too much. Yet I am sure GHOST is a revolutionary proposal that can much improve the very low current rate of 10 minutes per block.

The last thing I would like to note is that vulnerability to the same kind of convergence attack isn’t unique to GHOST, but to any system whose delay is relatively high compared with the average time per block. If Bitcoin blocks will be *much* increased (a lot more than the current 1MB maximum) and as a result the delay will increase too, it will be possible to attack Bitcoin similarly.

Yet a major difference is that in Bitcoin the attacker cannot keep secret blocks and release them later, as this will have no effect. Therefore the attacker should always mine on top of the shorter chain, and hopes that the difference between the lengths of the competitive chains will not grow too much. Theoretically it can be shown that no matter how much hash-power the attacker has, eventually this gap will reach any natural number. As this might take a very long time, I still consider such an attack valid.


Lear
King Lear
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
January 05, 2014, 03:26:15 PM
 #146

Hi,

I have post my suggestion for GHOST implementation, aiming to solve the centralization/unfairness of mining problem, and to reduce the vulnerability to convergence attacks (by making it much more difficult to effectively release old-mined blocks). Here is the link to the thread, and I will appreciate any of your comments:  https://bitcointalk.org/index.php?topic=396350.msg4278073#msg4278073.
 
Lear.
coinrevo
Member
**
Offline Offline

Activity: 70
Merit: 10


View Profile
January 05, 2014, 05:05:10 PM
 #147

Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?
maaku
Legendary
*
expert
Offline Offline

Activity: 905
Merit: 1012


View Profile
January 05, 2014, 06:41:36 PM
 #148

Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?

It's done based on local information.

I'm an independent developer working on bitcoin-core, making my living off community donations.
If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP
coinrevo
Member
**
Offline Offline

Activity: 70
Merit: 10


View Profile
January 05, 2014, 08:39:11 PM
 #149

It's done my magic. All the magic nodes transmit information faster than speed of light. They connect to parallel universes. Each universe is a uni-node, with probability of exactly 1/n. Due to quantum emission wormholes eliminate cost of friction, so miners can get the coinbase reward. IP address assignment by the federal reserve eliminates the so called -31.23% attack. I can prove this theorem using advanced stochastical analysis, but only for lambda smaller than 0.03.
prezbo
Sr. Member
****
Offline Offline

Activity: 430
Merit: 250


View Profile
January 05, 2014, 09:16:26 PM
 #150

Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?

Blocks are propagated across the network just like transactions are, so where exactly do you see a problem?
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
January 05, 2014, 10:46:43 PM
 #151

Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?

Block headers could be broadcast for orphaned blocks.  However, it is possible that 2 nodes will have different headers accumulated.

With a linear blockchain, proof of work is always exactly agreed by all miners.

Since all block headers are linked.  All nodes can agree on the proof of work linked to a particular header.  If a node was missing any blocks on that chain, then it wouldn't be able to form the chain at all.

With GHOST, 2 nodes could disagree about the proof of work for a particular chain.  There is no way to be sure that you have all the orphans.

In practice, this isn't really a problem, because the 2 rules almost always give the same answer.

If you did push the the block rate to a point where network latency is a problem, you could have 2 chains where there is disagreement.

Ensuring that all headers are well distributed between nodes is critical.  Disagreements are not likely to last long anyway.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
iddo
Sr. Member
****
Offline Offline

Activity: 360
Merit: 251


View Profile
January 05, 2014, 11:42:30 PM
 #152

In practice, this isn't really a problem, because the 2 rules almost always give the same answer.

If this was true then using GHOST with all its added complexities would be pointless.
prezbo
Sr. Member
****
Offline Offline

Activity: 430
Merit: 250


View Profile
January 05, 2014, 11:45:15 PM
 #153

In practice, this isn't really a problem, because the 2 rules almost always give the same answer.

If this was true then using GHOST with all its added complexities would be pointless.
It was probably meant as they give the same answer as things are at the moment, but they wouldn't if the number of blocks per unit of time would be increased significantly.
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
January 06, 2014, 12:48:35 AM
 #154

It was probably meant as they give the same answer as things are at the moment, but they wouldn't if the number of blocks per unit of time would be increased significantly.

Right, with a 10 minute block time, there are so few orphans that there is total agreement.

If the block time was 1 second, then there would be a huge number of orphans.

Some system for quickly finding the leaf block is important.  Each time a header is added, you have to update an entire tree.

An analysis of the optimal strategy for miners under GHOST would be important.

For bitcoin, broadcasting new blocks as fast as possible is optimal.  Is that still true for GHOST.

With GHOST, there would be lots of ties and near ties, so colluding to break ties in favour of a group may give those in the group an advantage.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
asdf
Hero Member
*****
Offline Offline

Activity: 527
Merit: 500


View Profile
January 06, 2014, 01:11:26 AM
 #155


One thing I note is that there are lots of ways to optimise block wire representations. Sending blocks as lists of hashes, for example, would use 32 byte hashes in our current code just because it's lazily reusing the Bloom filtering path. But 32 bytes is massive overkill for this. You only need to distinguish between transactions in the mempool. You could easily drop to 16, 8 or even 4 byte hashes (hash prefixes) and still be able to disambiguate what the peers block message contains very reliably. Indeed they could be varint encoded so the hash is only as long as it needs to be given current traffic loads. Malicious actors might try to create collisions to force block propagation times up, but if peers negotiate random byte subsets upon connect for relaying of block contents then such an attack becomes impossible.


I thought of this also. How do you deal with collisions, though? What if the receiver of a hash list knows a transaction whose prefix collides with one in the list, but isn't in the list itself?

Would you be including the prefixes of the merkle tree nodes too? These would serve as a checksum, so the receiver would at least know which transaction has the colliding prefix. In fact, with the tree nodes included, you could be much less conservative with the prefix length, sending 2 or event 1 byte(s).

Perhaps that makes no sense. I'm pretty noobish with the protocol specifics.

I'm very interested in reducing block propagation times because I believe slow times will eventually drive transaction costs to uncompetitive levels. This is due to the inherent cost of "orphan risk" which comes from including transactions. see here https://bitcointalk.org/index.php?topic=334284.msg3669703#msg3669703.



TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
January 06, 2014, 01:25:46 AM
 #156

Would you be including the prefixes of the merkle tree nodes too? These would serve as a checksum, so the receiver would at least know which transaction has the colliding prefix. In fact, with the tree nodes included, you could be much less conservative with the prefix length, sending 2 or event 1 byte(s).

Collisions could also be handled by having a re-send system.

Hashes were grouped together into 32 hash groups and the merkle root for those hashes.

The sender could estimate how much of a prefix is required.  If 4 bytes was required, then a group of 32 hashes would require (32 * 4 + 32) bytes = 5 bytes per hash.

The receiver could replace the 32 hashes with its best guess and then check the CRC/Merkle root.  If it doesn't match, then it could ask for those hashes in expanded form.

Assuming a block with 4000 transactions in a block.

Peer 1: Sends block with 5 bytes per hash.

This works out at 20kB

Peer 2: Checks block and finds 2 groups with a mismatch

Peer 2: sends tx request

byte[32]: Block hash
varInt: group count
int: index 1
int: index 2

This works out as 41 bytes

Peer 1: Sends full transaction hashes for those blocks

This works out at 64 * 32 = 2048 bytes

This is much less than 4000 * 32 = 128kB.  However, it is at the cost of an additional round trip delay.

If each compressed hash was prefixed by a length, then the sender could try to estimate which hashes require longer prefixes.

The birthday paradox may cause potential issues here.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
avivz78 (OP)
Newbie
*
Offline Offline

Activity: 21
Merit: 0


View Profile WWW
January 06, 2014, 06:53:53 AM
 #157

Could someone explain to me how the nodes should select the heaviest subtree, when they don't have that information? It requires information from all(!) other nodes. How can this even make sense, because one node is connected to say 8 peers and not the entire network?

Block headers could be broadcast for orphaned blocks.  However, it is possible that 2 nodes will have different headers accumulated.

With a linear blockchain, proof of work is always exactly agreed by all miners.

Since all block headers are linked.  All nodes can agree on the proof of work linked to a particular header.  If a node was missing any blocks on that chain, then it wouldn't be able to form the chain at all.

With GHOST, 2 nodes could disagree about the proof of work for a particular chain.  There is no way to be sure that you have all the orphans.

Very simple:

Our proposal includes keeping the hashes of off-chain blocks in each block (new ones not included in any of the ancestors of the block). The chain, when read sequentially therefore contains all off-chain block hashes, and thus any node can know the entire tree (if a node is missing some block it simply requests it or its header).
coinrevo
Member
**
Offline Offline

Activity: 70
Merit: 10


View Profile
January 06, 2014, 08:20:15 AM
 #158

Response to my critique of GHOST.

Quote
Huh? GHOST isn't different from Bitcoin in this regard.

No, not at all. The essence of bitcoin is: each node works on PoW independently. That's also why there are discrete blocks. As you said bitcoin ignores other nodes PoW - but why? The reason is that once a node requires information from other nodes (during working one block by default) you get an unstable network, because it takes time to transmit information. For example a node could run his own implementation and try to find nodes that do more PoW and disregard other weaker nodes. As soon as you introduce as an input the work of others, one node has to know about all(!) other nodes, so that the calculations are fair. One could argue it is a random distribution, but that assumption clearly doesn't hold. Ideally all peers are equal, but hashing power is not even. The bitcoin algorithm would be much better if it could somehow force nodes to be of equal hashing power. That's whats meant by the "proof-of-work is essentially one-CPU-one-vote" comment in the paper.

To be more clear, by way of practical example. If bitcoin would use a ghost like rule today.. Most mining now is done in certain hot spots, such as the big mine of Iceland [1]. Well, now you have the effect that any other mining is more profitable in Iceland, and miners should move to Iceland because delays are shorter and the heaviest subtree is located there. Simply by being located there you are nearest to the heaviest subtree (assuming that this miner is the biggest). That would introduce the effect that mining would quickly would be located in one or two places, when in fact we want the exact opposite. I suppose the reason that this entirely non-obvious is that people intuitively disregard the latency of speed of light. The other problem is how one looks at robustness of statistics. Traditional mathematics is very bad at understanding robustness, especially in economics. The Ghost paper claims to "prove" attacks are not possible. Note that the original bitcoin paper doesn't do that.

Why is the block time chosen to be 10 minutes? The reason is that it takes time to transmit information through a network. One has to really take into account that TCP/IP packets are not point to point connections. So the GHOST paper uses averages of delay times, but doesn't consider variance. The long 10 minutes is a safeguard against such geo-distribution effects. For example in theory people with bad interconnections can mine as well. The 0.5MB/sec throughput mentioned in the paper is not even achieved in developed countries, and so this leads to some biases the original design very thoughtfully avoids (which is why it is so stable).

[1]
http://dealbook.nytimes.com/2013/12/23/morning-agenda-the-bitcoin-mines-of-iceland/?_r=0
iddo
Sr. Member
****
Offline Offline

Activity: 360
Merit: 251


View Profile
January 06, 2014, 08:57:21 AM
 #159

Most mining now is done in certain hot spots, such as the big mine of Iceland [1]. Well, now you have the effect that any other mining is more profitable in Iceland, and miners should move to Iceland because delays are shorter and the heaviest subtree is located there. Simply by being located there you are nearest to the heaviest subtree (assuming that this miner is the biggest). That would introduce the effect that mining would quickly would be located in one or two places, when in fact we want the exact opposite.

Yes, if some miner nodes have better network connectivity to large portions of the total hashpower, then a shorter blocktime will benefit them more. The analysis in the GHOST paper doesn't take into account scenarios in which network connectivity among the miner nodes is unequal.
coinrevo
Member
**
Offline Offline

Activity: 70
Merit: 10


View Profile
January 06, 2014, 09:42:21 AM
 #160

It would quickly lead to effect like I described above, which would break the system. Soon all mining nodes would be at one place and all other nodes have no incentive to participate or believe the information is correct. So its more than just connectivity. At least the hashing power is distributed in the sense of geolocation, as delays are not an issue. I believe a careful analysis would show that 10 minutes is a well chosen factor, depending on variance of propagation in relation to proof of work time. I'm not sure if somebody has done that analysis, but I think it's more than likely satoshi thought about this in these terms. The 10 minute treshold takes care of all eventualities.
Pages: « 1 2 3 4 5 6 7 [8] 9 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!