Bitcoin Forum

Bitcoin => Bitcoin Discussion => Topic started by: DeathAndTaxes on March 12, 2013, 03:47:17 PM



Title: Understanding why the call to rollback to v0.7 was made.
Post by: DeathAndTaxes on March 12, 2013, 03:47:17 PM
There seems to be some confusion on why a call to rollback to v0.7 was made so hopefully this will dispel some myth.

What happened?
The network hard forked on block 225433.  The block produced by a v0.8 node was rejected by at least some v0.7 nodes.  Contrary to initial claims this was more complex than just "the block was too big".  Blocks right up to the 1MB limit have been created and relayed without issue on testnet.  The BDB used by v0.7 failed to validate the block 225433 due to an inability to handle the number of tx inputs (not to be confused with block size or number of transactions) and rejected it.  This was an undocumented issue with v0.7 ("bug").

At this point a hard fork existed.  The mining nodes running v0.8 built upon the block 225433 and extended their chain.  The nodes running v0.7 rejected the block 225433 and eventually produced an alternative and incompatible block "225433a".   Since the deviation was uni-direction in theory v0.7 could eventually surpass v0.8 chain and re-unify the network however v0.8 chain had a significant majority of hashing power and that never happened.  By the time the issue was being analyzed v0.8 was over 5 blocks ahead and pulling further ahead with time.

Why was the fork time critical?
The users on v0.7 fork would never rejoin the network.  If all those users shutdown their clients and didn't accept any transactions there was no real risk for them.  However in time as hashing power fled the v0.7 fork remaining nodes would become more and more vulnerable to a variety of attacks.  Not just from a potential 51% attack, but also from accepting generated coins which were valid on the v0.8 blocks and even non hashpower related double spends. 

Why not just let "most hashing power wins"?
The "most hashing power wins" concept is used to solve splits/reorgs not hard forks.  It would create a bad precedent.  Essentially miners on v0.8 node would be leaving potential v0.7 users vulnerable to attack to save a few block rewards.  Since non-mining nodes (users) running v0.7 would never accept the #225433 block produced by v0.8 they would have no method to rejoin the longest chain short of upgrading.  v0.8 users (non miners) would accept blocks produced by v0.7.  By having a few miners downgrade to v0.7 it "reunified" the network in the shortest period of time with the smallest number of changes required.

What if developers/pool operators couldn't reach a consensus?  Would this have destroyed Bitcoin?
No.  The v0.8 chain would continue.  v0.7 users would be urged to immediately stop all transactions and upgrade.  Users on v0.7 clients would remain (increasingly) vulnerable to a variety of attacks if actively engaging in transactions until they upgraded.  Eventually all users would upgrade and the network would have continued with the v0.7 branch being an oprhaned sidechain.  The potential threat to active v0.7 users would make this less than optimal but the network as a whole was never in any danger of failure.

Was there a real risk of 51% attack on the major chain?
Not really.  The v0.8 chain had roughly 60% of the hashing power so to double spend that chain would require more hashing power than 60% of the Bitcoin network.  While 60% is less than 100% it is still a staggering amount of hashing power.  Any malicous actor building out a network that large wouldn't stop and wait for the remote chance that 60% would be enough.  Anyone with that amount of time and resources would be planning to outbuild the full network and attack.

What risks were there for users on v0.7 fork?
The largest risk would be from a 51% attack.  Initially the v0.7 fork was relatively well protected but had it been abandoned in favor of extending the v0.8 chain that hash power would have dropped over time.  The longer a user remained active on the v0.7 network the more danger they would be in.  Eventually hashing power would fall to a level where even a small pool could have easily executed double spends.  

Users on the v0.7 would eventually (100 blocks after the fork) be at risk of bring double spent by newly generated coins.  The coins would confirm but once they upgraded to v0.8 that transaction would never exist (as the block producing it also never existed).  v0.7 users who heeded the alert key warning and stopped all transactions were never in any significant danger.  

Lastly a smart attacker with good timing could have executed a "no hashing power" double spend against vulnerable v0.7 users (who continued to engage in active transactions despite the alert key warning).  A simplified version of this attack would work like this.  As hashing power on v0.7 fork fell, the block generation rate would slow and the number of transactions in the memory pool would increase.  An attacker could exploit this differential by creating a large number of low/ no fee transactions and wait for some of them to be accepted into v0.8 blocks.  The Bitcoin network will eventually forget about unconfirmed transactions as they get old enough to fall out of the memory pool.  Normally a client will rebroadcast an unconfirmed transaction periodically but in this case the attacker's client would see the transaction as confirmed.  The attacker could speed up this process but producing a large number of unrelated transactions.  When the tx in v0.8 blocks were dropped from the v0.7 memory pool the attack could double spend victims on the v0.7 network.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Come-from-Beyond on March 12, 2013, 06:08:04 PM
Is there any transaction from abandoned 0.8 chain that was not included into legit 0.7 chain?


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: DeathAndTaxes on March 12, 2013, 06:09:35 PM
Is there any transaction from abandoned 0.8 chain that was not included into legit 0.7 chain?

Coinbase transactions (block rewards) from the roughly 25 orphaned blocks in the v0.8 chain.  It was important to resolve this before either chain was more than 100 blocks past the fork as then those coins would be spendable which would have been "chaotic".


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Ivica on March 12, 2013, 06:49:42 PM
Good post.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: RoadStress on March 12, 2013, 07:25:55 PM
Thank you for the post.Very insightful!


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: nevafuse on March 12, 2013, 08:00:35 PM
Is there any transaction from abandoned 0.8 chain that was not included into legit 0.7 chain?

https://bitcointalk.org/index.php?topic=152348.0 (https://bitcointalk.org/index.php?topic=152348.0)


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Come-from-Beyond on March 12, 2013, 08:35:12 PM
Is there any transaction from abandoned 0.8 chain that was not included into legit 0.7 chain?

https://bitcointalk.org/index.php?topic=152348.0 (https://bitcointalk.org/index.php?topic=152348.0)

That's very bad.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Ivica on March 12, 2013, 08:38:19 PM
Is there any transaction from abandoned 0.8 chain that was not included into legit 0.7 chain?

https://bitcointalk.org/index.php?topic=152348.0 (https://bitcointalk.org/index.php?topic=152348.0)

That's very bad.

There shouldn't be such transactions, there can be fault at merchant end accepting double-spend transaction during fork time.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: dserrano5 on March 12, 2013, 08:57:54 PM
It was important to resolve this before either chain was more than 100 blocks past the fork as then those coins would be spendable which would have been "chaotic".

Why? After 120 blocks, what makes those coins different from any other unspent output?


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Tesla71 on March 12, 2013, 09:12:57 PM
I think its because you have to wait for 120 confimations on coins that where mined before they add up in your wallet


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Sukrim on March 12, 2013, 09:45:30 PM
It was important to resolve this before either chain was more than 100 blocks past the fork as then those coins would be spendable which would have been "chaotic".

Why? After 120 blocks, what makes those coins different from any other unspent output?

The limit to spend newly mined coins is actually 100 blocks (120 in the UI) and since they are coming "out of thin air" they would be viewed as nonexistant as in "never been there" in a different chain.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: proudhon on March 12, 2013, 10:22:34 PM
There seems to be some confusion on why a call to rollback to v0.7 was made so hopefully this will dispel some myth.

What happened?
The network hard forked on block 225433.  The block produced by a v0.8 node was rejected by at least some v0.7 nodes.  Contrary to initial claims this was more complex than just "the block was too big".  Blocks right up to the 1MB limit have been created and relayed without issue on testnet.  The BDB used by v0.7 failed to validate the block 225433 due to an inability to handle the number of tx inputs (not to be confused with block size or number of transactions) and rejected it.  This was an undocumented issue with v0.7 ("bug").

At this point a hard fork existed.  The mining nodes running v0.8 built upon the block 225433 and extended their chain.  The nodes running v0.7 rejected the block 225433 and eventually produced an alternative and incompatible block "225433a".   Since the deviation was uni-direction in theory v0.7 could eventually surpass v0.8 chain and re-unify the network however v0.8 chain had a significant majority of hashing power and that never happened.  By the time the issue was being analyzed v0.8 was over 5 blocks ahead and pulling further ahead with time.

Why was the fork time critical?
The users on v0.7 fork would never rejoin the network.  If all those users shutdown their clients and didn't accept any transactions there was no real risk for them.  However in time as hashing power fled the v0.7 fork remaining nodes would become more and more vulnerable to a variety of attacks.  Not just from a potential 51% attack, but also from accepting generated coins which were valid on the v0.8 blocks and even non hashpower related double spends. 

Why not just let "most hashing power wins"?
The "most hashing power wins" concept is used to solve splits/reorgs not hard forks.  It would create a bad precedent.  Essentially miners on v0.8 node would be leaving potential v0.7 users vulnerable to attack to save a few block rewards.  Since non-mining nodes (users) running v0.7 would never accept the #225433 block produced by v0.8 they would have no method to rejoin the longest chain short of upgrading.  v0.8 users (non miners) would accept blocks produced by v0.7.  By having a few miners downgrade to v0.7 it "reunified" the network in the shortest period of time with the smallest number of changes required.

What if developers/pool operators couldn't reach a consensus?  Would this have destroyed Bitcoin?
No.  The v0.8 chain would continue.  v0.7 users would be urged to immediately stop all transactions and upgrade.  Users on v0.7 clients would remain (increasingly) vulnerable to a variety of attacks if actively engaging in transactions until they upgraded.  Eventually all users would upgrade and the network would have continued with the v0.7 branch being an oprhaned sidechain.  The potential threat to active v0.7 users would make this less than optimal but the network as a whole was never in any danger of failure.

Was there a real risk of 51% attack on the major chain?
Not really.  The v0.8 chain had roughly 60% of the hashing power so to double spend that chain would require more hashing power than 60% of the Bitcoin network.  While 60% is less than 100% it is still a staggering amount of hashing power.  Any malicous actor building out a network that large wouldn't stop and wait for the remote chance that 60% would be enough.  Anyone with that amount of time and resources would be planning to outbuild the full network and attack.

What risks were there for users on v0.7 fork?
The largest risk would be from a 51% attack.  Initially the v0.7 fork was relatively well protected but had it been abandoned in favor of extending the v0.8 chain that hash power would have dropped over time.  The longer a user remained active on the v0.7 network the more danger they would be in.  Eventually hashing power would fall to a level where even a small pool could have easily executed double spends.  

Users on the v0.7 would eventually (100 blocks after the fork) be at risk of bring double spent by newly generated coins.  The coins would confirm but once they upgraded to v0.8 that transaction would never exist (as the block producing it also never existed).  v0.7 users who heeded the alert key warning and stopped all transactions were never in any significant danger.  

Lastly a smart attacker with good timing could have executed a "no hashing power" double spend against vulnerable v0.7 users (who continued to engage in active transactions despite the alert key warning).  A simplified version of this attack would work like this.  As hashing power on v0.7 fork fell, the block generation rate would slow and the number of transactions in the memory pool would increase.  An attacker could exploit this differential by creating a large number of low/ no fee transactions and wait for some of them to be accepted into v0.8 blocks.  The Bitcoin network will eventually forget about unconfirmed transactions as they get old enough to fall out of the memory pool.  Normally a client will rebroadcast an unconfirmed transaction periodically but in this case the attacker's client would see the transaction as confirmed.  The attacker could speed up this process but producing a large number of unrelated transactions.  When the tx in v0.8 blocks were dropped from the v0.7 memory pool the attack could double spend victims on the v0.7 network.

Fantastic post!  Thanks DandT!  Somebody else on Reddit also offered a pretty good non-technical explanation.  This was a really cool event to watch.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Ekaros on March 12, 2013, 10:47:23 PM
Since when was this bug in software? So how long did the testnet have time to discover it?


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Littleshop on March 12, 2013, 11:09:11 PM
Since when was this bug in software? So how long did the testnet have time to discover it?

While a bug is easy to spot in hindsight, this but looks like it would have not easily emerged even in testnet unless testnet was being used quite hard.  Yes, I know complex scripts to can do that and if a series of complex to simulate transactions are not out there, maybe they should be.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Stephen Gornick on March 12, 2013, 11:20:43 PM
There shouldn't be such transactions, there can be fault at merchant end accepting double-spend transaction during fork time.

When the fork was first being looked into, at March 12 2013 and at 00:03 AM the main net at the time (mined by v0.8 clients) was at block 225439, yet other clients were stuck on the fork at the time (mined by v0.7 and prior clients) was at 225431.

Quote
00:00   sipa   ;;bc,blocks
00:00   gribble   225431

00:01   sipa   but it seems blockexplorer is stuck too...
00:01   sipa   as i'm on 225439

 - http://bitcoinstats.com/irc/bitcoin-dev/logs/2013/03/12

So even before anyone knew for sure that a fork was underway, there could have been transactions with six confirmations on the v0.8 side.  If for whatever reason those transactions didn't also already get included in blocks on the v0.7 side, there was the opportunity to perform a race attack to double spend the transaction that had already been included on the v0.8 side.

We now know that exact scenario is what happened with the transfer to OKPay that is being claimed to have been successfully double spent.  Though in that instance, it was after the alert went out that OKPay still processed the deposit as being valid.  So that specific incident could have been prevented had they halted processing once the alert went out, but there still was a window between when confirmations would occur since the fork started and when the alert eventually went out.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Stephen Gornick on March 12, 2013, 11:29:44 PM
This was an undocumented issue with v0.7 ("bug").

And the term v0.7 is being misused here.  It really is a pre v0.8 bug, right?  i.e., it has existed since day one ... a configuration setting for BDB that has been that way since v0.3 at least, if I read correctly.

Since when was this bug in software? So how long did the testnet have time to discover it?

See above.  It was a scenario (a transaction requiring 10,000 locks, or about 5,000 inputs) that hadn't been tested (again, from what conversations I've seen on it).


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Ekaros on March 12, 2013, 11:38:36 PM
This was an undocumented issue with v0.7 ("bug").

And the term v0.7 is being misused here.  It really is a pre v0.8 bug, right?  i.e., it has existed since day one ... a configuration setting for BDB that has been that way since v0.3 at least, if I read correctly.

Since when was this bug in software? So how long did the testnet have time to discover it?

See above.  It was a scenario (a transaction requiring 10,000 locks, or about 5,000 inputs) that hadn't been tested (again, from what conversations I've seen on it).

Thanks.

Still, if it is that old I wonder how well scenarios with really large scale user base and amount of daily transactions is tested. As some people prophecises about global use. But wouldn't such day also mean millions to billions users and thus very large amount transactions per block... I have to look into this myself when I have time...


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: AndyRossy on March 12, 2013, 11:40:09 PM
There shouldn't be such transactions, there can be fault at merchant end accepting double-spend transaction during fork time.

When the fork was first being looked into, at March 12 2013 and at 00:03 AM the main net at the time (mined by v0.8 clients) was at block 225439, yet other clients were stuck on the fork at the time (mined by v0.7 and prior clients) was at 225431.

Quote
00:00   sipa   ;;bc,blocks
00:00   gribble   225431

00:01   sipa   but it seems blockexplorer is stuck too...
00:01   sipa   as i'm on 225439

 - http://bitcoinstats.com/irc/bitcoin-dev/logs/2013/03/12

So even before anyone knew for sure that a fork was underway, there could have been transactions with six confirmations on the v0.8 side.  If for whatever reason those transactions didn't also already get included in blocks on the v0.7 side, there was the opportunity to perform a race attack to double spend the transaction that had already been included on the v0.8 side.

We now know that exact scenario is what happened with the transfer to OKPay that is being claimed to have been successfully double spent.  Though in that instance, it was after the alert went out that OKPay still processed the deposit as being valid.  So that specific incident could have been prevented had they halted processing once the alert went out, but there still was a window between when confirmations would occur since the fork started and when the alert eventually went out.

Not sure if you're really trying to place any blame on the merchant.... errr


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: cypherdoc on March 12, 2013, 11:56:29 PM

See above.  It was a scenario (a transaction requiring 10,000 locks, or about 5,000 inputs) that hadn't been tested (again, from what conversations I've seen on it).

what are locks?


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Stephen Gornick on March 13, 2013, 12:12:41 AM
Not sure if you're really trying to place any blame on the merchant.... errr

No, I'm not.

It might be common sense / generally accepted protocol that when an alert is received by the bitcoin client to basically halt all payment processing until the problem is understood.

In the case of the March 12th fork, the alert didn't go out until after there had already been six or more confirmations on transactions.  There's nothing an OKPay or other merchant could have done if this transaction had occurred a few blocks earlier (after the fork started but before the alert was released).   The merchant can't know at five o'clock that at six o'clock miners will abandon your chain and instead help a fork become the longest chain.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: nebulus on March 13, 2013, 12:19:38 AM
What exactly do you mean by roll back? I can't get the chain on 0.7 to update...


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Stephen Gornick on March 13, 2013, 12:24:03 AM
what are locks?

With databases, locks are generally referring to something that must be done in a series.  With multiple CPUs, there can be the situation where two pieces of code want to change the same variable at the same time.  Think of an increment operation.   If you have a value of four and that gets incremented by both at the same time, the result would be five, not six like it is supposed to be.     So a lock basically gives one instance of the code the ability to make the change, and then after that lock is released the next instance of the code can do what it needs to do.  So that value goes from four to five, and then five to six.

So in the context of the "bug" in v0.7 and prior versions, the setting used for maximum locks for BDB was below that necessary for a transaction that first made it into a block on a v0.8 node. v0.8 doesn't use BDB and therefore doesn't have that same restriction.  There was a rule in the bitcoin protocol, due to a specific BDB configuration, that wasn't expressly known previously.

[Edit: And here is the info specifically on the BDB locks configuration:

Here's the Berkeley DB tutorial for anyone who might want to do some reading on sizing your database correctly and lock limits.

http://www.stanford.edu/class/cs276a/projects/docs/berkeleydb/ref/lock/max.html (http://www.stanford.edu/class/cs276a/projects/docs/berkeleydb/ref/lock/max.html)

Quote
The maximum number of locks required by an application cannot be easily estimated. It is possible to calculate a maximum number of locks by multiplying the maximum number of lockers, times the maximum number of lock objects, times two (two for the two possible lock modes for each object, read and write). However, this is a pessimal value, and real applications are unlikely to actually need that many locks. Reviewing the Lock subsystem statistics is the best way to determine this value.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Stephen Gornick on March 13, 2013, 12:27:26 AM
What exactly do you mean by roll back? I can't get the chain on 0.7 to update...

By rollback I think he meant have miners downgrade to v0.7 (i.e., roll back the version of software they are using to a prior relese).

If you are having an issue getting your client to sync with v0.7, that probably has nothing to do with the fork.  You probably should open a thread on the Tech Support forum board for that (if the typical suggestions of check your connections, delete the blockchain data files and re-download, etc. don't work for you.)  [Edit: Or move on over to v0.8 which doesn't use BDB for the blockchain.]


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: nebulus on March 13, 2013, 12:35:16 AM
All right, thanks, man... I can get the chain fine on the 0.8. Was not sure if the problem was related.


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: eldentyrell on March 13, 2013, 05:29:24 AM
Thank you for posting this.  However, what you write about risks on the 0.7 branch is not correct.

However in time as hashing power fled the v0.7 fork remaining nodes would become more and more vulnerable to a variety of attacks.  Not just from a potential 51% attack, but also from accepting generated coins which were valid on the v0.8 blocks and even non hashpower related double spends.  

Actually this isn't true.  The Satoshi client checks for long invalid chains (this line of code (https://github.com/bitcoin/bitcoin/blob/1a9ee5da327d8079a297ad292a1c16745b75df91/src/main.cpp#L2941)); if it finds one it goes into safe mode and stops processing transactions or responding to RPC calls.  The message three lines below that line of code is what Pieter Wuillie is talking about in the original announcement:

If you're on 0.7 or older, the client will likely tell you that you need to upgrade.

The now-infamous OKPAY-BTCe double-spend was the reverse problem: it was an attack on a 0.8 client (or a 0.7 client that for some weird reason accepted the large block -- there are unconfirmed reports that the bug is platform-dependent).  Unfortunately that problem is a lot harder to solve.  The Satoshi client needs to start watching for long-and-recent-but-not-longest orphan branches (https://bitcointalk.org/index.php?topic=152348.msg1619035#msg1619035).


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: Stephen Gornick on March 13, 2013, 05:34:02 AM
I am unclear about the compatibility of older versions earlier than 0.7.

The BDB database lock limit issue discovered March 12th exists with every version of the reference client (Bitcoin-Qt, and prior to that WxBitcoin GUI, and bitcoind) prior to v0.8.

I assume that means all previous versions? At what point are older versions no longer usable?

That's the definition of a hard fork.  Old software rejects new blocks which include the incompatible change.
 - http://en.bitcoin.it/wiki/Hardfork_Wishlist

So the moment the fork starts is when any clients not supporting the new rules are no longer supported.

When that does happen how would a user know they should upgrade?

Chances are when the hard fork happens most everyone already would have upgraded.

What that means is that the client released would support both the existing rules and the new rules, with the new rules not taking effect until some future point in time (based on block number).   So essentially if block_number > N then follow the new rules else follow the old rules.

And it would be released well in advance of block N occurring.  [Edit: i.e., two years, according to one core dev:

A hard fork like this would require the intentional support of a majority of merchants.
Short of an emergency, that means everyone will be given at least 2 years to upgrade.
]


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: johnyj on March 13, 2013, 09:27:24 AM
It's cheerful that the roll back to 0.7 decision was made, the other things are not very important right now. The bitcoin's lifeline is hanging on consistency and integrity

Fork is a headache  8)


Title: Re: Understanding why the call to rollback to v0.7 was made.
Post by: jgarzik on March 13, 2013, 09:51:26 AM
That is not the full story. There was an INTENTIONAL setting change from 0.7 to 0.8, there is no "bug" ... both version did what they were meant to but they had been made incompatible by this change below ...

https://bitcointalk.org/index.php?topic=140233.msg1619546#msg1619546 (https://bitcointalk.org/index.php?topic=140233.msg1619546#msg1619546)

https://github.com/bitcoin/bitcoin/commit/ae8bfd12daa802d20791e69d3477e799d2b99f45#src/db.cpp (https://github.com/bitcoin/bitcoin/commit/ae8bfd12daa802d20791e69d3477e799d2b99f45#src/db.cpp)

Code:
82	        -    dbenv.set_lk_max_locks(10000);
83   -    dbenv.set_lk_max_objects(10000);
  82 +    dbenv.set_lk_max_locks(40000);
  83 +    dbenv.set_lk_max_objects(40000);

This analysis is incorrect, as noted here (https://bitcointalk.org/index.php?topic=140233.msg1619629#msg1619629) when this "discovery" was posted in another thread.