Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: jl2012 on May 26, 2015, 04:07:38 AM



Title: Rounding error in calculation of chainwork
Post by: jl2012 on May 26, 2015, 04:07:38 AM
At block 356831, the chainwork reported by Bitcoin Core is 06f293e337e48534b58620. However, I find that the actual expected value should be 06f293e337e48534b7d080, which is 150,112 bigger than the one reported by Bitcoin Core.

In Bitcoin Core, the work each block is underestimated by 0.5 hash on average due to rounding down, and therefore ~150,000** in total at current height.

150,112 hash is nothing, of course. But I think this would affect the way we determine the longest chain and a fix would be a hardfork?

**(The problem did not exist in the first 32,256 diff=1 blocks. The actual number of affected blocks is 324,576 and the expected difference is 162,288 vs actual difference of 150,112. The effective sample size is 161 targets and I think the discrepancy is within statistical error)


Title: Re: Rounding error in calculation of chainwork
Post by: gmaxwell on May 26, 2015, 06:11:16 AM
150,112 hash is nothing, of course. But I think this would affect the way we determine the longest chain and a fix would be a hardfork?
It has always been computed with precise integer arithmetic without rescaling. That it doesn't compute the same as an infinite precision number you came up with shouldn't be surprising or important. (The calculation is the sum of the getwork returned as the integer hash count on each block in the history).

It would be a different kind of short term consensus inconsistency ("softest fork", perhaps) if there were different implementations and  a competing blockchain ever exposed it (which would be an odd corner case to get into).

Unless in the worst case you expect work to be computed as and expressed a bit count proportional to the total number of block hashes, there will be a limit to the precision. The precision is normative, but only in a very soft sense, since either side would happily reorganize onto the other.

[I am reasonably confident the precision has never changed in any version of the software; but I haven't bothered researching it, since it's not even clear if you're alleging if something is actually wrong rather than the result not matching an infinite precision rational arithmetic value for it-- if you really think it's changed in some possibly inconsistent way, let me know and I'll go look.]


Title: Re: Rounding error in calculation of chainwork
Post by: jl2012 on May 26, 2015, 06:24:26 AM
150,112 hash is nothing, of course. But I think this would affect the way we determine the longest chain and a fix would be a hardfork?
It has always been computed with precise integer arithmetic without rescaling. That it doesn't compute the same as an infinite precision number you came up with shouldn't be surprising or important. (The calculation is the sum of the getwork returned as the integer hash count on each block in the history).

It would be a different kind of short term consensus inconsistency ("softest fork", perhaps) if there were different implementations and  a competing blockchain ever exposed it (which would be an odd corner case to get into).

Unless in the worst case you expect work to be computed as and expressed as a number with as many bits as there have been in block hashes, there will be a limit to the precision. The precision is normative, but only in a very soft sense, since either side would happily reorganize onto the other.

[I am reasonably confident the precision has never changed in any version of the software; but I haven't bothered researching it, since it's not even clear if you're alleging if something is actually wrong rather than the result not matching an infinite precision rational arithmetic value for it.]

Yes, it's a softfork for both sides, not a hard one. One side will eventually win due to variance.

In terms of consensus, I think there is nothing "wrong". Consensus is more important than correctness. It's just mathematically inaccurate.


Title: Re: Rounding error in calculation of chainwork
Post by: TierNolan on May 26, 2015, 12:19:15 PM
Yes, it's a softfork for both sides, not a hard one. One side will eventually win due to variance.

It is barely even a soft fork.

For 2015 out of every 2016 blocks, the longest chain is effectively defined as the one with the highest block height.

The fork would need cross a difficulty boundary update.  Even then, both sides would probably agree which of the 2 forks has the higher weight block.  The only exception is if the rounding error causes the finite precision maths to consider both forks to be equal.

Consider two forks that happen exactly at the diff update point.  For one fork (A), the difference in timestamps is 1209600 seconds (so 2 weeks).  On the other fork (B), the difference is 1209599 seconds, so one second less.

This means that for A, there is no diff update at all.  Both rules would agree here.  For B, the difficulty rises by 1209600 / 1209599.  Both sides will agree that fork B has higher difficulty, since difficulty is calculated to 256 bits of accuracy.

They will both break any tie in favour of fork B.


Title: Re: Rounding error in calculation of chainwork
Post by: gmaxwell on May 26, 2015, 11:20:34 PM
It's just mathematically inaccurate.
It's not clear to me that it's "mathematically incorrect".  If it isn't an infinite precision rational number (which it isn't since its an integer) there has to be a precision loss somewhere. E.g. even if it rounded instead of truncating at each block there would still eventually be a difference with the infinite precision rational version. Chainwork is the sum of the integer work estimates (with truncation) of the blocks; this seems like a reasonable thing, it's not the only reasonable thing, but any thing will have some approiximation in it somewhere, unless what it returns is a rational number.



Title: Re: Rounding error in calculation of chainwork
Post by: jl2012 on May 27, 2015, 03:44:22 AM
It's just mathematically inaccurate.
It's not clear to me that it's "mathematically incorrect".  If it isn't an infinite precision rational number (which it isn't since its an integer) there has to be a precision loss somewhere. E.g. even if it rounded instead of truncating at each block there would still eventually be a difference with the infinite precision rational version. Chainwork is the sum of the integer work estimates (with truncation) of the blocks; this seems like a reasonable thing, it's not the only reasonable thing, but any thing will have some approiximation in it somewhere, unless what it returns is a rational number.



[mathematics discussion]

It is well known that if one requires X significance digits in the final result, he has to preserve more than X digits in intermediate steps.

Although my result is still an approximation, it is an unbiased estimation and undeniably closer to true statistical expected value.

And use of rounding, instead of truncating, is also very likely to improve the precision. In statistics terms, rounding is unbiased and truncating is biased.

[/mathematics discussion]

I just wanted to point out the discrepancy and would like to know its implications in comparing competitive forks


Title: Re: Rounding error in calculation of chainwork
Post by: gmaxwell on May 27, 2015, 04:11:25 AM
Fair enough!   I will likely use this as an example implementation detail in the future, since it's softer-than-softforkness makes for an interesting discussion point around convergence behavior.

(Though, since we're on a pedantic tangent, involving various angles and angels on the heads of pins; I could question the classification of rounding as unbiased in an environment where the values might be adversarially chosen! Truncation is, I believe, the unique procedure which guarantees that an opponent could not choose values in a manner which would systematically overstate their work! :P )


Title: Re: Rounding error in calculation of chainwork
Post by: jl2012 on May 27, 2015, 07:39:23 AM
Though, since we're on a pedantic tangent, involving various angles and angels on the heads of pins; I could question the classification of rounding as unbiased in an environment where the values might be adversarially chosen! Truncation is, I believe, the unique procedure which guarantees that an opponent could not choose values in a manner which would systematically overstate their work! :P

Since "honest" miners are not optimized, attackers may exploit this error. They will manipulate the timestamp to make sure no hashing power will be wasted due to chainwork truncation. The advantage is expected to be 0.5 hash / block.