maaku


February 29, 2012, 11:09:24 PM 

Bitcoin uses a simple arithmetic mean for its retargeting algorithm. But since it's adapting to a change in rate, shouldn't it be using a harmonic mean instead? Of course it is probably way too late to change the algorithm, but wouldn't an arithmetic mean give mine pool operators who fudge timestamps a disproportionate effect?EDIT: As gmaxwell and Meni Rosenfeld point out, bitcoin is actually measuring the rate parameter of an exponential distribution, for which the arithmetic mean is the proper choice. Also, the harmonic mean could be manipulated by miners via acausal blocks or very short timestamp differences.

I'm an independent developer working on bitcoincore, making my living off community donations. If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP







Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.

koin
Legendary
Offline
Activity: 874


March 01, 2012, 01:13:08 AM 

Bitcoin uses a simple arithmetic mean for its retargeting algorithm. But since it's adapting to a change in rate, shouldn't it be using a harmonic mean instead? Of course it is probably way too late to change the algorithm, but wouldn't an arithmetic mean give mine pool operators who fudge timestamps a disproportionate effect? read up on the time warp bug, it might be relevant: http://bitcointalk.org/index.php?topic=43692.msg521772#msg521772




maaku


March 01, 2012, 06:30:50 AM 

That's actually how I came to thinking about this. I was looking at the retargeting code and got to thinking about how even if the timewarp bug were fixed one might still go about causing mischief by coming up with acausal values for the block header timestamp. Because an arithmetic average is used, all that matters for the difficulty calculation are the endpoints of the retargeting intervalthe miners that happen to find the first and last block of an interval are privileged in their ability to affect the difficulty retargeting algorithm by about +/ 2 hours (0.6%) each. If the harmonic mean were used instead, such potential/risk would have been evenly distributed across all blocks. (Additionally, a pool operator with a significant percentage of the global hash rate could further move the difficulty by a few percent by mucking around with all its blocks so as to skew the network time in one direction or the other, but this remains true in either case.)
By not fixing the time warp attack the problem is exacerbated, as otherwise (with the fix in place) an inflation of the perceived difficulty of one retarget interval would have a corresponding deflationary effect on the following round, and vice versa.

I'm an independent developer working on bitcoincore, making my living off community donations. If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP



gmaxwell
Moderator
Legendary
Offline
Activity: 2072


March 02, 2012, 05:00:43 AM 

such potential/risk would have been evenly distributed across all blocks.
This actually isn't desirable. In the current system miners have no economic incentive to lie about their timestamps (no marginal gain from doing so), except for the window boundary block, whos permissible times are constrained by the surrounding blocks and the actual time, allowing at most a small one time difficulty shift (assuming the warping bug is fixed). Including other blocks in the time computation would create an incentive to lie constantly for every miner, and for the same final response time a much larger allowable skew. Thats not to say that such a change wouldn't have also been a viable approach— it probably would have been— I just don't see that it's clearly better.




Meni Rosenfeld


March 02, 2012, 06:36:09 AM 

This absolutely cannot work. The harmonic expectation of the exponential distribution is 0, so there is no way to retarget using it.
Even if you use a geometric mean or something else that does converge, it will have very high variance causing retargets to be all over the place. Not only that, but an attacker could release blocks with reported timestamp a second after the previous ones, causing a massive drop in the geometric mean and a spike in the difficulty.
Also, messing with the timestamp of the last block may have some shortterm effect, but not so much on the longterm because of the rules for validity of timestamps.
This isn't to say I think the retargeting algorithm is perfect. It is basically a P controller, where PID may have been better.




maaku


March 02, 2012, 07:25:21 AM 

This actually isn't desirable. In the current system miners have no economic incentive to lie about their timestamps (no marginal gain from doing so), except for the window boundary block, whos permissible times are constrained by the surrounding blocks and the actual time, allowing at most a small one time difficulty shift (assuming the warping bug is fixed). Including other blocks in the time computation would create an incentive to lie constantly for every miner, and for the same final response time a much larger allowable skew. That's no different than the incentive in the current system, however. The existing constraint on block header timestamps vs. network time caps the amount of michief that can be done in either case. But with the harmonic mean every miner would have to lie to achieve the same effect as just the miner of the first/last block being dishonest in the arithmetic mean scenario. This absolutely cannot work. The harmonic expectation of the exponential distribution is 0, so there is no way to retarget using it.
Even if you use a geometric mean or something else that does converge, it will have very high variance causing retargets to be all over the place. Not only that, but an attacker could release blocks with reported timestamp a second after the previous ones, causing a massive drop in the geometric mean and a spike in the difficulty. Are you talking about continuous difficulty adjustment? Otherwise I'm really not sure where you're coming from. The question in the OP is about a dropin replacement of the method for determining the average hashrate in bitcoin's onceevery2016thblock difficulty adjustment. I assure you it does work as I have live code doing it right now, and it retargets just fine; it even results in a more accurate adjustment. The harmonic mean, not the arithmetic mean, yields the actual average hashrate for the last N blocks, although the difference between the two is typically very small. Also, messing with the timestamp of the last block may have some shortterm effect, but not so much on the longterm because of the rules for validity of timestamps. The question is more theoretical than anything. Such mischief would result in at most an increase or decrease in the difficulty by a few percent, and would likely be corrected in the next round anyway. Really I'm asking the question out of a desire to understand: I'm wondering if there is a reason Satoshi decided to use the arithmetic mean (such as preventing an attack or removing incentive for bad behavior), or if it was just a mistake (a very common mistake, to be fair).

I'm an independent developer working on bitcoincore, making my living off community donations. If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP



Meni Rosenfeld


March 02, 2012, 09:34:51 AM 

This absolutely cannot work. The harmonic expectation of the exponential distribution is 0, so there is no way to retarget using it.
Even if you use a geometric mean or something else that does converge, it will have very high variance causing retargets to be all over the place. Not only that, but an attacker could release blocks with reported timestamp a second after the previous ones, causing a massive drop in the geometric mean and a spike in the difficulty. Are you talking about continuous difficulty adjustment? Otherwise I'm really not sure where you're coming from. The question in the OP is about a dropin replacement of the method for determining the average hashrate in bitcoin's onceevery2016thblock difficulty adjustment. I assure you it does work as I have live code doing it right now, and it retargets just fine; it even results in a more accurate adjustment. The harmonic mean, not the arithmetic mean, yields the actual average hashrate for the last N blocks, although the difference between the two is typically very small. Maybe I'm not getting your proposed algorithm. Letting Ti = time between block i1 and block i, the current algorithm is (disregarding the offbyone bug): A = (T_1 + T_2 + ... +T_2016) / 2016 D2 = D1 * 10 minutes / A
What I think you're suggesting: H = 2016 / (1/T_1 + 1/T_2 + ... + 1/T_2016) D2 = D1 * c / H
Is that right? If so, can you explain what c is, and how does this work? If not, what is it?




genjix


March 02, 2012, 12:14:19 PM 

PID controllers are hard to tune. I like bitcoin's simplicity




gmaxwell
Moderator
Legendary
Offline
Activity: 2072


March 02, 2012, 02:44:38 PM 

But with the harmonic mean every miner would have to lie to achieve the same effect as just the miner of the first/last block being dishonest in the arithmetic mean scenario.
_any_ miner lying can lower the difficulty slightly, so everyone gets a marginal return for doing it. This makes larger drifts (which at still limited to half a percent one time reduction in difficulty at most) harder, yes, but it makes smaller drifts much more likely. If I understand correctly you're proposing: 10./(2016./sum([1./GAPi for i in range(2016)])) = adj factor So, if GAP is all 10 you have 1. ... All 5 you have 2. Etc. What happens when every third is a false tick that goes back 10 minutes, then the next tick is correct. The gaps then look like [10, 10, 30, ...]. The arithmetic mean gives the right value. 10./(2016./sum([1./(10+(20*(x%3==1))+(20*(x%3==2))) for x in range(2016)])) = 0.1111x ! Shall we try for a geometric mean so that we can get an imaginary adjustment next?




maaku


March 02, 2012, 04:34:54 PM 

Thanks gmaxwell, that's the kind of explanation I was looking for. I think I understand Meni's original post nowthe harmonic mean is clearly a flawed choice in this case.

I'm an independent developer working on bitcoincore, making my living off community donations. If you like my work, please consider donating yourself: 13snZ4ZyCzaL7358SmgvHGC9AxskqumNxP



Meni Rosenfeld


March 03, 2012, 05:34:24 PM 

In principle, negative timespans which result from inaccurate timestamps can be solved by sorting the 2016 timestamps, taking differences between the sorted values, and setting a minimum value for the spans.
My comment (which is about timestamps coming from an exact, theoretical Poisson process) wasn't completely accurate. The fact that 1/E[1/X]=0 is only relevant if the number of timestamps is arbitrary. For the fixed number 2016, E[H] has a finite value, which is approximately 0.109*(10 minutes). For the geometric mean, E[G] = exp(E[log(X)]) = exp(gamma)*E[X], independent on the number of intervals.




