lemonginger


August 08, 2011, 08:26:37 PM 

Also, people in this thread need to show their work when they are calculating probabilities.
Hint: you at least need to know the standard deviation.





Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.



RandyFolds


August 08, 2011, 08:32:43 PM 

online poker is rigged!
I am not very good with math, but I would like to see someone run these numbers on BurningToad's Arsbitcoin SMPPS pool. We are at a positive buffer of ~800btc, which is +16 blocks found over the probability curve, and have held it for almost a month. BurningToad also came forward with three (or maybe it was just two) blocks that were found but never registered by his code due to a bug concerning a block found while the previous found block's payout calcs were running (iirc). He could easily have kept them and none would be the wiser, especially with the crazy positive buffer. How many operators ARE keeping them? Anyways, beyond the hidden block thing, what are the odds that we would end up with this crazy buffer for so long? It seems radically unlikely, just like the string of bad luck the OP is quoting. With the rate of distribution being relatively constant, one pool's up is another pool's down...




Mad7Scientist


August 08, 2011, 08:49:23 PM 

If it goes to 100% one day then +100% the next day there is manipulation going on, even though in that particular case there is no indication of stealing. If there is manipulation going on, it's very important to know that.
If there is manipulation, then most likely there is also stealing going on. Why would there be manipulation of the luck in the pool without stealing? Can anyone explain this?
I'm trying to show that there is a high probability of manipulation of the pool going on, not that there is stealing going on. Once we determine that there is manipulation, it will be easy to conclude that there is stealing, unless the long term output of the pool is above the expected amount.
So having it shoot up to +70% increases the chance that there is manipulation going on, just as 40% would, although it is hard to figure out where the pool got those extra solved blocks from if it was manipulation. Why? maybe somebody is trying to divert attention away from the low luck days, and if so it worked fairly well because people on the forum who started talking about the bad luck quit talking about it when luck shot way up.
And the probability of the 5 day period is is 0.0036 (277:1) not 0.01.
If someone wants to find out what the expected(?) deviation for positive and negative luck is on the pool that would be nice. Those are the numbers you would get if you took all the positive and then negative values over a very long time on a normal non manipulated pool and averaged them.




eleuthria
Legendary
Offline
Activity: 1750


August 08, 2011, 08:50:47 PM 

online poker is rigged!
I am not very good with math, but I would like to see someone run these numbers on BurningToad's Arsbitcoin SMPPS pool. We are at a positive buffer of ~800btc, which is +16 blocks found over the probability curve, and have held it for almost a month. BurningToad also came forward with three (or maybe it was just two) blocks that were found but never registered by his code due to a bug concerning a block found while the previous found block's payout calcs were running (iirc). He could easily have kept them and none would be the wiser, especially with the crazy positive buffer. How many operators ARE keeping them? Anyways, beyond the hidden block thing, what are the odds that we would end up with this crazy buffer for so long? It seems radically unlikely, just like the string of bad luck the OP is quoting. With the rate of distribution being relatively constant, one pool's up is another pool's down... Assuming the network as a whole is constant isn't entirely accurate though. Looking at the charts at bitcoinwatch can show you the network can vary quite a lot. One pool's up/down does not mean another pool is having the opposite. However, the odds of consistently solving blocks faster than expected at a difficulty are the same as consistently taking longer. I spent the last week going over the changes I had made to pushpool, trying to find some simple change I made that would've caused it to not push valid shares upstream to bitcoind, and I've come up blank. I replaced the pools with stock pushpool [outside of dbmysql.c and a longpoll disable bit] the other day and it changed nothing. Other pools are running JoelKatz's patches so I have no reason to believe that is causing any issues either. Software side issues are ruled out. The only thing itching in the back of my mind is the way the servers have been split up to run against their own bitcoind + wallet, but that shouldn't mean anything, especially when we've been doing the same thing for a long time and have shown ups/downs regularly in luck. Splitting miners [and making sure they come back to the same server with results] across 6 smaller pools should have no difference in variance from all using one large pool as long as each pool is running different headers [receiving addresses] for the hashes.




Mad7Scientist


August 08, 2011, 09:04:53 PM 

The arsbitcoin.com pool's 800 BTC buffer is interesting. It looks like that pool is only 1/6th the size of BTCGuild.com.
Is there a way to find out over what length of time those 800 BTC were accumulated, edit: or that they actually have 800 BTC?
edit: If you consider the possibility that Ars Bitcoin and BTCGuild are working together, that 800 BTC buffer could explain where the missing blocks from BTCGuild went. Then at some point someone will keep the buffer. If there is negative manipulation of the pool, those blocks that are solved will show up somewhere and someone may see that a mystery mining pool has appeared. Transferring them to another pool would be a great way to hide.
I see that Ars Bitcoin keeps track of who solved each block. It would be possible to give the person who is appointed to win a share that is guaranteed to win wouldn't it?




RandyFolds


August 08, 2011, 09:10:03 PM 

The arsbitcoin.com pool's 800 BTC buffer is interesting. It looks like that pool is only 1/6th the size of BTCGuild.com.
Is there a way to find out over what length of time those 800 BTC were accumulated?
We switched to SMPPS on 70611.




RandyFolds


August 08, 2011, 09:16:29 PM 

online poker is rigged!
I am not very good with math, but I would like to see someone run these numbers on BurningToad's Arsbitcoin SMPPS pool. We are at a positive buffer of ~800btc, which is +16 blocks found over the probability curve, and have held it for almost a month. BurningToad also came forward with three (or maybe it was just two) blocks that were found but never registered by his code due to a bug concerning a block found while the previous found block's payout calcs were running (iirc). He could easily have kept them and none would be the wiser, especially with the crazy positive buffer. How many operators ARE keeping them? Anyways, beyond the hidden block thing, what are the odds that we would end up with this crazy buffer for so long? It seems radically unlikely, just like the string of bad luck the OP is quoting. With the rate of distribution being relatively constant, one pool's up is another pool's down... Assuming the network as a whole is constant isn't entirely accurate though. Looking at the charts at bitcoinwatch can show you the network can vary quite a lot. One pool's up/down does not mean another pool is having the opposite. However, the odds of consistently solving blocks faster than expected at a difficulty are the same as consistently taking longer. I spent the last week going over the changes I had made to pushpool, trying to find some simple change I made that would've caused it to not push valid shares upstream to bitcoind, and I've come up blank. I replaced the pools with stock pushpool [outside of dbmysql.c and a longpoll disable bit] the other day and it changed nothing. Other pools are running JoelKatz's patches so I have no reason to believe that is causing any issues either. Software side issues are ruled out. The only thing itching in the back of my mind is the way the servers have been split up to run against their own bitcoind + wallet, but that shouldn't mean anything, especially when we've been doing the same thing for a long time and have shown ups/downs regularly in luck. Splitting miners [and making sure they come back to the same server with results] across 6 smaller pools should have no difference in variance from all using one large pool as long as each pool is running different headers [receiving addresses] for the hashes. I thought the charts/hashrates recorded by bitcoinwatch were extrapolated from the difficulty and rate of solution of blocks. If that is the case, it is much more constant than it appears, as probability is responsible for the wiggle vs. actual computing power entering and leaving the network. I have no opinion in either direction on this, I just wanted to bring up the Ars pool as an example of statistics not playing out as they should for a loooong stretch...and we're back to the whole random number generator thing...




boaz2020
Jr. Member
Offline
Activity: 48


August 08, 2011, 09:23:28 PM 

...So having it shoot up to +70% increases the chance that there is manipulation going on, just as 40% would, although it is hard to figure out where the pool got those extra solved blocks from if it was manipulation...
What?




Vladimir


August 08, 2011, 09:30:18 PM 

Why do you people talk about all kinds of irrelevant stuff (from math point of view)? Fact 1.There are only 3 variables needed to determine probability of any 'at most N blocks found' outcome.
1. Difficulty. 2. Number of diff1 shares. 3. Number of blocks found. That is it!. Total hashrate of the network is not important, total hashrate of somebody's mining rig or pool is not important. There are not even any standard deviations involved, even though some math geniuses above imply that without it everything is lost. Basically, everything but the above 3 variables is utterly irrelevant. Fact 2.If any of that is above your head, here is simpler way to think of it. If during some period of time when D is diffuculty and N1 is diff1 shares submitted to a pool, than expected number of blocks solved B1 = N/D. Number of blocks found B2 is known. Push B1 and B2 into poisson formula and you get probability. Again this calculator http://www.sbrforum.com/bettingtools/poissoncalculator/ will give you all the probabilities you want based on B1 and B2. Think about D diff1 shares accepted by the pool as two coin tosses where head means block solved and tail means not solved. On average it should be one block solved per every <difficulty> shares. As simple as that. Now... if we have 12 mil shares without a solved block and difficulty is 2 millions (or 6 mil shares with difficulty of 1 million) it is essentially the same as tossing a coin 12 times in a row and getting only tails. Not impossible, but what are the odds? (about a quarter of one percent actually) Large pools should really be dead on target over any given ~2 week constant difficulty period (so far at least). Fact 3Probability of a pool finding at most 105 blocks when 133 is expected is 0.6952%.
Having said that, expectation of finding 133 blocks is stated by OP. I have no idea where he got this number from and since it is not derived by him from difficulty and number of diff1 shares but using some other method, which I do not understand, it is a highly suspect number.





eleuthria
Legendary
Offline
Activity: 1750


August 08, 2011, 10:05:18 PM 

Since Vladmir likes his Poisson distribution, here's the hard data for the completed difficulties that have been tracked by the pool: 577,129,642 shares submitted during difficulty 1690906. Number of blocks found: 334. Expected blocks from that many shares: 341.13.
Odds of Exactly 334 2.0246% More Than 334 63.7237% Less Than 334 34.2517%
481,715,969 shares submitted during difficulty 1563027. Number of blocks found: 285. Expected blocks from that many shares: 308.19.
Odds of Exactly 285 0.9651% More Than 285 90.3068% Less Than 285 8.7281%
599,334,163 shares submitted during difficulty 1379223. Number of blocks found: 433. Expected blocks from that many shares: 434.54.
Odds of Exactly 433 1.9116% More Than 433 51.6715% Less Than 433 46.4169%
358,943,796 shares submitted during difficulty 876954. Number of blocks found: 410. Expected blocks from that many shares: 409.30.
Odds of Exactly 410 1.9687% More Than 410 47.3083% Less Than 410 50.7230%
Grand total (Hopefully it's safe to do this, I'm sure Vladmir will verbally assault me though if I'm wrong): Number of blocks found: 1462 Expected blocks found: 1493.16
Odds of Exactly 1462 0.7520% More Than 1462 78.5779% Less Than 1462 20.6701%
It's worth noting that there were _MANY_ restarts of pushpool instances/server moves during 867k  1.56m difficulties. Some of them were definitely duplicating work during that time meaning the share counts were inflated on some rounds due to the same work being issued twice. However, I would expect the combined duplication of work was less than 1%, which is immaterial given the numbers we're working with. I'll leave interpretation up to Vladmir, I don't want to start ranting interpretations of these numbers that may not be accurate.




Vladimir


August 08, 2011, 10:12:46 PM 

Thank You for the data.
So basically, we have 21.4221% probability of 'at most' 1462 blocs found while 1493.16 blocks are expected. This is well in the realm of possibility. Just like tossing a coin 2986 times and getting 1462 heads.
If we take only two last 'unlucky' periods than we have expected number of blocks 649.32 and actual 619 with probability of 'at most' 619 blocks found 12.0440%.
for whatever it worth.





jwzguy


August 08, 2011, 10:22:32 PM 





k9quaint
Legendary
Offline
Activity: 1190


August 08, 2011, 10:26:12 PM 

Thank You for the data.
So basically, we have 21.4221% probability of 'at most' 1462 blocs found while 1493.16 blocks are expected. This is well in the realm of possibility.
Don't feel bad for taking MadScientist75 at his word, once. From this point forward, you should double check all of his starting assumptions, all of his data, all of his calculations and all of his conclusions. Those of us that know MadScientist75, know to do this. I think the person who mentioned standard deviation earlier in the thread was doing so as a slight to MadScientist75. It is a #btcguild inside joke that stems from this little gem: Jul 26 20:40:25 <sonicrules1234> Mad7Scientist: Are you using standard deviation anywhere in your equations? Jul 26 20:40:43 <Mad7Scientist> sonicrules1234, what is standard deviation? Hilarity follwed. You can grep the online logs for #btcguild if you want the whole conversation. Anyway, Eleutheria already put this thread to bed with actual math and actual data. So I don't get to have fun with it.

Bitcoin is backed by the full faith and credit of YouTube comments.



molecular
Donator
Legendary
Offline
Activity: 2366


August 08, 2011, 10:49:57 PM 

Isn't there a simple way to find out? How about a couple of supecting miners decide to use a slightly modded miner that would inform the user when it finds a block.
It'd be sufficient for this group to be able to present only 1 block that was stolen by the operator.
Of course they would have keep their identities secret, otherwise the pool operator can just always leave their blocks untouched and choose blocks of other miners for stealing.
This would at least create a lot of danger for the pool operator. The pool operator getting caught stealing would certainly destroy the pool, so even a slight danger of that happening should keep him from stealing, right?
Am I overlooking something?

PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0 3F39 FC49 2362 F9B7 0769



eleuthria
Legendary
Offline
Activity: 1750


August 08, 2011, 10:58:14 PM 

Isn't there a simple way to find out? How about a couple of supecting miners decide to use a slightly modded miner that would inform the user when it finds a block.
It'd be sufficient for this group to be able to present only 1 block that was stolen by the operator.
Of course they would have keep their identities secret, otherwise the pool operator can just always leave their blocks untouched and choose blocks of other miners for stealing.
This would at least create a lot of danger for the pool operator. The pool operator getting caught stealing would certainly destroy the pool, so even a slight danger of that happening should keep him from stealing, right?
Am I overlooking something?
This has been possible from the beginning. Miners know if they have the winning share, and the pool reports who found a block. Unfortunately it's the only way I know of to truly audit a pool you suspect of withholding blocks, and it would require a large number of users to provide reasonable assurance one way or another.




k9quaint
Legendary
Offline
Activity: 1190


August 08, 2011, 11:02:42 PM 

Isn't there a simple way to find out? How about a couple of supecting miners decide to use a slightly modded miner that would inform the user when it finds a block.
It'd be sufficient for this group to be able to present only 1 block that was stolen by the operator.
Of course they would have keep their identities secret, otherwise the pool operator can just always leave their blocks untouched and choose blocks of other miners for stealing.
This would at least create a lot of danger for the pool operator. The pool operator getting caught stealing would certainly destroy the pool, so even a slight danger of that happening should keep him from stealing, right?
Am I overlooking something?
This has been possible from the beginning. Miners know if they have the winning share, and the pool reports who found a block. Unfortunately it's the only way I know of to truly audit a pool you suspect of withholding blocks, and it would require a large number of users to provide reasonable assurance one way or another. That is the case if you want to prove a negative. We do not have to prove that you have not been cheating. All you need do is point out the fact that no proof has yet been presented and leave it at that. The OP got his odds wrong, as people who are not familiar with math have been doing for decades in Vegas.

Bitcoin is backed by the full faith and credit of YouTube comments.



JoelKatz
Legendary
Offline
Activity: 1582
Democracy is vulnerable to a 51% attack.


August 09, 2011, 01:59:31 AM 

If the pool is stealing from miners, it's not like sometimes they're going to cheat and sometimes they're going to sneak in bonuses.
I think this is an excellent point. Pretend you are a pool operator out to cheat. Having 5 visibly bad days seems like a bad way to steal. It would be both easier and less visible to skim off a small amount on a constant basis rather than inserting code that robs a large amount on a set schedule. And either way, the best way to catch it would be to look at the average over the longest term possible. You can always find bad days. Half the weeks will be below average. And if you get to pick the start date and the finish date specifically to bracket a run of bad luck, you can always find improbable events.

I am an employee of Ripple. Follow me on Twitter @JoelKatz 1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BMNBM3FRExVJSJJamV9ccgyWvQfratUHgN



Mad7Scientist


August 09, 2011, 02:30:13 AM 

Vladimir, Thanks for doing that. I guess I was doing it wrong using Binomial Distribution when I should have been using Poisson. My understanding is that Binomial Distribution is based on a fixed set of trials whereas Poisson is based independent events occurring over time, which is the correct thing to use for finding bitcoin blocks. So my 0.0036 (277:1) figure was incorrect it should have been 0.00695 (144:1) as you said. And either way, the best way to catch it would be to look at the average over the longest term possible. You can always find bad days. Half the weeks will be below average. And if you get to pick the start date and the finish date specifically to bracket a run of bad luck, you can always find improbable events. I want to know if the pool is being bounced up and down on purpose (with the down days likely being a little more down than the up days are up) to try to make it more difficult to see what is going on. So eleuthria says that during difficulty 1563027 there is a 90% chance the number of blocks found should have been higher so 10% chance they were this low by chance. That's because that +70% and other high luck days that followed the low days made up for a lot of it. If some more positive luck days are added to the pool it could be made to look just fine without any missing blocks problems at all. But the low days and high days will remain there as evidence that manipulation  such as stealing and then a cover up  took place.Not very many bitcoins were stolen over all so far. In fact the estimated amount of bitcoins stolen could go to 0 in the future if more positive luck days are added to make up for it.




k9quaint
Legendary
Offline
Activity: 1190


August 09, 2011, 02:32:37 AM 

If the pool is stealing from miners, it's not like sometimes they're going to cheat and sometimes they're going to sneak in bonuses.
I think this is an excellent point. Pretend you are a pool operator out to cheat. Having 5 visibly bad days seems like a bad way to steal. It would be both easier and less visible to skim off a small amount on a constant basis rather than inserting code that robs a large amount on a set schedule. And either way, the best way to catch it would be to look at the average over the longest term possible. You can always find bad days. Half the weeks will be below average. And if you get to pick the start date and the finish date specifically to bracket a run of bad luck, you can always find improbable events. Also keep in mind, the original poster has been examining all the pools for statistical clusters of bad luck (as evidence of malfeasance). When the OP presented the odds of finding the cluster, he failed to mention the other 2 major pools he examined (deepbit and slush) and did not include them in his search space. The OP presented the odds of his finding occurring as if he had only examined 1 pool instead of 3. Had he found bad luck in slush or deepbit, he would be talking about them instead of btcguild in exactly the same way. It is as if he flipped a coin 1000 times, dropped 700 of the results completely and then chose a run of NINE tails over a 10 flip period. I would absolutely expect him to find a run of "NINE" tails out of 10 under those circumstances. I would also expect him to try to convince me of how "unlikely" the existence of that event was. Vladimir, Thanks for doing that. I guess I was doing it wrong using Binomial Distribution when I should have been using Poisson. My understanding is that Binomial Distribution is based on a fixed set of trials whereas Poisson is based independent events occurring over time, which is the correct thing to use for finding bitcoin blocks. So my 0.0036 (277:1) figure was incorrect it should have been 0.00695 (144:1) as you said. And either way, the best way to catch it would be to look at the average over the longest term possible. You can always find bad days. Half the weeks will be below average. And if you get to pick the start date and the finish date specifically to bracket a run of bad luck, you can always find improbable events. I want to know if the pool is being bounced up and down on purpose (with the down days likely being a little more down than the up days are up) to try to make it more difficult to see what is going on. So eleuthria says that during difficulty 1563027 there is a 90% chance the number of blocks found should have been higher so 10% chance they were this low by chance. That's because that +70% and other high luck days that followed the low days made up for a lot of it. If some more positive luck days are added to the pool it could be made to look just fine without any missing blocks problems at all. But the low days and high days will remain there as evidence that manipulation  such as stealing and then a cover up  took place.Not very many bitcoins were stolen over all so far. In fact the estimated amount of bitcoins stolen could go to 0 in the future if more positive luck days are added to make up for it. Both Vladamir and Mad7 are conflating two calculations of odds: finding this behavior in the next 5 days for btcguild finding this behavior in any 5 day period for any pool Vladimir correctly calculated the odds of finding this behavior in the next 5 days of btcguild. However, that is not relevant. We need to calculate the odds of searching every possible 5 day period for every pool (or even just the 3 biggest) and finding this sort of luck. The OP set out to find the luckiest pool to mine in and to find the unluckiest pool to accuse of cheating.

Bitcoin is backed by the full faith and credit of YouTube comments.



JoelKatz
Legendary
Offline
Activity: 1582
Democracy is vulnerable to a 51% attack.


August 09, 2011, 02:51:12 AM 

A better analogy is this one:
A certain town has 500 people. You suspect that maybe some of them are psychic. So you ask each person to try to correctly identify the suits of some cards they cannot see. Of the 500 people, you find 3 that did extremely well in the test. So you say that you suspect those 3 are psychic. (However, you would expect about 3 out of 500 to do that well by chance. So these are just the people you suspect.)
So, you'll continue your analysis. You'll test those 3 people. But wait, someone already did that. So you'll just check the previous results. Lo and behold, the data shows that those 3 people succeed way beyond what you'd expect by mere chance, perhaps the chances were only 3 in 500 that they could do that well by chance.
You cannot use the same data both to decide which pool to accuse of cheating and to validate that same accusation. Some pool has to have the worst luck, so bad that it's hard to believe looking only at that pool that its luck was that bad due to mere chance.

I am an employee of Ripple. Follow me on Twitter @JoelKatz 1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BMNBM3FRExVJSJJamV9ccgyWvQfratUHgN



