Bitcoin Forum

Alternate cryptocurrencies => Altcoin Discussion => Topic started by: Vitalik Buterin on July 10, 2013, 08:12:03 AM



Title: Relationship between primes per second and expected block creation time?
Post by: Vitalik Buterin on July 10, 2013, 08:12:03 AM
One thing I can't quite figure out with Primecoin is exactly what the relationship is between the main measure of mining output, "primes per second", and how long it actually takes to find a block. Right now chains need to be 7 primes long, and each prime must (usually) be at least 256 bits long, so  naively (assuming primes are random):

P(chain of 7 primes) = P(number is a prime)^7 = 1/256^7 (~1/10^17) by the Prime Number Theorem

Since every number we're dealing with is odd, we can improve that bound to 1/128^7 (1/10^15), but the client seems to be using an advanced sieving algorithm (it looks like much more than just the basic Sieve of Erastothenes) to cut even more corners, so I would not be surprised if the number of primes needed to scan through to get a block is far less than this (it must be, as otherwise there's no way miners with even as much as 100 pps would be getting blocks so frequently). So, the question is, does anyone know what the ratio is, whether mathematically or empirically?


Title: Re: Relationship between primes per second and expected block creation time?
Post by: Vitalik Buterin on July 10, 2013, 09:55:23 AM
Okay, I think I figured out how the sieving algorithm works.

Suppose that you are looking for 2p-1 chains (eg. 1531, 3061, 6121, 12241, 24481) of length 3 whose origins are multiples of 100 and you want a sieve going up to 7. You need to work out 4 moduli: 2,3,5,7. Let w be the origin, so w+1 is the first prime, 2w+1 is the second prime, 4w+1 is the third prime.

A lot of math from here on is modulo math; if you see me making weird statements like 2*4=3 or 3/2=5, it's because they're true in modular arithmetic under whatever modulus I'm talking about.

Modulo 2:

w+1 != 0, so w != 1
2w+1 != 0, so 2w != 1, which is always true
4w+1 != 0, so 4w != 1, which is always true

Hence, w = 0 (mod 2)

Modulo 3:

w+1 != 0, so w != 2
2w+1 != 0, so 2w != 2, so w != 2/2 = 1
4w+1 != 0, so 4w != 2, so w != 2/4 = 2/1 = 2

Hence, w , ϵ [0,1,2] - [1,2] =
(mod 3)

Modulo 5:

w+1 != 0, so w != 4
2w+1 != 0, so 2w != 4, so w != 4/2 = 2
4w+1 != 0, so 4w != 4, so w != 4/4 = 1

Hence, w , ϵ [0,1,2,3,4] - [1,2,4] = [0,3] (mod 5)

Modulo 7:

w+1 != 0, so w != 6
2w+1 != 0, so 2w != 6, so w != 6/2 = 3
4w+1 != 0, so 4w != 6, so w != 6/4 = 5

Hence, w , ϵ [0,1,2,3,4,5,6] - [3,5,6] = [0,1,2,4] (mod 7)

Now, since we're looking for multiples of 100, we need to convert these modular statements into modular statements about the cofactors. That is to say, given w=100k, we want claims about k.

Base 2: w = 0 (mod 2) is true regardless of k, so this condition happens to be irrelevant here
Base 3: w = 0 (mod 3), so 100k = 0 (mod 3), so k = 0 (mod 3)
Base 5: w ϵ [0,3] (mod 5) is true regardless of k, so once again this condition can be thrown out
Base 7: w ϵ [0,1,2,4] (mod 7), so k ϵ [0/100,1/100,2/100,4/100] = [0/2,1/2,2/2,4/2] = [0,4,1,2] (mod 7)

Thus, we have k = 0 (mod 3), k ϵ (0,1,2,4) (mod 7), so altogether by Chinese Remainder Theorem k ϵ [0,9,15,18] (mod 21).

Let's try it out!

k = 9, w = 900. We have 901, 1801, 3601 . Unfortunately, 901 is divisible by 17 and 53, so fail. But, note that none of these three numbers is divisible by 2,3,5,7, so the sieve did its job.

k = 15, w = 1500, so first number is 1501 - once again not prime. In fact, the first origin that actually works is 26700, followed by 33300 and 54600.

Now, let's try a k outside the sieve.

k = 16, w = 1600, so the chain is 1601, 3201, 6401. As it turns out, 3201 is divisible by three. Once again, the sieve did its job by removing this possibility from consideration. In fact, every k removed by the sieve is guaranteed to fail on either 2,3,5 or 7 (actually just 3 or 7 since failing on 2 or 5 is impossible).

The actual algorithm is somewhat more complex to account for both kinds of Cunningham chains and bitwin chains, and the process is the same.

Now, there's another optimization: primorial numbers. 100 was a nice factor since it removed the possibility of any number in the chain from being divisible by 2 or 5 right off the bat. What if we had used 210 instead? Then, we could have done the sieve for some higher primes, like [11,13,17], instead, and gotten to the goal much faster - indeed, the first origin that works there is 2310, only the eleventh value tested even without any sieving! Primorial numbers are numbers that are divisible by all small primes up to a certain point. If we can shift around nonces in the block until the hash is divisible by at least a reasonably sized primorial, then we can achieve further speedups (although, the Primecoin paper states, not particularly large ones).

In the client, the sieve goes up to 1 million. So, the question is, how much an optimization over the randomness of the Prime Number Theorem (namely, that random numbers n bits long have a 1/n chance of being prime) do Primecoin miners actually get?