Bitcoin Forum
May 06, 2024, 04:15:24 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3]  All
  Print  
Author Topic: [XPM] So, why is porting primecoin mining to the GPU so difficult?  (Read 18669 times)
usahero
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
August 24, 2013, 02:54:45 AM
 #41


The algorithm is kind of complicated to explain in lay terms....


Thank you. You explained very well.
Unlike traditional banking where clients have only a few account numbers, with Bitcoin people can create an unlimited number of accounts (addresses). This can be used to easily track payments, and it improves anonymity.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715012124
Hero Member
*
Offline Offline

Posts: 1715012124

View Profile Personal Message (Offline)

Ignore
1715012124
Reply with quote  #2

1715012124
Report to moderator
1715012124
Hero Member
*
Offline Offline

Posts: 1715012124

View Profile Personal Message (Offline)

Ignore
1715012124
Reply with quote  #2

1715012124
Report to moderator
1715012124
Hero Member
*
Offline Offline

Posts: 1715012124

View Profile Personal Message (Offline)

Ignore
1715012124
Reply with quote  #2

1715012124
Report to moderator
Sunny King
Legendary
*
Offline Offline

Activity: 1205
Merit: 1010



View Profile WWW
August 24, 2013, 03:06:06 AM
 #42

The Lucas-Lehmer test currently available on GPU is very simple compared to the fermat testing needed for Primecoin.  A mersenne number is stated as 2p-1.  In order for a mersenne number to be prime, P must also be prime.  As stated on http://Mersenne.org/various/math.php, The Lucas-Lehmer primality test is remarkably simple. It states that for P > 2, 2P-1 is prime if and only if Sp-2 is zero in this sequence: S0 = 4, SN = (SN-12 - 2) mod (2P-1).  Trust me when I say the explanation is a lot harder than the actual math involved.  To prove 27-1 is prime take 6 steps, which is 1 less that the exponent of 7.  The current record Mersenne Prime, 257,885,161-1 took 57,885,160 calculations to prove as prime, requiring 15 days on my overclocked i5-2500k (4.3GHz) compared to ~4 days on an nVidia GTX 480 or ~3.5 days on a GTX 580.

Lucas-Lehmer test itself is computationally similar to Fermat test in the sense that they both rely on modular multiplication/square. But in the case of Mersenne the modulo is of a special form 2p-1, so the modulo operation can be done through only shift and addition.
bcp19
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
August 24, 2013, 03:28:43 AM
 #43

The Lucas-Lehmer test currently available on GPU is very simple compared to the fermat testing needed for Primecoin.  A mersenne number is stated as 2p-1.  In order for a mersenne number to be prime, P must also be prime.  As stated on http://Mersenne.org/various/math.php, The Lucas-Lehmer primality test is remarkably simple. It states that for P > 2, 2P-1 is prime if and only if Sp-2 is zero in this sequence: S0 = 4, SN = (SN-12 - 2) mod (2P-1).  Trust me when I say the explanation is a lot harder than the actual math involved.  To prove 27-1 is prime take 6 steps, which is 1 less that the exponent of 7.  The current record Mersenne Prime, 257,885,161-1 took 57,885,160 calculations to prove as prime, requiring 15 days on my overclocked i5-2500k (4.3GHz) compared to ~4 days on an nVidia GTX 480 or ~3.5 days on a GTX 580.

Lucas-Lehmer test itself is computationally similar to Fermat test in the sense that they both rely on modular multiplication/square. But in the case of Mersenne the modulo is of a special form 2p-1, so the modulo operation can be done through only shift and addition.
I will admit that I don't undertand the fermat test well... my brain locks up at The triple bar, ≡, and the set membership symbol (∈).  :/

I do not suffer fools gladly... "Captain!  We're surrounded!"
I embrace my inner Kool-Aid.
Sunny King
Legendary
*
Offline Offline

Activity: 1205
Merit: 1010



View Profile WWW
August 24, 2013, 04:08:28 AM
 #44

Smiley Though I think the Lucas-Lehmer used for GIMPS is very different because Mersenne is dealing with multiplications of tens of millions of bits. I think the kernel is probably spending most of its time doing a special FFT (floating point) algorithm for big number multiplications.

For primecoin mining it only needs modular multiplication of a thousand bits.
primedigger (OP)
Member
**
Offline Offline

Activity: 75
Merit: 10


View Profile
August 24, 2013, 11:01:26 PM
 #45

Lucas–Lehmer–Riesel test is deterministic and requires that numbers are in a special form. This is both a good fit for Mersenne primes, which have that special form and deterministic means that the test is proof for primality. The class of probabilistic primality tests is usually faster, but doesn't prove primality. Fermat's test is a good fit for primecoin, because it is general purpose. Also Fermat's test should be the fastest test among all probabilistic primality tests with the price of a high false positive rate (it's still very small in practise). This is ignored in Primecoins, as all numbers passing a Fermat test in base 2 are threaded as primes.
Dumbo
Member
**
Offline Offline

Activity: 114
Merit: 10


View Profile
August 24, 2013, 11:22:06 PM
 #46

For primecoin mining it only needs modular multiplication of a thousand bits.

Unfortunately the only modular multiplication implementation on CUDA I saw involved 32-bit integers Sad 
primedigger (OP)
Member
**
Offline Offline

Activity: 75
Merit: 10


View Profile
August 26, 2013, 01:33:09 AM
 #47

Mtrlt has a new status on his miner. 2x speedup for testnet, ? for mainnet. I hope that we will see the source soon, so that I can port it over to CUDA. If we can get this to 5x-10x, it will still be a game changer! But based on my analysis, we won't see the extreme speedups which we have seen for hash based coins (atleast not without some clever tricks I'm unaware of). Sunny may have unintentionally created the best CPU-coin there currently is! Wink
cryptrol
Hero Member
*****
Offline Offline

Activity: 637
Merit: 500


View Profile
August 26, 2013, 01:54:46 PM
 #48

Mtrlt has a new status on his miner. 2x speedup for testnet, ? for mainnet. I hope that we will see the source soon, so that I can port it over to CUDA. If we can get this to 5x-10x, it will still be a game changer! But based on my analysis, we won't see the extreme speedups which we have seen for hash based coins (atleast not without some clever tricks I'm unaware of). Sunny may have unintentionally created the best CPU-coin there currently is! Wink
He reports a 2x performance gain against a phenom x6, that's not much considering that a 6990 has 2 cores .
If the final results go on these lines, it will not be better to mine on GPU than on CPU.
primedigger (OP)
Member
**
Offline Offline

Activity: 75
Merit: 10


View Profile
August 26, 2013, 04:15:43 PM
 #49

Thanks, I didn't notice that he compared his speedup to a dual GPU. I wonder with what metric he compared it and if he factored in the new advances in hp-10 that brought a >2x speedup with optimisations to the sieve.

Still, the sieve looks quite easy to port to the GPU and I will try to do that if my time permits it. Let's better have a definite prove that GPU sieving is ineffective right now, if that is really the case. Although, a >10 difficulty could make the sieving part more important than now, that could change the current situation.
arnuschky
Hero Member
*****
Offline Offline

Activity: 517
Merit: 501


View Profile
August 26, 2013, 08:38:06 PM
 #50

Thanks, I didn't notice that he compared his speedup to a dual GPU. I wonder with what metric he compared it and if he factored in the new advances in hp-10 that brought a >2x speedup with optimisations to the sieve.

hp10 didn't improve that much; the metric is wrong. We're getting pretty much the same results with hp9 than hp10. Others have confirmed this.
bcp19
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
August 26, 2013, 10:07:29 PM
 #51

Thanks, I didn't notice that he compared his speedup to a dual GPU. I wonder with what metric he compared it and if he factored in the new advances in hp-10 that brought a >2x speedup with optimisations to the sieve.

hp10 didn't improve that much; the metric is wrong. We're getting pretty much the same results with hp9 than hp10. Others have confirmed this.
doubling the sieve speed would not equate to a 2x speedup overall.  The deeper you end up seiving the less increased result you get from it.  Imagine the normal sieve eliminating 99% of candidates.  The added seive could then amount to 99.1% or 99.2% of all candidates. 

I do not suffer fools gladly... "Captain!  We're surrounded!"
I embrace my inner Kool-Aid.
drumaliens
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
August 29, 2013, 06:23:26 PM
Last edit: August 29, 2013, 06:50:22 PM by drumaliens
 #52

From reading all of the information out there I think I understand the basics of primecoin. Is there a worked example, using an existing block which shows how they fit together ?
patapato
Member
**
Offline Offline

Activity: 93
Merit: 10



View Profile
August 29, 2013, 10:38:21 PM
 #53

He reports a 2x performance gain against a phenom x6, that's not much considering that a 6990 has 2 cores .
If the final results go on these lines, it will not be better to mine on GPU than on CPU.

In fact the price of a Radeon HD 6990 is more than twice that of a Phenom II X6. Probably also the power consumption is 2x.
crendore
Sr. Member
****
Offline Offline

Activity: 363
Merit: 250


View Profile
September 03, 2013, 12:29:00 PM
 #54

Hey primedigger, now that mtrlt's miner is out, have you taken a look at the source? Was it what you expected?

YipYip
Hero Member
*****
Offline Offline

Activity: 574
Merit: 500



View Profile
September 03, 2013, 01:22:52 PM
 #55

I would offer 5 BTC to the first to provide me with a working GPU that the public does not know about.  This will need to be atleast 10x faster than CPU.

WOW u are cooolzz

OBJECT NOT FOUND
bcp19
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
September 07, 2013, 06:12:30 PM
 #56

Thanks, I didn't notice that he compared his speedup to a dual GPU. I wonder with what metric he compared it and if he factored in the new advances in hp-10 that brought a >2x speedup with optimisations to the sieve.

Still, the sieve looks quite easy to port to the GPU and I will try to do that if my time permits it. Let's better have a definite prove that GPU sieving is ineffective right now, if that is really the case. Although, a >10 difficulty could make the sieving part more important than now, that could change the current situation.
Since the openCL miner is in the wild now, have you had a chance too look at it and see if a CUDA miner is possible?

I do not suffer fools gladly... "Captain!  We're surrounded!"
I embrace my inner Kool-Aid.
tugvarish
Newbie
*
Offline Offline

Activity: 30
Merit: 0


View Profile
June 17, 2014, 08:06:58 PM
 #57

I know this is and old topic, but I just now read it.
I am a long time Boinc-er (using the Berkley distributed computing platform) and between the projects I participate, there is one called PrimeGrid and it is all about various prime number mathematical work, one of them is finding various kind of them.
The projects uses GPU in some of their calculation, so maybe it could interesting to look at what they are using it for and get some new ideas.
smoothie
Legendary
*
Offline Offline

Activity: 2492
Merit: 1473


LEALANA Bitcoin Grim Reaper


View Profile
June 17, 2014, 09:41:53 PM
 #58

I know this is and old topic, but I just now read it.
I am a long time Boinc-er (using the Berkley distributed computing platform) and between the projects I participate, there is one called PrimeGrid and it is all about various prime number mathematical work, one of them is finding various kind of them.
The projects uses GPU in some of their calculation, so maybe it could interesting to look at what they are using it for and get some new ideas.

As far as I am aware there are a few Primecoin GPU mining code bases to use:

Claymore and PrimeGPU

███████████████████████████████████████

            ,╓p@@███████@╗╖,           
        ,p████████████████████N,       
      d█████████████████████████b     
    d██████████████████████████████æ   
  ,████²█████████████████████████████, 
 ,█████  ╙████████████████████╨  █████y
 ██████    `████████████████`    ██████
║██████       Ñ███████████`      ███████
███████         ╩██████Ñ         ███████
███████    ▐▄     ²██╩     a▌    ███████
╢██████    ▐▓█▄          ▄█▓▌    ███████
 ██████    ▐▓▓▓▓▌,     ▄█▓▓▓▌    ██████─
           ▐▓▓▓▓▓▓█,,▄▓▓▓▓▓▓▌          
           ▐▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▌          
    ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓─  
     ²▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓╩    
        ▀▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▀       
           ²▀▀▓▓▓▓▓▓▓▓▓▓▓▓▀▀`          
                   ²²²                 
███████████████████████████████████████

. ★☆ WWW.LEALANA.COM        My PGP fingerprint is A764D833.                  History of Monero development Visualization ★☆ .
LEALANA BITCOIN GRIM REAPER SILVER COINS.
 
Pages: « 1 2 [3]  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!