




Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.

mikaelh


July 17, 2013, 08:09:31 AM 

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.




oxfeeefeee
Member
Offline
Activity: 73
Merit: 10


July 17, 2013, 08:47:31 AM 

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
Thanks for the answer, It's helpful even it's "no" because it provides useful information for the miners to plan their investments.




paulthetafy


July 17, 2013, 08:52:38 AM 

I think it's a fair bet that XPM has already been implemented in both OpenCL and CUDA. They just haven't been publicly released yet
PTT




oxfeeefeee
Member
Offline
Activity: 73
Merit: 10


July 17, 2013, 09:39:53 AM 

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
Did a very quick look at the code, it seems that the majority of calculation is BN_mod_exp operation, which is r=a^p%m. while CUMP doesn't support % yet, we can still let GPU do the a^p part. that would still be a lot more faster right? unless there is some fast algorithm that requires we do a^p%m altogether. again, please correct me if I'm wrong.




meta.p02


July 17, 2013, 12:39:22 PM 

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
Did a very quick look at the code, it seems that the majority of calculation is BN_mod_exp operation, which is r=a^p%m. while CUMP doesn't support % yet, we can still let GPU do the a^p part. that would still be a lot more faster right? unless there is some fast algorithm that requires we do a^p%m altogether. again, please correct me if I'm wrong. Look up fast modular exponentation. Basically, you don't want to be storing the entire a^p, because it's going to be on the order of 10^90 digits. So, after every step of exponentation, you have to reduce it modulo m.




oxfeeefeee
Member
Offline
Activity: 73
Merit: 10


July 17, 2013, 01:07:17 PM 

Look up fast modular exponentation.
Basically, you don't want to be storing the entire a^p, because it's going to be on the order of 10^90 digits. So, after every step of exponentation, you have to reduce it modulo m.
Thanks，so this is what we are currently doing with our CPUs now? (Initialize) Set (x; y; f) = (1; a; e). (Loop) While f > 0, do as follows: { If f%2 = 0 then replace (x; y; f) by (x; y^2 %n; f/2), { otherwise replace (x; y; f) by (xy%n; y; f1). (Terminate) Return x
from http://people.reed.edu/~jerry/361/lectures/bigprimes.pdf




meta.p02


July 17, 2013, 01:12:38 PM 

Thanks，so this is what we are currently doing with our CPUs now? (Initialize) Set (x; y; f) = (1; a; e). (Loop) While f > 0, do as follows: { If f%2 = 0 then replace (x; y; f) by (x; y^2 %n; f/2), { otherwise replace (x; y; f) by (xy%n; y; f1). (Terminate) Return x
from http://people.reed.edu/~jerry/361/lectures/bigprimes.pdfThat's about it. Currently we need to find a way to port it to the GPU so that the GPU can run multiple copies of (1; 2; e) in parallel.




oxfeeefeee
Member
Offline
Activity: 73
Merit: 10


July 17, 2013, 02:11:53 PM 

Thanks，so this is what we are currently doing with our CPUs now? (Initialize) Set (x; y; f) = (1; a; e). (Loop) While f > 0, do as follows: { If f%2 = 0 then replace (x; y; f) by (x; y^2 %n; f/2), { otherwise replace (x; y; f) by (xy%n; y; f1). (Terminate) Return x
from http://people.reed.edu/~jerry/361/lectures/bigprimes.pdfThat's about it. Currently we need to find a way to port it to the GPU so that the GPU can run multiple copies of (1; 2; e) in parallel. It seems this is a very well studied problem, A search showed a lot of papers about this, e.g. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.104.7236&rep=rep1&type=pdfSo now again I'm confident that a GPU miner is coming soon.




oxfeeefeee
Member
Offline
Activity: 73
Merit: 10


July 17, 2013, 04:31:59 PM 

Another paper with a nice table, gives you some idea on how things will go. http://trone.di.fc.ul.pt/images/e/e2/ASAP11paper.pdfGTS8800 [17] GTX8800 [10] GTX260 (This paper) GTX580 [39] Intel W3565 [46] AMD Phenom II 1090T [46] Cores 112 128 192 512 4 6 Frequency (MHz) 1188 1350 1294 1544 3200 3200 Price (USD) 250 173 100 500 300 200 TDP (W) 150 155 202 244 130 125 GFLOPS 399 518 715 1581 102 153 Modexp/s 6504 11074 41426 149464 32608 77002 Modexp/s (scaled) 13052 15282 41426 46973 N/A N/A Modexp/s/W 43 71 205 612 250 616 Modexp/s/USD 26 64 414 298 131 385 Table I COMPARISONS OF MODULAR EXPONENTIATION PERFORMANCES ON VARIOUS CPU AND GPU IMPLEMENTATIONS.




solracx


July 17, 2013, 04:43:05 PM 

Another paper with a nice table, gives you some idea on how things will go. http://trone.di.fc.ul.pt/images/e/e2/ASAP11paper.pdfGTS8800 [17] GTX8800 [10] GTX260 (This paper) GTX580 [39] Intel W3565 [46] AMD Phenom II 1090T [46] Cores 112 128 192 512 4 6 Frequency (MHz) 1188 1350 1294 1544 3200 3200 Price (USD) 250 173 100 500 300 200 TDP (W) 150 155 202 244 130 125 GFLOPS 399 518 715 1581 102 153 Modexp/s 6504 11074 41426 149464 32608 77002 Modexp/s (scaled) 13052 15282 41426 46973 N/A N/A Modexp/s/W 43 71 205 612 250 616 Modexp/s/USD 26 64 414 298 131 385 Table I COMPARISONS OF MODULAR EXPONENTIATION PERFORMANCES ON VARIOUS CPU AND GPU IMPLEMENTATIONS. looking at the numbers, looks like GPU is only around 2x CPU???

ZenithCoin  Sustainable Scrypt Based Crypto Currency



mustyoshi


July 17, 2013, 04:57:30 PM 

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
Division is just subtraction with a counter.




markm
Legendary
Offline
Activity: 2674
Merit: 1041


July 17, 2013, 05:01:21 PM 

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
Division is just subtraction with a counter. Unless you like to optimise it, like maybe by using some shifts or whatever (albeit maybe also with counters involved in that part too?) I have often wondered if the Trachtenburg Speed System of basic mathematics is any faster on machines than other approaches? Maybe more useful for base ten than binary though? MarkM




oxfeeefeee
Member
Offline
Activity: 73
Merit: 10


July 17, 2013, 05:06:45 PM Last edit: July 17, 2013, 05:19:25 PM by oxfeeefeee 

Another paper with a nice table, gives you some idea on how things will go. http://trone.di.fc.ul.pt/images/e/e2/ASAP11paper.pdfGTS8800 [17] GTX8800 [10] GTX260 (This paper) GTX580 [39] Intel W3565 [46] AMD Phenom II 1090T [46] Cores 112 128 192 512 4 6 Frequency (MHz) 1188 1350 1294 1544 3200 3200 Price (USD) 250 173 100 500 300 200 TDP (W) 150 155 202 244 130 125 GFLOPS 399 518 715 1581 102 153 Modexp/s 6504 11074 41426 149464 32608 77002 Modexp/s (scaled) 13052 15282 41426 46973 N/A N/A Modexp/s/W 43 71 205 612 250 616 Modexp/s/USD 26 64 414 298 131 385 Table I COMPARISONS OF MODULAR EXPONENTIATION PERFORMANCES ON VARIOUS CPU AND GPU IMPLEMENTATIONS. looking at the numbers, looks like GPU is only around 2x CPU??? Yes it seems so, and the performance/watt is basically at the same level. But you can't just plug 6 AMD CPUs to a single rig like what you'd do with GPUs. I'm guessing this is because CPUs have AVX which is 256 bit, and that makes them very good at dealing with big numbers compared to GPUs which can only support 64bit natively.




iCEBREAKER
Legendary
Offline
Activity: 2156
Merit: 1070
Crypto is the separation of Power and State.


July 17, 2013, 06:01:58 PM 

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining. Division is just subtraction with a counter.

██████████ ██████████████████ ██████████████████████ ██████████████████████████ ████████████████████████████ ██████████████████████████████ ████████████████████████████████ ████████████████████████████████ ██████████████████████████████████ ██████████████████████████████████ ██████████████████████████████████ ██████████████████████████████████ ██████████████████████████████████ ████████████████████████████████ ██████████████ ██████████████ ████████████████████████████ ██████████████████████████ ██████████████████████ ██████████████████ ██████████ Monero

 "The difference between bad and welldeveloped digital cash will determine whether we have a dictatorship or a real democracy." David Chaum 1996 "Fungibility provides privacy as a side effect." Adam Back 2014

  



AstroKev
Newbie
Offline
Activity: 23
Merit: 0


July 18, 2013, 01:21:48 AM 

Here is a few year old example where an implementation of ECM was developed and compared against the standard CPU and the result was roughly 2x faster. I know we're not talking about ECM here but again it's suggestive of what one might expect. http://eecm.cr.yp.to/gpuecm20090127.pdfmodular arithmetic is very easy to implement with the four basic arithmetic functions, so I'm not sure what the holdup is around that?




markm
Legendary
Offline
Activity: 2674
Merit: 1041


July 18, 2013, 01:26:46 AM 

If a GPU is only twice as fast as a CPU, maybe putting more CPUs on one's motherboard might be more costeffective than putting one or more GPUs in a machine?
Or think blade servers, two or more CPUs per blade, would GPUs really be more costeffective?
MarkM




horeaper
Newbie
Offline
Activity: 49
Merit: 0


July 18, 2013, 01:39:21 AM 

I think XPM is a good addition to current GPU rigs, now your CPU won't be sleeping all the time, right?




mustyoshi


July 18, 2013, 01:40:56 AM 

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining. Division is just subtraction with a counter.
That's beautiful. :')




AstroKev
Newbie
Offline
Activity: 23
Merit: 0


July 18, 2013, 01:54:12 AM 

Agreed that XPM mining is a good addition to BTC/LTC mining! Except for the peopel who bought the cruddiest processors they could find when building their rigs...
Even though only ADD/SUB/MUL are in CUMP, one might be able to use a binomial method (guess/check/converge) to implement DIV without much difficulty though I don't know how well the operation would perform.
There are quite a few factoring algos/sieves on github using cuda, though my cursory glance was showing nonarbitrary precision routines. I thought floating point arithmetic has no place in integer factorization?!




