Bitcoin Forum
March 02, 2021, 08:42:10 PM *
News: Latest Bitcoin Core release: 0.21.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: [XPM] A GPU miner for XPM is around the corner? (via CUMP).  (Read 4877 times)
oxfeeefeee
Member
**
Offline Offline

Activity: 73
Merit: 10


View Profile
July 17, 2013, 05:59:21 AM
 #1

As you know mikaelh's build uses GMP to do the BigNum calculation, which gives a huge performance boost.

And I googled "GMP GPU" and found this CUMP http://www.hpcs.cs.tsukuba.ac.jp/~nakayama/cump/ which basically is a CUDA version of GMP. So it's reasonable to assume that a build using CUDA is very easy to make. according to this http://www.hpcs.cs.tsukuba.ac.jp/~nakayama/cump/index.php?CUMP%20Performance%20Evaluation, it wouldn't be so dramatic compared to the GPU revolution happened to the other coins.

Anyone worked with this CUMP library before? it there any blockers for this? or is someone already working on it? Smiley
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
mikaelh
Sr. Member
****
Offline Offline

Activity: 301
Merit: 250


View Profile
July 17, 2013, 08:09:31 AM
 #2

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
oxfeeefeee
Member
**
Offline Offline

Activity: 73
Merit: 10


View Profile
July 17, 2013, 08:47:31 AM
 #3

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.

Thanks for the answer, It's helpful even it's "no" because it provides useful information for the miners to plan their investments.
paulthetafy
Hero Member
*****
Offline Offline

Activity: 816
Merit: 1000


View Profile
July 17, 2013, 08:52:38 AM
 #4

I think it's a fair bet that XPM has already been implemented in both OpenCL and CUDA.  They just haven't been publicly released yet

PTT
oxfeeefeee
Member
**
Offline Offline

Activity: 73
Merit: 10


View Profile
July 17, 2013, 09:39:53 AM
 #5

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.

Did a very quick look at the code, it seems that the majority of calculation is BN_mod_exp operation, which is r=a^p%m. while CUMP doesn't support % yet, we can still let GPU do the a^p part. that would still be a lot more faster right? unless there is some fast algorithm that requires we do a^p%m altogether.

again, please correct me if I'm wrong.
meta.p02
Full Member
***
Offline Offline

Activity: 196
Merit: 100



View Profile
July 17, 2013, 12:39:22 PM
 #6

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.

Did a very quick look at the code, it seems that the majority of calculation is BN_mod_exp operation, which is r=a^p%m. while CUMP doesn't support % yet, we can still let GPU do the a^p part. that would still be a lot more faster right? unless there is some fast algorithm that requires we do a^p%m altogether.

again, please correct me if I'm wrong.

Look up fast modular exponentation.

Basically, you don't want to be storing the entire a^p, because it's going to be on the order of 10^90 digits. So, after every step of exponentation, you have to reduce it modulo m.

Earn Devcoins by Writing | Trade on Cryptsy! Faucets: Watch ads, earn Bitcoin | Visit pages, get Bitcoin | Gamble with faucet earnings!
If you found my post informative/interesting, consider tipping at BTC: 15877457612137dj4MM57bGXRkPzU4wPRM or DVC: 1B2PAYVe9BQRrZKaWZxWtunutwrm6fVcF7.
oxfeeefeee
Member
**
Offline Offline

Activity: 73
Merit: 10


View Profile
July 17, 2013, 01:07:17 PM
 #7

Look up fast modular exponentation.

Basically, you don't want to be storing the entire a^p, because it's going to be on the order of 10^90 digits. So, after every step of exponentation, you have to reduce it modulo m.

Thanks,so this is what we are currently doing with our CPUs now?
Code:
 (Initialize) Set (x; y; f) = (1; a; e).
 (Loop) While f > 0, do as follows:
    { If f%2 = 0 then replace (x; y; f) by (x; y^2 %n; f/2),
    { otherwise replace (x; y; f) by (xy%n; y; f-1).
 (Terminate) Return x
from http://people.reed.edu/~jerry/361/lectures/bigprimes.pdf
meta.p02
Full Member
***
Offline Offline

Activity: 196
Merit: 100



View Profile
July 17, 2013, 01:12:38 PM
 #8


Thanks,so this is what we are currently doing with our CPUs now?
Code:
 (Initialize) Set (x; y; f) = (1; a; e).
 (Loop) While f > 0, do as follows:
    { If f%2 = 0 then replace (x; y; f) by (x; y^2 %n; f/2),
    { otherwise replace (x; y; f) by (xy%n; y; f-1).
 (Terminate) Return x
from http://people.reed.edu/~jerry/361/lectures/bigprimes.pdf

That's about it. Currently we need to find a way to port it to the GPU so that the GPU can run multiple copies of (1; 2; e) in parallel.

Earn Devcoins by Writing | Trade on Cryptsy! Faucets: Watch ads, earn Bitcoin | Visit pages, get Bitcoin | Gamble with faucet earnings!
If you found my post informative/interesting, consider tipping at BTC: 15877457612137dj4MM57bGXRkPzU4wPRM or DVC: 1B2PAYVe9BQRrZKaWZxWtunutwrm6fVcF7.
oxfeeefeee
Member
**
Offline Offline

Activity: 73
Merit: 10


View Profile
July 17, 2013, 02:11:53 PM
 #9


Thanks,so this is what we are currently doing with our CPUs now?
Code:
 (Initialize) Set (x; y; f) = (1; a; e).
 (Loop) While f > 0, do as follows:
    { If f%2 = 0 then replace (x; y; f) by (x; y^2 %n; f/2),
    { otherwise replace (x; y; f) by (xy%n; y; f-1).
 (Terminate) Return x
from http://people.reed.edu/~jerry/361/lectures/bigprimes.pdf

That's about it. Currently we need to find a way to port it to the GPU so that the GPU can run multiple copies of (1; 2; e) in parallel.


It seems this is a very well studied problem, A search showed a lot of papers about this, e.g. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.104.7236&rep=rep1&type=pdf
So now again I'm confident that a GPU miner is coming soon.
oxfeeefeee
Member
**
Offline Offline

Activity: 73
Merit: 10


View Profile
July 17, 2013, 04:31:59 PM
 #10

Another paper with a nice table, gives you some idea on how things will go.

http://trone.di.fc.ul.pt/images/e/e2/ASAP11-paper.pdf

GTS8800 [17] GTX8800 [10] GTX260 (This paper) GTX580 [39] Intel W3565 [46] AMD Phenom II 1090T [46]
Cores 112 128 192 512 4 6
Frequency (MHz) 1188 1350 1294 1544 3200 3200
Price (USD) 250 173 100 500 300 200
TDP (W) 150 155 202 244 130 125
GFLOPS 399 518 715 1581 102 153
Modexp/s 6504 11074 41426 149464 32608 77002
Modexp/s (scaled) 13052 15282 41426 46973 N/A N/A
Modexp/s/W 43 71 205 612 250 616
Modexp/s/USD 26 64 414 298 131 385
Table I
COMPARISONS OF MODULAR EXPONENTIATION PERFORMANCES ON VARIOUS CPU AND GPU IMPLEMENTATIONS.
solracx
Sr. Member
****
Offline Offline

Activity: 294
Merit: 250


View Profile WWW
July 17, 2013, 04:43:05 PM
 #11

Another paper with a nice table, gives you some idea on how things will go.

http://trone.di.fc.ul.pt/images/e/e2/ASAP11-paper.pdf

GTS8800 [17] GTX8800 [10] GTX260 (This paper) GTX580 [39] Intel W3565 [46] AMD Phenom II 1090T [46]
Cores 112 128 192 512 4 6
Frequency (MHz) 1188 1350 1294 1544 3200 3200
Price (USD) 250 173 100 500 300 200
TDP (W) 150 155 202 244 130 125
GFLOPS 399 518 715 1581 102 153
Modexp/s 6504 11074 41426 149464 32608 77002
Modexp/s (scaled) 13052 15282 41426 46973 N/A N/A
Modexp/s/W 43 71 205 612 250 616
Modexp/s/USD 26 64 414 298 131 385
Table I
COMPARISONS OF MODULAR EXPONENTIATION PERFORMANCES ON VARIOUS CPU AND GPU IMPLEMENTATIONS.

looking at the numbers, looks like GPU is only around 2x CPU???

ZenithCoin - Sustainable Scrypt Based Crypto Currency
mustyoshi
Sr. Member
****
Offline Offline

Activity: 287
Merit: 250



View Profile
July 17, 2013, 04:57:30 PM
 #12

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
Division is just subtraction with a counter.
markm
Legendary
*
Offline Offline

Activity: 2674
Merit: 1041



View Profile WWW
July 17, 2013, 05:01:21 PM
 #13

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
Division is just subtraction with a counter.

Unless you like to optimise it, like maybe by using some shifts or whatever (albeit maybe also with counters involved in that part too?)

I have often wondered if the Trachtenburg Speed System of basic mathematics is any faster on machines than other approaches? Maybe more useful for base ten than binary though?

-MarkM-

Browser-launched Crossfire client now online (select CrossCiv server for Galactic  Milieu)
Free website hosting with PHP, MySQL etc: http://hosting.knotwork.com/
oxfeeefeee
Member
**
Offline Offline

Activity: 73
Merit: 10


View Profile
July 17, 2013, 05:06:45 PM
Last edit: July 17, 2013, 05:19:25 PM by oxfeeefeee
 #14

Another paper with a nice table, gives you some idea on how things will go.

http://trone.di.fc.ul.pt/images/e/e2/ASAP11-paper.pdf

GTS8800 [17] GTX8800 [10] GTX260 (This paper) GTX580 [39] Intel W3565 [46] AMD Phenom II 1090T [46]
Cores 112 128 192 512 4 6
Frequency (MHz) 1188 1350 1294 1544 3200 3200
Price (USD) 250 173 100 500 300 200
TDP (W) 150 155 202 244 130 125
GFLOPS 399 518 715 1581 102 153
Modexp/s 6504 11074 41426 149464 32608 77002
Modexp/s (scaled) 13052 15282 41426 46973 N/A N/A
Modexp/s/W 43 71 205 612 250 616
Modexp/s/USD 26 64 414 298 131 385
Table I
COMPARISONS OF MODULAR EXPONENTIATION PERFORMANCES ON VARIOUS CPU AND GPU IMPLEMENTATIONS.

looking at the numbers, looks like GPU is only around 2x CPU???

Yes it seems so, and the performance/watt is basically at the same level. But you can't just plug 6 AMD CPUs to a single rig like what you'd do with GPUs.

I'm guessing this is because CPUs have AVX which is 256 bit, and that makes them very good at dealing with big numbers compared to GPUs which can only support 64bit natively.
iCEBREAKER
Legendary
*
Offline Offline

Activity: 2156
Merit: 1070


Crypto is the separation of Power and State.


View Profile WWW
July 17, 2013, 06:01:58 PM
 #15

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
Quote

Division is just subtraction with a counter.


██████████
█████████████████
██████████████████████
█████████████████████████
████████████████████████████
████
████████████████████████
█████
███████████████████████████
█████
███████████████████████████
██████
████████████████████████████
██████
████████████████████████████
██████
████████████████████████████
██████
███████████████████████████
██████
██████████████████████████
█████
███████████████████████████
█████████████
██████████████
████████████████████████████
█████████████████████████
██████████████████████
█████████████████
██████████

Monero
"The difference between bad and well-developed digital cash will determine
whether we have a dictatorship or a real democracy." 
David Chaum 1996
"Fungibility provides privacy as a side effect."  Adam Back 2014
Buy and sell XMR near you
P2P Exchange Network
Buy XMR with fiat
Is Dash a scam?
AstroKev
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
July 18, 2013, 01:21:48 AM
 #16

Here is a few year old example where an implementation of ECM was developed and compared against the standard CPU and the result was roughly 2x faster.  I know we're not talking about ECM here but again it's suggestive of what one might expect.

http://eecm.cr.yp.to/gpuecm-20090127.pdf

modular arithmetic is very easy to implement with the four basic arithmetic functions, so I'm not sure what the holdup is around that?
markm
Legendary
*
Offline Offline

Activity: 2674
Merit: 1041



View Profile WWW
July 18, 2013, 01:26:46 AM
 #17

If a GPU is only twice as fast as a CPU, maybe putting more CPUs on one's motherboard might be more cost-effective than putting one or more GPUs in a machine?

Or think blade servers, two or more CPUs per blade, would GPUs really be more cost-effective?

-MarkM-

Browser-launched Crossfire client now online (select CrossCiv server for Galactic  Milieu)
Free website hosting with PHP, MySQL etc: http://hosting.knotwork.com/
horeaper
Newbie
*
Offline Offline

Activity: 49
Merit: 0



View Profile
July 18, 2013, 01:39:21 AM
 #18

I think XPM is a good addition to current GPU rigs, now your CPU won't be sleeping all the time, right?
mustyoshi
Sr. Member
****
Offline Offline

Activity: 287
Merit: 250



View Profile
July 18, 2013, 01:40:56 AM
 #19

I guess you failed to notice that CUMP only supports addition, subtraction and multiplication. You need division and other special functions for Primecoin mining.
Quote

Division is just subtraction with a counter.

That's beautiful. :')
AstroKev
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
July 18, 2013, 01:54:12 AM
 #20

Agreed that XPM mining is a good addition to BTC/LTC mining!  Except for the peopel who bought the cruddiest processors they could find when building their rigs...

Even though only ADD/SUB/MUL are in CUMP, one might be able to use a binomial method (guess/check/converge) to implement DIV without much difficulty though I don't know how well the operation would perform.

There are quite a few factoring algos/sieves on github using cuda, though my cursory glance was showing non-arbitrary precision routines.  I thought floating point arithmetic has no place in integer factorization?!
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!