cbuchner1 (OP)
|
|
April 09, 2013, 11:38:06 PM |
|
Is it the SP's available or card memory available that is a limiting factor for the CUDA miner? I ask because I have a 670 with 4GB (vs. 2GB standard) and was curious if I should pop that in and give it a try.
Memory size can be a limiting factor if you have lots of SP (streamprocessors or CUDA cores) on the cards. e.g. on a 660Ti with 3GB I can go for 290x2 (which consumes some 2.5 GB of RAM), whereas on smaller cards we can choose e.g. 148x2. The more CUDA cores a card has, the larger the grid x block value we need to throw at it to keep the SPs busy - but also the memory requirement grows. I have now improved my heuristics (the --no-autotune case) for such cases on compute 3.0 cards. I have no experience with the 670 or 680 cards yet. But 4GB vs 2GB can make a difference in the autotune results. Christian
|
|
|
|
cbuchner1 (OP)
|
|
April 09, 2013, 11:40:00 PM Last edit: April 10, 2013, 12:07:01 AM by cbuchner1 |
|
What is the autotune using to determine the outcome? Should I try to find an suitable config and then fix than in the startup params? Or is it better if I let it autotune on each startup?
Due to measurement errors there is always a bit of randomness involved. Occasionally it will even find a surprising new configuration that suddenly beats the one you were using previously. It may depend on current overclocking settings as well (memory vs. core clock), that's why it's called "tuning". Thanks to everybody who is posting their launch configs here. It allowed me to improve the heuristics code for people who do not have the patience to wait through an autotune session. I am still lacking data points on compute 1.0 and 1.1 devices (8800GTX, etc...), and also compute 1.2. Really ancient hardware And of course, the TITAN! Due to good benchmark results, I just ordered myself a GTX 570 (Club 3D also, 149 Euros) Anyone got a GTX 590? This dual GPU card could do 300 kHash/s I guess. Also I posted one last binary+source code update for today, which includes the improved heuristics for --no-autotune. I found and fixed one more bug appearing on Linux in my 9600M GPU. Most autotune measurements were showing 0 kHash suddenly. Christian
|
|
|
|
Number6
|
|
April 10, 2013, 12:10:07 AM |
|
Memory size can be a limiting factor if you have lots of SP (streamprocessors or CUDA cores) on the cards.
e.g. on a 660Ti with 3GB I can go for 290x2 (which consumes some 2.5 GB of RAM), whereas on smaller cards we can choose e.g. 148x2.
That explains why mine didn't work on 290x2 then and picked 148x2. I thought I had a 3 GB 660TI, but then looked at the box and indeed it was only a 2 GB model. Would you mind explaining the math a bit more? From what you posted it seems as if I could go higher than 148x2, but not to 290x2 which you mention consumes 2.5 GB. For other's reference: GTX 660Ti 2GB 314.22 WHLQ Driver GPU Core OC to 1320 MHz autotune launch config settings 148x2 130-140 khash/sec with 98.8% average accepted share rate - via console 130.8 kH/sec reported by Pool (average over 10 minutes)
|
BTC: 18jdvLeM6r943eUY4DEC5B9cQZPuDyg4Zn LTC: LeBh9akQ3RwxwpUU6pJQ9YGs9PrC1Zc9BK
|
|
|
Number6
|
|
April 10, 2013, 12:15:41 AM Last edit: April 10, 2013, 12:53:42 AM by Number6 |
|
I am still lacking data points on compute 1.0 and 1.1 devices (8800GTX, etc...), and also compute 1.2. Really ancient hardware I actually have a old box with an 8800GTX in it! I will run it for awhile and report my results. Update: Autotune picked 1x4 Hashrate of 2.4 kH/sec didn't bother which much more testing due to such a low rate
|
BTC: 18jdvLeM6r943eUY4DEC5B9cQZPuDyg4Zn LTC: LeBh9akQ3RwxwpUU6pJQ9YGs9PrC1Zc9BK
|
|
|
gchil0
Newbie
Offline
Activity: 59
Merit: 0
|
|
April 10, 2013, 12:38:59 AM |
|
I couldn't get a 32-bit compile in Ubuntu 12.04 because of libcurl issues. For the 64-bit compile, I had to add -fpermissive to CXXFLAGS to get the compiler to accept a cast related to jansson. After that, I have a binary. Here's the autotune for a K20. http://pastebin.com/s9Cyb8yA
|
|
|
|
cbuchner1 (OP)
|
|
April 10, 2013, 12:46:16 AM |
|
I couldn't get a 32-bit compile in Ubuntu 12.04 because of libcurl issues. For the 64-bit compile, I had to add -fpermissive to CXXFLAGS to get the compiler to accept a cast related to jansson. After that, I have a binary. Here's the autotune for a K20. http://pastebin.com/s9Cyb8yANot nearly as impressive as I thought it would be. Is ECC turned off? Performance is not scaling up with core count as much as I would have hoped for. Nice to know the code does not barf on 64bit and that shares are accepted.
|
|
|
|
Testarossa
Newbie
Offline
Activity: 11
Merit: 0
|
|
April 10, 2013, 01:29:07 AM |
|
Hello,
First of all, good work Christian. Applause!
I have a GTX 295 (2 GPU's). The first version of the cudaminer performs a bit better than the latest(09/04) one. Between 1 a 2 Khash/s better. Without overclocking I get +-36 Khash/s on each GPU.
Tested for bitcoin mining (not a cudaminer), and there it went to +-52 Mhash/s on each GPU. Should I be able to get more or less, litecoin whise mining?
What do I have to overclock/underclock to be efficient?
|
|
|
|
wndrbr3d
|
|
April 10, 2013, 01:31:09 AM |
|
Hello,
First of all, good work Christian. Applause!
I have a GTX 295 (2 GPU's). The first version of the cudaminer performs a bit better than the latest(09/04) one. Between 1 a 2 Khash/s better. Without overclocking I get +-36 Khash/s on each GPU.
Tested for bitcoin mining (not a cudaminer), and there it went to +-52 Mhash/s on each GPU. Should I be able to get more or less, litecoin whise mining?
What do I have to overclock/underclock to be efficient?
Bitcoin uses SHA-256 where as Litecoin uses scrypt for proof of work. Scrypt is much more memory intensive, so it's a world of difference.
|
|
|
|
Lacan82
|
|
April 10, 2013, 02:21:51 AM |
|
Thank god! I can finally post! No longer have to email Christian Directly \o/ GTX 570 GPU 822 MHZ receiving 165 KH 650M receiving 29 KH On the list: GTX 9800 to be tested. Pool is reporting tho I am mining at 190 KH, and 41 KHs
|
|
|
|
|
gchil0
Newbie
Offline
Activity: 59
Merit: 0
|
|
April 10, 2013, 02:40:37 AM |
|
Not nearly as impressive as I thought it would be. Is ECC turned off? Performance is not scaling up with core count as much as I would have hoped for. Nice to know the code does not barf on 64bit and that shares are accepted.
Yes, ECC is disabled. The older version only got ~110 khash/sec, so the Titan version is nearly 70% faster on the K20.
|
|
|
|
omo
|
|
April 10, 2013, 02:50:56 AM Last edit: April 10, 2013, 03:02:42 AM by omo |
|
I tried the latest version on a Quadro FX 2800M card, it crashed after giving the following message: E:\cudaminer-2013-04-09>cudaminer.exe --url http://litecoinpool.org:9332/ --user user --pass x --thread 1 *** CudaMiner for nVidia GPUs by Christian Buchner *** This is version 2013-04-09 (alpha) based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler Cuda additions Copyright 2013 Christian Buchner My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm
[2013-04-10 10:43:36] 1 miner threads started, using 'scrypt' algorithm. [2013-04-10 10:43:37] Long-polling activated for http://litecoinpool.org:9332/LP [2013-04-10 10:43:37] GPU #0: with compute capability 0.47446600 [2013-04-10 10:43:37] GPU #0: Performing auto-tuning (Patience...) [2013-04-10 10:43:37] GPU #0: 0.00 khash/s with configuration 0x0 [2013-04-10 10:43:37] GPU #0: using launch configuration 0x0
edit: I got similar messages on a GTX680 box(desktop remote accessing), all are win7/64 os: [2013-04-10 11:01:22] 1 miner threads started, using 'scrypt' algorithm. [2013-04-10 11:01:23] Long-polling activated for http://litecoinpool.org:9332/LP
[2013-04-10 11:01:23] GPU #0: ,跸?羨 with compute capability 10002000.10002000 [2013-04-10 11:01:23] GPU #0: Performing auto-tuning (Patience...) [2013-04-10 11:01:23] GPU #0: 0.00 khash/s with configuration 0x0 [2013-04-10 11:01:23] GPU #0: using launch configuration 0x0
|
BTC:1Fu4TNpVPToxxhSXBNSvE9fz6X3dbYgB8q
|
|
|
Lacan82
|
|
April 10, 2013, 02:58:48 AM Last edit: April 10, 2013, 03:10:23 AM by Lacan82 |
|
I tried the latest version on a Quadro FX 2800M card, it crashed after giving the following message: E:\cudaminer-2013-04-09>cudaminer.exe --url http://litecoinpool.org:9332/ --user user --pass x --thread 1 *** CudaMiner for nVidia GPUs by Christian Buchner *** This is version 2013-04-09 (alpha) based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler Cuda additions Copyright 2013 Christian Buchner My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm
[2013-04-10 10:43:36] 1 miner threads started, using 'scrypt' algorithm. [2013-04-10 10:43:37] Long-polling activated for http://litecoinpool.org:9332/LP [2013-04-10 10:43:37] GPU #0: with compute capability 0.47446600 [2013-04-10 10:43:37] GPU #0: Performing auto-tuning (Patience...) [2013-04-10 10:43:37] GPU #0: 0.00 khash/s with configuration 0x0 [2013-04-10 10:43:37] GPU #0: using launch configuration 0x0
What happens if you don't specify the thread? same thing? you can also use -D to see what Autotuning does Nevermind. it isn't even seeing your cards properly. Language issue maybe? I see Japanese character on the one print out.
|
|
|
|
omo
|
|
April 10, 2013, 03:10:11 AM |
|
I tried the latest version on a Quadro FX 2800M card, it crashed after giving the following message: E:\cudaminer-2013-04-09>cudaminer.exe --url http://litecoinpool.org:9332/ --user user --pass x --thread 1 *** CudaMiner for nVidia GPUs by Christian Buchner *** This is version 2013-04-09 (alpha) based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler Cuda additions Copyright 2013 Christian Buchner My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm
[2013-04-10 10:43:36] 1 miner threads started, using 'scrypt' algorithm. [2013-04-10 10:43:37] Long-polling activated for http://litecoinpool.org:9332/LP [2013-04-10 10:43:37] GPU #0: with compute capability 0.47446600 [2013-04-10 10:43:37] GPU #0: Performing auto-tuning (Patience...) [2013-04-10 10:43:37] GPU #0: 0.00 khash/s with configuration 0x0 [2013-04-10 10:43:37] GPU #0: using launch configuration 0x0
What happens if you don't specify the thread? same thing? you can also use -D to see what Autotuning does thank you for your reply. if I did'nt specify the thread option, it won't go below the donation line. I specified -D and got a slightly different messages, but still crashed: E:\cudaminer-2013-04-09>cudaminer.exe --url http://litecoinpool.org:9332/ --user user --pass x --thread 1 -D
*** CudaMiner for nVidia GPUs by Christian Buchner *** This is version 2013-04-09 (alpha) based on pooler-cpuminer 2.2.3 (c) 2010 Jeff Garzik, 2012 pooler Cuda additions Copyright 2013 Christian Buchner My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm
[2013-04-10 11:07:21] 1 miner threads started, using 'scrypt' algorithm. [2013-04-10 11:07:22] Long-polling activated for http://litecoinpool.org:9332/LP [2013-04-10 11:07:22] DEBUG: got new work in 1019 ms [2013-04-10 11:07:22] GPU #0: with compute capability 532.0 [2013-04-10 11:07:22] GPU #0: Performing auto-tuning (Patience...) [2013-04-10 11:07:22] GPU #0: 0.00 khash/s with configuration 0x0 [2013-04-10 11:07:22] GPU #0: using launch configuration 0x0
|
BTC:1Fu4TNpVPToxxhSXBNSvE9fz6X3dbYgB8q
|
|
|
Lacan82
|
|
April 10, 2013, 03:11:42 AM |
|
I saw your edit on your previous post. Is your computer in another language? your card name isn't even showing. So maybe a language issue?
Drives are up to date?
|
|
|
|
omo
|
|
April 10, 2013, 03:22:21 AM |
|
I saw your edit on your previous post. Is your computer in another language? your card name isn't even showing. So maybe a language issue?
Drives are up to date?
yeah, my win7 is Chinese version. I shall check the driver. thank you
|
BTC:1Fu4TNpVPToxxhSXBNSvE9fz6X3dbYgB8q
|
|
|
omo
|
|
April 10, 2013, 04:32:08 AM Last edit: April 10, 2013, 06:09:06 AM by omo |
|
after upgrading to the latest driver,I made my Quadro FX 1800M running: ....... ....... [2013-04-10 12:28:56] GPU #0: 14.12 khash/s with configuration S10x3 [2013-04-10 12:28:56] GPU #0: using launch configuration S10x3 [2013-04-10 12:28:56] GPU #0: Quadro FX 2800M, 960 hashes, 0.00 khash/s [2013-04-10 12:28:56] GPU #0: Quadro FX 2800M, 960 hashes, 6.91 khash/s [2013-04-10 12:28:56] LONGPOLL detected new block [2013-04-10 12:28:56] DEBUG: got new work [2013-04-10 12:28:56] GPU #0: Quadro FX 2800M, 7680 hashes, 12.41 khash/s [2013-04-10 12:29:06] DEBUG: hash <= target Hash: 00005f1b32735cc067794562b715b6dc1b42133abb3416db26e127ca7e8d59de
and GTX 680: [2013-04-10 14:06:19] GPU #0: 143.56 khash/s with configuration 166x2 [2013-04-10 14:06:19] GPU #0: using launch configuration 166x2 [2013-04-10 14:06:20] GPU #0: GeForce GTX 680, 10624 hashes, 0.06 khash/s [2013-04-10 14:06:20] DEBUG: got new work in 647 ms [2013-04-10 14:06:20] GPU #0: GeForce GTX 680, 10624 hashes, 69.53 khash/s [2013-04-10 14:06:23] DEBUG: hash <= target
|
BTC:1Fu4TNpVPToxxhSXBNSvE9fz6X3dbYgB8q
|
|
|
datguyian
|
|
April 10, 2013, 04:41:05 AM |
|
So I was thinking cool, I might be able to get another 50 kh/s from that workstation that's been sitting in my office for the last few months. WRONG! I'm averaging almost 400 kh/s right now from a quadro 600... if this keeps up and doesn't burn my office down by the morning (i'm not physically at the location atm, so hoping nothing is smoking right now), expect a decent donation to this project soon.
|
|
|
|
|
datguyian
|
|
April 10, 2013, 04:49:09 AM |
|
One thing I'm curious and slightly worried about (bear in mind I'm screwing around with a drafter's workstation on my company's network at this time... yeah, I know, WTF?!)... CPU is staying maxed out on the machine I'm testing this on right now. Any idea why? From what I've read, Radeon's require almost no CPU usage. Why is it that this Quadro 600 would be maxing a 4 core intel Xeon CPU out?
|
|
|
|
|