chapmanjw
Newbie
Offline
Activity: 53
Merit: 0
|
 |
February 04, 2014, 10:38:28 PM |
|
I posted a new release 2014-02-04 fixing two important bugs.
- autotune underreporting kHash/s values if the kernel finished in under 50ms (forgot to divide time elapsed by number of measurements, doh!) - Multi-GPU support was not working - it is now.
I've posted the Mac OS X binaries of the 2014-02-04 release for 10.6, 10.7, 10.8, and 10.9 here: http://www.johnchapman.net/crypto-currency/cudaminer-2-4-2014-release-now-available-for-os-x-10-6-10-7-10-8-and-10-9/With this release I am seeing about a 15% increase in performance from my GT 650M. Went from about 65 kh/s to 75 kh/s. Well done! Thanks.
|
|
|
|
|
|
|
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
|
lordaccess
Member

Offline
Activity: 69
Merit: 10
|
 |
February 04, 2014, 10:40:23 PM |
|
any idea settings for vertcoin for gtx 780? I am getting 380 total from two cards with autotune (which was the best from all the switches I tried). Have no idea about vertcoin!
|
|
|
|
eLeMe
Newbie
Offline
Activity: 37
Merit: 0
|
 |
February 04, 2014, 11:10:28 PM |
|
On a more positive note: New build working great!
EVGA GTX660 SC w/slight oc (+71 engine, 110% power) 240Kh/s - scrypt (-l K5x32 -i 0 -H 1 -m 1 -C 1) 2.65Kh/s - scrypt-jane:YAC (-b 32768 -L 3 -l K59x3 -s 120 -i 0 -C 0 -m 0 -H 2)
Not sure why, but I tried my YAC config with a older commit_133 build, and I'm now getting Pi-Kh/s! (3.14Kh/s) I'm supprised at how much more coin i'm getting with just the little bump in hash rate. Haven't tested the 2/4/14 build yet.. And props to bathrobehero for my YAC config! Couldn't have done it without you. Note: my 660 has 2 monitors plugged into it.
|
|
|
|
relm9
|
 |
February 04, 2014, 11:16:06 PM |
|
any idea settings for vertcoin for gtx 780? I am getting 380 total from two cards with autotune (which was the best from all the switches I tried). Have no idea about vertcoin!
I get 250 kH/s per card with T12x20, but this setting only works well under Linux. In Windows I can only get up to 190 kH/s per card, about the same as you.
|
|
|
|
AizenSou
|
 |
February 04, 2014, 11:24:51 PM |
|
So i broke something. Hopefully someone can enlighten me to where my problem lies (other than in the chair)
Please keep in mind i make no claim to knowing what the hell i'm doing, but only one way to learn, right?
2 670 GTXs, not in sli, sep .bat files for each, x64 cudaminer
On VTC, K7x32 was producing ~133kh per card. After tinkering with some different kernals received cuda error 30 and display driver crashed. Rebooted and went back to K7x32 but that now produced cuda error 30 and a crash. Reinstalled cuda and vid drivers and was able to use K7x32 with same results (133/per). Once again tinkered with different kernals (using WHQL driver instead of beta this time to see if there was any difference) and received cuda error 30 etc etc but this time a cuda/driver reinstall didn't fix issue.
Currently running K7x20 with 155kh on one card and 135 on the other atm, but what did i break exactly and how?
maybe the x32 config is at the limits of what the WDDM graphics card driver will allow you to allocate. Sometimes it works, and sometimes it doesn't. Leave away any -m1 or -C 1/2 options or reduce the x32 to something slightly smaller e.g. x30, x28 Hi Christian, Could you give me some advices how to finetune Kepler and Titan cards for mining low-factor scrypt-jane coins? For example with Vertcoin N=10 I got: Fermi GTX580: 130khs Kepler GTX680: 150khs Titan GTX780: 210khs All of those were from autotune because I don't know how to tune card for scrypt-jane. The b x w should lower than the total warps, but the number of warps is very different even with cards of the same manufacture, right ? There are a lot that I still don't understand. Thank in advance, Christian. Gute Nacht und besten Gruesse,  I assume that you are German 
|
|
|
|
steking
Member

Offline
Activity: 67
Merit: 10
|
 |
February 05, 2014, 12:15:59 AM |
|
gtx 650 ti i get 153kh/s
cudaminer.exe -D -o 127.0.0.1:12588 -u u -p p -i 0 -H 1 -m 1 -C 1 -s 1
|
|
|
|
AizenSou
|
 |
February 05, 2014, 12:38:54 AM |
|
any idea settings for vertcoin for gtx 780? I am getting 380 total from two cards with autotune (which was the best from all the switches I tried). Have no idea about vertcoin!
I get 250 kH/s per card with T12x20, but this setting only works well under Linux. In Windows I can only get up to 190 kH/s per card, about the same as you. Amazing. Thanks you. I got from 220khs to 280khs with your kernel setting. Autotune is very clumsy 
|
|
|
|
steking
Member

Offline
Activity: 67
Merit: 10
|
 |
February 05, 2014, 01:22:23 AM |
|
vertcoin cudaminer latest how to mining?
|
|
|
|
tron666
Member

Offline
Activity: 112
Merit: 10
|
 |
February 05, 2014, 01:24:49 AM |
|
with gtx560ti im getting about 145 khz vs 160 khz before.
|
CCO MNVPaetsHpxr97mRDqqPuV6PQoSVbFgPVE NEM-test TBZXHE-TD6AO6-PHSFZL-SZ7MWS-JEEI7C-EFCUC2-7Y7V LTC LcAQUMNhqDYesRRYMxMAsE5rhAAseDMDp7 XPM AHDtLd993oYke4Zrm5dDG5WGtgyaUaMTCK NXT 16706883867271464458 DOGE DBGiKBD1HZ8yfTdTcX5m8T7mY4X4cUVnEz
|
|
|
|
SystmHash
Newbie
Offline
Activity: 37
Merit: 0
|
 |
February 05, 2014, 04:33:57 AM |
|
Vertcoin on GTX780 with core overclocked to 1254Mhz = ~335kh/s cudaminer.exe --algo=scrypt:2048 -d 0 -H 2 -C 1 -l T12x20 -i 0
GTX 670 overclocked to 1200mhz (using CPU for sha256) = ~155kh/s cudaminer.exe --algo=scrypt:2048 -d 1 -H 1 -l K35x6 -i 0
Use these!
Donations welcome! VTC = VouDyTwF5QNvmGukvdHgsZNwtJKjcY6aRD
|
|
|
|
Sunsparc
Newbie
Offline
Activity: 1
Merit: 0
|
 |
February 05, 2014, 04:35:55 AM |
|
A brave tester with 8 Fermi cards Tesla M2090 (thanks Choseh) just figured out the performance regression between 2013-12-18 and 2014-02-02. If you change the #if 0 in the fermi_kernel.cu to #if 1 (thereby enabling the previous version of the Salsa20/8 round function) you should see the previous performance figures again. Those who can compile the code themselves and want to mine on Fermi are welcome to make this change themselves. EDIT: False alarm apparently. My tester cannot reproduce this now
also there seems to be a bug in the autotuning code in salsa_kernel.cu
hash_sec = (double)WU_PER_LAUNCH / tdelta;
should very likely be
hash_sec = (double)WU_PER_LAUNCH * repeat / tdelta;
to factor in the number of repetitions in the measurement (we want to measure for 50ms minimum for better timer accuracy). So autotune was drunk after all!
So, it seems I should release fixes (new binary release) for these problems tonight.
Christian
I've been experiencing problems with the 2-2 and 2-4 releases, both dropped my Kh/s about 20-30 on my GTX 560 Ti (Fermi). I've been using the 12-18 release to maximize my hashing power. Here's my config: cudaminer.exe --no-autotune -O user.worker:pass -o stratum+tcp://pool.com:3333 -C 1 -i 0 -H 1 -l F8x16 I have noticed that the newer releases report that the maximum warps as 209, whereas 12-18 shows maximum warps as 211. I did a benchmark on 2-4 with -C 1 -H 1 and -i 0 flags included, which gave me a config of F32x4. According to all of the resources I've read, F8x16 is the maximum my card can handle before giving CPU validation errors.
|
|
|
|
tabbek
Member

Offline
Activity: 116
Merit: 10
|
 |
February 05, 2014, 04:42:52 AM Last edit: February 05, 2014, 04:58:06 AM by tabbek |
|
for what it's worth: ultracoin cudaminer -C 2 -H 2 -l <change_me> -o xxxx -u xxxx -p xxxx --algo=scrypt-jane:UTC
gtx 570 | f30x32 (173-193 Kh/s) -------------------------------- old 8500| f2x7 (2.5 Kh/s)
gtx 570 | F15x16 (193-223 Kh/s) 260 *occasionally drops to ~114 for a few sec. -------------------------------- warps old 8500| F6x4 (2.1 Kh/s) 105
*added warps from F set.
|
|
|
|
RandomNobody
Newbie
Offline
Activity: 10
Merit: 0
|
 |
February 05, 2014, 04:50:41 AM |
|
Vertcoin on GTX780 with core overclocked to 1254Mhz = ~335kh/s cudaminer.exe --algo=scrypt:2048 -d 0 -H 2 -C 1 -l T12x20 -i 0
GTX 670 overclocked to 1200mhz (using CPU for sha256) = ~155kh/s cudaminer.exe --algo=scrypt:2048 -d 1 -H 1 -l K35x6 -i 0
Use these!
Donations welcome! VTC = VouDyTwF5QNvmGukvdHgsZNwtJKjcY6aRD
I have a GTX 780 and with those arguments I get around 25kh/s. Adding -L 2 gets me to 200kh/s. What model is your 780?
|
|
|
|
ManiacMiner
|
 |
February 05, 2014, 04:53:46 AM |
|
A brave tester with 8 Fermi cards Tesla M2090 (thanks Choseh) just figured out the performance regression between 2013-12-18 and 2014-02-02. If you change the #if 0 in the fermi_kernel.cu to #if 1 (thereby enabling the previous version of the Salsa20/8 round function) you should see the previous performance figures again. Those who can compile the code themselves and want to mine on Fermi are welcome to make this change themselves. EDIT: False alarm apparently. My tester cannot reproduce this now
also there seems to be a bug in the autotuning code in salsa_kernel.cu
hash_sec = (double)WU_PER_LAUNCH / tdelta;
should very likely be
hash_sec = (double)WU_PER_LAUNCH * repeat / tdelta;
to factor in the number of repetitions in the measurement (we want to measure for 50ms minimum for better timer accuracy). So autotune was drunk after all!
So, it seems I should release fixes (new binary release) for these problems tonight.
Christian
I've been experiencing problems with the 2-2 and 2-4 releases, both dropped my Kh/s about 20-30 on my GTX 560 Ti (Fermi). I've been using the 12-18 release to maximize my hashing power. Here's my config: cudaminer.exe --no-autotune -O user.worker:pass -o stratum+tcp://pool.com:3333 -C 1 -i 0 -H 1 -l F8x16 I have noticed that the newer releases report that the maximum warps as 209, whereas 12-18 shows maximum warps as 211. I did a benchmark on 2-4 with -C 1 -H 1 and -i 0 flags included, which gave me a config of F32x4. According to all of the resources I've read, F8x16 is the maximum my card can handle before giving CPU validation errors. I have same problem with my GTX 560Ti.
|
(つ ͡๏ ͜১ ͡๏ )つ[̲̅$̲̅(̲̅5̲̅)̲̅$̲̅]ε=ʕ ͡๏ ͜১ ͡๏ʔ=з
|
|
|
lordaccess
Member

Offline
Activity: 69
Merit: 10
|
 |
February 05, 2014, 05:25:44 AM |
|
Vertcoin on GTX780 with core overclocked to 1254Mhz = ~335kh/s cudaminer.exe --algo=scrypt:2048 -d 0 -H 2 -C 1 -l T12x20 -i 0
GTX 670 overclocked to 1200mhz (using CPU for sha256) = ~155kh/s cudaminer.exe --algo=scrypt:2048 -d 1 -H 1 -l K35x6 -i 0
Use these!
Donations welcome! VTC = VouDyTwF5QNvmGukvdHgsZNwtJKjcY6aRD
I have a GTX 780 and with those arguments I get around 25kh/s. Adding -L 2 gets me to 200kh/s. What model is your 780? cheers bro the -L 2 took me to 200 or so on each card.
|
|
|
|
sambiohazard
|
 |
February 05, 2014, 06:09:13 AM |
|
Example for Litecoin Mining on coinotron pool with GTX 660 Ti
cudaminer -d gtx660ti -l K28x32 -C 2 -i 0 -o stratum+tcp://coinotron.com:3334 -O workername:password Anyone else getting infinite "result does not validate on CPU" errors with this settings? I have an Asus GTX 660Ti OC I think you need to use k instead of K for 28x32, as far as I understood from christian's post. K = Y in new release. I am using K7x32 for my 660Ti non-OC. Here is how launch configs from the cudaminer 2013-12-18 release translate to cudaminer 2014-02-02 release to get equivalent performance. This can be handy if you find some older launch configs posted by others
L b x w -> sorry, legacy kernel was replaced by Fermi kernel. Autotune the F kernel.
F b x w -> F b x w ( no change to this one )
K b x w -> k 4*b x w (the previous K kernel is now named k and no. of blocks has to be quadrupled)
T b x w -> t 4*b x w (the previous T kernel is now named t and no. of blocks has to be quadrupled)
S b x w -> Spinlock kernel is GONE.
These following kernels are new in the cudaminer 2014-02-02 release: T (nVidia), K (nVidia), f (ported over from David Andersen's code).
Christian
|
|
|
|
sambiohazard
|
 |
February 05, 2014, 06:17:04 AM |
|
Anyone been able to get a evga gtx780ti with acx cooling to 640khash?
Highest I can get is around 400. I'm just guessing different values now. But the old version could get very close to 650 cudaminer -a scrypt:2048 -d gtx780ti -H 1 -l t15x16 -C 2 -i 0
For vertcoin getting 181khash/sec. The gpu reaches 1.1Ghz when running cudaminer. I should be getting over 650. With the old version gpu would only get up to around 1000mhz. So now I'm getting 100 more mhz but very bad hash rates.
Have you tried dropping -C. I have genrally seen that -C 1 and -C 2 drop the hashrate.
|
|
|
|
Silverwolf_Ru
Full Member
 
Offline
Activity: 120
Merit: 100
Astrophotographer and Ham Radioist!
|
 |
February 05, 2014, 06:35:45 AM |
|
I have been away for a while, but I'm glad people are using my and Christian's binaries for good. I'm still mining Microcoin with them. Gives me about 168 kilohashes on an OC'ed 560 Ti. It even found around two blocks since yesterday. ON towards becoming an MRC Millionaire! If anyone needs any help, feel free to get in touch.
Christian, could you please explain why the F and X and f autotuning give much much lower values and choose something else then F16x8 / F8x16 on my card? Those give me 168 kilohashes, anything else is 120-130 and autotune isn't showing those configurations to give more then 70-80 kilohashes, so another one gets chosen. I'm glad I found my "sweet spot", but others may not.
|
Bitcoin: 17kz4pWKoMoVupGUYgj8kGomxXUkDHNtVe Shadowcoin: Seta8CFwP6yvbeCkgfjxXjpkokrQMQovGF ~Coin of the Future!
|
|
|
Galvin
Newbie
Offline
Activity: 21
Merit: 0
|
 |
February 05, 2014, 07:04:52 AM |
|
I ran autotune again in debug. And got this cudaminer -d gtx780ti -H 1 -l T59x4 -C 2 -i 0
This gives me what I got with the december release around 630 to 650.
|
|
|
|
|