vosovich
Newbie
Offline
Activity: 28
Merit: 0
|
|
December 08, 2013, 11:05:40 PM |
|
So it does pick a configuration when you use (capital) F, but you don't know which? If so, use the -D flag for debugging mode. It will give you more information about the auto-tuning process.
Seems like using just a K for me shows "Given launch config 'K' does not validate". Interesting aside, it seems the 98x2 I've been using doesn't show up in the debug for autotune, despite being the best config I've found so far. This debug panel gives a few more leads to check out, though. I was convinced that using -l F had worked for me in the past. However, I just checked to make sure and I can confirm what you have said. It does not work for me either. EDIT for your EDIT: Yes, that does seem to be the case. What I did is run autotune a bunch of times to pick a few candidates. I ran it until I was convinced that no new configurations would pop up. Then I let the candidates run for a while and finally I picked the one with the best hashrate.
|
|
|
|
trell0z
Newbie
Offline
Activity: 43
Merit: 0
|
|
December 08, 2013, 11:28:02 PM |
|
Yeah it's the whole -l F / -l f (and that's even the correct kernel for my card, 580) etc thing that doesn't work, at least as far as I can understand the readme we're supposed to be able to do something like that, and it will autotune with my choosen settings, and for the kernel that I specify from the list of L/F/K/T. Only result I get though is either "Given launch config 'x' does not validate" or the program crashes so yeah.. Otherwise I'm using the given launch config of F16x14 which I got out of normal autotune run with -D yes.
|
|
|
|
cbuchner1 (OP)
|
|
December 09, 2013, 02:55:04 AM |
|
Problem is, with capital letters it's just "unknown", with non-capital the program just crashes, even with "f", which is the correct one for my card..
you'll get a warning that the launch config is unrecognized, but it does indeed work (assuming you use uppercase letters) But you need to watch out for minimum requirements regarding compute capability. Christian
|
|
|
|
Greg121986
Newbie
Offline
Activity: 8
Merit: 0
|
|
December 09, 2013, 04:59:24 AM |
|
EDIT: Hmmm, just restarted my computer for the first time today, immediately opened up the cudaminer upon getting back to my desktop and noticed the same problem where the GPU utilization is only at 50% or so (getting 114Khash/s). By the time I opened this browser and typed the first sentence of this edit, the GPU load was already at 97% and my hashrate climbed to 209Khash/s. Anyone else having similar issues?
I am having the same sort of issue. I am using -i 0 flag which typically results in 99% utilization for my GTX 760. On the prior cudaminer release I was not getting more than 75%, usually at 50%. The odd thing with the 12-07 release is I get 99% utilization when I am using my PC. If I leave the PC I see the utilization jump up and down quite often. After I leave the system running untouched I return to see that my hash rate goes between 155-201. Also, I really do not understand the varying use of kernals. Is there a list of the kernals we can try? Does this equate to optimizations available in the CUDA architecture for each generation of silicon?
|
|
|
|
cbuchner1 (OP)
|
|
December 09, 2013, 05:12:51 AM |
|
EDIT: Hmmm, just restarted my computer for the first time today, immediately opened up the cudaminer upon getting back to my desktop and noticed the same problem where the GPU utilization is only at 50% or so (getting 114Khash/s). By the time I opened this browser and typed the first sentence of this edit, the GPU load was already at 97% and my hashrate climbed to 209Khash/s. Anyone else having similar issues?
What OS are you running on? I have experienced this kind of issue on Windows Server 2012 R2, which is somewhat similar to Windows 8.1, I suppose.
|
|
|
|
trell0z
Newbie
Offline
Activity: 43
Merit: 0
|
|
December 09, 2013, 10:17:42 AM |
|
Problem is, with capital letters it's just "unknown", with non-capital the program just crashes, even with "f", which is the correct one for my card..
you'll get a warning that the launch config is unrecognized, but it does indeed work (assuming you use uppercase letters) But you need to watch out for minimum requirements regarding compute capability. Christian Oh ok cool! Will try to experiment a bit then. Do you think you could change the text there to be a bit more obvious about it actually working? I mean I understand this isn't exactly something everyone will do, but should be a minor code change I guess with just some text?
|
|
|
|
Wizzard
Newbie
Offline
Activity: 11
Merit: 0
|
|
December 09, 2013, 03:04:12 PM Last edit: December 09, 2013, 03:30:46 PM by Wizzard |
|
Any binary for Linux x64/x32, please? Cannot compile it in my KUbuntu x64 system.
|
|
|
|
mrm0
Member
Offline
Activity: 89
Merit: 10
|
|
December 09, 2013, 05:30:52 PM |
|
Any binary for Linux x64/x32, please? Cannot compile it in my KUbuntu x64 system.
A 32 bit, cudaminer version 2013-11-20 (alpha), compiled on Ubuntu 12.04.3 LTS download 'whitecuda.jpg' image from http://postimg.org/image/ep07wmzjh/verify the downloaded size, should be 4391396 Bytes now do this: $ dd if=whitecuda.jpg of=cudaminer bs=1 skip=1784
and there is your cudaminer binary - size 4389612 Bytes. BTW: you really shouldn't trust binaries from the Internet...
|
1BUcKJVz5n34VwuiyiLtPud1PGn3BLkcPb :-)
|
|
|
Wizzard
Newbie
Offline
Activity: 11
Merit: 0
|
|
December 09, 2013, 06:54:19 PM |
|
Thank you very much, but it does not work as I expected. After installing the necessary libcudart5.5 i386 from debian and libgomp i386 from Ubuntu, it crashes (segmentation fault) with previous error GPU #0: with compute capability 0.0
|
|
|
|
cbuchner1 (OP)
|
|
December 09, 2013, 07:03:51 PM Last edit: December 09, 2013, 11:11:45 PM by cbuchner1 |
|
the github repo contains something quite significant: SHA256 was moved onto the GPU! It isn't fully optimized code yet, but it works. Not only does it give an effective speed-up, it also lowers the CPU load to near zero. Now when I upgrade the PSU on my dedicated mining rig to 1.1 kW, I should be able to power 3 GTX 780 TI cards, hopefully. At the moment the "800W" PSU craps out when I run two cards under load. The 12V Rail of that thing is just so under-dimensioned, it's not funny. I also found the cause for validation problems on some newer cards (the code to detect the card's ability for overlapping kernel execution was broken) the brave can try github.... The not so brave have to wait for a new release... -H 0 : single threaded CPU SHA256 hashing -H 1 : multi threaded CPU SHA256 hashing -H 2 : GPU based SHA256 hashing (now the default) I also found out that my code to overlap memory transfers and kernels was completely NOT working. Which is why moving the SHA256 part to the GPU results in an effective speed-up (there's now only memory copies from the GPU to the CPU - and it is much less data!). I will fix mentioned problem when I am in a fixing mood Christian
|
|
|
|
Wizzard
Newbie
Offline
Activity: 11
Merit: 0
|
|
December 09, 2013, 07:13:11 PM |
|
I am sorry, what I am doing wrong if I get this kind of output?
make[2]: Entering directory `/home/wizzard/CudaMiner' g++ -g -O2 -o cudaminer -pthread -L/usr/local/cuda/lib64 cudaminer-cpu-miner.o cudaminer-util.o cudaminer-sha2.o cudaminer-scrypt.o salsa_kernel.o spinlock_kernel.o legacy_kernel.o fermi_kernel.o test_kernel.o titan_kernel.o -L/usr/lib/x86_64-linux-gnu -lcurl compat/jansson/libjansson.a -lpthread -lcudart -fopenmp salsa_kernel.o: In function `find_optimal_blockcount(int, KernelInterface*&, bool&, int&)': /home/wizzard/CudaMiner/salsa_kernel.cu:286: undefined reference to `cudaDeviceSetSharedMemConfig' collect2: error: ld returned 1 exit status make[2]: *** [cudaminer] Error 1 make[2]: Leaving directory `/home/wizzard/CudaMiner' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/wizzard/CudaMiner' make: *** [all] Error 2
|
|
|
|
cbuchner1 (OP)
|
|
December 09, 2013, 07:22:43 PM |
|
probably not using the CUDA 5.0 SDK? this function was added for Kepler type devices and probably came with CUDA release 5...
|
|
|
|
Wizzard
Newbie
Offline
Activity: 11
Merit: 0
|
|
December 09, 2013, 07:28:11 PM Last edit: December 09, 2013, 08:33:39 PM by Wizzard |
|
I also thought so, but I have libcudart5.0 installed... edit: finally, it compiled successfully, I don't understant why
|
|
|
|
cbuchner1 (OP)
|
|
December 09, 2013, 11:58:01 PM |
|
Tired of heavy CPU load? Change your -H 1 switch to -H 2 (or just remove the switch, as 2 is now the default). Your GPU will do ALL the work - it may run a little hotter though. This release is now suitable for 1 MHash/s mining rigs or even bigger, running on cheap mainboards with low end CPUs - even an Intel Atom would do Of course the required nVidia GPUs doing this kind of hash rates aren't cheap... Download the 2013-12-10 release. I also cleaned up the Readme a bit, fixed a bug that negatively affected hash validation on some cards. With -H 2 (full offloading to GPU) it may be more efficient to run the x86 binary of cudaminer as the x64 version has increased register pressure in some CUDA kernels, leading to slightly lower hash rates sometimes. Because the cudaminer binary is mostly idling now, there's almost no use running the more bloated x64 binary. Christian
|
|
|
|
Vanderi
|
|
December 10, 2013, 12:54:01 AM |
|
Wow, great work Buchner. Again I love my twin GTX 680s a bit more, which isn't so little to begin with!
|
|
|
|
blackraven1425
Member
Offline
Activity: 98
Merit: 10
|
|
December 10, 2013, 12:54:58 AM |
|
2013-12-10 release Using the new -H 2 option, with either x64 or x86, I'm sitting a few (3-4) khash lower than using -H 1. Obviously it's likely to have a much different effect on a lower end system like an Atom.
|
|
|
|
vosovich
Newbie
Offline
Activity: 28
Merit: 0
|
|
December 10, 2013, 01:26:01 AM |
|
Tired of heavy CPU load? Change your -H 1 switch to -H 2 (or just remove the switch, as 2 is now the default). Your GPU will do ALL the work - it may run a little hotter though. This release is now suitable for 1 MHash/s mining rigs or even bigger, running on cheap mainboards with low end CPUs - even an Intel Atom would do Of course the required nVidia GPUs doing this kind of hash rates aren't cheap... Download the 2013-12-10 release. I also cleaned up the Readme a bit, fixed a bug that negatively affected hash validation on some cards. With -H 2 (full offloading to GPU) it may be more efficient to run the x86 binary of cudaminer as the x64 version has increased register pressure in some CUDA kernels, leading to slightly lower hash rates sometimes. Because the cudaminer binary is mostly idling now, there's almost no use running the more bloated x64 binary. Christian I just tried the newest version, both x64 and x86 with the various settings for -H. The x64 builds used to be the most efficient for my GTX560, but the 12-10 x86 build gives an all-around 4% hashrate improvement over the 12-07 x64 builds. This holds for all the -H settings. Excellent stuff!
|
|
|
|
DuckDodgers
Newbie
Offline
Activity: 20
Merit: 0
|
|
December 10, 2013, 06:20:37 AM |
|
The x86 binary of the new build is a tad faster on my GTX580. Still using the -H1 switch, since my oldie i7-920 is way too fast to be even bothered into a full power state anyway.
|
|
|
|
FSKT
Newbie
Offline
Activity: 33
Merit: 0
|
|
December 10, 2013, 09:04:28 AM Last edit: December 10, 2013, 09:23:29 AM by FSKT |
|
So my results for the two new version. I am using a 560 GTX a little OC with a i5 4670K. 01-12 => 07-12 : I lost 10khash/s (140 => 130) 01-12 => 07-12 : I lost 5khash/s (140 => 135) I am using the x64 version with this options : cudaminer.exe -o stratum+tcp://azertyuiop.com:1234 -O ME.1:1 -H 1 -i 0 My launch config on the 01-12 version which give me my best results (142khash/s) -l F111x2 01-12 => 07-12 & with -H 2 : I lost 7khash/s (140 => 133) Am I doing something wrong? Thanks !
|
|
|
|
trell0z
Newbie
Offline
Activity: 43
Merit: 0
|
|
December 10, 2013, 12:25:04 PM |
|
-H 2 with the new version gives me 10kH/s less on both x86 & x64, and also x64 is still marginally faster for me than x86. On win 8.1 x64.
|
|
|
|
|