ktf
Newbie
Offline
Activity: 24
Merit: 0
|
|
January 19, 2014, 03:51:41 PM |
|
It seems that running -L 2 it was set to K59x2, which was netting almost 3khash/s.
If I try to specify however -l K59x2 I get errors :
[2014-01-19 17:49:25] GPU #1: cudaError 4 (unspecified launch failure) calling ' cudaStreamSynchronize(context_streams[1][thr_id])' (C:/__test/CudaMiner-master/s alsa_kernel.cu line 164)
I tried with different values and I get the same error. It only works if I don't use the -l flag.
|
|
|
|
Silverwolf_Ru
Full Member
Offline
Activity: 120
Merit: 100
Astrophotographer and Ham Radioist!
|
|
January 19, 2014, 03:59:48 PM |
|
OP, how about autotune crashing on Fermi kernels? I think they need some love as well, any news on their progress?
|
Bitcoin: 17kz4pWKoMoVupGUYgj8kGomxXUkDHNtVe Shadowcoin: Seta8CFwP6yvbeCkgfjxXjpkokrQMQovGF ~Coin of the Future!
|
|
|
cbuchner1 (OP)
|
|
January 19, 2014, 04:26:57 PM |
|
The lookup gap has turned my 10 kHash/s 450 Watts Yacoin mining rig into a devilish 14 kHash/s 666 Watts mining rig. Not quite as high as I had hoped for, but the new Wattage is nice.
I run GTX 780 with -L 6 -l 12x32 up to 3.65 kHash/s and GTX 780Ti with -L 6 -l 15x32 up to 4.7 kHash/s
still quite an easy to remember formula with a decent performance. There may be better values but that is what I found within an hour of tinkering.
Christian
|
|
|
|
ManIkWeet
|
|
January 19, 2014, 04:35:23 PM |
|
The lookup gap has turned my 10 kHash/s 450 Watts Yacoin mining rig into a devilish 14 kHash/s 666 Watts mining rig. Not quite as high as I had hoped for, but the new Wattage is nice.
I run GTX 780 with -L 6 -l 12x32 up to 3.65 kHash/s and GTX 780Ti with -L 6 -l 15x32 up to 4.7 kHash/s
still quite an easy to remember formula with a decent performance. There may be better values but that is what I found within an hour of tinkering.
Christian
I am sure you can squeeze more out of your GTX 780, I get 3.87-3.90 khash/s with -l T64x2 -b 8192 -L 2 -i 0 --algo=scrypt-jane.
|
BTC donations: 18fw6ZjYkN7xNxfVWbsRmBvD6jBAChRQVn (thanks!)
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
January 19, 2014, 04:56:00 PM |
|
The lookup gap has turned my 10 kHash/s 450 Watts Yacoin mining rig into a devilish 14 kHash/s 666 Watts mining rig. Not quite as high as I had hoped for, but the new Wattage is nice.
I run GTX 780 with -L 6 -l 12x32 up to 3.65 kHash/s and GTX 780Ti with -L 6 -l 15x32 up to 4.7 kHash/s
still quite an easy to remember formula with a decent performance. There may be better values but that is what I found within an hour of tinkering.
Christian
Here what I got with my 780ti: L3 29x7 => 4,78 khash/s L4 137x2 => 5.09 L5 169x2 => 5.1 L6 60x8 => 5.22 In principle there should be somewhat better timing. In script the best one are multiple of the cuda cores number (no reason it doesn't work this way for scrypt-jane). I can't monitor the power usage on linux, but I use a self modbios to allow up to 150% of the tdp, but I don'tthink it has any impact, since I can't change the power limit)
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
primeomega
Member
Offline
Activity: 63
Merit: 10
|
|
January 19, 2014, 06:30:32 PM |
|
Call me stupid, but why all of a sudden did YAC become a thing? I use cudaminer for a while mining alt coins, and check this thread once in while. But it's all about yac now. Is it the most profitable coin to mine with a Nvid card now? Did not see it traded on Cryptsy at all, so not sure on what it's all about.
|
|
|
|
bathrobehero
Legendary
Offline
Activity: 2002
Merit: 1051
ICO? Not even once.
|
|
January 19, 2014, 06:35:43 PM |
|
CudaMiner at the moment is the strongest around an N factor of 14 (compared to ATI/AMD GPU's and CPU's) and YaC is the only one around which makes it the most profitable. YaC has some issues though so I'm waiting for other coins to get close to N 14. On another note, if anyone wants to speed up the autotuning process for the cost of some accuracy, you could decrease the number of measurements in salsa_kernel.cu (538) while (repeat < 3) // average up to 3 measurements for better exactness Also, you can interrupt autotuning with CTRL+C in windows anytime and while it will close cudaMiner, it will show you the best kernel launch config it has found up to that point (handy for skipping the last part in some cases).
|
Not your keys, not your coins!
|
|
|
relm9
|
|
January 19, 2014, 06:42:16 PM |
|
Really? Dude drop the entitlement...
Excuse me, but you need to drop something yourself. That being the assumption that you know my motives or what type of person I am. You don't, so knock it off. It was a sort of tongue-in-cheek comment, but I can see how the humor doesn't come across very well without knowing the intent of the post. If it were intended as you framed it, why would I follow up the comment with a polite request for updated binaries? Anyway I'm getting the prerequisites together as we speak s I can compile it myself. I was not aware that a trial of VS2010 could be used to compile, but now I know. Thanks for the snap judgment, though. Makes my day when some snooty know-it-all gets something totally wrong. Next time drop the egoistic notion that you've got everything figured out, and you'll be less likely to make the same mistake again. Thanks cbuchner1 for your continued effort. Ok - I just don't find posts like that constructive when you could have just asked for help instead (I compiled a version of this for a guy that asked). You're right I shouldn't have judged what type of person you are from that post. I apologize, let's move on. On-topic: I tried the new build today, getting up to 4.5kh/s with T68x4 and -L4 on a GTX780. It usually hovers more around 4.3.
|
|
|
|
bathrobehero
Legendary
Offline
Activity: 2002
Merit: 1051
ICO? Not even once.
|
|
January 19, 2014, 07:48:10 PM |
|
On-topic: I tried the new build today, getting up to 4.5kh/s with T68x4 and -L4 on a GTX780. It usually hovers more around 4.3.
Hovering or jittering to me occurs when there's too much memory being used or at least it's borderline. So for example for me N 14 with L 3 results in 181 warps. Autotune comes up with K59x3 (= 177) which results in a very stable hashrate, using 1931 VRAM. (using the default 3 measurements) But using K10x18 (= 180) jitters a bit but on average it's better, even if the VRAM usage keeps jumping between 1942-1963, which if I have to guess is causing the jittering. Here's a screenshot (with minimum/average/maximum hashrates added in the brackets). So in addition to my previous post, you can find these borderline kernel configs if you don't touch, or maybe even increase the number of measurements done by autotune, but if you're card is used as primary (has a monitor attached to it), you will be fine with a less accurate autotune since VRAM usage is not static (desktop, background apps, etc). Also, I guess most of us have their cards overclocked at this point but as the new lookup gap puts more pressure on the cards, our pre-lookup gap overclocks are not that stable anymore, causing crashes.
|
Not your keys, not your coins!
|
|
|
manofcolombia
Member
Offline
Activity: 84
Merit: 10
SizzleBits
|
|
January 19, 2014, 08:08:21 PM |
|
When I go to compile to get lookup_gap I end up with this error
C:\Users\Zak Lantz\Desktop\cudaminer_vc2010_prerequisites\CudaMiner-master\cudaminer.vcxproj : error : Unable to read the project file "cudaminer.vcxproj". C:\Users\Zak Lantz\Desktop\cudaminer_vc2010_prerequisites\CudaMiner-master\cudaminer.vcxproj(50,5): The imported project "C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA 5.5.props" was not found. Confirm that the path in the <Import> declaration is correct, and that the file exists on disk.
I understand what the error is because its an error that its not finding cuda installed because I have it installed on my H drive since my C is a 120 gb SSD so how would I point Visual Studio to look where CUDA is actually installed?
|
|
|
|
ktf
Newbie
Offline
Activity: 24
Merit: 0
|
|
January 19, 2014, 08:13:31 PM |
|
Anyone having issues with the YAC wallet ? Mine crashes as soon as I start it on windows 7 64 bit...
|
|
|
|
cbuchner1 (OP)
|
|
January 19, 2014, 10:12:58 PM Last edit: January 19, 2014, 11:03:21 PM by cbuchner1 |
|
Should we roll the Lookup-Gap into kernel launch configurations? how does T12x32/6 look like to you? ;-) No issues with the YAC wallet on Windows here, but mine does start horribly slowly on Linux (takes up to an hour). I pulled it from the official PPA repository for stable builds. The reason for autotune crashes on Windows with lookup gap seems to be rising memory usage during the autotune process. e.g on my 780Ti as soon as the "Memory Used" value shown in GPU-z hits 3072MB, the driver will crash. I could fix it by adding a configurable "backoff" parameter in percent. The default value on Windows should be higher than on Linux, probably around 10% on Windows and 2% on Linux. Alternatively I could allow giving the backoff in MB also. For a very quick fix in the current source code, increment the parameter 2 in this for loop in salsa_kernel.cu to something higher - like e.g. 2*LOOKUP_GAP. It should fix auto-tuning when single-memory allocation is not enabled. for (int i=0; warp > 0 && i < 2; ++i) { warp--; checkCudaErrors(cudaFree(h_V[thr_id][warp]-h_V_extra[thr_id][warp])); h_V[thr_id][warp] = NULL; h_V_extra[thr_id][warp] = 0; }
UPDATE: I also find that CUDA sometimes kills the autotuning process with the error message "the launch timed out and was terminated. This might be fixed by auto-tuning with smaller batchsize (-b) parameters, like e.g. 1024. CUDA has a watchdog timer that will kill kernel calls that take longer than 5 seconds. This is to avoid permanent display freeze when some computation gets stuck. I am also considering to also allow specifying the devices like in the following example because whenever I swap cards around on my mainboards, all the device IDs get shuffled by CUDA which is annoying. The strings however would keep working as is, unless you remove the card with the given name. -d "GT 640, GTX 780 Ti, GTX 660 Ti, GTX 660 Ti#2" Christian
|
|
|
|
orrett3
Newbie
Offline
Activity: 33
Merit: 0
|
|
January 19, 2014, 10:22:00 PM |
|
Anyone having issues with the YAC wallet ? Mine crashes as soon as I start it on windows 7 64 bit...
What is the error you're getting if there is one? I was getting not able to load block index, but was able to fix it.
|
|
|
|
Magister1
Newbie
Offline
Activity: 9
Merit: 0
|
|
January 20, 2014, 12:01:01 AM |
|
Should we roll the Lookup-Gap into kernel launch configurations?
I am also considering to also allow specifying the devices like in the following example because whenever I swap cards around on my mainboards, all the device IDs get shuffled by CUDA which is annoying. The strings however would keep working as is, unless you remove the card with the given name.
-d "GT 640, GTX 780 Ti, GTX 660 Ti, GTX 660 Ti#2"
Christian
This is your baby, but those sound like good ideas, in addition to the idea about setting warp ranges for auto tuning. I would suggest clarifying/cleaning up the display and help pages for new people. You are beginning to make a real dent in the struggle for viable NVidia mining and getting attention across the web. Your baby ought to look its best, right? Maybe once you do a new release even open a new thread (with a link to this one obviously) so people aren't overwhelmed by 130+ pages of old comments pertaining mainly to old versions. Keep up the good work! PS. Do you take Yacoin donations?
|
|
|
|
cbuchner1 (OP)
|
|
January 20, 2014, 12:36:09 AM |
|
Keep up the good work!
PS. Do you take Yacoin donations?
Yeah, you can donate to YBQ4hrUQqEb2EDip1NFwMAgZbvK8hJx5Tn Good idea about starting a new thread for the scrypt-jane enabled cudaminer, once it is released. I have made some changes to autotune reliability and speed. It will not assign less blocks than half the multiprocessor count in your card. For example on a GTX 780 it will start autotuning at 6 blocks now (the card has 12 SMX). Also I made changes to how memory is allocated. The backoff value on Windows is currently 12% of the largest allocation it was able to make. On Linux it is a mere 2%. If I don't back off, autotune will crash pretty badly. It can still occasionally crash with launch timeouts though. I find that my GTX 660Ti is a better investment than my new GTX 780 card (3 GB each, but 7 vs 12 SMX). At -L 2 the 660Ti totally beats my 780. Meh. My GT 660 Ti uses -L 2 -l K64x2 -C 1 -b 32768 -i 0 and gets 3.7 kHash/s Christian
|
|
|
|
ozie
|
|
January 20, 2014, 12:42:54 AM |
|
No issues with the YAC wallet on Windows here, but mine does start horribly slowly on Linux (takes up to an hour). I pulled it from the official PPA repository for stable builds.
There is a new stable release on github which speeds up the time it takes to open the wallet on Linux. Not sure if it is in PPA already.
|
|
|
|
Magister1
Newbie
Offline
Activity: 9
Merit: 0
|
|
January 20, 2014, 12:49:58 AM |
|
Keep up the good work!
PS. Do you take Yacoin donations?
Yeah, you can donate to YBQ4hrUQqEb2EDip1NFwMAgZbvK8hJx5Tn Good idea about starting a new thread for the scrypt-jane enabled cudaminer, once it is released. I have made some changes to autotune reliability and speed. It will not assign less blocks than half the multiprocessor count in your card. For example on a GTX 780 it will start autotuning at 6 blocks now (the card has 12 SMX). Also I made changes to how memory is allocated. The backoff value on Windows is currently 12% of the largest allocation it was able to make. On Linux it is a mere 2%. If I don't back off, autotune will crash pretty badly. It can still occasionally crash with launch timeouts though. I find that my GTX 660Ti is a better investment than my new GTX 780 card (3 GB each, but 7 vs 12 SMX). At -L 2 the 660Ti totally beats my 780. Meh. My GT 660 Ti uses -L 2 -l K64x2 -C 1 -b 32768 -i 0 and gets 3.7 kHash/s Christian Donation sent. In case you guys didn't know they just released an update to the Yacoin wallet 0.42.
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
January 20, 2014, 01:27:09 AM |
|
I just tried to run the latest version on windows on scrypt with my newest config of yesterday without L and it seems I lost 100khash/h (was running at 700 (OC...) and now it barely makes 600...) Do I need to retune ? Or something has changed more drastically ?
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
muliukov
Newbie
Offline
Activity: 55
Merit: 0
|
|
January 20, 2014, 01:36:29 AM |
|
Sorry for question, but can you help to create cudaminer for Microcoin? As I see it must be like for YAC so it won't be difficult, but I never did it before and have no skills
|
|
|
|
orrett3
Newbie
Offline
Activity: 33
Merit: 0
|
|
January 20, 2014, 01:37:21 AM |
|
I just tried to run the latest version on windows on scrypt with my newest config of yesterday without L and it seems I lost 100khash/h (was running at 700 (OC...) and now it barely makes 600...) Do I need to retune ? Or something has changed more drastically ?
I would try using autotune to get another config and see what happens.
|
|
|
|
|