1coin, S3 hash, blablabla https://bitcointalk.org/index.php?topic=820302.msg9323320#msg9323320Something to get the NV people by until someone makes a decent release. It's already practically 100% djm's code (shavite that eats 80 bytes of input) in addition to cb's original x11 so I expect djm to add it to his fork... when he wakes up, I guess. Early bird gets the instamine, djm is clearly losing his touch ![Tongue](https://bitcointalk.org/Smileys/default/tongue.gif) I haven't instamined a coin in a long time... I don't have time at the moment, I have to (re)write the opencl miner for coinshield... (also I have many update to pass to my release, so it will take a bit longer... to be honest, if there is already an amd miner, there isn't much incentive for a fast nvidia release... the coin is already raped... ![Grin](https://bitcointalk.org/Smileys/default/grin.gif) ) True enough, personally I moonwalked away from the coin after the initial insta. Just figured I'd put out some kind of a CUDA miner since the OpenCL one was, unsurprisingly, not running very well on NV cards.
|
|
|
Is there any ccminer for CUDA 3.0?
Well, that was a major brainfart on my part. Thought the Kepler shuffle instruction was available only from 3.5 up when it in fact was introduced in 3.0. New win32 binary with compute 3.0/3.5/5.0/5.2 at https://github.com/tsiv/ccminer/releases900-series card owners should also find this release more to their liking than the first one. Edit: Oh yea, somebody was asking for a compute 2.1 release. The SIMD kernel makes heavy use of shuffle and rewriting it or trying to dig up an old version that doesn't use shuffle just ... doesn't feel like something I want so spend my time on, sorry ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif)
|
|
|
1coin, S3 hash, blablabla https://bitcointalk.org/index.php?topic=820302.msg9323320#msg9323320Something to get the NV people by until someone makes a decent release. It's already practically 100% djm's code (shavite that eats 80 bytes of input) in addition to cb's original x11 so I expect djm to add it to his fork... when he wakes up, I guess. Early bird gets the instamine, djm is clearly losing his touch ![Tongue](https://bitcointalk.org/Smileys/default/tongue.gif)
|
|
|
Anyone get Nvidia cards working yet with no HW errors with the miner?
Necroed an old cbuchner version of ccminer to plug the S3 hash into. Win32 binary: https://github.com/tsiv/ccminer/releases/download/v1.123-s3/ccminer-s3.zipSource obviously at https://github.com/tsiv/ccminerIt's a fair bit slower than I mentioned earlier. It's a bit of a mystery, it's pretty much just as fast on the GTX 750 Ti as the other fork but on a 970 it drops from 20 to 13? Dafuq... Might just be that I compiled it for compute 3.5 and 5.0, not 5.2 which is supposed to be something the 900-series brings to the table... Meh, guess I'll try another compile.
|
|
|
Too bad there is no nVidia miner ![Roll Eyes](https://bitcointalk.org/Smileys/default/rolleyes.gif) actually there is sgminer3s works very well with OpenCL. You only miss the adl functionality for temperature and fan speedtry installing the nvidia opencl drivers And around 50% of your hashrate potential. For example on my GTX 970: sgminer3s - 9 MH/s my ccminer fork with 3s support - 20 MH/s The source code of my fork is a bit of a clusterfuck atm but I'll see if I can figure out a way to push the 3s part out.
|
|
|
PM me or post a config and Ill mine for you for about 5 minutes to test the waters. I would like to mention I'm on windows and need a compiled version to run it.
thanks. But, I'm on linux and could not make you any win-binarys for tsiv's newest CN miner or wolf's BBR miner. And I think its no accident, that tsiv has made some update yesterday. So the newest version would be needed for a try. Also it would be wise to find the right -launch-config for the new cards to see how far they could go. the more I write, the more I want to have a new toy by myself. shit. thanks for your offer to test it ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif) It is in fact just coincidence, somebody reported a problem with the compute 2.0 code and I unfucked it, nothing else. I do have a 970 on the way but I'm not expecting much from it for CN. We'll see. it'll probably still be severely gimped by the random memory access patterns.
|
|
|
Cryptonite != Cryptonight
|
|
|
The nvidia drivers are a bitch to get installed. I gave up, went with Kopiemtu.
|
|
|
You could try running Kopiemtu just to see if it changes things. If it still runs like crap it's hardware related, otherwise you'll probably want to keep looking at your Windows settings, drivers etc.
|
|
|
I think he figured out how to work the "pay for beer with btc" app.
|
|
|
I was slightly bored at work... ![](https://ip.bitcointalk.org/?u=https%3A%2F%2Fi.imgur.com%2FbqDHh0y.gif&t=663&c=WZ_WeNXchGv_qw)
|
|
|
The pastebin code is actually two separate files, you'll see where each file starts as a comment.
Clone djm's repo, replace doom.cu with the first part of the paste. Then create a new file in the qubit directory, called doom_luffa512.cu and paste in the second part from pastebin. Then you'll want to change Makefile.am to include the new file in the build. I did it by changing the line x13/cuda_haval512.cu x13/cuda_sha512.cu qubit/doom.cu \ into x13/cuda_haval512.cu x13/cuda_sha512.cu qubit/doom.cu qubit/doom_luffa512.cu \
After that, it's all autogen configure make and enjoy the summer.
|
|
|
How about posting your code and we can compare to see what you did differently? Me just wants to get my grubby hands on it. ![Grin](https://bitcointalk.org/Smileys/default/grin.gif) Right, I cloned djm's latest and only modified doom.cu and created doom_luffa512.cu. And obviously you'll need to modify Makefile.am and the VC project files to include doom_luffa512.cu in the build. Built it and voila, ~80 MH/s per 750 Ti. Let me know how it turns out. Source for both files here: http://pastebin.com/3kUMumEmI reproduced the speed increase this : #pragma unroll 8 for (int i = 7; i >= 0; i--) { if (hash[i] > d_target[i]) { if(position < i) { position = i; rc = false; } } if (hash[i] < d_target[i]) { if(position < i) { position = i; rc = true; } } }
give a gain of performance while this doesn't: ![Shocked](https://bitcointalk.org/Smileys/default/shocked.gif) #pragma unroll 8 for (int i = 7; i >= 0; i--) { if (Hash[i] > pTarget[i]) {
rc = false; break; } if (Hash[i] < pTarget[i]) { rc = true; break; } }
Now if someone has the explanations, I will be happy to hear it... (I would tend to believe that the second is faster....) but no... It's all them capital letters, yo. Can't fit through them pipelines. You've got the Hash array defined in the kernel and not still a pointer to gmem?
|
|
|
How about posting your code and we can compare to see what you did differently? Me just wants to get my grubby hands on it. ![Grin](https://bitcointalk.org/Smileys/default/grin.gif) Right, I cloned djm's latest and only modified doom.cu and created doom_luffa512.cu. And obviously you'll need to modify Makefile.am and the VC project files to include doom_luffa512.cu in the build. Built it and voila, ~80 MH/s per 750 Ti. Let me know how it turns out. Source for both files here: http://pastebin.com/3kUMumEm
|
|
|
[2014-07-29 16:44:07] GPU #3: GeForce GTX 750 Ti, 81983 khash/s [2014-07-29 16:44:07] GPU #1: GeForce GTX 750 Ti, 83060 khash/s [2014-07-29 16:44:07] GPU #0: GeForce GTX 750 Ti, 82335 khash/s [2014-07-29 16:44:07] GPU #2: GeForce GTX 750 Ti, 81466 khash/s [2014-07-29 16:44:07] GPU #5: GeForce GTX 750 Ti, 81053 khash/s [2014-07-29 16:44:07] GPU #4: GeForce GTX 750 Ti, 80631 khash/s
Recipe: Take djm's Doomcoin code, compare hash to target and return if good at the end of the luffa kernel instead of saving hash outputs to global memory, then reading them back for comparison in another kernel. I did it in the version I use but I didn't get any speed up (reason why I didn't upload it). need to check what I did... (anyhow I didn't see a block since a couple of hour now...) edit: My version is still saving the hash though... Yep, the global write might just be the thing holding it back. I'm getting hell of a lot of "submit_upstream_work stratum_send_line failed" and "reject reason: Job 'e585' not found" though, but I can't see how that would be related to the mod. Guess the suprnova stratum is starting to choke as load increases. Doesn't help that there's only one port configured for an extremely low starting diff looking from a GPU point of view. I tried to remove it but that doesn't change anything for me: 66MH on the 750ti with a power usage of 95% (was 99% before) Weird, I'm fairly certain I didn't really change anything else or compile with the --yo-dawg-gimme-plus-40-percent-hashrates switch ![Huh](https://bitcointalk.org/Smileys/default/huh.gif) Color me confused.
|
|
|
[2014-07-29 16:44:07] GPU #3: GeForce GTX 750 Ti, 81983 khash/s [2014-07-29 16:44:07] GPU #1: GeForce GTX 750 Ti, 83060 khash/s [2014-07-29 16:44:07] GPU #0: GeForce GTX 750 Ti, 82335 khash/s [2014-07-29 16:44:07] GPU #2: GeForce GTX 750 Ti, 81466 khash/s [2014-07-29 16:44:07] GPU #5: GeForce GTX 750 Ti, 81053 khash/s [2014-07-29 16:44:07] GPU #4: GeForce GTX 750 Ti, 80631 khash/s
Recipe: Take djm's Doomcoin code, compare hash to target and return if good at the end of the luffa kernel instead of saving hash outputs to global memory, then reading them back for comparison in another kernel. I did it in the version I use but I didn't get any speed up (reason why I didn't upload it). need to check what I did... (anyhow I didn't see a block since a couple of hour now...) edit: My version is still saving the hash though... Yep, the global write might just be the thing holding it back. I'm getting hell of a lot of "submit_upstream_work stratum_send_line failed" and "reject reason: Job 'e585' not found" though, but I can't see how that would be related to the mod. Guess the suprnova stratum is starting to choke as load increases. Doesn't help that there's only one port configured for an extremely low starting diff looking from a GPU point of view.
|
|
|
[2014-07-29 16:44:07] GPU #3: GeForce GTX 750 Ti, 81983 khash/s [2014-07-29 16:44:07] GPU #1: GeForce GTX 750 Ti, 83060 khash/s [2014-07-29 16:44:07] GPU #0: GeForce GTX 750 Ti, 82335 khash/s [2014-07-29 16:44:07] GPU #2: GeForce GTX 750 Ti, 81466 khash/s [2014-07-29 16:44:07] GPU #5: GeForce GTX 750 Ti, 81053 khash/s [2014-07-29 16:44:07] GPU #4: GeForce GTX 750 Ti, 80631 khash/s
Recipe: Take djm's Doomcoin code, compare hash to target and return if good at the end of the luffa kernel instead of saving hash outputs to global memory, then reading them back for comparison in another kernel.
|
|
|
Hey TSIV, thanks for your XMR miner and all your contributions !
I'm getting 230KH/s at the moment with overclocked 750 Ti and -l 8x60 parameter.
Any advice to get a bit more or it seems good enough like that ?
230 isn't horribly bad, but not great for an overclocked 750 Ti either. The Asus DC2OC on my win box is doing 244 at the default factory oc (1072 core, 900/5400 mem) with a variety of similar sized configs, 8x60/16x30/32x15. Same configs work well on my Linux rig, again 750 Tis but this time the oc is 1202 core and 1000/6000 mem for 280 H/s. On the other hand we've just had some people report piss poor performance with 750 Ti and 8x60 while 8x30 worked a lot better. Haven't got a clue why, might have something to do with newer drivers or something ![Huh](https://bitcointalk.org/Smileys/default/huh.gif) You could try playing around with the launch config a bit, could also the latest nvMiner by cayars from http://www.cudamining.cc/url/releases if you're on Windows.
|
|
|
|