tsiv
|
|
July 29, 2014, 09:21:12 AM Last edit: July 29, 2014, 09:34:38 AM by tsiv |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17%
|
|
|
|
henningml
Newbie
Offline
Activity: 27
Merit: 0
|
|
July 29, 2014, 09:36:59 AM |
|
Is it OK to compile latest miners on Ubuntu with Cuda 5.5 or should I install new Cuda version?
I compiled djm34's github-source successfully today. Mint 17 / Ubuntu 14.04, Cuda 5.5, drivers 331.38
|
|
|
|
Amph
Legendary
Offline
Activity: 3206
Merit: 1069
|
|
July 29, 2014, 09:40:13 AM |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17% 1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m?
|
|
|
|
tsiv
|
|
July 29, 2014, 09:57:25 AM |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17% 1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m? Pretty sure you're confusing it with some other coin, DOOM is just a single round of luffa-512 and runs at about 48-49 MH/s using a poor implementation on my 750 Ti. Djm's version bumped it up to the 56 MH/s area. Per card.
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
July 29, 2014, 10:01:17 AM |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17% 1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m? 48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti). But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C). regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
go6ooo1212
Legendary
Offline
Activity: 1512
Merit: 1000
quarkchain.io
|
|
July 29, 2014, 10:04:49 AM |
|
confirmed - 55-56 Mh/s average speed with a 750TI
|
|
|
|
Amph
Legendary
Offline
Activity: 3206
Merit: 1069
|
|
July 29, 2014, 10:11:44 AM |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17% 1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m? 48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti). But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C). regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5 funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why i'm trying cuda 6.0 but it don't work with visual studio 2010
|
|
|
|
tsiv
|
|
July 29, 2014, 10:14:06 AM |
|
#define SUBCRUMB(a0,a1,a2,a3,a4)\ asm( \ "mov.b32 %4, %0;\n\t" \ "or.b32 %0, %0, %1;\n\t" \ "xor.b32 %2, %2, %3;\n\t" \ "not.b32 %1, %1;\n\t" \ "xor.b32 %0, %0, %3;\n\t" \ "and.b32 %3, %3, %4;\n\t" \ "xor.b32 %1, %1, %3;\n\t" \ "xor.b32 %3, %3, %2;\n\t" \ "and.b32 %2, %2, %0;\n\t" \ "not.b32 %0, %0;\n\t" \ "xor.b32 %2, %2, %1;\n\t" \ "or.b32 %1, %1, %3;\n\t" \ "xor.b32 %4, %4, %1;\n\t" \ "xor.b32 %3, %3, %2;\n\t" \ "and.b32 %2, %2, %1;\n\t" \ "xor.b32 %1, %1, %0;\n\t" \ "mov.b32 %0, %4;\n\t" \ :: "r"(a0), "r"(a1), "r"(a2), "r"(a3), "r"(a4))
Massive +1 MH/s on 750 Ti Well, at least with CUDA 5.5. No idea how that can actually be even a single bit faster than straight C, the compiler seems to do a piss poor job sometimes with simple statements like in that define.
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
July 29, 2014, 10:20:50 AM |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17% 1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m? 48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti). But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C). regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5 funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why i'm trying cuda 6.0 but it don't work with visual studio 2010 you have to right-click on the project and click on build customization and there you can chose the cuda version
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
Amph
Legendary
Offline
Activity: 3206
Merit: 1069
|
|
July 29, 2014, 10:24:51 AM |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17% 1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m? 48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti). But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C). regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5 funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why i'm trying cuda 6.0 but it don't work with visual studio 2010 you have to right-click on the project and click on build customization and there you can chose the cuda version i can't even load the project
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
July 29, 2014, 10:36:48 AM |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17% 1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m? 48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti). But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C). regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5 funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why i'm trying cuda 6.0 but it don't work with visual studio 2010 you have to right-click on the project and click on build customization and there you can chose the cuda version i can't even load the project did you uninstall cuda 5.5 ?
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
Amph
Legendary
Offline
Activity: 3206
Merit: 1069
|
|
July 29, 2014, 10:38:39 AM |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17% 1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m? 48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti). But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C). regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5 funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why i'm trying cuda 6.0 but it don't work with visual studio 2010 you have to right-click on the project and click on build customization and there you can chose the cuda version i can't even load the project did you uninstall cuda 5.5 ? yeah i thought it wasn't necessary anymore, i'm re-installing now
|
|
|
|
cayars
|
|
July 29, 2014, 10:40:14 AM |
|
Mine. Butchered up a quick SPH port pre-launch and it just happened to work right out of the box. Hadn't been following the other algos and didn't realize there was a variation of the luffa512 kernel in djm's repo that handled 80 input bytes, couldn't figure out how to mod the one in x11 to do 80 instead of 64 for input so yet another ugly SPH copy&paste job it was. I expect djm's will be the faster of the two, compiling it as we speak to confirm. Which speeds did you reaxh with 750TI ? 290 MH/s with a 6 card rig so about 48.3 MH/s each on average. Still stuck compiling djm's version, I did read it compiles slow on CUDA 5.5 but dddddddddddddaaaaaaaaaaaaaamn.... Edit: Compilation finished. No surprises there, djm's is indeed faster by roughly 17% 1 card give you 48 mh/s??? isn't supposed to be 8mh/s, you mean one rig give ou 48m? 48MH sounds good. I am doing roughly 170MHash/s with the 780ti and the 750ti (110 for the 780ti). But the R9-290x is doing 145MHash/s all alone and I had to decrease the TDP by 20% (otherwise the temp is going to 94°C). regarding compilation, I compiled compute 3.0, 3.5, 5.0 all together and it takes only 1/2 hours with cuda 6.5... I am really done with cuda 5.5 funny thing is that i can compile in 35 min with cuda 5.5 lol dunno why i'm trying cuda 6.0 but it don't work with visual studio 2010 you have to right-click on the project and click on build customization and there you can chose the cuda version i can't even load the project If not and on windows open the project file in notepad. Do a search for 5.5 and replace with 6.0 You will find it 2 or 3 times. That will change the CUDA version from 5.5 to 6.0. You can also do the same for 6.5 too. Re-open project and see if that works. Carlo PS You can have multiple versions of the CUDA toolkit installed as they use different directories (what we just edited above).
|
|
|
|
Amph
Legendary
Offline
Activity: 3206
Merit: 1069
|
|
July 29, 2014, 10:44:43 AM |
|
everything fine, that pesky ccminer.rc is there again
|
|
|
|
AizenSou
|
|
July 29, 2014, 10:58:29 AM |
|
tsiv you did solomine 150 blocks from total 600 blocks till now? LOL the fastest instaminer title in the cryptoworld from djm34 should be given to you
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
July 29, 2014, 10:59:20 AM |
|
everything fine, that pesky ccminer.rc is there again
yes I know... it doesn't want to go away (it must be referenced in some vxproj files... but the problem visual finder won't look in those files... so it is difficult to catch...)
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
Madychoux
|
|
July 29, 2014, 11:02:45 AM |
|
Hey TSIV, thanks for your XMR miner and all your contributions !
I'm getting 230KH/s at the moment with overclocked 750 Ti and -l 8x60 parameter.
Any advice to get a bit more or it seems good enough like that ?
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
July 29, 2014, 11:02:58 AM |
|
tsiv you did solomine 150 blocks from total 600 blocks till now? LOL the fastest instaminer title in the cryptoworld from djm34 should be given to you yep I doesn't have that much I solomined only 5950 (took me a while to get the R9 working strangely...), but blocks are still coming... (actually there wasn't any instamine or I arrived late to that coin... and block wasn't coming that easily even when I was doing 1/2 of the net hashrate) ok there is still 331 unaccounted blocks
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
TheFridge
|
|
July 29, 2014, 11:04:44 AM |
|
Im only getting 80% accepted shares with doomcoin. Is this normal for this algo??
|
|
|
|
AizenSou
|
|
July 29, 2014, 11:06:41 AM |
|
tsiv you did solomine 150 blocks from total 600 blocks till now? LOL the fastest instaminer title in the cryptoworld from djm34 should be given to you yep I doesn't have that much I solomined only 5950 (took me a while to get the R9 working strangely...), but blocks are still coming... (actually there wasn't any instamine or I arrived late to that coin... and block wasn't coming that easily even when I was doing 1/2 of the net hashrate) Ok tsiv 150 blocks and you 120 blocks from total 620 blocks suprnova found only 99 blocks with 3Ghs hashrate. What should I call then ?
|
|
|
|
|