CCminer(SP-MOD) Modded GPU kernels.

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 24, 2015, 07:35:42 AM

#1661

Quote from: Wolf0 on February 24, 2015, 07:03:31 AM

The CUDA and OpenCL code for Whirlpool consists of lookups into huge tables - which sucks for the GPU;

The lookup is done in shared memory and is 1 cycle, but the internal RISC cpu needs 4 instructions to do the lookup (byteperm/add/shift/move)
With the BFINS instruction and alligned memroy buffers this can be reduced to 2 instructions, although I failed to implement it in my first attempt (AES)

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

tbearhere

Legendary

Offline

Activity: 3220
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 24, 2015, 09:08:12 AM

#1662

Need help on DGB coin Qubit algo Theblocksfactory I think needs a setting that I don't understand and he doesn't, all other pools on this algorithm work fine.
On Qubit algo... before #33 we needed a -f 236 and now we don't. Now on this pool since 6 months ago I never got ccminer to work properly. With the older versions I needed to restart the program every 60 seconds to get the pool at my true hashrate. With #39 , no -f 236 needed , it works fine except it only excepts exactly 1/2 my hashrate. I think its a setting the pool owner needs to make. Again I tried this on another pools and it works fine. Any thoughts on this please? Please. ps The other pools have so little hash rate they only hit a block once in awhile.
Thx

bathrobehero

Legendary

Offline

Activity: 2002
Merit: 1051

ICO? Not even once.

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 24, 2015, 12:06:41 PM

#1663

Quote from: tbearhere on February 24, 2015, 09:08:12 AM

If a pool is showing half the hashrate chances are you're doing twice the expected work so doubling your difficulty divide factor (--diff or -f) is what's probably missing. The default is 1 so you should try 2. Conversely, if it only accepts half the shares then you're sending smaller chunks of work then what the pool expects in which case halving the diff helps (-f 0.5). If there are still rejected shares try lowering the values to like -f 0.0078125 or -f 0.00390625 to offset the default 128/256 multipliers while checking the pool's reported hashrate.

Not your keys, not your coins!

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 24, 2015, 12:26:32 PM

#1664

Quote from: Wolf0 on February 24, 2015, 07:37:54 AM

Quote from: sp_ on February 24, 2015, 07:35:42 AM

Quote from: Wolf0 on February 24, 2015, 07:03:31 AM

The CUDA and OpenCL code for Whirlpool consists of lookups into huge tables - which sucks for the GPU;

I haven't done CUDA in quite a while, but here's a tip about AMD - using fucktons of LDS is bad for you. It reduces the waves in flight - more waves in flight usually mean more performance, up to a point.

The maxwell can do 2 instructions per clockcycle, but only one cycle when the instruction is using shared/const memory. Normal superscalar design. Thats why I normally move constants into the instruction cache. Just need to make sure that the codesize fit the cache..

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

tbearhere

Legendary

Offline

Activity: 3220
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 24, 2015, 03:16:36 PM
Last edit: February 24, 2015, 04:48:18 PM by tbearhere

#1665

Quote from: bathrobehero on February 24, 2015, 12:06:41 PM

Quote from: tbearhere on February 24, 2015, 09:08:12 AM

-f 0.5 divides it in half so total= 1/4 hash rate. I did try on another pool and its fine but this amd pool is s***. theblocksfactory I tried 2 but over shares. So I come to the conclusion that it theblocksfactory pool.
I'm in http://digihash.co very good no problems.

bathrobehero

Legendary

Offline

Activity: 2002
Merit: 1051

ICO? Not even once.

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 24, 2015, 05:12:08 PM

#1666

Quote from: tbearhere on February 24, 2015, 03:16:36 PM

Quote from: bathrobehero on February 24, 2015, 12:06:41 PM

Quote from: tbearhere on February 24, 2015, 09:08:12 AM

Theblocksfactory is weird. When their vardiff starts climbing it throws rejects so it goes back and repeats. Anyway, you can use a fixed minimum vardiff and it seems for a 6 card 750 Ti rig 4 (.workername_diff4) works fine with -f 256 with release 39.

Not your keys, not your coins!

tbearhere

Legendary

Offline

Activity: 3220
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 24, 2015, 10:41:54 PM
Last edit: February 25, 2015, 09:14:49 PM by tbearhere

#1667

Quote from: bathrobehero on February 24, 2015, 05:12:08 PM

Quote from: tbearhere on February 24, 2015, 03:16:36 PM

Quote from: bathrobehero on February 24, 2015, 12:06:41 PM

Quote from: tbearhere on February 24, 2015, 09:08:12 AM

Thanks
Thats better..but getting alot of booo's with shares above target. Funny that we have to use the -f 256 for that.
Edit: With diif4 and -f 256 hash went from 50% to 75%. So I have to adjust the diff4 to 2 or 8 ect to see what happens when I get a chance.

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 25, 2015, 10:41:29 PM

#1668

I have rewritten wirlpool hash. 12% faster when mining wirlcoin (750ti)
x15 is +20khash(750ti)

Will cleanup abit and submit to github.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 25, 2015, 10:52:30 PM

#1669

Quote from: Wolf0 on February 25, 2015, 10:49:36 PM

Quote from: sp_ on February 25, 2015, 10:41:29 PM

I have rewritten wirlpool hash. 12% faster when mining wirlcoin (750ti)
x15 is +20khash(750ti)
Will cleanup abit and submit to github.

Sounds like you're still using tables...

yes. but the table is 1/8 the size.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 25, 2015, 11:24:22 PM

#1670

Quote from: Wolf0 on February 25, 2015, 10:58:09 PM

Yeah, a few rotations and you can down the size, still ouch.

The Hashing function can probobly be improved more, but 12% is ok for now.
I have also submitted a speedup in fugue (x13) precalced some hash and removed instructions.
Building release 40 now.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 25, 2015, 11:44:23 PM

#1671

1.5.40(sp-MOD) is available here: (27-feb-2015)

https://github.com/sp-hash/ccminer/releases/tag/1.5.40

The sourcecode is available here:

https://github.com/sp-hash/ccminer

Differences from release 39

wirlcoin +12%

Faster hash in

Wirlpool(x15,x17)
fugue(x13,x14,x15,x17)
shavite(x11,x13,x14,x15,x17) (tiny speedup)
shabal(x11,x13,x14,x15,x17)(tiny speedup)

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

djm34

Legendary

Offline

Activity: 1400
Merit: 1050

⇾ Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 26, 2015, 12:23:17 AM

#1672

Quote from: sp_ on February 25, 2015, 11:44:23 PM

I try sometime ago the rotation but I wasn't convince, however I don't think I tried it with uint2 since then (I hate working on whirlpool... takes forever to compile).

I get +20MH/s on whirlpoolx on gtx980
+10MH/s on 750ti
but -30MH on 780ti

djm34 facebook page
BTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze
Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 26, 2015, 07:18:54 AM

#1673

With uint2 it uses more registers and spills to memory, so I increaced the launchbound to 128 regs.
The codesize is also bigger.
On the 780ti you should probobobly not unroll all the loops.

There are more speedups to come. Still some easy pickings.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

rednoW

Legendary

Offline

Activity: 1510
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 26, 2015, 08:50:22 AM

#1674

commit Faster fugue https://github.com/sp-hash/ccminer/commit/2ab3254cbddedc0a34020fb3b5d7917fca87dc01 was a little bit (5-7khs) slower for my gtx750 even with manually fine tuned -i.
But commit faster whirlpool https://github.com/sp-hash/ccminer/commit/9715bf7eea8f1c92034e1c67891fa242a8c63d26 is faster for x15.

So i just added x15/cuda_x15_whirlpool.cu from Release 40 to my custom build and gain optimal performance: +17khs on x15 without drop in x13 and x14

rednoW

Legendary

Offline

Activity: 1510
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 26, 2015, 09:11:12 AM

#1675

damn, I can't understand why your R40 binary is 20khs faster in Qubit then my custom R39 based built.
Your R40 didn't work on GTX750 1gb on default (out of memory error) so I use it with "-i 19.3" option.

And it is 20khs faster then both your own built R39 binary and my custom build.
As I see on GitHub there were no changes in Qubit between R39 and R40 except for different launch config that is overridden by -i option.

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 26, 2015, 09:21:29 AM

#1676

Quote from: rednoW on February 26, 2015, 08:50:22 AM

commit Faster fugue https://github.com/sp-hash/ccminer/commit/2ab3254cbddedc0a34020fb3b5d7917fca87dc01 was a little bit (5-7khs) slower for my gtx750 even with manually fine tuned -i.
But commit faster whirlpool https://github.com/sp-hash/ccminer/commit/9715bf7eea8f1c92034e1c67891fa242a8c63d26 is faster for x15.
So i just added x15/cuda_x15_whirlpool.cu from Release 40 to my custom build and gain optimal performance: +17khs on x15 without drop in x13 and x14

There are 3 commits to the fugue hash between 39 and 40. did you forget these?

https://github.com/sp-hash/ccminer/commit/cfb07f6488f436caae14fcd179933ae74efa6a65
https://github.com/sp-hash/ccminer/commit/c60096da2401173051c11fe3a864aee2b2f5d7ad

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 26, 2015, 09:23:32 AM

#1677

Quote from: rednoW on February 26, 2015, 09:11:12 AM

damn, I can't understand why your R40 binary is 20khs faster in Qubit then my custom R39 based built.
Your R40 didn't work on GTX750 1gb on default (out of memory error) so I use it with "-i 19.3" option.
And it is 20khs faster then both your own built R39 binary and my custom build.
As I see on GitHub there were no changes in Qubit between R39 and R40 except for different launch config that is overridden by -i option.

shavite is used in qubit did you include this change?

https://github.com/sp-hash/ccminer/commit/af409c1da57085ad942b3de05d1e33f730d6b910

Also make sure that you compile with the latest drivers.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

rednoW

Legendary

Offline

Activity: 1510
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 26, 2015, 09:45:25 AM

#1678

Quote from: sp_ on February 26, 2015, 09:21:29 AM

No I didn't. These 2 commits are in my build.
And the last https://github.com/sp-hash/ccminer/commit/2ab3254cbddedc0a34020fb3b5d7917fca87dc01 is certainly not so good on my 750 so I've got rid of it Wink

rednoW

Legendary

Offline

Activity: 1510
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 26, 2015, 09:51:42 AM

#1679

Quote from: sp_ on February 26, 2015, 09:23:32 AM

shavite is used in qubit did you include this change?

https://github.com/sp-hash/ccminer/commit/af409c1da57085ad942b3de05d1e33f730d6b910

Also make sure that you compile with the latest drivers.

You were right! I missed it! Now my own build is doing well, 4378khs in Qubit

sp_ (OP)

Legendary

Offline

Activity: 2954
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

February 26, 2015, 10:08:20 AM

#1680

Quote from: rednoW on February 26, 2015, 09:51:42 AM

Quote from: sp_ on February 26, 2015, 09:23:32 AM

shavite is used in qubit did you include this change?
https://github.com/sp-hash/ccminer/commit/af409c1da57085ad942b3de05d1e33f730d6b910
Also make sure that you compile with the latest drivers.

You were right! I missed it! Now my own build is doing well, 4378khs in Qubit

Pretty good for 512 cores.. (The GTX 750 Maxwell 1gb card retails for around $120)

A AMD radeon 280x R9 does 5,5MHASH with 2048 cores. (optimized opensource miner with 5 times the power usage).
Wolf, how fast can you make qubit?

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

Pages: « 1 ... 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 [84] 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 ... 1240 »

Bitcoin Forum > Alternate cryptocurrencies > Mining (Altcoins) > CCminer(SP-MOD) Modded GPU kernels.

« previous topic next topic »