CCminer(SP-MOD) Modded GPU kernels.

Epsylon3

Legendary

Offline

Activity: 1484
Merit: 1082

ccminer/cpuminer developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 06:54:59 AM

#2201

-g is useless and will make your fork buggy...

you just have to use -d0,0 to run 2 threads on the same gpu... beware to the cudaDeviceReset() in this case...

BTC: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd - My Projects: ccminer - cpuminer-multi - yiimp - Forum threads : ccminer - cpuminer-multi - yiimp

sp_ (OP)

Legendary

Offline

Activity: 2898
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 07:24:31 AM

#2202

With -d 0,0 you get hashrate statistics per thread and not per gpu, also I think my fork need to add some threadsyncronization calls and changed gpuconfig(more sharedmem/less level1cache) to avoid invalid hashes.

I want the -g option like in sgminer, but the current implementation (beta) is not working 100%. Tests have shown that 2 threads can boost performance up to 25% on the highend cards. 960/970/980. on most algorithms.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

arpika

Newbie

Offline

Activity: 14
Merit: 0

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 08:05:21 AM

#2203

Some tests...

A test machine with one 970 and one 750ti, Ubuntu 14.04, ccminer 45, quark algo, compiled with both 50/52 capability
default config (without -d and -g parameters): no performance increase and validate errors
-d 0 -g 2 (only 970 runs): 35% performance increase, no errors
-d 1 -g 2 (only 750ti runs): no performance increase and validate errors

sp_ (OP)

Legendary

Offline

Activity: 2898
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 08:28:32 AM
Last edit: April 21, 2015, 08:48:52 AM by sp_

#2204

Quote from: arpika on April 21, 2015, 08:05:21 AM

Some tests...
A test machine with one 970 and one 750ti, Ubuntu 14.04, ccminer 45, quark algo, compiled with both 50/52 capability
default config (without -d and -g parameters): no performance increase and validate errors
-d 0 -g 2 (only 970 runs): 35% performance increase, no errors
-d 1 -g 2 (only 750ti runs): no performance increase and validate errors

The problem is that the 750ti is out of resources.

I will fix it so that it will give a boost and no validation errors on all the maxwell cards. I just need to reduce the constmem/sharedmem usage and reduce threads per block for the kernals.

I think I will recode the -g parameter to support the -d parameter.

just give me some more time. This will be the biggest boost in hashrates in months..

For a 35% boost I think it is time to include a small developer fee of 2%. What do you think guys?

I will keep the sourcecode opensource and linux compatible.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

ol92

Sr. Member

Offline

Activity: 445
Merit: 255

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 09:29:32 AM

#2205

Quote from: sp_ on April 21, 2015, 08:28:32 AM

Quote from: arpika on April 21, 2015, 08:05:21 AM

Some tests...
A test machine with one 970 and one 750ti, Ubuntu 14.04, ccminer 45, quark algo, compiled with both 50/52 capability
default config (without -d and -g parameters): no performance increase and validate errors
-d 0 -g 2 (only 970 runs): 35% performance increase, no errors
-d 1 -g 2 (only 750ti runs): no performance increase and validate errors

The problem is that the 750ti is out of resources.

I will fix it so that it will give a boost and no validation errors on all the maxwell cards. I just need to reduce the constmem/sharedmem usage and reduce threads per block for the kernals.

I think I will recode the -g parameter to support the -d parameter.

just give me some more time. This will be the biggest boost in hashrates in months..

For a 35% boost I think it is time to include a small developer fee of 2%. What do you think guys?

I will keep the sourcecode opensource and linux compatible.

I agree with the developper fee.
This is more fair than a fixed amount for small miners ...
You can made it switchable in the source code and force it in the binary releases (I am using the binary releases and I will be happy to contribute for 2% for your work.)

pallas

Legendary

Offline

Activity: 2716
Merit: 1094

Black Belt Developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 09:32:23 AM

#2206

how are you going to implement the developer fee? time based pool switch?

Cryptonite (XCN): first mini-blockchain coin, innovative, running since 2014!

antonio8

Legendary

Offline

Activity: 1386
Merit: 1000

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 11:14:46 AM

#2207

The -d 0 -g 2 does not work. It is piggy backing a second card.

Stop all your cards and then run the bat. Look at Precision or Afterburner and you will see two cards are actually running. Device 0 and some other random card. I even tried -d 0 -g 3 and I had 3 cards running. So the increase in hash is actually coming from other cards running along with device 0. I do not thing this is how it works for cgminer/sgminer.

If you are going to leave your BTC on an exchange please send it to this address instead 1GH3ub3UUHbU5qDJW5u3E9jZ96ZEmzaXtG, I will at least use the money better than someone who steals it from the exchange. Thanks Wink

totoy

Hero Member

Offline

Activity: 623
Merit: 500

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 12:16:43 PM

#2208

Quote from: antonio8 on April 21, 2015, 11:14:46 AM

The -d 0 -g 2 does not work. It is piggy backing a second card.

Stop all your cards and then run the bat. Look at Precision or Afterburner and you will see two cards are actually running. Device 0 and some other random card. I even tried -d 0 -g 3 and I had 3 cards running. So the increase in hash is actually coming from other cards running along with device 0. I do not thing this is how it works for cgminer/sgminer.

agreed. I tried using -d 0 -g 2 and two cards are running.

sp_ (OP)

Legendary

Offline

Activity: 2898
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 12:28:21 PM

#2209

1. disable the 750ti cards in device manager. (only works on 960,970,980)
2. The head@git is broken. use release 45 exe file and run x11 with -g 4 -i 10. or other settings (no more than 4 cards)

Meassure the hashrate on a pool over time.

The g parameter is not compatible with the -d switch, but I will fix it.

Not very stable with invalid hashes(the invalid hashes will not be submitted to the pool), but you should get higher average rates.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

scryptr

Legendary

Offline

Activity: 1793
Merit: 1028

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 01:25:27 PM
Last edit: April 21, 2015, 04:22:52 PM by scryptr

#2210

CUDAMINER--

When I began mining Scrypt with CudaMiner, I picked up the trick of mining with two instances of the miner running simultaneously. The results were a slightly improved total hash rate. Also, if one instance of the miner crashed, the other would pick up the slack until I could set things right.

The "gputhreads" option is promising. However, I tried the "-d 0,0,1,1,2,2,3,3,4,4,5,5" switch in CCminer v45 and it apparently does start two threads per gpu. Mining Quark, I receive "does not validate on CPU" errors, but also have a higher hash rate and 99%+ acceptance rate on my 6x750ti FTW rig. The hash rate is currently 36.8Mh/s for the rig.

My GTX 960 SSC gets about 9.5Mh/s with "-d 0,0". The threads run about 1/2 the total hash rate with greater variance than a single thread. You can tell the threads are running concurrently. The performance with the "-d" switch appears more stable than when using the "-g" switch currently.

If the "gputhreads" switch is de-bugged, and allows for better control of the miner, I am all for it. --scryptr

750ti FTW Rig

960 SSC Card

TIPS: BTC - 1Fs4uZ6a9ABYBTaHGUfqcwCQmeBRxkKRQT DASH - XrK81tW31SLsVvZ2WX9VhTjpT6GXJPLdbQ
SCRYPTR'S NOTEBOOK: https://bitcointalk.org/index.php?topic=5035515.msg46035530#msg46035530
GITHUB: "github.com/scryptr" MERIT is appreciated, also. Thanks!

5w00p

Hero Member

Offline

Activity: 644
Merit: 502

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 03:57:26 PM

#2211

Quote from: scryptr on April 21, 2015, 01:25:27 PM

CUDAMINER--

When I began mining scrypt with CudaMiner, I picked up the trick of mining with two instances of the miner running simultaneously. The results were a slightly improved total hash rate. Also, if one instance of the miner crashed, the other would pick up the slack until I could set things right.

The "gputhreads" option is promising. However, I tried the "-d 0,0,1,1,2,2,3,3,4,4,5,5" switch and it apparently does start two threads per gpu. I receive "does not validate on CPU" errors, but also have a higher hash rate and 99%+ acceptance rate on my 6x750ti FTW rig. The hash rate is currently 36.8Mh/s for the rig.

My GTX 960 SSC gets about 9.5Mh/s with "-d 0,0". The threads run about 1/2 the total hash rate with greater variance than a single thread. You can tell the threads are running concurrently. The performance with the "-d" switch appears more stable than when using the "-g" switch currently.

If the "gputhreads" switch is de-bugged, and allows for better control of the miner, I am all for it. --scryptr

images snipped

Cool.

What algorithm? Scrypt?

scryptr

Legendary

Offline

Activity: 1793
Merit: 1028

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 04:14:01 PM

#2212

Quote from: 5w00p on April 21, 2015, 03:57:26 PM

Quote from: scryptr on April 21, 2015, 01:25:27 PM

CUDAMINER--

When I began mining Scrypt with CudaMiner, I picked up the trick of mining with two instances of the miner running simultaneously. The results were a slightly improved total hash rate. Also, if one instance of the miner crashed, the other would pick up the slack until I could set things right.

The "gputhreads" option is promising. However, I tried the "-d 0,0,1,1,2,2,3,3,4,4,5,5" switch in CCminer v45 and it apparently does start two threads per gpu. Mining Quark, I receive "does not validate on CPU" errors, but also have a higher hash rate and 99%+ acceptance rate on my 6x750ti FTW rig. The hash rate is currently 36.8Mh/s for the rig.

My GTX 960 SSC gets about 9.5Mh/s with "-d 0,0". The threads run about 1/2 the total hash rate with greater variance than a single thread. You can tell the threads are running concurrently. The performance with the "-d" switch appears more stable than when using the "-g" switch currently.

If the "gputhreads" switch is de-bugged, and allows for better control of the miner, I am all for it. --scryptr

images snipped

Cool.

What algorithm? Scrypt?

I began mining Scrypt with CudaMiner. The images are of a single instance (per machine) of CCminer mining Quark. The algo being mined is output to the screen in blue. --scryptr (P.S. I edited for clarity)

TIPS: BTC - 1Fs4uZ6a9ABYBTaHGUfqcwCQmeBRxkKRQT DASH - XrK81tW31SLsVvZ2WX9VhTjpT6GXJPLdbQ
SCRYPTR'S NOTEBOOK: https://bitcointalk.org/index.php?topic=5035515.msg46035530#msg46035530
GITHUB: "github.com/scryptr" MERIT is appreciated, also. Thanks!

scryptr

Legendary

Offline

Activity: 1793
Merit: 1028

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 04:48:32 PM

#2213

SPLIT THREADS ERROR-

I just captured this:

750ti FTW Rig with error

The anomalous "GPU #11" represents one of the threads, but generally the output only displays GPUs #0-5. As you can see, the acceptance rate while mining Quark with v45 is still good. --scryptr

TIPS: BTC - 1Fs4uZ6a9ABYBTaHGUfqcwCQmeBRxkKRQT DASH - XrK81tW31SLsVvZ2WX9VhTjpT6GXJPLdbQ
SCRYPTR'S NOTEBOOK: https://bitcointalk.org/index.php?topic=5035515.msg46035530#msg46035530
GITHUB: "github.com/scryptr" MERIT is appreciated, also. Thanks!

rednoW

Legendary

Offline

Activity: 1510
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 04:51:13 PM

#2214

Quote from: scryptr on April 21, 2015, 04:48:32 PM

The anomalous "GPU #11" represents one of the threads, but generally the output only displays GPUs #0-5. As you can see, the acceptance rate, while mining Quark with v45, is still good. --scryptr

did the pool confirm the numbers that miner shows? I mean hashrate ...
I tried with my single card with -d 0,0 option and it gives (sometimes but rather often) 'does not validate on CPU' for non-existing GPU1.
Also higher hashrate the miner shows is not confirmed by the pool.

scryptr

Legendary

Offline

Activity: 1793
Merit: 1028

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 05:01:55 PM

#2215

Quote from: rednoW on April 21, 2015, 04:51:13 PM

Quote from: scryptr on April 21, 2015, 04:48:32 PM

The anomalous "GPU #11" represents one of the threads, but generally the output only displays GPUs #0-5. As you can see, the acceptance rate while mining Quark with v45 is still good. --scryptr

did the pool confirm the numbers that miner shows? I mean hashrate ...
I tried with my single card with -d 0,0 option and it gives 'does not validate on CPU' for non-existing GPU1.
Also higher hashrate the miner shows is not confirmed by the pool.

The pool rates fluctuate between 30-40Mh/s. I never know when they are accurate. It is like an "ebb and flow" phenomenon to me.

If you have ever done any CPU mining, multiple threads can be used with the "--threads 16" command. If you use that on an i7, you'll get 2 threads per virtual CPU on the Quad-Core. --scryptr

TIPS: BTC - 1Fs4uZ6a9ABYBTaHGUfqcwCQmeBRxkKRQT DASH - XrK81tW31SLsVvZ2WX9VhTjpT6GXJPLdbQ
SCRYPTR'S NOTEBOOK: https://bitcointalk.org/index.php?topic=5035515.msg46035530#msg46035530
GITHUB: "github.com/scryptr" MERIT is appreciated, also. Thanks!

rednoW

Legendary

Offline

Activity: 1510
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 05:16:17 PM

#2216

The only reason I see for running more threads then physical C(G)PUs is to fill the gaps between tasks. You can easily see that gaps on GPU-Z load graph. In other cases it is useless I think. There is a case in some memory intensive applications when running multiple threads on single CPU gives boost over single thread. But this is a case when single-threaded app is bad coded not fitting data in cache.

rednoW

Legendary

Offline

Activity: 1510
Merit: 1003

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 05:25:47 PM

#2217

Code:

[2015-04-21 20:23:02] GPU #0: GeForce GTX 750, 4110 kH/s
[2015-04-21 20:23:02] GPU #0: GeForce GTX 750, 555.39 kH/s
[2015-04-21 20:23:03] yaamp.com:4733 qubit block 485
[2015-04-21 20:23:03] GPU #0: GeForce GTX 750, 4178 kH/s
[2015-04-21 20:23:03] GPU #0: GeForce GTX 750, 357.77 kH/s
[2015-04-21 20:23:07] GPU #0: GeForce GTX 750, 4145 kH/s
[2015-04-21 20:23:07] accepted: 13/13 (100.00%), 5792 khash/s yay!!!
[2015-04-21 20:23:08] yaamp.com:4733 qubit block 486
[2015-04-21 20:23:08] GPU #0: GeForce GTX 750, 404.75 kH/s
[2015-04-21 20:23:08] GPU #0: GeForce GTX 750, 4067 kH/s
[2015-04-21 20:23:12] GPU #0: GeForce GTX 750, 4105 kH/s
[2015-04-21 20:23:12] accepted: 14/14 (100.00%), 5799 khash/s yay!!!
[2015-04-21 20:23:22] GPU #0: GeForce GTX 750, 4120 kH/s
[2015-04-21 20:23:23] accepted: 15/15 (100.00%), 5802 khash/s yay!!!
[2015-04-21 20:23:25] GPU #0: GeForce GTX 750, 4114 kH/s
[2015-04-21 20:23:25] accepted: 16/16 (100.00%), 5801 khash/s yay!!!
[2015-04-21 20:23:27] yaamp.com:4733 qubit block 486
[2015-04-21 20:23:27] GPU #0: GeForce GTX 750, 679.92 kH/s
[2015-04-21 20:23:27] GPU #0: GeForce GTX 750, 4084 kH/s
[2015-04-21 20:23:34] yaamp.com:4733 qubit block 487
[2015-04-21 20:23:34] GPU #0: GeForce GTX 750, 4114 kH/s
[2015-04-21 20:23:34] GPU #0: GeForce GTX 750, 417.11 kH/s
[2015-04-21 20:23:38] GPU #0: GeForce GTX 750, 4142 kH/s
[2015-04-21 20:23:39] accepted: 17/17 (100.00%), 5806 khash/s yay!!!

this is qubit on my single gtx750 with release 45
ccminer.exe --cpu-priority 5 -d 0,0 -i 16 -a qubit

But I'm not sure that this numbers are real performance ...

scryptr

Legendary

Offline

Activity: 1793
Merit: 1028

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 05:31:05 PM

#2218

Quote from: rednoW on April 21, 2015, 05:16:17 PM

The only reason I see for running more threads then physical C(G)PUs is to fill the gaps between tasks. You can easily see that gaps on GPU-Z load graph. In other cases it is useless I think. There is a case in some memory intensive applications when running multiple threads on single CPU gives boost over single thread. But this is a case when single-threaded app is bad coded not fitting data in cache.

HASH RATE --

I always get a little more hash. It only takes a command line switch, no real effort. The real trick is tuning the code to make good use of it.

I am now mining Quark at close to 37Mh/s on my 750ti FTW rig, and that is 400kh/s better than yesterday with the same v45 code. My 960 SSC is getting up to 10Mh/s now:

960 SSC nearing max performance mining Quark, v45 sp_hash

The hash rate fluctuates. --scryptr

TIPS: BTC - 1Fs4uZ6a9ABYBTaHGUfqcwCQmeBRxkKRQT DASH - XrK81tW31SLsVvZ2WX9VhTjpT6GXJPLdbQ
SCRYPTR'S NOTEBOOK: https://bitcointalk.org/index.php?topic=5035515.msg46035530#msg46035530
GITHUB: "github.com/scryptr" MERIT is appreciated, also. Thanks!

sp_ (OP)

Legendary

Offline

Activity: 2898
Merit: 1087

Team Black developer

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 06:04:30 PM

#2219

I have submitted fixes to the -g parameter. It now supports the -d parameter. I have also remove cudadevicereset, and corrected the stats. (total sum for each gpu)

So

ccminer -d 1,2 -g 2

is the same as

ccminer -d 1,1,2,2

but the stats will give the total hash of the each gpu

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner

DougB62

Hero Member

Offline

Activity: 672
Merit: 500

Banned: For Your Protection

Re: 10MHASH CCminer modded NVIDIA Maxwell kernals by SP.

April 21, 2015, 06:23:31 PM

#2220

Question - I have been building right along with VS Express 2013, but now I get this?

Code:

Error	3	error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_21,compute_20\" -gencode=arch=compute_30,code=\"sm_30,compute_30\" -gencode=arch=compute_35,code=\"sm_35,compute_35\" -gencode=arch=compute_50,code=\"sm_50,compute_50\" -gencode=arch=compute_52,code=\"sm_52,compute_52\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin"  -I. -Icompat -I"compat\curl-for-windows\curl\include" -Icompat\jansson -Icompat\getopt -Icompat\pthreads -I"compat\curl-for-windows\openssl\openssl\include" -I"compat\curl-for-windows\zlib" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include"    --keep --keep-dir Release -maxrregcount=80 --ptxas-options=-v --machine 32 --compile -cudart static --ptxas-options="-O2"     -DWIN32 -DNDEBUG -D_CONSOLE -D_CRT_SECURE_NO_WARNINGS -DCURL_STATICLIB -DUSE_WRAPNVML -DSCRYPT_KECCAK512 -DSCRYPT_CHACHA -DSCRYPT_CHOOSE_COMPILETIME -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Ox /Zi  /MT  " -o Release\fermi_kernel.cu.obj "C:\Users\Linux\Desktop\ccminer-windows\scrypt\fermi_kernel.cu"" exited with code 1.	C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA 6.5.targets	593	9	ccminer

Edit - just noticed this, I'm building for maxwell though?: Release\fermi_kernel.cu.obj "C:\Users\Linux\Desktop\ccminer-windows\scrypt\fermi_kernel.cu