Bitcoin Forum
May 03, 2024, 03:45:02 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 [82] 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 ... 1240 »
  Print  
Author Topic: CCminer(SP-MOD) Modded GPU kernels.  (Read 2347498 times)
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 16, 2015, 11:04:01 PM
 #1621

The last change is Aes is slower on the 750ti. I am working to improve it.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
In order to achieve higher forum ranks, you need both activity points and merit points.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
rednoW
Legendary
*
Offline Offline

Activity: 1510
Merit: 1003


View Profile
February 17, 2015, 08:16:44 AM
 #1622

The last change is Aes is slower on the 750ti. I am working to improve it.
I've rolled back in cuda_x11_aes.cu and seen no difference. Also commit "Faster shabal" https://github.com/sp-hash/ccminer/commit/c7eef5275ab77f02d3d86601092774fae8a29cd7 doesn't change anything in rates on my setup.
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 17, 2015, 09:33:41 AM
 #1623

The last change is Aes is slower on the 750ti. I am working to improve it.
I've rolled back in cuda_x11_aes.cu and seen no difference. Also commit "Faster shabal" https://github.com/sp-hash/ccminer/commit/c7eef5275ab77f02d3d86601092774fae8a29cd7 doesn't change anything in rates on my setup.

The cuda_x11_aes.cu is excluded from the project file, so if you change it it will not build unless you save echo or shavite or take a full build. To messure you can use Fresh, because this has fewer chained hashing algos.
The differences are small, but the PTX code. (assembly code) shows less instructions with my changes. And also the different cards have different timings.
A small speedup of 1kHASH will not be noticable, but 100 small speedups will.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 17, 2015, 09:36:47 AM
 #1624

The last change is Aes is slower on the 750ti. I am working to improve it.
I've rolled back in cuda_x11_aes.cu and seen no difference. Also commit "Faster shabal" https://github.com/sp-hash/ccminer/commit/c7eef5275ab77f02d3d86601092774fae8a29cd7 doesn't change anything in rates on my setup.

x14 has 14 chained algos. If  I optmize shabal 2% , the increase of the total hash is very small. like 0.005% faster. This is because (groest, echo, simd etc are much slower. and take most of the time)

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 17, 2015, 09:45:26 AM
 #1625

Another problem is 64bit vs 32bit. And windows vs linux

I optimize for windows and 32bit. On linux builds are normally 64 bit(Some optimalizations are faster when building 32bit. )

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 17, 2015, 09:49:47 AM
 #1626

qubit has slowed from 3609 kH/s on the previous version to 3571 kH/s on the current one - GTX 850M Linux, using "git pull" to update, build.sh has -O3 passed to configure

The fresh algo uses 2 rounds of shavite, 2 of simd and 1 echo.

shavite512
simd512
shavite512
simd512
echo512

The shavite and echo is affected in my AES change, that is faster on the compute 5.2 cards. I am rewriting the AES now, so the next commit will hopefully increase the performance of fresh.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
rednoW
Legendary
*
Offline Offline

Activity: 1510
Merit: 1003


View Profile
February 17, 2015, 11:09:57 AM
 #1627

The cuda_x11_aes.cu is excluded from the project file, so if you change it it will not build unless you save echo or shavite or take a full build. To messure you can use Fresh, because this has fewer chained hashing algos.

I checked a VS build log after rollback. cuda_x11_aes.cu was #included in 2 other .cu files that were rebuilt by VS. So I think I made it right.
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 17, 2015, 09:33:38 PM
 #1628

Any coders who can help me?

I try to reduce the  number of instructions from 4 to 2 per shared mem access in cuda_x11_aes.cu

I try to force the pointer to be correct with this code:

__device__ __forceinline__
uint32_t  SPRead(uint32_t *const __restrict__ x, uint32_t low)
{
   uint32_t tmp,res;
   asm("bfi.b32 %0, %1, %2, 10 , 8;" : "=r"(tmp) : "r"(low), "r"(x));
   asm("ld.shared.u32  %0,[%1];" : "=r"(res) : "r"(tmp));
   return res;
}

the ptx looks good

cvta.shared.u32    %r393, _Z23x11_echo512_gpu_hash_64jjPy$__cuda_local_var_204566_57_non_const_sharedMemory;
// inline asm
bfi.b32 %r232, %r18, %r393, 10 , 8;
// inline asm
// inline asm
ld.shared.u32  %r235,[%r232];


But the program crash with illegal memory access. The shared memory pointer is alligned to 1024 boundary.


Without the pointer hack I get code like this:

   bfe.u32 %r243, %r4, %r440, 8;
   // inline asm
   shl.b32    %r468, %r243, 2;
   add.s32    %r469, %r468, %r451;
   ld.shared.u32    %r470, [%r469+1024];

4 instructions.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
djm34
Legendary
*
Offline Offline

Activity: 1400
Merit: 1050


View Profile WWW
February 17, 2015, 09:46:25 PM
 #1629

Any coders who can help me?

I try to reduce the  number of instructions from 4 to 2 per shared mem access in cuda_x11_aes.cu

I try to force the pointer to be correct with this code:

__device__ __forceinline__
uint32_t  SPRead(uint32_t *const __restrict__ x, uint32_t low)
{
   uint32_t tmp,res;
   asm("bfi.b32 %0, %1, %2, 10 , 8;" : "=r"(tmp) : "r"(low), "r"(x));
   asm("ld.shared.u32  %0,[%1];" : "=r"(res) : "r"(tmp));
   return res;
}
tmp in the second instruction should be a pointer

djm34 facebook page
BTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze
Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 18, 2015, 06:49:08 AM
 #1630

Didn't work.

the strange is that this doesn't work eighter:

__device__ __forceinline__
uint32_t  SPRead(uint32_t *const __restrict__ x, uint32_t low)
{
   uint32_t *tmp;
   tmp=x;     
   return tmp[low];
}
...
tmp=SPRead(sharedmemory,x0 &0xff);


but this works:

tmp= sharedmemory[x0 &0xff];


Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
Namsbreh
Sr. Member
****
Offline Offline

Activity: 318
Merit: 250


View Profile
February 18, 2015, 10:39:55 AM
 #1631

How is the next version of the spreadminer coming along?
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 18, 2015, 11:25:37 AM
 #1632

How is the next version of the spreadminer coming along?

I'm rewriting the sha part. Perhaps I'm done in the weekend.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
scryptr
Legendary
*
Offline Offline

Activity: 1793
Merit: 1028



View Profile WWW
February 19, 2015, 07:42:44 PM
Last edit: February 19, 2015, 07:57:52 PM by scryptr
 #1633

QUBIT --

I've been mining qubit on Yaamp with no accepts.  No errors, maybe one or two accepts, but a long chain of block change messages, one after another.  My 960 is getting about 7880kh/s, but no productive work.   I am using version 39, on Windows 7.    

sp-ccminer v39:


scryptr image


tpruvot ccminer v1.5.3:


scryptr image

--scryptr

TIPS:  BTC - 1Fs4uZ6a9ABYBTaHGUfqcwCQmeBRxkKRQT    DASH - XrK81tW31SLsVvZ2WX9VhTjpT6GXJPLdbQ
          SCRYPTR'S NOTEBOOK: https://bitcointalk.org/index.php?topic=5035515.msg46035530#msg46035530
          GITHUB: "github.com/scryptr"  MERIT is appreciated, also.  Thanks!
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 19, 2015, 09:04:13 PM
 #1634

Must be a linux issue, or perhaps an issue with the 960 card.

51% of the qubit hashrate is now with my latest version of the mod @yaamp

ccminer/1.5.39-git(SP-MOD)   14   51%   161.1 mh/s   1%

I will be testing on the 960 this weekend.

The latest version on github has a bit high intensity. I get out of memory sometimes on my testrig, but the hashrate is +30-50KHASH. The GPU has plenty of memory left, but I think this is an issue with memory on the mainboard. I try to allocate a total of 1.2 gig on the graphic card, and it fails. 700mb is ok. (2GB 750ti's)

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2898
Merit: 1087

Team Black developer


View Profile
February 19, 2015, 09:10:44 PM
 #1635

More free hash is comming in x13/x14/x15 (fugue).
I can precalc most of the first round of the FUGUE512_3 function since it is starting with alot of constant inputs. It seems to give another 20-30KHASH(750ti), but more work is needed before I can know for sure. .

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
scryptr
Legendary
*
Offline Offline

Activity: 1793
Merit: 1028



View Profile WWW
February 19, 2015, 09:14:02 PM
 #1636

Must be a linux issue, or perhaps an issue with the 960 card.

51% of the qubit hashrate is now with my latest version of the mod @yaamp

ccminer/1.5.39-git(SP-MOD)   14   51%   161.1 mh/s   1%

I will be testing on the 960 this weekend.

The latest version on github has a bit high intensity. I get out of memory sometimes on my testrig, but the hashrate is +30-50KHASH. The GPU has plenty of memory left, but I think this is an issue with memory on the mainboard. I try to allocate a total of 1.2 gig on the graphic card, and it fails. 700mb is ok. (2GB 750ti's)

Thanks for the quick reply!  This card is running on Win7 x64.  I also posted on tpruvot's thread.        --scryptr

TIPS:  BTC - 1Fs4uZ6a9ABYBTaHGUfqcwCQmeBRxkKRQT    DASH - XrK81tW31SLsVvZ2WX9VhTjpT6GXJPLdbQ
          SCRYPTR'S NOTEBOOK: https://bitcointalk.org/index.php?topic=5035515.msg46035530#msg46035530
          GITHUB: "github.com/scryptr"  MERIT is appreciated, also.  Thanks!
tbearhere
Legendary
*
Offline Offline

Activity: 3136
Merit: 1003



View Profile
February 19, 2015, 09:25:31 PM
 #1637

Interesting to see bitcoin (sha256) support added.  No chance of profit but fun to have most of the algos supported in one miner.  Of note, this doesn't seem to be a very optimal implementation.  With cgminer (opencl) I was getting about 405,000 kh/s on my 2 x 750 Ti setup.  With release 34 ccminer I'm down to about 345,000 kh/s.  Still, it sits at the very bottom of the profitability list with either hash rate so I'm not too concerned.  It burns many more times in power than it earns.

Klaus_t implementation was pretty good, but the launchconfiguration of the kernal was way off. I submitted a 17.5% speed increase to github. (750ti)

Hah! 17.5% increase puts it right at the 405 mh/s I had with the OpenCL miner.   Cheesy

If all the ASIC miners out there suddenly brick themselves we'll have a nice GPU Bitcoin miner here.
@ StuffOfInterest or anyone. What does your bat file look like to gpu mine bitcoin to your wallet? I can't get mine working. Did you add a conf file if so what addnode and port do you use? I have to try it just to say I mined btc directly. Hey...I may hit a block. Grin
flipclip
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
February 20, 2015, 05:56:30 PM
 #1638

Must be a linux issue, or perhaps an issue with the 960 card.

51% of the qubit hashrate is now with my latest version of the mod @yaamp

ccminer/1.5.39-git(SP-MOD)   14   51%   161.1 mh/s   1%

I will be testing on the 960 this weekend.

The latest version on github has a bit high intensity. I get out of memory sometimes on my testrig, but the hashrate is +30-50KHASH. The GPU has plenty of memory left, but I think this is an issue with memory on the mainboard. I try to allocate a total of 1.2 gig on the graphic card, and it fails. 700mb is ok. (2GB 750ti's)

Probably a 960 issue... works fine Linux (2) 750 Ti.
rednoW
Legendary
*
Offline Offline

Activity: 1510
Merit: 1003


View Profile
February 20, 2015, 06:43:00 PM
Last edit: February 20, 2015, 07:28:47 PM by rednoW
 #1639

Commit "Bether default throughput qubit(+30khash 750ti)" gives error

"Cuda error in func 'x11_simd512_cpu_init' at line 634 : out of memory."

on my poor gtx750 non-ti with only 1gb memory

Will try to lower "int intensity = 256 * 256 * 14;"

I tried 256 * 256 * 12 - it crashes driver when monitor attached and works but slow without monitor.
256 * 256 * 10 gives me 2-4 khs benefit compares to default 1U << 19 (256 * 256 * 8) from previous version

So I went to try -i command line parameter and figured out that for my card -i 19.3 is max for qubit (681472 cuda threads)
scryptr
Legendary
*
Offline Offline

Activity: 1793
Merit: 1028



View Profile WWW
February 22, 2015, 04:32:59 AM
Last edit: February 22, 2015, 06:09:41 PM by scryptr
 #1640

Must be a linux issue, or perhaps an issue with the 960 card.

51% of the qubit hashrate is now with my latest version of the mod @yaamp

ccminer/1.5.39-git(SP-MOD)   14   51%   161.1 mh/s   1%

I will be testing on the 960 this weekend.

The latest version on github has a bit high intensity. I get out of memory sometimes on my testrig, but the hashrate is +30-50KHASH. The GPU has plenty of memory left, but I think this is an issue with memory on the mainboard. I try to allocate a total of 1.2 gig on the graphic card, and it fails. 700mb is ok. (2GB 750ti's)

Probably a 960 issue... works fine Linux (2) 750 Ti.

QUBIT ISSUE:

I have 2 6x750ti rigs, 1 is Linux, the other is Win 8 (now).  I also have a GTX 960 on Win 7.  Qubit does not behave well, recently.  Both my 750ti cards and my GTX 960 card have periods where no shares are accepted or submitted, although hashing goes on.  No rejects, just hashing speed and new block notifications.  Essentially, they burn electricity and do no useful work.

Also, I get strange "doubled-up" behavior, where every other share is rejected.  This happens with tpruvot's ccminer v1.5.3.  Apparently, it happens about the same time as trouble with SP_ 's ccminer.

Later, they may run qubit properly.  I don't know why.        --scryptr

EDIT:  I suppose the reason I am posting about qubit is that qubit is a top-ranking earner just now.  If I let my Windows rigs mine on the round-robin batch file, they get stuck on qubit and actually earn nothing.       --scryptr

TIPS:  BTC - 1Fs4uZ6a9ABYBTaHGUfqcwCQmeBRxkKRQT    DASH - XrK81tW31SLsVvZ2WX9VhTjpT6GXJPLdbQ
          SCRYPTR'S NOTEBOOK: https://bitcointalk.org/index.php?topic=5035515.msg46035530#msg46035530
          GITHUB: "github.com/scryptr"  MERIT is appreciated, also.  Thanks!
Pages: « 1 ... 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 [82] 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 ... 1240 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!