Bitcoin Forum
June 17, 2025, 10:38:29 PM *
News: Pizza day contest voting
 
   Home   Help Search Login Register More  
Pages: « 1 ... 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 [56] 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 ... 1240 »
  Print  
Author Topic: CCminer(SP-MOD) Modded GPU kernels.  (Read 2347845 times)
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2926
Merit: 1087

Team Black developer


View Profile
January 15, 2015, 07:47:07 PM
Last edit: January 16, 2015, 05:34:37 AM by sp_
 #1101

Checked in groestl speedup.

Faster groestl part 1. quark+150KHASH, x11 +60KHASH (750ti)

I managed to shrink the method "to_bitslice_quad" from around 800 asm instructions to around 80.

and "from_bitslice_quad" from around 400 instructions to around 200 instructions. With some more work, I will shrink this to 80 as well.

Instead of calculating one bit at a time I use the whole register in the cpu. Similar to a chunky2planar convertion.



Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2926
Merit: 1087

Team Black developer


View Profile
January 15, 2015, 08:28:33 PM
 #1102

Added my first binary to github

1.5.30(sp-MOD) is available here: (15-jan-2015)

https://github.com/sp-hash/ccminer/releases/tag/1.5.30

The sourcecode is available here:

https://github.com/sp-hash/ccminer

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2926
Merit: 1087

Team Black developer


View Profile
January 15, 2015, 08:55:30 PM
 #1103

I will be taking a break now and work on the private spreadcoin miner. If you want more hash, please donate some BTC Smiley thanks.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
tbearhere
Legendary
*
Offline Offline

Activity: 3276
Merit: 1003



View Profile
January 15, 2015, 09:54:49 PM
 #1104

Checked in groestl speedup.

Faster groestl part 1. quark+250KHASH, x11 +50KHASH (750ti)

I managed to shrink the method "to_bitslice_quad" from around 800 asm instructions to around 80.

and "from_bitslice_quad" from around 400 instructions to around 200 instructions. With some more work, I will shrink this to 80 as well.

Instead of calculating one bit at a time I use the whole register in the cpu. Similar to a chunky2planar convertion.



Good one sp..getting about 100+ to 150+ on quark...don't have time yet to check other algo's.  Smiley
flipclip
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
January 16, 2015, 01:04:17 AM
Last edit: January 16, 2015, 03:45:55 AM by flipclip
 #1105

Quark:
v28= ~11,160 kh/s
v30= ~11,441 kh/s

x11:
v28= ~5,730 kh/s
v30= ~5,850 kh/s

lyra2:
v28= ~1,350 kh/s
v30= ~1,350 kh/s

(2 750Ti's, no overclock)
chrysophylax
Legendary
*
Offline Offline

Activity: 3080
Merit: 1093


--- ChainWorks Industries ---


View Profile WWW
January 16, 2015, 02:03:39 AM
 #1106

I will be taking a break now and work on the private spreadcoin miner. If you want more hash, please donate some BTC Smiley thanks.

tanx sp ...

im pulling the latest from git - compiling - and then as of tomorrow ( adelaide australia time ) ill mine your address as donation for 48 hours with the upgraded miners on ccminer ...

any algo you want to be mining with? on yaamp? ...

btw - how do can i be included in the private project for spreadcoin - even if its just for testing? or at all? ...

i would really like to see how it runs so far ...

tanx ...

#crysx

flipclip
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
January 16, 2015, 02:26:34 AM
 #1107

... ill mine your address as donation for 48 hours with the upgraded miners on ccminer ...

any algo you want to be mining with? on yaamp? ...

#crysx

I'd hazard a guess, the most profitable  Wink
tbearhere
Legendary
*
Offline Offline

Activity: 3276
Merit: 1003



View Profile
January 16, 2015, 02:38:29 AM
Last edit: January 16, 2015, 10:31:19 AM by tbearhere
 #1108

Looks like pool rejects are higher making #30 less efficient then #29. quark

wrong its good  Smiley
jpouza
Legendary
*
Offline Offline

Activity: 3080
Merit: 1131


View Profile
January 16, 2015, 03:40:10 AM
 #1109

I will be taking a break now and work on the private spreadcoin miner. If you want more hash, please donate some BTC Smiley thanks.

Nice, I've sent you a PM.

Cheers
flipclip
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
January 16, 2015, 03:40:48 AM
 #1110

Looks like pool rejects are higher making #30 less efficient then #29. quark

how many rejects and which pool?  I was on yaamp for 1.5 hours, with one reject ("reject reason: Job not found" right after a block change), which seemed fine to me.
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2926
Merit: 1087

Team Black developer


View Profile
January 16, 2015, 05:55:46 AM
Last edit: January 16, 2015, 06:18:50 AM by sp_
 #1111

Quark:
v28= ~11,160 kh/s
v30= ~11,441 kh/s
x11:
v28= ~5,730 kh/s
v30= ~5,850 kh/s
lyra2:
v28= ~1,350 kh/s
v30= ~1,350 kh/s
(2 750Ti's, no overclock)

Try groestl or diamondgroestl.  Smiley

You can see on the commit on github that this is groestl speedup part 1. I have part 2 soon ready for checkin, but it is currently mixing the bits, and produce wrong results. Lyra is not using the bitslice groestl (killer groestl)
so my improvements will not have an effect. But I guess if I swap the implementation, Lyra2 will get a boost as well. Did you try that DJM34?

About the spreadcoinminer:

I will integrate and optimize TSIV's spreadcoin implementation into the latest fork of ccminer. I will send out beta versions for a fee of 0.1 BTC. The beta will be a windows executable.
I might send out more than one exe in the testing phase, if I manage to optimize more.
I will publish the sourcecode after one month of betatesting. (When I publish the sourcecode, the spreadcoin will spread bether, and secure the coin)
 
The current speed will be announced when the exefile is released. Hopefully next weekend 25-26 january.

The estimated speed increase is 30-40% on the 980 cards

If you want a early seat, you can start donating to my BTC  or DRK adress in my signature.The current speed will be announced when the exefile is released. Hopefully next weekend 25-26 january.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
tbearhere
Legendary
*
Offline Offline

Activity: 3276
Merit: 1003



View Profile
January 16, 2015, 10:32:14 AM
 #1112

Looks like pool rejects are higher making #30 less efficient then #29. quark

how many rejects and which pool?  I was on yaamp for 1.5 hours, with one reject ("reject reason: Job not found" right after a block change), which seemed fine to me.

wrong its good ....great  Smiley  sorry
chrysophylax
Legendary
*
Offline Offline

Activity: 3080
Merit: 1093


--- ChainWorks Industries ---


View Profile WWW
January 16, 2015, 11:10:07 AM
 #1113

Quark:
v28= ~11,160 kh/s
v30= ~11,441 kh/s
x11:
v28= ~5,730 kh/s
v30= ~5,850 kh/s
lyra2:
v28= ~1,350 kh/s
v30= ~1,350 kh/s
(2 750Ti's, no overclock)

Try groestl or diamondgroestl.  Smiley

You can see on the commit on github that this is groestl speedup part 1. I have part 2 soon ready for checkin, but it is currently mixing the bits, and produce wrong results. Lyra is not using the bitslice groestl (killer groestl)
so my improvements will not have an effect. But I guess if I swap the implementation, Lyra2 will get a boost as well. Did you try that DJM34?

About the spreadcoinminer:

I will integrate and optimize TSIV's spreadcoin implementation into the latest fork of ccminer. I will send out beta versions for a fee of 0.1 BTC. The beta will be a windows executable.
I might send out more than one exe in the testing phase, if I manage to optimize more.
I will publish the sourcecode after one month of betatesting. (When I publish the sourcecode, the spreadcoin will spread bether, and secure the coin)
 
The current speed will be announced when the exefile is released. Hopefully next weekend 25-26 january.

The estimated speed increase is 30-40% on the 980 cards

If you want a early seat, you can start donating to my BTC  or DRK adress in my signature.The current speed will be announced when the exefile is released. Hopefully next weekend 25-26 january.


any possibility of a linux x64 version for beta as well sp? ...

ill donate ( and i assume most will ) for the cause of improving the spreadcoin miner ... but a windows version is useless to me ...

anyway around that - or are you just keeping for the windows based systems? ...

it just simply means that 30 days of no testing on my end ...

#crysx

sp_ (OP)
Legendary
*
Offline Offline

Activity: 2926
Merit: 1087

Team Black developer


View Profile
January 16, 2015, 12:30:56 PM
 #1114

The new gtx 960 might fail on the default settings because the intensity is set to high for compute 5.2 devices. How much memory are they planning to ship with the new cards?

The fix will be to set the intensity manually with the -i parameter. f.eks -i 19

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
djm34
Legendary
*
Offline Offline

Activity: 1400
Merit: 1050


View Profile WWW
January 16, 2015, 01:28:40 PM
 #1115

Quark:
v28= ~11,160 kh/s
v30= ~11,441 kh/s
x11:
v28= ~5,730 kh/s
v30= ~5,850 kh/s
lyra2:
v28= ~1,350 kh/s
v30= ~1,350 kh/s
(2 750Ti's, no overclock)

Try groestl or diamondgroestl.  Smiley

You can see on the commit on github that this is groestl speedup part 1. I have part 2 soon ready for checkin, but it is currently mixing the bits, and produce wrong results. Lyra is not using the bitslice groestl (killer groestl)
so my improvements will not have an effect. But I guess if I swap the implementation, Lyra2 will get a boost as well. Did you try that DJM34?
I tried a bit (I was on a rather tight schedule) but without success, the problem the quad implementation was written for groestl512 and using it for groestl256 isn't really straight forward...


djm34 facebook page
BTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze
Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
flipclip
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
January 16, 2015, 02:40:29 PM
 #1116

Quark:
v28= ~11,160 kh/s
v30= ~11,441 kh/s
x11:
v28= ~5,730 kh/s
v30= ~5,850 kh/s
lyra2:
v28= ~1,350 kh/s
v30= ~1,350 kh/s
(2 750Ti's, no overclock)

Try groestl or diamondgroestl.  Smiley

You can see on the commit on github that this is groestl speedup part 1. I have part 2 soon ready for checkin, but it is currently mixing the bits, and produce wrong results. Lyra is not using the bitslice groestl (killer groestl)
so my improvements will not have an effect. But I guess if I swap the implementation, Lyra2 will get a boost as well. Did you try that DJM34?


I realized Lyra2 wasn't part of the speed up, it just happened that the algo was profitable for like five minutes on yaamp, and I happen to be in front of my computer while it happened.  Since I hadn't posted any Lyra2 rates in a while (ever?) thought I would do it, just in case someone was interested.
flipclip
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
January 16, 2015, 03:08:24 PM
 #1117

The new gtx 960 might fail on the default settings because the intensity is set to high for compute 5.2 devices. How much memory are they planning to ship with the new cards?

The fix will be to set the intensity manually with the -i parameter. f.eks -i 19

For people with mixed cards (or thinking about mixing in a 960) in their rigs, the -i parameter is not a per card setting, so they'll either need to run seperate ccminer instances or hope the same -i parameter works across cards  Smiley.  Just something for people to think about.
flipclip
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
January 16, 2015, 03:20:26 PM
 #1118

Quark:
v28= ~11,160 kH/s
v30= ~11,441 kH/s

x11:
v28= ~5,730 kH/s
v30= ~5,850 kH/s

lyra2:
v28= ~1,350 kH/s
v30= ~1,350 kH/s

Mjollnir:
v30=~21,000 MH/s

(2 750Ti's, no overclock)
djm34
Legendary
*
Offline Offline

Activity: 1400
Merit: 1050


View Profile WWW
January 16, 2015, 03:21:02 PM
 #1119

anyone knows how, with visual studio, to get the "release" directory with less crap in it (I still need the ptx though...) ?

djm34 facebook page
BTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze
Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
Bombadil
Hero Member
*****
Offline Offline

Activity: 644
Merit: 500



View Profile
January 16, 2015, 03:28:47 PM
 #1120

anyone knows how, with visual studio, to get the "release" directory with less crap in it (I still need the ptx though...) ?

Shift-delete the crap Tongue
Pages: « 1 ... 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 [56] 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 ... 1240 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!