Bitcoin Forum
June 29, 2017, 04:08:49 AM *
News: Latest stable version of Bitcoin Core: 0.14.2  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
Author Topic: [ANN][GRS][DMD][DGB] Pallas optimized groestl opencl kernels  (Read 53494 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
October 05, 2014, 07:11:56 AM
 #101

Help set up R9 290 Tri-x! Thank you!
Read page 1 then ask question if help needed Smiley  (Sorry for being ill-tempered, I just had to block a cred card due to fraudulent charges being made on it, cash only till new card arrives).

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
1498709329
Hero Member
*
Offline Offline

Posts: 1498709329

View Profile Personal Message (Offline)

Ignore
1498709329
Reply with quote  #2

1498709329
Report to moderator
1498709329
Hero Member
*
Offline Offline

Posts: 1498709329

View Profile Personal Message (Offline)

Ignore
1498709329
Reply with quote  #2

1498709329
Report to moderator
POLONIEX TRADING SIGNALS
+50% Profit and more via TELEGRAM
ALTCOINTRADER.CO
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1498709329
Hero Member
*
Offline Offline

Posts: 1498709329

View Profile Personal Message (Offline)

Ignore
1498709329
Reply with quote  #2

1498709329
Report to moderator
qaz6767
Full Member
***
Offline Offline

Activity: 138


View Profile
October 05, 2014, 09:02:58 AM
 #102

I can not replace the bin file. After restarting the miner, is presented again the old bin file. How to replace it? Thank you
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
October 05, 2014, 11:05:43 AM
 #103

I can not replace the bin file. After restarting the miner, is presented again the old bin file. How to replace it? Thank you

if a bin file with the same name already exists, it shouldn't replace it.
so best replace the bin file the miner creates with mine, using the same filename.

qaz6767
Full Member
***
Offline Offline

Activity: 138


View Profile
October 05, 2014, 01:58:08 PM
 #104

I can not replace the bin file. After restarting the miner, is presented again the old bin file. How to replace it? Thank you

if a bin file with the same name already exists, it shouldn't replace it.
so best replace the bin file the miner creates with mine, using the same filename.
Thanks!!!
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
October 07, 2014, 08:02:36 AM
 #105

It looks like the compiler included in 14.9 drivers produces binaries which run considerably slower than older releases.
The same happens for other algorythms as well, for example on X11.
I've tweaked the code a bit but I still can't reach full speed, so I will keep on trying or, eventually, wait for a new driver release.
Meanwhile, if you are on 14.9, use the provided bin file instead of the kernel source in .cl format.

utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
October 09, 2014, 10:01:33 PM
 #106

Any chance of getting a worksize 128 super optimized kernel to try on HD5450? (256 too large)

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
October 10, 2014, 07:40:39 AM
 #107

Any chance of getting a worksize 128 super optimized kernel to try on HD5450? (256 too large)

The changes needed to make it work at 128 are easy, but it probably won't be tuned well for such a card: I've tested on r9 290 and 7950 while developing. It might even not work at all.
If you want to try I can send you a file or the changes and if it works well we can post it here.

utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
October 11, 2014, 01:24:53 AM
 #108

been busy working with miningfield setting up my USA mirror of pools.
yah if you can post a modifed .cl for ws 128 I'll test it on 5450

Made a lot of progress on USA mirror setup today Smiley might have it online by monday Smiley

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
October 11, 2014, 08:58:16 AM
 #109

been busy working with miningfield setting up my USA mirror of pools.
yah if you can post a modifed .cl for ws 128 I'll test it on 5450

Made a lot of progress on USA mirror setup today Smiley might have it online by monday Smiley

just change the initial part of the main function to:

for (u = get_local_id(0); u < 256; u += get_local_size(0)) {
  T2 = ROTL64(T0, 16UL);
  T3 = ROTL64(T0, 24UL);
  T4 = ROTL64(T0, 32UL);
  T5 = ROTL64(T0, 40UL);
  T6 = ROTL64(T0, 48UL);
  T7 = ROTL64(T0, 56UL);
}

this part was blocking worksize < 256.
as I said previously, it still might not work or be very slow for tuning reasons.
let me know of it works.
thanks!

utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
October 11, 2014, 09:05:31 AM
 #110

been busy working with miningfield setting up my USA mirror of pools.
yah if you can post a modifed .cl for ws 128 I'll test it on 5450

Made a lot of progress on USA mirror setup today Smiley might have it online by monday Smiley

just change the initial part of the main function to:

for (u = get_local_id(0); u < 256; u += get_local_size(0)) {
  T2 = ROTL64(T0, 16UL);
  T3 = ROTL64(T0, 24UL);
  T4 = ROTL64(T0, 32UL);
  T5 = ROTL64(T0, 40UL);
  T6 = ROTL64(T0, 48UL);
  T7 = ROTL64(T0, 56UL);
}

this part was blocking worksize < 256.
as I said previously, it still might not work or be very slow for tuning reasons.
let me know of it works.
thanks!
thanks will do as soon as I get time Smiley  VM server box has 5450 in it might as well let the host make use of it, I can get ~0.25MHs with normal gorestlcoin kernel with ws 128 on it ... it's running 24/7/365 anyway Smiley

Only has 80 shaders LOL it's a dwarf but is air cooled hehe
about on par with intel HD GPU (10 shaders) in G3220 CPU as far as hashrate


DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
qaz6767
Full Member
***
Offline Offline

Activity: 138


View Profile
October 20, 2014, 12:26:52 PM
 #111

Help! What the bat file to start Diamond? I can not run for card 280x.Thanks
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
October 20, 2014, 12:29:45 PM
 #112

been busy working with miningfield setting up my USA mirror of pools.
yah if you can post a modifed .cl for ws 128 I'll test it on 5450

Made a lot of progress on USA mirror setup today Smiley might have it online by monday Smiley

just change the initial part of the main function to:

for (u = get_local_id(0); u < 256; u += get_local_size(0)) {
  T2 = ROTL64(T0, 16UL);
  T3 = ROTL64(T0, 24UL);
  T4 = ROTL64(T0, 32UL);
  T5 = ROTL64(T0, 40UL);
  T6 = ROTL64(T0, 48UL);
  T7 = ROTL64(T0, 56UL);
}

this part was blocking worksize < 256.
as I said previously, it still might not work or be very slow for tuning reasons.
let me know of it works.
thanks!
thanks will do as soon as I get time Smiley  VM server box has 5450 in it might as well let the host make use of it, I can get ~0.25MHs with normal gorestlcoin kernel with ws 128 on it ... it's running 24/7/365 anyway Smiley

Only has 80 shaders LOL it's a dwarf but is air cooled hehe
about on par with intel HD GPU (10 shaders) in G3220 CPU as far as hashrate

just curious... did you manage to make it work? if yes, what hashrate?

EDIT: it has about half the shaders of a nexus 9 :-D

Wolf0
Legendary
*
Offline Offline

Activity: 1610


Miner Developer


View Profile
November 04, 2014, 12:43:07 PM
 #113

A BIT OF HISTORY

The first gpu miner for groestlcoin and similar was sph-sgminer by phm. Optimizing the original implementation was trivial (almost 3x the speed could be achived!), so probably there are tens of optimized versions around, many of which have been kept private: mining groestlcoin and similar was always unfair for most people, at least for non-devs.
Hopefully this kernel will end this and should also level the field between amd and nvidia.
I believe my version is faster than many of the other kernels because of the time I dedicated to it and the thousands of tests I did.

FINAL ADVICES

I suggest to keep "good binaries": make a backup of the fastest .bin files you have, so you can recover them in case of driver problems.
Also this will enable you to get 1 o 2 percent more hashrate because of compiler variance (try removing the bin and running 3/4 times to see the variance in action).
Or use the provided bin file (see the OP) which should be a good one.

I've experienced lower power usage with catalyst 14.9 compared to 14.6 beta (a bit less compared to 13 but still better). Speaking of optimization, this should be kept in mind: buy a power meter for your miner(s)!
But it looks like the compiler included in 14.9 drivers produces binaries which run considerably slower than older releases. If you are on 14.9, use the provided bin file instead of the kernel source in .cl format.

If you want to fix it for 14.9, remove the naive implementation of the B64_# macros and use swizzle. Worked for me.

Code:
Donations: BTC: 1WoLFdwcfNEg64fTYsX1P25KUzzSjtEZC -- XMR: 45SLUTzk7UXYHmzJ7bFN6FPfzTusdUVAZjPRgmEDw7G3SeimWM2kCdnDQXwDBYGUWaBtZNgjYtEYA22aMQT4t8KfU3vHLHG
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
November 04, 2014, 12:50:23 PM
 #114

A BIT OF HISTORY

The first gpu miner for groestlcoin and similar was sph-sgminer by phm. Optimizing the original implementation was trivial (almost 3x the speed could be achived!), so probably there are tens of optimized versions around, many of which have been kept private: mining groestlcoin and similar was always unfair for most people, at least for non-devs.
Hopefully this kernel will end this and should also level the field between amd and nvidia.
I believe my version is faster than many of the other kernels because of the time I dedicated to it and the thousands of tests I did.

FINAL ADVICES

I suggest to keep "good binaries": make a backup of the fastest .bin files you have, so you can recover them in case of driver problems.
Also this will enable you to get 1 o 2 percent more hashrate because of compiler variance (try removing the bin and running 3/4 times to see the variance in action).
Or use the provided bin file (see the OP) which should be a good one.

I've experienced lower power usage with catalyst 14.9 compared to 14.6 beta (a bit less compared to 13 but still better). Speaking of optimization, this should be kept in mind: buy a power meter for your miner(s)!
But it looks like the compiler included in 14.9 drivers produces binaries which run considerably slower than older releases. If you are on 14.9, use the provided bin file instead of the kernel source in .cl format.

If you want to fix it for 14.9, remove the naive implementation of the B64_# macros and use swizzle. Worked for me.

Thanks, but I've already tried any combination of bitwise operations and vectors (as_uchar...): I could make it work but hashrate is about 20 Mh/s vs 25 Mh/s of 14.6 beta.

Wolf0
Legendary
*
Offline Offline

Activity: 1610


Miner Developer


View Profile
November 04, 2014, 01:09:20 PM
 #115

A BIT OF HISTORY

The first gpu miner for groestlcoin and similar was sph-sgminer by phm. Optimizing the original implementation was trivial (almost 3x the speed could be achived!), so probably there are tens of optimized versions around, many of which have been kept private: mining groestlcoin and similar was always unfair for most people, at least for non-devs.
Hopefully this kernel will end this and should also level the field between amd and nvidia.
I believe my version is faster than many of the other kernels because of the time I dedicated to it and the thousands of tests I did.

FINAL ADVICES

I suggest to keep "good binaries": make a backup of the fastest .bin files you have, so you can recover them in case of driver problems.
Also this will enable you to get 1 o 2 percent more hashrate because of compiler variance (try removing the bin and running 3/4 times to see the variance in action).
Or use the provided bin file (see the OP) which should be a good one.

I've experienced lower power usage with catalyst 14.9 compared to 14.6 beta (a bit less compared to 13 but still better). Speaking of optimization, this should be kept in mind: buy a power meter for your miner(s)!
But it looks like the compiler included in 14.9 drivers produces binaries which run considerably slower than older releases. If you are on 14.9, use the provided bin file instead of the kernel source in .cl format.

If you want to fix it for 14.9, remove the naive implementation of the B64_# macros and use swizzle. Worked for me.

Thanks, but I've already tried any combination of bitwise operations and vectors (as_uchar...): I could make it work but hashrate is about 20 Mh/s vs 25 Mh/s of 14.6 beta.

Ah, I see - I just saw it go from 7MH/s to... I think 20, on 14.9, so I figured it worked; never mind, then.

Code:
Donations: BTC: 1WoLFdwcfNEg64fTYsX1P25KUzzSjtEZC -- XMR: 45SLUTzk7UXYHmzJ7bFN6FPfzTusdUVAZjPRgmEDw7G3SeimWM2kCdnDQXwDBYGUWaBtZNgjYtEYA22aMQT4t8KfU3vHLHG
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
November 04, 2014, 01:27:57 PM
 #116

Thanks, but I've already tried any combination of bitwise operations and vectors (as_uchar...): I could make it work but hashrate is about 20 Mh/s vs 25 Mh/s of 14.6 beta.
Ah, I see - I just saw it go from 7MH/s to... I think 20, on 14.9, so I figured it worked; never mind, then.

It's funny how some little changes lead to huge hashrate drops (depending on compiler version); but it's true for memory intensive algos only, as far as I can see.
Maybe your own version doesn't have this problem, then ;-)

Wolf0
Legendary
*
Offline Offline

Activity: 1610


Miner Developer


View Profile
November 04, 2014, 01:43:39 PM
 #117

Thanks, but I've already tried any combination of bitwise operations and vectors (as_uchar...): I could make it work but hashrate is about 20 Mh/s vs 25 Mh/s of 14.6 beta.
Ah, I see - I just saw it go from 7MH/s to... I think 20, on 14.9, so I figured it worked; never mind, then.

It's funny how some little changes lead to huge hashrate drops (depending on compiler version); but it's true for memory intensive algos only, as far as I can see.
Maybe your own version doesn't have this problem, then ;-)

Not true for only memory intensive algos - one little screwup and the idiot compiler will double the size of your code, it won't fit in the code cache, and be slow lol

Code:
Donations: BTC: 1WoLFdwcfNEg64fTYsX1P25KUzzSjtEZC -- XMR: 45SLUTzk7UXYHmzJ7bFN6FPfzTusdUVAZjPRgmEDw7G3SeimWM2kCdnDQXwDBYGUWaBtZNgjYtEYA22aMQT4t8KfU3vHLHG
cryptonit
Legendary
*
Offline Offline

Activity: 1288


CVO Diamond Foundation (Visionary)


View Profile WWW
November 16, 2014, 07:31:32 AM
 #118


cryptonit
Legendary
*
Offline Offline

Activity: 1288


CVO Diamond Foundation (Visionary)


View Profile WWW
November 23, 2014, 12:23:11 PM
 #119

@pallas could u find the actual state of the art mining software for DMD Groestl and post links in DMD ANN we then will update software on website

it would be great if it include ur performance boost tricks already.....

i think no one from our core team runs AMD cards any longer so ur help would be welcome

pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
November 23, 2014, 02:39:19 PM
 #120

@pallas could u find the actual state of the art mining software for DMD Groestl and post links in DMD ANN we then will update software on website

it would be great if it include ur performance boost tricks already.....

i think no one from our core team runs AMD cards any longer so ur help would be welcome


the problem with my kernel is that, no matter how hard I try, I can't get the best hashrate on 14.9 drivers (only 20 Mh/s vs 25 with 14.6), so it's not enough to just replace diamond.cl on sgminer 4.1 or 5.
that's why I still prefer people visit this post, with all the info and troubleshooting, for best performance.
the only way to make it clean is creating a fork of sgminer, for tahiti and hawaii cards only, with the precompiled binary; some changes are needed in order for it to always use the binary and not compile the cl sources.
not sure I like it but it might work for many... what do you think?

Pages: « 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!