Bitcoin Forum
June 26, 2017, 05:38:26 AM *
News: Latest stable version of Bitcoin Core: 0.14.2  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 »  All
  Print  
Author Topic: [ANN][GRS][DMD][DGB] Pallas optimized groestl opencl kernels  (Read 53211 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 13, 2015, 03:57:29 AM
 #221

Well that neoscrypt is quiet complicated. I can't even got it compiled as I think it needs more defines than just WORKSIZE alone. Some day I gonna chack it from a closer view as it is interesting...

Now I have Cat 14.12 omega (whatever it is) now. Asm kernel is unchanged, original Ocl kernel is 15% faster than cat 14.9 but it is still way too bad.

I've compared your diamondTahiti compilation with my Capeverde one. The differences are not that complicated:
- In the ELF's header the 'archtype' field is 3FF vs. 3FD
- In the small binary info section (outer elf)   2x bytes are different: 9F vs. 9C
- In the small binary info section (inner elf)   one byte difference: 1C vs. 1A
- In the text ARG section the only difference is the strings: capeverde vs. tahiti

So if I collect all these constants/strings I can convert from one to another. But Capeverde and Tahiti are identical chips. It's possible that the binary of Hawaii is much more different.
And yet the two binary (capeverde and tahiti) are almost the same, the clBuildKernel() checks for hardware ids and refuses to load it.
1498455506
Hero Member
*
Offline Offline

Posts: 1498455506

View Profile Personal Message (Offline)

Ignore
1498455506
Reply with quote  #2

1498455506
Report to moderator
1498455506
Hero Member
*
Offline Offline

Posts: 1498455506

View Profile Personal Message (Offline)

Ignore
1498455506
Reply with quote  #2

1498455506
Report to moderator
POLONIEX TRADING SIGNALS
+50% Profit and more via TELEGRAM
ALTCOINTRADER.CO
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1498455506
Hero Member
*
Offline Offline

Posts: 1498455506

View Profile Personal Message (Offline)

Ignore
1498455506
Reply with quote  #2

1498455506
Report to moderator
1498455506
Hero Member
*
Offline Offline

Posts: 1498455506

View Profile Personal Message (Offline)

Ignore
1498455506
Reply with quote  #2

1498455506
Report to moderator
1498455506
Hero Member
*
Offline Offline

Posts: 1498455506

View Profile Personal Message (Offline)

Ignore
1498455506
Reply with quote  #2

1498455506
Report to moderator
utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 13, 2015, 04:12:32 AM
 #222

ROFL u just had to try 14.12 hahahaha ... now back to 14.7RC3 LOL

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
January 13, 2015, 04:19:05 AM
 #223

* I've updated the main page with benchmark data I've collected: http://realhet.wordpress.com/gcn-asm-groestl-coin-kernel/

30 Mh/s is for the r9 290, 290x does 33.

mitache365
Hero Member
*****
Online Online

Activity: 686


View Profile
January 13, 2015, 08:05:07 AM
 #224

Thanks, I wasnt disabling the intel in UEFI which was my problem. its working now at 26.5MH/s per card which is amazing.
I'm using 14.7r3, xI 2048, 1100/150, -w 256 undervolted to 1.00 and getting 23.38 MH/s.  What's your config?

same here. 23.4
witch is the right miner?

BTC
utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 13, 2015, 08:49:56 AM
 #225

Thanks, I wasnt disabling the intel in UEFI which was my problem. its working now at 26.5MH/s per card which is amazing.
I'm using 14.7r3, xI 2048, 1100/150, -w 256 undervolted to 1.00 and getting 23.38 MH/s.  What's your config?

same here. 23.4
witch is the right miner?
Yer not gonna get 30-33MHs right out of the box on 290/290x, you will have to tune intensity, gpu clock, mem clock (lowest possible).  Pallas can help with these cards if he's in right mood.
On 280x (1180/150) I was able to use my tuning from previous kernel to get 26.0MHs only because it was already maxed out Smiley Volt modded, vbios modded etc.  Info about these techniques is in the thread if you look about ...
As far as miner ... sgminer 4.1.0 (sph) is what I use ...

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
mitache365
Hero Member
*****
Online Online

Activity: 686


View Profile
January 13, 2015, 10:00:59 AM
 #226

I am at stock core 1070 and mem 1100. this is the difference maybe.

BTC
physixz
Newbie
*
Offline Offline

Activity: 13


View Profile
January 13, 2015, 10:03:36 AM
 #227

Can I ask what software you are using to change the values as I'm using msi after burner 4.1 but it wont change the memory clock and the core clock is always lower than what I set?

These changes are really pushing my cards now as they normally sat at 50'C but are now over 60'C (They are watercooled). I will put up power usage when i next reboot and plug the power meter in.
utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 13, 2015, 10:26:04 AM
 #228

Can I ask what software you are using to change the values as I'm using msi after burner 4.1 but it wont change the memory clock and the core clock is always lower than what I set?

These changes are really pushing my cards now as they normally sat at 50'C but are now over 60'C (They are watercooled). I will put up power usage when i next reboot and plug the power meter in.
Not sure if you can do this to a 290/290x card because vbios likely to be quite different.  You will have to do some research before you attempt my method usiing VBE7.0.0.7b.exe it is a video bios editor u can use to change voltages, clocks at board level.  If I remember correctly it was only for Tahiti cards ... do your research, then u flash vbios with atiwinflash.  There may be programs like msi afterburner but I did my card at low level Smiley Again check  out if it will work on your card before you do it or u can "brick" your card haha

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
January 13, 2015, 10:38:42 AM
 #229

Wow I got Best share: 702K
Was it a block?Huh :-D

Now let's get serious: I finally have a little time to write some considerations on the ocl and asm kernels.
I believe we should pursue the asm path for a number or reasons:

- currently the OCL kernel is a little faster on hawaii but not on all other cards and I don't think it can be improved in this respect
- the OCL kernel has been tweaked and optimized for months, while the asm one is new so there is probably much more room for improvement
- just by applying the first and last round optimization the asm kernel will probably be faster on hawaii as well; I'm sure that Realhet will find other asm tricks to apply
- with all these catalyst version problems, the best way to share kernels for the people to mine is by bin files, making the asm version and ocl equivalent (for distribution purposes); better yet would be a miner with all the bundled bin files (takes time)
- asm is cooler than ocl ;-)

what do you guys think?

utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 13, 2015, 10:41:51 AM
 #230

I am at stock core 1070 and mem 1100. this is the difference maybe.
Not to be condescending but have u tried on sgminer command line
--gpu-clock 1100 --mem-clock 150

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
physixz
Newbie
*
Offline Offline

Activity: 13


View Profile
January 13, 2015, 10:45:29 AM
 #231

Well the best i can run at without crashing is at 1040 core / 1250 memory as it wont go lower with a -0.055 core volt drop

1 card the rig pulls 510W at 28.5MH/s so 0.056MH per watt
2 cards the rig pulls 740W at 57MH/s so 0.077MH per watt
3 cards the rig pull 990W at 85.5MH/s so 0.086MH per watt

which is about 230 - 250W per card with 0.114MH per watt excluding the system use

if anybody can get a higher hash per watt then let me know

EDIT

Even at that rate with my electricity costs i cant make a profit...
utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 13, 2015, 10:45:34 AM
 #232

Wow I got Best share: 702K
Was it a block?Huh :-D

Now let's get serious: I finally have a little time to write some considerations on the ocl and asm kernels.
I believe we should pursue the asm path for a number or reasons:

- currently the OCL kernel is a little faster on hawaii but not on all other cards and I don't think it can be improved in this respect
- the OCL kernel has been tweaked and optimized for months, while the asm one is new so there is probably much more room for improvement
- just by applying the first and last round optimization the asm kernel will probably be faster on hawaii as well; I'm sure that Realhet will find other asm tricks to apply
- with all these catalyst version problems, the best way to share kernels for the people to mine is by bin files, making the asm version and ocl equivalent (for distribution purposes); better yet would be a miner with all the bundled bin files (takes time)
- asm is cooler than ocl ;-)

what do you guys think?
I'm all for sticking with asm route ... u need to feed your ocl tweaks to realhet and lets maximize asm kernel.
As I already suggested to realhet "cross-compile" to generate bins for all arch we support is possible, he needs our bins created on each arch to dig out minor diffs between bins.

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
Wolf0
Legendary
*
Offline Offline

Activity: 1610


Miner Developer


View Profile
January 13, 2015, 10:48:19 AM
 #233

Wow I got Best share: 702K
Was it a block?Huh :-D

Now let's get serious: I finally have a little time to write some considerations on the ocl and asm kernels.
I believe we should pursue the asm path for a number or reasons:

- currently the OCL kernel is a little faster on hawaii but not on all other cards and I don't think it can be improved in this respect
- the OCL kernel has been tweaked and optimized for months, while the asm one is new so there is probably much more room for improvement
- just by applying the first and last round optimization the asm kernel will probably be faster on hawaii as well; I'm sure that Realhet will find other asm tricks to apply
- with all these catalyst version problems, the best way to share kernels for the people to mine is by bin files, making the asm version and ocl equivalent (for distribution purposes); better yet would be a miner with all the bundled bin files (takes time)
- asm is cooler than ocl ;-)

what do you guys think?
I'm all for sticking with asm route ... u need to feed your ocl tweaks to realhet and lets maximize asm kernel.
As I already suggested to realhet "cross-compile" to generate bins for all arch we support is possible, he needs our bins created on each arch to dig out minour diffs between bins.

ASM route seems better.

Code:
Donations: BTC: 1WoLFdwcfNEg64fTYsX1P25KUzzSjtEZC -- XMR: 45SLUTzk7UXYHmzJ7bFN6FPfzTusdUVAZjPRgmEDw7G3SeimWM2kCdnDQXwDBYGUWaBtZNgjYtEYA22aMQT4t8KfU3vHLHG
mitache365
Hero Member
*****
Online Online

Activity: 686


View Profile
January 13, 2015, 02:59:31 PM
 #234

Can I ask what software you are using to change the values as I'm using msi after burner 4.1 but it wont change the memory clock and the core clock is always lower than what I set?

These changes are really pushing my cards now as they normally sat at 50'C but are now over 60'C (They are watercooled). I will put up power usage when i next reboot and plug the power meter in.

also afterburner but 14.4 driver for me

BTC
mitache365
Hero Member
*****
Online Online

Activity: 686


View Profile
January 13, 2015, 03:00:28 PM
 #235

I am at stock core 1070 and mem 1100. this is the difference maybe.
Not to be condescending but have u tried on sgminer command line
--gpu-clock 1100 --mem-clock 150

I will stay at core 1070(dont like to overclock) but will set mem at 150 to see the result.

edit. hm again 23.4 but lower temps. thats fine enough I think.

BTC
pallas
Legendary
*
Offline Offline

Activity: 1330


Black Belt Developer


View Profile
January 13, 2015, 03:04:19 PM
 #236

I am at stock core 1070 and mem 1100. this is the difference maybe.
Not to be condescending but have u tried on sgminer command line
--gpu-clock 1100 --mem-clock 150

I will stay at core 1070(dont like to overclock) but will set mem at 150 to see the result.

edit. hm again 23.4 but lower temps. thats fine enough I think.

lower mem clock = less power usage and bigger core overclock potential.

mitache365
Hero Member
*****
Online Online

Activity: 686


View Profile
January 13, 2015, 03:33:17 PM
 #237

hmm now I see miner still showing the stock memlock 1550 for 280x vapor. why its not changed? driver 14.4
I changed it from the batch file also from the miner later.

BTC
utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 13, 2015, 06:03:39 PM
 #238

hmm now I see miner still showing the stock memlock 1550 for 280x vapor. why its not changed? driver 14.4
I changed it from the batch file also from the miner later.
Why can't ppl read the thread ... u will have best performance with driver 14.7RC3 ...

Many 280x are locked to small range of adjustment on clocks (PowerColor 280x being one of them, that's why I had to low-level vbios mod mine.  Also many 280x will throttle gpu-clock at temps above 72C)

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
mitache365
Hero Member
*****
Online Online

Activity: 686


View Profile
January 13, 2015, 06:19:24 PM
 #239

hmm now I see miner still showing the stock memlock 1550 for 280x vapor. why its not changed? driver 14.4
I changed it from the batch file also from the miner later.
Why can't ppl read the thread ... u will have best performance with driver 14.7RC3 ...

Many 280x are locked to small range of adjustment on clocks (PowerColor 280x being one of them, that's why I had to low-level vbios mod mine.  Also many 280x will throttle gpu-clock at temps above 72C)

thread readed. tryed 14.4 14.6 14.7 14.9
can't put the cards at lower memlock than the stock 1550 Sad maybe really locked!

BTC
utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 13, 2015, 06:22:54 PM
 #240

Trust me 14.7RC3 is best.
Then u are unlucky enough to have a "Locked" card.
Only recourse for u is vbios modding your card if u want to lower memclock to 150 and be able to do higher overclock on gpu.
Do the research on vbios modding ... there are pointers in this thread by myself, I hate repeating my self a hundred times that's why the info is in thread.
https://bitcointalk.org/index.php?topic=779598.msg9043545#msg9043545

BTW I can still clock mem at 1625 via sgminer setting when I mine X11 or Neoscrypt with it ... just have to set it manual for them ...

Welcome to Extreme Diamond Mining LOL

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!