Bitcoin Forum
April 25, 2024, 07:09:15 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 [13] 14 15 16 17 18 19 20 »  All
  Print  
Author Topic: [ANN][GRS][DMD][DGB] Pallas optimized groestl opencl kernels  (Read 61214 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
January 15, 2015, 09:26:14 AM
 #241

This one is going to take a lot of learning for me (I'm 53yo LOL learning takes more time for me hehehe) ... I'm going to copy asm src to flash drive and print at apt complex office so it's a bit easier for me to follow through.  Can u send me your latest greatest fastest OCL and I'll get that printed too ...

Funny u can only get 1 card to run, I have 280x as card 0, 7950 as card 1 and they both run kernel fine ...
Just an after thought, I have not tried with single instance of sgminer controlling both cards, I run an instance of sgminer for each card individually, so I do not know if this problem affects me ...
1714028955
Hero Member
*
Offline Offline

Posts: 1714028955

View Profile Personal Message (Offline)

Ignore
1714028955
Reply with quote  #2

1714028955
Report to moderator
Transactions must be included in a block to be properly completed. When you send a transaction, it is broadcast to miners. Miners can then optionally include it in their next blocks. Miners will be more inclined to include your transaction if it has a higher transaction fee.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
January 15, 2015, 10:28:31 AM
 #242

@realhet
can u make the right hand pane detachable in hetpas?  I like to use multiple monitors and have a full screen for IDE ...

Also would really appreciate if you could do the finishing touches on first/last pass as neither of us are up to speed on asm yet and could be quite a while ...
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
January 15, 2015, 12:25:43 PM
 #243

Oh, that's a pity you moved away...
Not sure I like the idea of learning another asm, even if it's very cool!
I understand the first and last round optimizations are boring to do, but could you please, before leaving us, fix the problem with multiple cards? Where card 0 doesn't provide any work unit while card 1 works fine? Thanks!

I built my bin as posted using Hetpass.
Two machines.
Two 7950s reference cards in one.
And a Dualx  7950 in the other.
All Sapphires. I use Sgminer 4.1 the original if you will.
Not sgminer 5.1. Too many bells and whistles.
Hetpass said these cards should do soo many hashes and it is correct.
I run two cards in one machine with out any problems.
.......

On my machine, card 0 hashed fine but no work submitted (WU=0).
Card 1 had normal WU.
I'm using 4.1 as well.
Never had this problem with any kernel before.

Toninho
Hero Member
*****
Offline Offline

Activity: 597
Merit: 500


View Profile
January 17, 2015, 06:32:28 AM
 #244

Hi, i need Miner for Nvidia 560 TI DS Work about 3600 kh/s but in Groestl 100%  in --algo=dmd-gr  =  Bommm ...Bommm  reject  do not understand ??'
M1ST3R
Member
**
Offline Offline

Activity: 89
Merit: 10


View Profile
January 17, 2015, 10:56:11 PM
 #245

Hi, i need Miner for Nvidia 560 TI DS Work about 3600 kh/s but in Groestl 100%  in --algo=dmd-gr  =  Bommm ...Bommm  reject  do not understand ??'

Hello, anyone home??? This is opencl kernel which is for AMD gpu not Nvidia.
qaz6767
Full Member
***
Offline Offline

Activity: 151
Merit: 100


View Profile
January 20, 2015, 10:27:50 AM
 #246

Dobrii den! Podskajite v shapke novii fail .cl ? Kotorii vidaet 30Mh na 290 karte? Spasibo
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
February 19, 2015, 03:01:59 PM
 #247

Is there still interest in this? Utahjohn and.... :-D
I could dedicate some time to finish the opensource kernel v2 if it's worth.

berbip
Member
**
Offline Offline

Activity: 143
Merit: 10


View Profile
February 19, 2015, 04:58:51 PM
 #248

Is there still interest in this? Utahjohn and.... :-D
I could dedicate some time to finish the opensource kernel v2 if it's worth.

Oh, i'm totally interested Smiley
Thanks for your great work btw
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
February 25, 2015, 01:02:39 PM
Last edit: February 25, 2015, 04:03:54 PM by pallas
 #249

experimental new bin for Hawaii (r9 290/290X) only:

https://dl.dropboxusercontent.com/u/40353042/Diamond/diamondHawaiiw128l8.bin

use worksize 128.

this is my opencl kernel, tweaked for speed and compatibility.
please report hashrates and show your support!

Star65
Member
**
Offline Offline

Activity: 109
Merit: 13


View Profile
February 25, 2015, 02:27:00 PM
 #250

Thx pallas! But im out of the game cos i have 280s only. Utahjohn too (i think so).
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
February 25, 2015, 02:43:16 PM
 #251

Thx pallas! But im out of the game cos i have 280s only. Utahjohn too (i think so).

the new kernel should make no difference on tahiti cards, but we will eventually make some tests later anyway: the reason is compatibility with newer drivers.

utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
February 25, 2015, 07:26:18 PM
 #252

Thx pallas! But im out of the game cos i have 280s only. Utahjohn too (i think so).
Indeed 280x only here, I am in talks with Pallas to work on this further Smiley
M1ST3R
Member
**
Offline Offline

Activity: 89
Merit: 10


View Profile
February 27, 2015, 03:44:29 AM
 #253

experimental new bin for Hawaii (r9 290/290X) only:

https://dl.dropboxusercontent.com/u/40353042/Diamond/diamondHawaiiw128l8.bin

use worksize 128.

this is my opencl kernel, tweaked for speed and compatibility.
please report hashrates and show your support!

Hi Pallas,

The bin file is not working on both the sgminer 4.1.0 from Diamond website and sgminer 5 from Wolf0.
After the ...kernel is experimental... display, both sgminer version either hanged or display black screen.
Maybe the sgminer needs the specific v2 diamond.cl file to function properly.

BTW, unlike v1, changing the name of your .bin file to match the one sgminer generated does not work either.
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
February 27, 2015, 09:30:03 AM
 #254

experimental new bin for Hawaii (r9 290/290X) only:

https://dl.dropboxusercontent.com/u/40353042/Diamond/diamondHawaiiw128l8.bin

use worksize 128.

this is my opencl kernel, tweaked for speed and compatibility.
please report hashrates and show your support!

Hi Pallas,

The bin file is not working on both the sgminer 4.1.0 from Diamond website and sgminer 5 from Wolf0.
After the ...kernel is experimental... display, both sgminer version either hanged or display black screen.
Maybe the sgminer needs the specific v2 diamond.cl file to function properly.

BTW, unlike v1, changing the name of your .bin file to match the one sgminer generated does not work either.

the binary can work without the sources.
check that you are running a 64 bit miner (the official diamond miner is 32 bit), that you are using worksize 128 and that you are setting the correct bin file name.
and of course that you have a hawaii card! :-)

utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
February 27, 2015, 09:33:34 AM
 #255

experimental new bin for Hawaii (r9 290/290X) only:

https://dl.dropboxusercontent.com/u/40353042/Diamond/diamondHawaiiw128l8.bin

use worksize 128.

this is my opencl kernel, tweaked for speed and compatibility.
please report hashrates and show your support!

Hi Pallas,

The bin file is not working on both the sgminer 4.1.0 from Diamond website and sgminer 5 from Wolf0.
After the ...kernel is experimental... display, both sgminer version either hanged or display black screen.
Maybe the sgminer needs the specific v2 diamond.cl file to function properly.

BTW, unlike v1, changing the name of your .bin file to match the one sgminer generated does not work either.

the binary can work without the sources.
check that you are running a 64 bit miner (the official diamond miner is 32 bit), that you are using worksize 128 and that you are setting the correct bin file name.
and of course that you have a hawaii card! :-)
Good reason to have OCL source LOL, I am 64 bit OS but my miner is 32 bit ...
In my experience blackscreen or just hung miner indicates too much O/C or memclock not set as recommended (GPU crash before it can be reported) ...

Without the OCL source I can not test on Tahiti properly so I have no definte answer ... u did not specify OS, config etc so a bit hard to troubleshoot ...
sammy007
Legendary
*
Offline Offline

Activity: 1904
Merit: 1003


View Profile
February 27, 2015, 04:35:20 PM
 #256

Very nice results with 290(X), any chance for 280x gain?
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
February 27, 2015, 04:41:36 PM
 #257

Very nice results with 290(X), any chance for 280x gain?

In order to do the same optimizations on 280(x), the code would need to be almost completely rewritten to work on 32 bit numbers instead of 64 (because of lds usage), hoping for the vgprs count (which is mostly in compiler control and very difficult to reduce by modifying the opencl code) to be low enough to permit 2 wavefronts.
Or, maybe, a better compiler in the future could do it by itself.
As of now, I think the best is using the asm version for 280(x) and my last binary for 290(x).

utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
February 27, 2015, 05:00:08 PM
 #258

Very nice results with 290(X), any chance for 280x gain?

In order to do the same optimizations on 280(x), the code would need to be almost completely rewritten to work on 32 bit numbers instead of 64 (because of lds usage), hoping for the vgprs count (which is mostly in compiler control and very difficult to reduce by modifying the opencl code) to be low enough to permit 2 wavefronts.
Or, maybe, a better compiler in the future could do it by itself.
As of now, I think the best is using the asm version for 280(x) and my last binary for 290(x).
Why do you think this, ASM version is driver independent and relies on directly coding for GPU, there is very little difference between 280x and 290, 290 has more shaders, true.   Buts basic code optimize such as your first/last pass should work in ASM just as well or better considering that AMD lobotomized OCL compiler after 14.7
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
February 27, 2015, 09:57:54 PM
 #259

Very nice results with 290(X), any chance for 280x gain?

In order to do the same optimizations on 280(x), the code would need to be almost completely rewritten to work on 32 bit numbers instead of 64 (because of lds usage), hoping for the vgprs count (which is mostly in compiler control and very difficult to reduce by modifying the opencl code) to be low enough to permit 2 wavefronts.
Or, maybe, a better compiler in the future could do it by itself.
As of now, I think the best is using the asm version for 280(x) and my last binary for 290(x).
Why do you think this, ASM version is driver independent and relies on directly coding for GPU, there is very little difference between 280x and 290, 290 has more shaders, true.   Buts basic code optimize such as your first/last pass should work in ASM just as well or better considering that AMD lobotomized OCL compiler after 14.7

14.12 is the first version making Hawaii specific code which, in some cases, may bring sensible improvements.
On Tahiti, the compiler simply can't make code capable of running 2 wavefronts. Or maybe it can but I'm not able to make it do it, on Hawaii I can instead.
Hawaii is not just Tahiti with more shaders...

utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
February 27, 2015, 10:11:06 PM
Last edit: February 27, 2015, 10:40:33 PM by utahjohn
 #260

Very nice results with 290(X), any chance for 280x gain?

In order to do the same optimizations on 280(x), the code would need to be almost completely rewritten to work on 32 bit numbers instead of 64 (because of lds usage), hoping for the vgprs count (which is mostly in compiler control and very difficult to reduce by modifying the opencl code) to be low enough to permit 2 wavefronts.
Or, maybe, a better compiler in the future could do it by itself.
As of now, I think the best is using the asm version for 280(x) and my last binary for 290(x).
Why do you think this, ASM version is driver independent and relies on directly coding for GPU, there is very little difference between 280x and 290, 290 has more shaders, true.   Buts basic code optimize such as your first/last pass should work in ASM just as well or better considering that AMD lobotomized OCL compiler after 14.7

14.12 is the first version making Hawaii specific code which, in some cases, may bring sensible improvements.
On Tahiti, the compiler simply can't make code capable of running 2 wavefronts. Or maybe it can but I'm not able to make it do it, on Hawaii I can instead.
Hawaii is not just Tahiti with more shaders...
What are the differences, 280x (Tahiti)  can do multiple gpu-threads on many other coins (up to 4 gpu-threads on x11) with great efficiency, I do not understand why groesl can not.
Forgive me for asking such questions, but like my question about neoscrypt (which performs best with only 1 gpu-thread)  WS being totally tuned by amount of shaders ...
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 [13] 14 15 16 17 18 19 20 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!