Bitcoin Forum
March 26, 2017, 09:29:54 AM *
News: Latest stable version of Bitcoin Core: 0.14.0  [Torrent]. (New!)
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 [10] 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
Author Topic: [ANN][GRS][DMD] Pallas optimized groestlcoin / diamond etc. opencl kernel  (Read 48900 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 12, 2015, 01:36:18 AM
 #181

I tested my kernel only in Cat 14.9
I have no info on how it works on 14.7

When you compile in HetPas it will generate a skeleton kernel binary with the help of the OpenCL compiler. And then the new assembly code will be PATCHED into that. So I don't make the binary from scratch and maybe the 14.7 binary is a bit different than the 14.9 binary and I just don't know about that. (Although life would be so much easier if AMD would be so kind and give us an interface to upload binary program code... But that's not going to happen Cheesy)


"Any tweaks you can do with..."

Please let's do the test inside the IDE first. Let's compare the original and the new kernel there, as it is perfect for timing. In sgminer we need to play with Intensity and other factors and wait for minutes to get a correct time anyways.

So please paste here what you see on HetPas on the right pane after you run the program:
I'm interested in this information, and also tell me what card and engine MHz you used:

Using new GCN ASM code
Kernel binary saved: C:\Work\Groestl\kernel_dump\kernel.elf

elapsed: 190.645 ms  13.750 MH/s   gain:   3.44x
elapsed: 188.281 ms  13.923 MH/s   gain:   3.48x
elapsed: 188.233 ms  13.927 MH/s   gain:   3.48x
elapsed: 188.316 ms  13.920 MH/s   gain:   3.48x

Functional test: RESULT IS OK

1490520594
Hero Member
*
Offline Offline

Posts: 1490520594

View Profile Personal Message (Offline)

Ignore
1490520594
Reply with quote  #2

1490520594
Report to moderator
1490520594
Hero Member
*
Offline Offline

Posts: 1490520594

View Profile Personal Message (Offline)

Ignore
1490520594
Reply with quote  #2

1490520594
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 12, 2015, 02:04:01 AM
 #182

Thanks!

Well this is kinda bad for a Tahiti :/

Also the times of the 4 kernel launches are weird:
On my card it is 3.44x, 3.48x, 3.48x, 3.48x
But on your card this is 3.88x, 3.10x, 3.10x, 3.10x

On my card the first launch is a bit slow because the card was at low MHz when the test started and after the warmip it became steady 3.48x.

On your card the speeds are so random. Your card (at 1150) is 3.68x faster than mine, so everything is ok, you should have see 12.8x gains.

Maybe it is a 14.7 issue, I don't know. Everything can change from driver to driver...

What is on my mind is:

1. What if you change workcount form the original
    WorkCount := 256*10*512
to WorkCount := 256*10*512*10;  ?
Does elapsed times became are 10x longer?  (Functional test will fail, ot's ok, just reset WorkCount to default value after this test)

2. Let's see how the original kernel works in HetPas:
  just comment out the  "#define USE_NEW_ASM_KERNEL" and let me see the times please. If the original kernel works well, then gain must be 3.68.


(Thank you for testing so far)

--------------------------------------------------------------------
"elapsed: 50.686 ms  51.719 MH/s   gain:  12.93x"
WOW! THIS IS IT! Cheesy:D:D
Exactly what I've expected! Your card is 3.71x faster. What was the error? You accidentally mined while testing, right?
utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 12, 2015, 02:15:04 AM
 #183

The last test run I did grabbed 2 cards so divide in half for an average on Tahiti (280x+7950).

Not the gains I was expecting base on you blog ... 3.4x times 18.5 MHs should net me around 62 Mhz vs the 26MHs I'm getting now ... so Tahiti not so great gains but better  Smiley

Short of pulling a card physically I don't know how to disable hetpas running all of them ...

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
Star65
Jr. Member
*
Offline Offline

Activity: 47


View Profile
January 12, 2015, 02:34:42 AM
 #184

I would also tested on 7970 & 280x.
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 12, 2015, 02:47:15 AM
 #185

There must be some missunderstandings based on MHs values. So we have to be careful!

On this topic (first post) when Pallas says that R9 280x is 18MH/s he counts it in Groestl hashes.

When my program says "elapsed: 50.686 ms  51.719 MH/s" it counts it also in Groestl hashes. Just as Pallas.

But when you see MH/s inside sgminer then it must be multiplied by 2 because in SG 1 MH/s = 2 MGroestlH/s.

--------------------------
So when you see "51.719 MH/s" is my program
then you must see 26MH/s in SG.

And when you see 18MH/s on the first post on this topic
You must see 9MH/s in SG.

Also when I see 4MH/s in my program
Then I saw 2MH/s in SG.
---------------------------

So the equation is: 2*sgminer Mh/s = Pallas's Mh/s

This is because sgminer counts 2 Groesth hash calculations as 1. But Pallas count it as 2 hashes, and I just copied Pallas, then later found out how sgminer calculates.

---------------------------
So the Tahiti 26MH/s in sgminer is correct. Please remove the kernel and let sgminer compile it form opencl! If I'm calculating well, then you must see 7-8MH/s with the original kernel. Can you check it please?


utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 12, 2015, 02:52:13 AM
 #186

When I run Pallas OCL I see 18.5MHs in sgminer.
When I run Realhet asm I see 26.0MHs in sgminer.

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 12, 2015, 03:11:14 AM
 #187

When I run Pallas OCL I see 18.5MHs in sgminer.
When I run Realhet asm I see 26.0MHs in sgminer.

Please send me that .cl file and the binary that is compiled by the sgminer, I gotta check it.

For today, Thank You for testing, I gotta sleep now, see you!
Star65
Jr. Member
*
Offline Offline

Activity: 47


View Profile
January 12, 2015, 04:33:54 AM
 #188

TVM Pallas and realhet for nice work!

7970/280x 1130/300 W7

Pallas kernel in Cat 14.6  - 17.8MH/s
Pallas kernel in Cat 14.9  - 7.8MH/s   - so 14.9 very bad drivers?!
Realhet kernel in Cat 14.9 - 24.8MH/s - 24.8/7.8=3.18x !!!

We need realhet kernel (bin) with Cat 14.6 or 14.7 (best drivers perhaps). But I do not know how to do it.

utahjohn
Hero Member
*****
Offline Offline

Activity: 616


View Profile WWW
January 12, 2015, 05:03:14 AM
 #189

14.9 has a piss poor OCL compiler, we've known this for a long time ... Stick with 14.7RC3 for best overall performance over many different algo's.

I guess we are stuck with compiling realhet asm on 14.9 but 14.7 does better compiles for OCL.

I am running realhet asm kernel generated with 14.9 on 14.7 catalyst, just a pain in the ass reverting to 14.7 after using 14.9.

My Pallas OCL compile was done with 14.7RC3 and works better than OCL compiled on 14.9.
Pallas ocl compiled with 14.7RC3 will run normal on 14.9, just don't re-compile it with 14.9 ...

Confused yet? hehe

@Realhet
So the gain of Realhet = 1.40x Pallas stands when comparing to properly working Pallas OCL kernel on 14.7
(Same clocks and Intensity running under 14.7 so a fair compare).
Your Pallas reference speed is incorrect in hetpas because 14.9 mangled the OCL badly performance wise.
Take a look at performance hit 14.7 vs 14.9 in Star65 post above.
Unfortunately some of the "gains" you made may have been just repairing 14.9 OCL bugs LOL but obviously improvement was made somewhere in asm kernel.
You need to establish a baseline for your GPU using 14.7 Pallas OCL and see what really made improvements ...
I suggest start over and use this first round a learning experience Smiley  You started with code broken by 14.9 compiler as a base ...

Pallas 14.7 OCL Bin for 280x 18.5 MHs
https://mega.co.nz/#!kAEnDATC!HeelwXTHDsQNx8WJhTDcwqS-slOmikoBiMqTEK9-DV0
Realhet 14.9 ASM bin for 280x 26.0 MHs
https://mega.co.nz/#!1NlRhYLC!7oLFfr2umL7T2Lc0fX3HY1ddthbpNqt6I_tYdG9OI9g

Another random thought Smiley Can you set hetpas up to "cross-compile" for diff GCN architectures so all we have to do is DL bin files from u to test them?  I really dislike uninst-inst-uninst-inst to try a new asm version on 14.7 ... For example have it compile Tahiti.elf, hawaii.elf etc.  I understand u can only test for your card but with us out here to test other elf would speed process of testing new versions ...

DMD Donations : dJrhv4Pp1FXPrQiEp5njx42QrZiuZrbjQ1

Block found and accepted  solo mining so your asm kernel appears to be valid Smiley

I'd like you to have a look see what you can do to further improve wolf0's neoscrypt kernel with asm when you get time.
7950 currently doing 278KHs mining FTC.  PM me for OCL and BIN.

DMD: dUTjohnrXHGYkh7jELWrZkGJbMnE6mdsuh (Staking)
BTC: 1HANJQygp3jHuzutceBgMT7wfCgEug6h4L (Donation)
ETH: 0xba90d7c1ab2bb9d5c07d843476153d1722637250 Mine ETH for 0.5% http://donkeypool.com
pallas
Legendary
*
Offline Offline

Activity: 1232


Black Belt Developer


View Profile
January 12, 2015, 09:18:29 AM
 #190

"when Pallas says that R9 280x is 18MH/s he counts it in Groestl hashes."

no my hashrates are taken from sgminer.

pallas
Legendary
*
Offline Offline

Activity: 1232


Black Belt Developer


View Profile
January 12, 2015, 09:20:38 AM
 #191

@pallas: Thanks for fiddling with Win7! Cheesy What does it means by 32 bit code? That has no meaning regarding the GCN hardware o.O
But I'm 100% sure that you can't use my Capeverde binary unless you have that chip in the device you selected. ( var dev:=cl.devices[CLdeviceIndex]; )

Bins generated by sgminer on a 32 bit system will not work on a 64 bit one and viceversa, so I suppose the same is true for your kernels.

sp_
Hero Member
*****
Offline Offline

Activity: 980

Ccminer developer


View Profile
January 12, 2015, 09:25:04 AM
 #192

@pallas: Thanks for fiddling with Win7! Cheesy What does it means by 32 bit code? That has no meaning regarding the GCN hardware o.O
But I'm 100% sure that you can't use my Capeverde binary unless you have that chip in the device you selected. ( var dev:=cl.devices[CLdeviceIndex]; )
Bins generated by sgminer on a 32 bit system will not work on a 64 bit one and viceversa, so I suppose the same is true for your kernels.

On linux yes, but on windows they work. You need to run the x86 build of sgminer.

BTC: 1CTiNJyoUmbdMRACtteRWXhGqtSETYd6Vd
pallas
Legendary
*
Offline Offline

Activity: 1232


Black Belt Developer


View Profile
January 12, 2015, 09:26:25 AM
 #193

@pallas: Thanks for fiddling with Win7! Cheesy What does it means by 32 bit code? That has no meaning regarding the GCN hardware o.O
But I'm 100% sure that you can't use my Capeverde binary unless you have that chip in the device you selected. ( var dev:=cl.devices[CLdeviceIndex]; )

Bins generated by sgminer on a 32 bit system will not work on a 64 bit one and viceversa, so I suppose the same is true for your kernels.

infact:

[10:25:27] Internal error: Input OpenCL binary is not for the target!

pallas
Legendary
*
Offline Offline

Activity: 1232


Black Belt Developer


View Profile
January 12, 2015, 09:30:29 AM
 #194

@pallas: Thanks for fiddling with Win7! Cheesy What does it means by 32 bit code? That has no meaning regarding the GCN hardware o.O
But I'm 100% sure that you can't use my Capeverde binary unless you have that chip in the device you selected. ( var dev:=cl.devices[CLdeviceIndex]; )

Bins generated by sgminer on a 32 bit system will not work on a 64 bit one and viceversa, so I suppose the same is true for your kernels.
Min end in l4.bin ... am I 32 or 64 ... (win 7 x64)

4 * 8 (bits) = 32

it's the size of a long integer.
probably the sgminer build you are using is 32 bit.

pallas
Legendary
*
Offline Offline

Activity: 1232


Black Belt Developer


View Profile
January 12, 2015, 10:04:38 AM
 #195

@pallas: Thanks for fiddling with Win7! Cheesy What does it means by 32 bit code? That has no meaning regarding the GCN hardware o.O
But I'm 100% sure that you can't use my Capeverde binary unless you have that chip in the device you selected. ( var dev:=cl.devices[CLdeviceIndex]; )

Bins generated by sgminer on a 32 bit system will not work on a 64 bit one and viceversa, so I suppose the same is true for your kernels.
Min end in l4.bin ... am I 32 or 64 ... (win 7 x64)

4 * 8 (bits) = 32

it's the size of a long integer.
probably the sgminer build you are using is 32 bit.
question is does hetpas use 32 or 64 bit ... I'd assume 32 bit since it runs ok on my sgminer ...
my sgminer is old 4.1.0 ...

so you main prob is needing hetpas src to run on linux ...

Probably realhet coded it for 32 bit; I don't know what changes, maybe the parameter passing part.
I hope realhet has time to look into this.
I also use version 4.1.
Hetpas can't run on linux: I'll try again with the new version when I can access my workstation and make it boot on windows.

JuanHungLo
Hero Member
*****
Offline Offline

Activity: 686


I don't always drink...


View Profile
January 12, 2015, 12:30:42 PM
 #196

I built my bins with Wolf0's x64 miner.  Works perfectly.

I'm glad I'm not judgmental like all you smug, superficial idiots
pallas
Legendary
*
Offline Offline

Activity: 1232


Black Belt Developer


View Profile
January 12, 2015, 12:37:57 PM
 #197

I built my bins with Wolf0's x64 miner.  Works perfectly.

could you share your bin files please?

JuanHungLo
Hero Member
*****
Offline Offline

Activity: 686


I don't always drink...


View Profile
January 12, 2015, 01:37:52 PM
 #198

I built my bins with Wolf0's x64 miner.  Works perfectly.

could you share your bin files please?

Personally, I wouldn't download this.  I'd generate my own.  But here it is.  Use at your own risk!
http://ge.tt/2uga0R82/v/0?c

I'm glad I'm not judgmental like all you smug, superficial idiots
pallas
Legendary
*
Offline Offline

Activity: 1232


Black Belt Developer


View Profile
January 12, 2015, 01:41:34 PM
 #199

I built my bins with Wolf0's x64 miner.  Works perfectly.

could you share your bin files please?

Personally, I wouldn't download this.  I'd generate my own.  But here it is.  Use at your own risk!
http://ge.tt/2uga0R82/v/0?c

Thanks, but it's 32 bit, I need 64 bit.

pallas
Legendary
*
Offline Offline

Activity: 1232


Black Belt Developer


View Profile
January 12, 2015, 01:51:21 PM
 #200

HOW TO TELL IF AN SGMINER BIN FILE IS 32 OR 64 BIT

If the filename, generated by sgminer, ends in l4.bin it is 32 bit (8 x 4 = 32)
If the filename, generated by sgminer, ends in l8.bin it is 64 bit (8 x 8 = 64)

They are incompatible.

Pages: « 1 2 3 4 5 6 7 8 9 [10] 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!