Bitcoin Forum
October 31, 2024, 01:46:54 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
Author Topic: further improved phatk_dia kernel for Phoenix + SDK 2.6 - 2012-01-13  (Read 106895 times)
Keninishna
Hero Member
*****
Offline Offline

Activity: 556
Merit: 500



View Profile
July 06, 2011, 02:53:22 PM
 #41

no change for me and my 6950s 11.6 drivers sdk 2.4
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 06, 2011, 04:43:35 PM
 #42

Another improvement but still not enough to beat SDK 2.1 for me.

The two phatk kernels of interest to me are:
- Kernel A = standard phatk kernel with the MA tweak applied.
- Kernel B = the latest kernel from this thread.

I'm using: Linux, Catalyst 11.6, a Sapphire HD5850 Xtreme:

At 900 MHz things look promising...

[900 MHz - 360 MHz RAM]
SDK 2.1, kernel A: 364.4 MH/s
SDK 2.1, kernel B: Fatal error
SDK 2.4, kernel A: 360.8 MH/s
SDK 2.4, kernel B: 365.6 MH/s

...but at higher core clock rates SDK 2.1 takes the lead once more.

[980 MHz - 360 MHz RAM]
SDK 2.1, kernel A: 404.7 MH/s
SDK 2.4, kernel B: 404.3 MH/s

[1020 MHz - 360 MHz RAM]
SDK 2.1, kernel A: 421.5 MH/s
SDK 2.4, kernel B: 420.9 MH/s

I would give some higher clocks but I can't go much past 1020 MHz without overvolting my card and I don't want to do that.

I tried playing with the RAM frequency but everything dropped off slowly as I lowered it and quickly as I raised it no matter which kernel or version of SDK I choose.

It's a shame that SDK 2.1 cannot drive your latest kernel but then I guess you are specifically designing it for SDK 2.4 and are now using features which are not available in SDK 2.1.  It would be great to finally put SDK 2.1 to bed but another MH/s sounds like a tall order at this point.

I have no data on accepts and rejects (I mine solo).
CYPER
Hero Member
*****
Offline Offline

Activity: 812
Merit: 502



View Profile
July 06, 2011, 04:50:18 PM
 #43

I'm using Autominer and the newest kernel does not work for me at all, but I don't have time to troubleshoot it as I'll miss on my mining and any expected rewards.
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 06, 2011, 05:13:20 PM
 #44

I'm using Autominer and the newest kernel does not work for me at all, but I don't have time to troubleshoot it as I'll miss on my mining and any expected rewards.

Possibly you're using an old version of SDK.  I get a fatal error when trying this with SDK 2.1 but am fine with SDK 2.4.
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 06, 2011, 07:07:04 PM
 #45

I made an interesting discovery during my own tests with the new kernel version. I had to up the memory clock of my 5870 from 200 to 350 MHz in order to achieve the highest hashing values. Another thing to mention is, that I drive a Phenom II X6 1090T with only 800 MHz for every core, due to power saving, while mining. If I let the CPU use full speed, MHash/s goes even higher, let's say 3-4 MH/s.

Conclusion: Perhaps you guys should try to raise your mem speeds + experiment with CPU clocks, too. I know it has to be a good balance, so that higher MH/s values are not eaten by higher energy costs.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 06, 2011, 07:15:24 PM
 #46

I made an interesting discovery during my own tests with the new kernel version. I had to up the memory clock of my 5870 from 200 to 350 MHz in order to achieve the highest hashing values. Another thing to mention is, that I drive a Phenom II X6 1090T with only 800 MHz for every core, due to power saving, while mining. If I let the CPU use full speed, MHash/s goes even higher, let's say 3-4 MH/s.

Conclusion: Perhaps you guys should try to raise your mem speeds + experiment with CPU clocks, too. I know it has to be a good balance, so that higher MH/s values are not eaten by higher energy costs.

Dia

My card RAM is already at 360 MHz and I've tested but I can't find a better frequency for the RAM at my core speeds if I'm only interested in MH/s.

As for CPU usage  I've not touched my CPU settings at all and the miners only use about 0.4% each.  I even removed the fan from the CPU and placed it to cool the back of my hot card (the heatsink on the CPU is not even warm).  I'm assuming significant CPU loads is a Windows thing.

What interests me is how SDK 2.1 seems to be better at higher clock speeds whereas SDK 2.4 with your kernel is better at moderate speeds (940 MHz or below).  I admit I have little data on this but if anyone else gets the same results it would be interesting to know why.
CYPER
Hero Member
*****
Offline Offline

Activity: 812
Merit: 502



View Profile
July 06, 2011, 07:18:31 PM
 #47

Possibly you're using an old version of SDK.  I get a fatal error when trying this with SDK 2.1 but am fine with SDK 2.4.


Well it worked with the previous version of the modified kernel.
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 06, 2011, 07:22:37 PM
 #48

Possibly you're using an old version of SDK.  I get a fatal error when trying this with SDK 2.1 but am fine with SDK 2.4.


Well it worked with the previous version of the modified kernel.

Yes, I found the previous version worked with SDK 2.1.  But the version released today doesn't.  I had to change to SDK 2.4 for this most recent version and this change actually lost me 0.4-0.6 MH/s.
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 06, 2011, 07:34:13 PM
 #49

Possibly you're using an old version of SDK.  I get a fatal error when trying this with SDK 2.1 but am fine with SDK 2.4.


Well it worked with the previous version of the modified kernel.

Yes, I found the previous version worked with SDK 2.1.  But the version released today doesn't.  I had to change to SDK 2.4 for this most recent version and this change actually lost me 0.4-0.6 MH/s.


I'm sorry to hear that ... what are the error messages you get with 2.1? I only tried with 2.4 and will test only on 2.4 and later SDKs by myself. Buf If you give me a hint I can try to fix it.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 06, 2011, 07:49:09 PM
 #50

I'm sorry to hear that ... what are the error messages you get with 2.1? I only tried with 2.4 and will test only on 2.4 and later SDKs by myself. Buf If you give me a hint I can try to fix it.

Dia

Don't worry about it.  The improvements for the SDK 2.4 users are clear and I'm impressed that you've managed to close the gap between 2.4 and 2.1 as much as you have.

I don't know how to get detailed error messages from phatk.  When I use SDK 2.1 and your latest kernel I run the command
python phoenix.py -u http://<user>:<pass>@<host>:<port>/ -a 1 -q 1 -k phatk VECTORS BFI_INT FASTLOOP=false AGGRESSION=14 WORKSIZE=256 DEVICE=1
and get
[<date> <time>] FATAL kernel error: Failed to load OpenCL kernel!

If I try the same with the previous version of your kernel everything works happily.  I wish I had more details for you but I just don't know how to get them.
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 06, 2011, 08:07:43 PM
 #51

I'm sorry to hear that ... what are the error messages you get with 2.1? I only tried with 2.4 and will test only on 2.4 and later SDKs by myself. Buf If you give me a hint I can try to fix it.

Dia

Don't worry about it.  The improvements for the SDK 2.4 users are clear and I'm impressed that you've managed to close the gap between 2.4 and 2.1 as much as you have.

I don't know how to get detailed error messages from phatk.  When I use SDK 2.1 and your latest kernel I run the command
python phoenix.py -u http://<user>:<pass>@<host>:<port>/ -a 1 -q 1 -k phatk VECTORS BFI_INT FASTLOOP=false AGGRESSION=14 WORKSIZE=256 DEVICE=1
and get
[<date> <time>] FATAL kernel error: Failed to load OpenCL kernel!

If I try the same with the previous version of your kernel everything works happily.  I wish I had more details for you but I just don't know how to get them.


If Phoenix would allow to output the OpenCL compiler build log we could get an idea what's wrong. Perhaps jedi95 reads here and takes this as a suggestion Cheesy.
Perhaps I can take the lead with 2.4 and newer versions of my kernel, but for now I have no huge optimization ideas ... (but I'm thinking about it right now ^^).

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 06, 2011, 08:19:16 PM
 #52

If Phoenix would allow to output the OpenCL compiler build log we could get an idea what's wrong. Perhaps jedi95 reads here and takes this as a suggestion Cheesy.
Perhaps I can take the lead with 2.4 and newer versions of my kernel, but for now I have no huge optimization ideas ... (but I'm thinking about it right now ^^).

Dia

Would I get more detailed feedback from another front-end to phatk?  I haven't really 'shopped around' with the front ends.
jedi95
Full Member
***
Offline Offline

Activity: 219
Merit: 120


View Profile
July 06, 2011, 08:53:03 PM
 #53

I'm sorry to hear that ... what are the error messages you get with 2.1? I only tried with 2.4 and will test only on 2.4 and later SDKs by myself. Buf If you give me a hint I can try to fix it.

Dia

Don't worry about it.  The improvements for the SDK 2.4 users are clear and I'm impressed that you've managed to close the gap between 2.4 and 2.1 as much as you have.

I don't know how to get detailed error messages from phatk.  When I use SDK 2.1 and your latest kernel I run the command
python phoenix.py -u http://<user>:<pass>@<host>:<port>/ -a 1 -q 1 -k phatk VECTORS BFI_INT FASTLOOP=false AGGRESSION=14 WORKSIZE=256 DEVICE=1
and get
[<date> <time>] FATAL kernel error: Failed to load OpenCL kernel!

If I try the same with the previous version of your kernel everything works happily.  I wish I had more details for you but I just don't know how to get them.


If Phoenix would allow to output the OpenCL compiler build log we could get an idea what's wrong. Perhaps jedi95 reads here and takes this as a suggestion Cheesy.
Perhaps I can take the lead with 2.4 and newer versions of my kernel, but for now I have no huge optimization ideas ... (but I'm thinking about it right now ^^).

Dia

There is no point trying to run phatk on pre-2.4 SDK versions. It will just end up being slower than the poclbm kernel.

For mining I see only 2 real options:
SDK 2.1 with poclbm
SDK 2.4 with phatk

2.2 is slower than 2.1 on poclbm and doesn't work well with phatk either.
2.3 is even slower than 2.2 on poclbm, but all I know with phatk is that it's slower than with 2.4

Anyway, getting the output from the compiler is very simple. You just need to comment out the try/except block surrounding self.loadKernel().

Phoenix Miner developer

Donations appreciated at:
1PHoenix9j9J3M6v3VQYWeXrHPPjf7y3rU
dishwara
Legendary
*
Offline Offline

Activity: 1855
Merit: 1016



View Profile
July 06, 2011, 09:02:09 PM
 #54

I am sure, it increases.
424-447 Mhash/s & 413 -430 Mhash/s
Sapphire 5870 & MSI 5870.
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 06, 2011, 09:07:25 PM
 #55

There is no point trying to run phatk on pre-2.4 SDK versions. It will just end up being slower than the poclbm kernel.

I read elsewhere that this is the theory but in practice phatk is faster than poclbm on SDK 2.1 for me.  Maybe this has something to do with the fact that I've applied the MA tweak (one less operation) to both kernels.

E.g. Sapphire HD5850 Xtreme 1000MHz core, 350MHz RAM, Catalyst 11.6 (Linux x86_64), VECTORS BFI_INT FASTLOOP=false AGGRESSION=13 WORKSIZE=256:
phatk: 413.3 MH/s (+/- 0.2 MH/s)
poclbm: 411.4 MH/s (+/- 0.2 MH/s)

I've tried lower core speeds and higher RAM speeds but always phatk outperforms poclbm on SDK 2.1 for me.

For mining I see only 2 real options:
SDK 2.1 with poclbm
SDK 2.4 with phatk

2.2 is slower than 2.1 on poclbm and doesn't work well with phatk either.
2.3 is even slower than 2.2 on poclbm, but all I know with phatk is that it's slower than with 2.4

Anyway, getting the output from the compiler is very simple. You just need to comment out the try/except block surrounding self.loadKernel().

I'll try that.
Wildvest
Newbie
*
Offline Offline

Activity: 41
Merit: 0


View Profile WWW
July 06, 2011, 09:18:42 PM
 #56

THANKS for your efforts  ! just reporting back  Cool

6990 version 2011-07-06 with Catalyst 11.4, SDK 2.4 now equal with the latest poclbm (phatk) - maybe 0.5 MH/s slower  Cry
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 06, 2011, 09:20:17 PM
 #57

Ok, here are the errors for the latest kernel on SDK 2.1.
{
Build on <pyopencl.Device 'Cypress' at 0x34a3680>:

/tmp/OCLthVTDN.cl(126): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[19] = P4(19) + 0x11002000 + P1(19);
                         ^

/tmp/OCLthVTDN.cl(138): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[30] = P3(30) + 0xA00055 + P1(30);
                         ^

/tmp/OCLthVTDN.cl(261): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        Vals[3] = L + W[64];
                      ^

/tmp/OCLthVTDN.cl(286): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[81] = P4(81) + P2(81) + 0xA00000;
                                  ^

/tmp/OCLthVTDN.cl(299): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[87] = P4(87) + P3(87) + 0x11002000 + P1(87);
                                  ^

/tmp/OCLthVTDN.cl(316): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[94] = P3(94) + 0x400022 + P1(94);
                         ^

6 errors detected in the compilation of "/tmp/OCLthVTDN.cl".
}

Hopefully this is just some implicit casting of a kind which SDK 2.1 wants to be babied through.  If I were versed in OpenCL I'd have a go at fixing this myself but I'm sure you would be much more efficient.
Alan Lupton
Newbie
*
Offline Offline

Activity: 42
Merit: 0


View Profile
July 06, 2011, 10:51:07 PM
 #58

2001-06-07: Wow, nice work! Now I'm getting not 5-15% rejections and working like a charm. No speed increase though from last update.
c_k
Donator
Full Member
*
Offline Offline

Activity: 242
Merit: 100



View Profile
July 07, 2011, 02:38:11 AM
 #59

New release gives me 2-3MH/s more

I've given a small donation Smiley

Thanks for the hard work!

Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 07, 2011, 06:02:05 AM
 #60

Ok, here are the errors for the latest kernel on SDK 2.1.
{
Build on <pyopencl.Device 'Cypress' at 0x34a3680>:

/tmp/OCLthVTDN.cl(126): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[19] = P4(19) + 0x11002000 + P1(19);
                         ^

/tmp/OCLthVTDN.cl(138): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[30] = P3(30) + 0xA00055 + P1(30);
                         ^

/tmp/OCLthVTDN.cl(261): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        Vals[3] = L + W[64];
                      ^

/tmp/OCLthVTDN.cl(286): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[81] = P4(81) + P2(81) + 0xA00000;
                                  ^

/tmp/OCLthVTDN.cl(299): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[87] = P4(87) + P3(87) + 0x11002000 + P1(87);
                                  ^

/tmp/OCLthVTDN.cl(316): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[94] = P3(94) + 0x400022 + P1(94);
                         ^

6 errors detected in the compilation of "/tmp/OCLthVTDN.cl".
}

Hopefully this is just some implicit casting of a kind which SDK 2.1 wants to be babied through.  If I were versed in OpenCL I'd have a go at fixing this myself but I'm sure you would be much more efficient.


You could try to add (u) in front of the raw hex values like (u)0x400022 and report back then.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!