Bitcoin Forum
September 14, 2024, 07:13:48 AM *
News: Latest Bitcoin Core release: 27.1 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 [4] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
Author Topic: further improved phatk_dia kernel for Phoenix + SDK 2.6 - 2012-01-13  (Read 106800 times)
dsky
Sr. Member
****
Offline Offline

Activity: 279
Merit: 250


View Profile
July 07, 2011, 07:22:21 AM
 #61

All miner are Windows 7 x32 - SDK 2.4 - Catalyst 11.6

Latest changes:
HD5770 - from 219 up to 220
HD6950 (unlockable) - from 367 to 370
HD6970 (6950 with 6950 BIOS) - from 405 up to 408

Small speed increase on all three kind of cards and the rejected rate seems better, too.

Well done again, Sir!

hugolp
Legendary
*
Offline Offline

Activity: 1148
Merit: 1001


Radix-The Decentralized Finance Protocol


View Profile
July 07, 2011, 07:40:36 AM
Last edit: July 07, 2011, 09:07:07 AM by hugolp
 #62

5870, Ubuntu 11.04, 11.6, 2.4, poclbm, went up 1MH/s (with last modification from previous modification).

The good news is the card that was randomly crashing the miner every 20 minutes with previous patch has been running for more than an hour without problems, so it seems stable now. Just crashed. I dont know what happens with this card and the modified kernel. Also, consumption has gone down like 5W. Im very puzzled by this changes in consumption by the different kernels.

Very good job. A small donation is going your way.


               ▄████████▄
               ██▀▀▀▀▀▀▀▀
              ██▀
             ███
▄▄▄▄▄       ███
██████     ███
    ▀██▄  ▄██
     ▀██▄▄██▀
       ████▀
        ▀█▀
The Radix DeFi Protocol is
R A D I X

███████████████████████████████████

The Decentralized

Finance Protocol
Scalable
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
██▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀██
██                   ██
██                   ██
████████████████     ██
██            ██     ██
██            ██     ██
██▄▄▄▄▄▄      ██     ██
██▀▀▀▀██      ██     ██
██    ██      ██     
██    ██      ██
███████████████████████

███
Secure
      ▄▄▄▄▄
    █████████
   ██▀     ▀██
  ███       ███

▄▄███▄▄▄▄▄▄▄███▄▄
██▀▀▀▀▀▀▀▀▀▀▀▀▀██
██             ██
██             ██
██             ██
██             ██
██             ██
██    ███████████

███
Community Driven
      ▄█   ▄▄
      ██ ██████▄▄
      ▀▀▄█▀   ▀▀██▄
     ▄▄ ██       ▀███▄▄██
    ██ ██▀          ▀▀██▀
    ██ ██▄            ██
   ██ ██████▄▄       ██▀
  ▄██       ▀██▄     ██
  ██▀         ▀███▄▄██▀
 ▄██             ▀▀▀▀
 ██▀
▄██
▄▄
██
███▄
▀███▄
 ▀███▄
  ▀████
    ████
     ████▄
      ▀███▄
       ▀███▄
        ▀████
          ███
           ██
           ▀▀

███
Radix is using our significant technology
innovations to be the first layer 1 protocol
specifically built to serve the rapidly growing DeFi.
Radix is the future of DeFi
█████████████████████████████████████

   ▄▄█████
  ▄████▀▀▀
  █████
█████████▀
▀▀█████▀▀
  ████
  ████
  ████

Facebook

███

             ▄▄
       ▄▄▄█████
  ▄▄▄███▀▀▄███
▀▀███▀ ▄██████
    █ ███████
     ██▀▀▀███
           ▀▀

Telegram

███

▄      ▄███▄▄
██▄▄▄ ██████▀
████████████
 ██████████▀
   ███████▀
 ▄█████▀▀

Twitter

██████

...Get Tokens...
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 07, 2011, 07:48:45 AM
 #63

Ok, here are the errors for the latest kernel on SDK 2.1.
{
Build on <pyopencl.Device 'Cypress' at 0x34a3680>:

/tmp/OCLthVTDN.cl(126): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[19] = P4(19) + 0x11002000 + P1(19);
                         ^

/tmp/OCLthVTDN.cl(138): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[30] = P3(30) + 0xA00055 + P1(30);
                         ^

/tmp/OCLthVTDN.cl(261): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        Vals[3] = L + W[64];
                      ^

/tmp/OCLthVTDN.cl(286): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[81] = P4(81) + P2(81) + 0xA00000;
                                  ^

/tmp/OCLthVTDN.cl(299): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[87] = P4(87) + P3(87) + 0x11002000 + P1(87);
                                  ^

/tmp/OCLthVTDN.cl(316): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[94] = P3(94) + 0x400022 + P1(94);
                         ^

6 errors detected in the compilation of "/tmp/OCLthVTDN.cl".
}

Hopefully this is just some implicit casting of a kind which SDK 2.1 wants to be babied through.  If I were versed in OpenCL I'd have a go at fixing this myself but I'm sure you would be much more efficient.


You could try to add (u) in front of the raw hex values like (u)0x400022 and report back then.

Dia

No good.  Now I get a ton of "expression must have a constant value" errors.  The end of the log looks like:
{
/tmp/OCLgV3our.cl(25): error: expression must have a constant value
        (u)0x6a09e667, (u)0xbb67ae85, (u)0x3c6ef372, (u)0x510e527f, (u)0x9b05688c, (u)0x1f83d9ab, (u)0xfc08884d, (u)0x5be0cd19
                                                                                                                    ^

/tmp/OCLgV3our.cl(29): error: expression must have a constant value
  __constant ulong L = (u)0x198c7e2a2;
                          ^

/tmp/OCLgV3our.cl(261): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        Vals[3] = L + W[64];
                      ^

74 errors detected in the compilation of "/tmp/OCLgV3our.cl".
}

Only one of the "mixed vector-scalar operation" errors remains but I'm guessing the others are still there but just buried by the even more urgent "constant value" errors.
OCedHrt
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
July 07, 2011, 08:32:42 AM
 #64

7/6 kernel seems to have the following effects for me:

4850 - elf kernel is much larger, but slight speed increase of ~3-4% from 56-57 @ 460 core to 58-59 @ 460 core. also seems to run cooler. since my card is overheating I have to underclock. cooler kernel means higher clock at 480 - getting 60. Using -v -w 128. worksize 128 is still optimal, 64 is slightly slower and 256 is much slower < 50mhash.

6450 - again elf kernel is much larger, but slight speed decrease at worksize 128. previous kernel was optimal at 128 but this one is optimal at 64. worksize 64 with new kernel is equivalent to worksize 128 with old kernel. potentially runs cooler but did not check. this is a fanless card but only gets 32mhash.

ALL.ME  ●●●  SOCIAL NETWORK OF THE BLOCKCHAIN TIME ●●●
▄▄▄▬▬▄▄▄  Bounty all.me ▶ Jan 29th - May 8th 2018  ▄▄▄▬▬▄▄▄
Facebook   ▲   Twitter   ▲   Telegram
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 07, 2011, 08:50:22 AM
 #65

You could try to add (u) in front of the raw hex values like (u)0x400022 and report back then.

Dia

Sorry about that last post.  I'm not usually that dumb I assure you.

I've modified your kernel code by adding (u) before each of the 5 raw hex values corresponding to the error messages.  I also added (u) directly before L from the other error message.  After this everything starts working in SDK 2.1.

For my stock voltage 5850:
423.7 (+/- 0.1) MH/s -> 425.9 (+/- 0.05) MH/s

This does of course mean that SDK 2.1 has increased its lead against SDK 2.4 for me.  So many people are convinced that SDK 2.4 is faster so perhaps this is a Windows/Linux thing.

If this runs for 24 hours without freezing then I have a new personal best!  I will want to test what proportion of these hashes are inaccurate but things are looking good.  Another donation is coming your way.
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 07, 2011, 09:46:52 AM
 #66

Ok, here are the errors for the latest kernel on SDK 2.1.
{
Build on <pyopencl.Device 'Cypress' at 0x34a3680>:

/tmp/OCLthVTDN.cl(126): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[19] = P4(19) + 0x11002000 + P1(19);
                         ^

/tmp/OCLthVTDN.cl(138): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[30] = P3(30) + 0xA00055 + P1(30);
                         ^

/tmp/OCLthVTDN.cl(261): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        Vals[3] = L + W[64];
                      ^

/tmp/OCLthVTDN.cl(286): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[81] = P4(81) + P2(81) + 0xA00000;
                                  ^

/tmp/OCLthVTDN.cl(299): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[87] = P4(87) + P3(87) + 0x11002000 + P1(87);
                                  ^

/tmp/OCLthVTDN.cl(316): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        W[94] = P3(94) + 0x400022 + P1(94);
                         ^

6 errors detected in the compilation of "/tmp/OCLthVTDN.cl".
}

Hopefully this is just some implicit casting of a kind which SDK 2.1 wants to be babied through.  If I were versed in OpenCL I'd have a go at fixing this myself but I'm sure you would be much more efficient.


You could try to add (u) in front of the raw hex values like (u)0x400022 and report back then.

Dia

No good.  Now I get a ton of "expression must have a constant value" errors.  The end of the log looks like:
{
/tmp/OCLgV3our.cl(25): error: expression must have a constant value
        (u)0x6a09e667, (u)0xbb67ae85, (u)0x3c6ef372, (u)0x510e527f, (u)0x9b05688c, (u)0x1f83d9ab, (u)0xfc08884d, (u)0x5be0cd19
                                                                                                                    ^

/tmp/OCLgV3our.cl(29): error: expression must have a constant value
  __constant ulong L = (u)0x198c7e2a2;
                          ^

/tmp/OCLgV3our.cl(261): error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
        Vals[3] = L + W[64];
                      ^

74 errors detected in the compilation of "/tmp/OCLgV3our.cl".
}

Only one of the "mixed vector-scalar operation" errors remains but I'm guessing the others are still there but just buried by the even more urgent "constant value" errors.


Ah sorry, I was not clear enough. You must not add (u) in front of every hex value in the kernel, but ONLY in front of the hex values, that generated an error.

Code:
W[19] = P4(19) + (u)0x11002000 + P1(19);

W[30] = P3(30) + (u)0xA00055 + P1(30);

Vals[3] = (u)L + W[64];

W[81] = P4(81) + P2(81) + (u)0xA00000;

W[87] = P4(87) + P3(87) + (u)0x11002000 + P1(87);

W[94] = P3(94) + (u)0x400022 + P1(94);

If you can be so kind and test this out and report back. I would say restore the latest kernel and then modifiy the 6 places.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 07, 2011, 09:58:45 AM
 #67

Ah sorry, I was not clear enough. You must not add (u) in front of every hex value in the kernel, but ONLY in front of the hex values, that generated an error.

You were perfectly clear, I was just being dumb.  Incase you missed my second post, this fix works for SDK 2.1.  Thank you very much.
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 07, 2011, 10:06:00 AM
 #68

Great, so we have a fix and a version that works with 2.1. Will release a fix later today!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
n4l3hp
Full Member
***
Offline Offline

Activity: 173
Merit: 100


View Profile
July 07, 2011, 10:44:05 AM
 #69

7/6 kernel seems to have the following effects for me:

4850 - elf kernel is much larger, but slight speed increase of ~3-4% from 56-57 @ 460 core to 58-59 @ 460 core. also seems to run cooler. since my card is overheating I have to underclock. cooler kernel means higher clock at 480 - getting 60. Using -v -w 128. worksize 128 is still optimal, 64 is slightly slower and 256 is much slower < 50mhash.

6450 - again elf kernel is much larger, but slight speed decrease at worksize 128. previous kernel was optimal at 128 but this one is optimal at 64. worksize 64 with new kernel is equivalent to worksize 128 with old kernel. potentially runs cooler but did not check. this is a fanless card but only gets 32mhash.

My 4850 @ 675 core 250 mem gets 85MH/s. 0.32% stale rate at DeepBit. (bought from eBay, dont know what brand, came with zalman cooler. anything higher than 680 core will cause it to stop hashing even if overvolted). Temps at 71 degrees celsius, closed case. Been running Milkyway@Home for more than a year at the same settings before I switched it to bitcoin mining.

For ATI 4000 series, use SDK 2.1 and poclbm (April 28 version). Using phatk and higher opencl sdk version on these cards will only lower the hash rate.
OCedHrt
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
July 07, 2011, 12:11:48 PM
 #70

7/6 kernel seems to have the following effects for me:

4850 - elf kernel is much larger, but slight speed increase of ~3-4% from 56-57 @ 460 core to 58-59 @ 460 core. also seems to run cooler. since my card is overheating I have to underclock. cooler kernel means higher clock at 480 - getting 60. Using -v -w 128. worksize 128 is still optimal, 64 is slightly slower and 256 is much slower < 50mhash.

6450 - again elf kernel is much larger, but slight speed decrease at worksize 128. previous kernel was optimal at 128 but this one is optimal at 64. worksize 64 with new kernel is equivalent to worksize 128 with old kernel. potentially runs cooler but did not check. this is a fanless card but only gets 32mhash.

My 4850 @ 675 core 250 mem gets 85MH/s. 0.32% stale rate at DeepBit. (bought from eBay, dont know what brand, came with zalman cooler. anything higher than 680 core will cause it to stop hashing even if overvolted). Temps at 71 degrees celsius, closed case. Been running Milkyway@Home for more than a year at the same settings before I switched it to bitcoin mining.

For ATI 4000 series, use SDK 2.1 and poclbm (April 28 version). Using phatk and higher opencl sdk version on these cards will only lower the hash rate.

You misread my post. I am running at 460 core because the card is 105C at that speed. I cannot run it any faster. I can run 480 core with this new kernel. Btw, the days of SDK 2.1 and poclbm are nearly over. I get 84MH/s at 675 core and 494 mem. I can't do 250 mem the card doesn't downclock that far with afterburner. At 250 mem it would be even higher. However I can actually clock 700+ even though only for a few seconds.

ALL.ME  ●●●  SOCIAL NETWORK OF THE BLOCKCHAIN TIME ●●●
▄▄▄▬▬▄▄▄  Bounty all.me ▶ Jan 29th - May 8th 2018  ▄▄▄▬▬▄▄▄
Facebook   ▲   Twitter   ▲   Telegram
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 07, 2011, 03:20:41 PM
 #71

New version 2011-07-07 is ready: http://www.mediafire.com/?7j70gnmllgi9b73

This is mainly a bugfix release for SDK 2.1 with some code restructuring to save a few writes and additions. I can not guarantee, that this really works for 2.1, because I didn't test it. If you are unsure, wait for users to test it for you and consider applying this patch later!

By the way, I want to thank all of those who donated a few Bitcents to me, feels great!

Thanks,
Dia

PS.: If it works, please post here and consider a small donation @ 1B6LEGEUu1USreFNaUfvPWLu6JZb7TLivM Smiley.

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Saturn7
Full Member
***
Offline Offline

Activity: 147
Merit: 100



View Profile
July 07, 2011, 03:29:29 PM
 #72

Went from 433/Mhash to 440/Mhash on 5870. Overclocked to 970Mhz.
Thanks  Smiley Donation sent.

First there was Fire, then Electricity, and now Bitcoins Wink
BOARBEAR
Member
**
Offline Offline

Activity: 77
Merit: 10


View Profile
July 07, 2011, 03:30:10 PM
 #73

This kernel will cause poclbm to exit after running for a while.
I have tested several times, after few hours poclbm will be gone and my machine left there doing nothing

AMD5870 with SDK 2.5
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 07, 2011, 03:37:03 PM
 #74

This kernel will cause poclbm to exit after running for a while.
I have tested several times, after few hours poclbm will be gone and my machine left there doing nothing

AMD5870 with SDK 2.5

That's the first report I get with that problem. Any other poclbm users with that observation?

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
hugolp
Legendary
*
Offline Offline

Activity: 1148
Merit: 1001


Radix-The Decentralized Finance Protocol


View Profile
July 07, 2011, 03:45:03 PM
 #75

This kernel will cause poclbm to exit after running for a while.
I have tested several times, after few hours poclbm will be gone and my machine left there doing nothing

AMD5870 with SDK 2.5

That's the first report I get with that problem. Any other poclbm users with that observation?

Dia

I reported it twice in this same thread...

It only happens in one of my cards (I have 4 5870's, and I have one that its an identical model that works fine). In phoenix what happens is that it will give a mistake about a kernel mistake and will continue mining (I suspect phoenix reloads the kernel).


               ▄████████▄
               ██▀▀▀▀▀▀▀▀
              ██▀
             ███
▄▄▄▄▄       ███
██████     ███
    ▀██▄  ▄██
     ▀██▄▄██▀
       ████▀
        ▀█▀
The Radix DeFi Protocol is
R A D I X

███████████████████████████████████

The Decentralized

Finance Protocol
Scalable
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
██▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀██
██                   ██
██                   ██
████████████████     ██
██            ██     ██
██            ██     ██
██▄▄▄▄▄▄      ██     ██
██▀▀▀▀██      ██     ██
██    ██      ██     
██    ██      ██
███████████████████████

███
Secure
      ▄▄▄▄▄
    █████████
   ██▀     ▀██
  ███       ███

▄▄███▄▄▄▄▄▄▄███▄▄
██▀▀▀▀▀▀▀▀▀▀▀▀▀██
██             ██
██             ██
██             ██
██             ██
██             ██
██    ███████████

███
Community Driven
      ▄█   ▄▄
      ██ ██████▄▄
      ▀▀▄█▀   ▀▀██▄
     ▄▄ ██       ▀███▄▄██
    ██ ██▀          ▀▀██▀
    ██ ██▄            ██
   ██ ██████▄▄       ██▀
  ▄██       ▀██▄     ██
  ██▀         ▀███▄▄██▀
 ▄██             ▀▀▀▀
 ██▀
▄██
▄▄
██
███▄
▀███▄
 ▀███▄
  ▀████
    ████
     ████▄
      ▀███▄
       ▀███▄
        ▀████
          ███
           ██
           ▀▀

███
Radix is using our significant technology
innovations to be the first layer 1 protocol
specifically built to serve the rapidly growing DeFi.
Radix is the future of DeFi
█████████████████████████████████████

   ▄▄█████
  ▄████▀▀▀
  █████
█████████▀
▀▀█████▀▀
  ████
  ████
  ████

Facebook

███

             ▄▄
       ▄▄▄█████
  ▄▄▄███▀▀▄███
▀▀███▀ ▄██████
    █ ███████
     ██▀▀▀███
           ▀▀

Telegram

███

▄      ▄███▄▄
██▄▄▄ ██████▀
████████████
 ██████████▀
   ███████▀
 ▄█████▀▀

Twitter

██████

...Get Tokens...
phorensic
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500



View Profile
July 07, 2011, 03:56:54 PM
 #76

Excellent work on this kernel.  It seems like the original author got to a point where he thought he had improved performance to almost the max, but you are progressing very nicely!
CYPER
Hero Member
*****
Offline Offline

Activity: 812
Merit: 502



View Profile
July 07, 2011, 07:16:33 PM
 #77

Well finally I can see some improvement.

With todays version I got from 1748 to 1758 so ~10Mhash increase.

This is for 4x 5870 @ 960Mhz Core & 300Mhz Memory
SDK 2.1
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 07, 2011, 08:01:50 PM
 #78

Well finally I can see some improvement.

With todays version I got from 1748 to 1758 so ~10Mhash increase.

This is for 4x 5870 @ 960Mhz Core & 300Mhz Memory
SDK 2.1


Seems like good news for SDK 2.1 users, right Wink?

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
CYPER
Hero Member
*****
Offline Offline

Activity: 812
Merit: 502



View Profile
July 07, 2011, 08:11:12 PM
 #79

Well to be perfectly honest and objective 0.5% increase won't make any difference even with my setup  Roll Eyes At least not in terms of financial benefits Smiley

But I don't mean to belittle your work - well done Smiley
erek
Newbie
*
Offline Offline

Activity: 36
Merit: 0


View Profile
July 07, 2011, 09:40:20 PM
 #80

808.6 MH/sec max now (up 1-2% at least) from 7-6-11 to 7-7-11
Pages: « 1 2 3 [4] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!