Bitcoin Forum
April 27, 2024, 03:45:38 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 »  All
  Print  
Author Topic: further improved phatk_dia kernel for Phoenix + SDK 2.6 - 2012-01-13  (Read 106673 times)
BOARBEAR
Member
**
Offline Offline

Activity: 77
Merit: 10


View Profile
July 23, 2011, 10:43:17 PM
 #221

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia

What's the rationale behind this?  It seems very weird to me that the compiler interpret the two statement differently.
1714189538
Hero Member
*
Offline Offline

Posts: 1714189538

View Profile Personal Message (Offline)

Ignore
1714189538
Reply with quote  #2

1714189538
Report to moderator
1714189538
Hero Member
*
Offline Offline

Posts: 1714189538

View Profile Personal Message (Offline)

Ignore
1714189538
Reply with quote  #2

1714189538
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714189538
Hero Member
*
Offline Offline

Posts: 1714189538

View Profile Personal Message (Offline)

Ignore
1714189538
Reply with quote  #2

1714189538
Report to moderator
indio007
Full Member
***
Offline Offline

Activity: 224
Merit: 100


View Profile
July 23, 2011, 11:49:00 PM
 #222

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia

Will it be slower on a 6850?
MiningBuddy
Hero Member
*****
Offline Offline

Activity: 927
Merit: 1000


฿itcoin ฿itcoin ฿itcoin


View Profile
July 24, 2011, 02:32:56 AM
 #223

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia
Thanks, gave me a 0.31 Mh/s increase per core on my 6990's  Cool

Vince
Newbie
*
Offline Offline

Activity: 38
Merit: 0


View Profile
July 24, 2011, 03:29:33 AM
 #224

thats really weired. I cant see why the compiler generates better code with this statement - but it does. Verified it with SDK 2.4 myself ..

Anyone noticed that during the last few rounds a lot of variables are calculated, but never used again?

I took the time to optimize all unused statements out, but got no speed improvements at all. Seems like the compiler optimized it anyway.

Here it is - from round 124 to the #ifdef VECTORS preprocessor command

Code:
        sharound(121);

        // Optimized out all unused calculations
        W[122 - O] = P4(122) + P3(122) + P2(122) + P1(122);
        Vals[1] += K[58] + Vals[5] + W[122 - O] + s1(122) + ch(122);
        Vals[0] += K[59] + Vals[4] + s1(123) + ch(123) + P4(123) + P3(123) + P2(123) + P1(123);
        Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

#ifdef VECTORS

Maybe someone can trick the compiler into making better code ;-)
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 24, 2011, 08:41:18 AM
 #225

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia

What's the rationale behind this?  It seems very weird to me that the compiler interpret the two statement differently.

You are not the only one Cheesy, but the compiler sais it's one ALU OP less!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 24, 2011, 08:46:06 AM
 #226

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia

Will it be slower on a 6850?

6850 is a VLIW5 design and will be slower ... really only 69XX cards!

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 24, 2011, 08:48:26 AM
 #227

thats really weired. I cant see why the compiler generates better code with this statement - but it does. Verified it with SDK 2.4 myself ..

Anyone noticed that during the last few rounds a lot of variables are calculated, but never used again?

I took the time to optimize all unused statements out, but got no speed improvements at all. Seems like the compiler optimized it anyway.

Here it is - from round 124 to the #ifdef VECTORS preprocessor command

Code:
        sharound(121);

        // Optimized out all unused calculations
        W[122 - O] = P4(122) + P3(122) + P2(122) + P1(122);
        Vals[1] += K[58] + Vals[5] + W[122 - O] + s1(122) + ch(122);
        Vals[0] += K[59] + Vals[4] + s1(123) + ch(123) + P4(123) + P3(123) + P2(123) + P1(123);
        Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

#ifdef VECTORS

Maybe someone can trick the compiler into making better code ;-)

Hey Vince, tried this by myself a few days ago and didn't get a better efficiency either ... perhaps I will have to throw it into the mixer again Cheesy.
And thanks again for your work!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 25, 2011, 02:03:38 PM
 #228

thats really weired. I cant see why the compiler generates better code with this statement - but it does. Verified it with SDK 2.4 myself ..

Anyone noticed that during the last few rounds a lot of variables are calculated, but never used again?

I took the time to optimize all unused statements out, but got no speed improvements at all. Seems like the compiler optimized it anyway.

Here it is - from round 124 to the #ifdef VECTORS preprocessor command

Code:
        sharound(121);

        // Optimized out all unused calculations
        W[122 - O] = P4(122) + P3(122) + P2(122) + P1(122);
        Vals[1] += K[58] + Vals[5] + W[122 - O] + s1(122) + ch(122);
        Vals[0] += K[59] + Vals[4] + s1(123) + ch(123) + P4(123) + P3(123) + P2(123) + P1(123);
        Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

#ifdef VECTORS

Maybe someone can trick the compiler into making better code ;-)

Hey Vince, tried this by myself a few days ago and didn't get a better efficiency either ... perhaps I will have to throw it into the mixer again Cheesy.
And thanks again for your work!

Dia

No chance, tried different things and combinations, but the OpenCL compiler does it better, than I do (again ^^).

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
iopq
Hero Member
*****
Offline Offline

Activity: 658
Merit: 500


View Profile
July 29, 2011, 02:07:11 PM
 #229

Phateus posted some improvements in his own kernel, check it out:
http://forum.bitcoin.org/index.php?topic=7964.0

unfortunately it doesn't run for me, so i can't check whether it's faster on my card
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 29, 2011, 06:48:07 PM
 #230

Phateus posted some improvements in his own kernel, check it out:
http://forum.bitcoin.org/index.php?topic=7964.0

unfortunately it doesn't run for me, so i can't check whether it's faster on my card

Thanks for pointing me to that thread!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
July 30, 2011, 06:14:41 PM
 #231

I'm working on a new version! The inputs came from the original Author of phatk, who released a version 2.0 of phatk (THANKS Phateus).
Currently my version IS slower, but I see this as a fair and cool competition, from which all of us will benefit in the end.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Tx2000
Full Member
***
Offline Offline

Activity: 182
Merit: 100



View Profile
July 30, 2011, 06:27:01 PM
 #232

I'm working on a new version! The inputs came from the original Author of phatk, who released a version 2.0 of phatk (THANKS Phateus).
Currently my version IS slower, but I see this as a fair and cool competition, from which all of us will benefit in the end.

Dia

Hopefully you can re-work his version because atm, it doesn't work for me on Win7/2.4SDK/GUIMiner lastest.  I replaced your phatk kernel with his and it just stays at connecting spamming idle worker in console.
pennytrader
Sr. Member
****
Offline Offline

Activity: 254
Merit: 250


View Profile
July 31, 2011, 11:12:27 AM
 #233

I'm working on a new version! The inputs came from the original Author of phatk, who released a version 2.0 of phatk (THANKS Phateus).
Currently my version IS slower, but I see this as a fair and cool competition, from which all of us will benefit in the end.

Dia

Actually your kernel is fast for me.

Catalyst 11.6 + SDK 2.1, 975/300 gives me 313 mhs (your kernel)

Catalyst 11.6 + SDK 2.4, 975/300 gives me 311 mhs (latest from the original author. SDK 2.1 doesn't work)

please donate to 1P3m2resGCP2o2sFX324DP1mfqHgGPA8BL
deepceleron
Legendary
*
Offline Offline

Activity: 1512
Merit: 1025



View Profile WWW
July 31, 2011, 07:14:16 PM
 #234

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.
ssateneth
Legendary
*
Offline Offline

Activity: 1344
Merit: 1004



View Profile
July 31, 2011, 11:26:38 PM
 #235

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.
this is the exact same problem I have too, but i use guiminer

Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
August 01, 2011, 01:57:33 PM
 #236

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.

Are you talking about my kernel mod or phatk 2.0? What are your command line options. Try to delete the *.elf files in the phatk directory!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
deepceleron
Legendary
*
Offline Offline

Activity: 1512
Merit: 1025



View Profile WWW
August 01, 2011, 05:17:10 PM
 #237

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.

Are you talking about my kernel mod or phatk 2.0? What are your command line options. Try to delete the *.elf files in the phatk directory!

Dia
Sorry for the confusion, that's regarding the 2.0 phatk that came this weekend.
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
August 01, 2011, 05:59:37 PM
 #238

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.

Are you talking about my kernel mod or phatk 2.0? What are your command line options. Try to delete the *.elf files in the phatk directory!

Dia
Sorry for the confusion, that's regarding the 2.0 phatk that came this weekend.

Okay, but then please don't use this thread for phatk 2.0 support Smiley. I think you understand that.

Regards,
Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
ssateneth
Legendary
*
Offline Offline

Activity: 1344
Merit: 1004



View Profile
August 02, 2011, 09:10:31 AM
 #239

Alright diapolo, you got your work cut out for you. Phateus fixed the guiminer bug (It was in his kernel. 1 too many #'s in __init__). You do great work; you might want to join forces at this point though Wink

Tx2000
Full Member
***
Offline Offline

Activity: 182
Merit: 100



View Profile
August 02, 2011, 08:21:31 PM
 #240

Alright diapolo, you got your work cut out for you. Phateus fixed the guiminer bug (It was in his kernel. 1 too many #'s in __init__). You do great work; you might want to join forces at this point though Wink

No way! Competition breeds innovation.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!