Bitcoin Forum
December 03, 2016, 09:55:46 PM *
News: To be able to use the next phase of the beta forum software, please ensure that your email address is correct/functional.
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 »  All
  Print  
Author Topic: further improved phatk_dia kernel for Phoenix + SDK 2.6 - 2012-01-13  (Read 101386 times)
BOARBEAR
Member
**
Offline Offline

Activity: 77


View Profile
July 23, 2011, 10:43:17 PM
 #221

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia

What's the rationale behind this?  It seems very weird to me that the compiler interpret the two statement differently.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
indio007
Full Member
***
Offline Offline

Activity: 210


View Profile
July 23, 2011, 11:49:00 PM
 #222

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia

Will it be slower on a 6850?
MiningBuddy
Moderator
Legendary
*
Offline Offline

Activity: 1058


฿itcoin ฿itcoin ฿itcoin


View Profile
July 24, 2011, 02:32:56 AM
 #223

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia
Thanks, gave me a 0.31 Mh/s increase per core on my 6990's  Cool

Vince
Jr. Member
*
Offline Offline

Activity: 38


View Profile
July 24, 2011, 03:29:33 AM
 #224

thats really weired. I cant see why the compiler generates better code with this statement - but it does. Verified it with SDK 2.4 myself ..

Anyone noticed that during the last few rounds a lot of variables are calculated, but never used again?

I took the time to optimize all unused statements out, but got no speed improvements at all. Seems like the compiler optimized it anyway.

Here it is - from round 124 to the #ifdef VECTORS preprocessor command

Code:
        sharound(121);

        // Optimized out all unused calculations
        W[122 - O] = P4(122) + P3(122) + P2(122) + P1(122);
        Vals[1] += K[58] + Vals[5] + W[122 - O] + s1(122) + ch(122);
        Vals[0] += K[59] + Vals[4] + s1(123) + ch(123) + P4(123) + P3(123) + P2(123) + P1(123);
        Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

#ifdef VECTORS

Maybe someone can trick the compiler into making better code ;-)

Like what I'm posting? -> 1DtHZgVufX1tc5jRCoNiUnygenwRWj939C
Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
July 24, 2011, 08:41:18 AM
 #225

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia

What's the rationale behind this?  It seems very weird to me that the compiler interpret the two statement differently.

You are not the only one Cheesy, but the compiler sais it's one ALU OP less!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
July 24, 2011, 08:46:06 AM
 #226

69xx version would be wonderful ;-P

To all 69XX card owners, that want 1 ALU OP less, down to 1697 Smiley. Just edit the kernel.cl file and replace Line 385 (DL the latest 2011-07-17 version):

Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

with

Vals[7] = Vals[7] + Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

Please report if it works Smiley. Remember, this WILL be slower for 58XX owners, so don't try this, if you are on 58XX cards or even (s)lower!

Dia

Will it be slower on a 6850?

6850 is a VLIW5 design and will be slower ... really only 69XX cards!

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
July 24, 2011, 08:48:26 AM
 #227

thats really weired. I cant see why the compiler generates better code with this statement - but it does. Verified it with SDK 2.4 myself ..

Anyone noticed that during the last few rounds a lot of variables are calculated, but never used again?

I took the time to optimize all unused statements out, but got no speed improvements at all. Seems like the compiler optimized it anyway.

Here it is - from round 124 to the #ifdef VECTORS preprocessor command

Code:
        sharound(121);

        // Optimized out all unused calculations
        W[122 - O] = P4(122) + P3(122) + P2(122) + P1(122);
        Vals[1] += K[58] + Vals[5] + W[122 - O] + s1(122) + ch(122);
        Vals[0] += K[59] + Vals[4] + s1(123) + ch(123) + P4(123) + P3(123) + P2(123) + P1(123);
        Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

#ifdef VECTORS

Maybe someone can trick the compiler into making better code ;-)

Hey Vince, tried this by myself a few days ago and didn't get a better efficiency either ... perhaps I will have to throw it into the mixer again Cheesy.
And thanks again for your work!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
July 25, 2011, 02:03:38 PM
 #228

thats really weired. I cant see why the compiler generates better code with this statement - but it does. Verified it with SDK 2.4 myself ..

Anyone noticed that during the last few rounds a lot of variables are calculated, but never used again?

I took the time to optimize all unused statements out, but got no speed improvements at all. Seems like the compiler optimized it anyway.

Here it is - from round 124 to the #ifdef VECTORS preprocessor command

Code:
        sharound(121);

        // Optimized out all unused calculations
        W[122 - O] = P4(122) + P3(122) + P2(122) + P1(122);
        Vals[1] += K[58] + Vals[5] + W[122 - O] + s1(122) + ch(122);
        Vals[0] += K[59] + Vals[4] + s1(123) + ch(123) + P4(123) + P3(123) + P2(123) + P1(123);
        Vals[7] += Vals[3] + P4(124) + P3(124) + P2(124) + P1(124) + s1(124) + ch(124);

#ifdef VECTORS

Maybe someone can trick the compiler into making better code ;-)

Hey Vince, tried this by myself a few days ago and didn't get a better efficiency either ... perhaps I will have to throw it into the mixer again Cheesy.
And thanks again for your work!

Dia

No chance, tried different things and combinations, but the OpenCL compiler does it better, than I do (again ^^).

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
iopq
Hero Member
*****
Offline Offline

Activity: 644


View Profile
July 29, 2011, 02:07:11 PM
 #229

Phateus posted some improvements in his own kernel, check it out:
http://forum.bitcoin.org/index.php?topic=7964.0

unfortunately it doesn't run for me, so i can't check whether it's faster on my card

Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
July 29, 2011, 06:48:07 PM
 #230

Phateus posted some improvements in his own kernel, check it out:
http://forum.bitcoin.org/index.php?topic=7964.0

unfortunately it doesn't run for me, so i can't check whether it's faster on my card

Thanks for pointing me to that thread!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
July 30, 2011, 06:14:41 PM
 #231

I'm working on a new version! The inputs came from the original Author of phatk, who released a version 2.0 of phatk (THANKS Phateus).
Currently my version IS slower, but I see this as a fair and cool competition, from which all of us will benefit in the end.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Tx2000
Full Member
***
Offline Offline

Activity: 182



View Profile
July 30, 2011, 06:27:01 PM
 #232

I'm working on a new version! The inputs came from the original Author of phatk, who released a version 2.0 of phatk (THANKS Phateus).
Currently my version IS slower, but I see this as a fair and cool competition, from which all of us will benefit in the end.

Dia

Hopefully you can re-work his version because atm, it doesn't work for me on Win7/2.4SDK/GUIMiner lastest.  I replaced your phatk kernel with his and it just stays at connecting spamming idle worker in console.
pennytrader
Sr. Member
****
Offline Offline

Activity: 253


View Profile
July 31, 2011, 11:12:27 AM
 #233

I'm working on a new version! The inputs came from the original Author of phatk, who released a version 2.0 of phatk (THANKS Phateus).
Currently my version IS slower, but I see this as a fair and cool competition, from which all of us will benefit in the end.

Dia

Actually your kernel is fast for me.

Catalyst 11.6 + SDK 2.1, 975/300 gives me 313 mhs (your kernel)

Catalyst 11.6 + SDK 2.4, 975/300 gives me 311 mhs (latest from the original author. SDK 2.1 doesn't work)

please donate to 1P3m2resGCP2o2sFX324DP1mfqHgGPA8BL
deepceleron
Legendary
*
Offline Offline

Activity: 1470



View Profile WWW
July 31, 2011, 07:14:16 PM
 #234

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.

ssateneth
Legendary
*
Offline Offline

Activity: 1288



View Profile
July 31, 2011, 11:26:38 PM
 #235

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.
this is the exact same problem I have too, but i use guiminer

Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
August 01, 2011, 01:57:33 PM
 #236

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.

Are you talking about my kernel mod or phatk 2.0? What are your command line options. Try to delete the *.elf files in the phatk directory!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
deepceleron
Legendary
*
Offline Offline

Activity: 1470



View Profile WWW
August 01, 2011, 05:17:10 PM
 #237

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.

Are you talking about my kernel mod or phatk 2.0? What are your command line options. Try to delete the *.elf files in the phatk directory!

Dia
Sorry for the confusion, that's regarding the 2.0 phatk that came this weekend.

Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
August 01, 2011, 05:59:37 PM
 #238

I tried the new kernel with phoenix/11.6/2.4/Win7/5830, and gave it lots of different command line options. I could only get it to be 'miner idle' five times a second, or 5Ghash/s with no share solves.

Are you talking about my kernel mod or phatk 2.0? What are your command line options. Try to delete the *.elf files in the phatk directory!

Dia
Sorry for the confusion, that's regarding the 2.0 phatk that came this weekend.

Okay, but then please don't use this thread for phatk 2.0 support Smiley. I think you understand that.

Regards,
Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
ssateneth
Legendary
*
Offline Offline

Activity: 1288



View Profile
August 02, 2011, 09:10:31 AM
 #239

Alright diapolo, you got your work cut out for you. Phateus fixed the guiminer bug (It was in his kernel. 1 too many #'s in __init__). You do great work; you might want to join forces at this point though Wink

Tx2000
Full Member
***
Offline Offline

Activity: 182



View Profile
August 02, 2011, 08:21:31 PM
 #240

Alright diapolo, you got your work cut out for you. Phateus fixed the guiminer bug (It was in his kernel. 1 too many #'s in __init__). You do great work; you might want to join forces at this point though Wink

No way! Competition breeds innovation.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!