Bitcoin Forum
December 09, 2024, 01:01:26 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 [5] 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
Author Topic: further improved phatk_dia kernel for Phoenix + SDK 2.6 - 2012-01-13  (Read 106967 times)
zimpixa
Member
**
Offline Offline

Activity: 98
Merit: 10


View Profile
July 07, 2011, 10:16:10 PM
 #81

SDK 2.1 working for 5h now.

Newest version seems to behave more stable (less delta max-min). Speed is about 0.5MHash higher, but it can be just impression.

YinCoin YangCoin ☯☯First Ever POS/POW Alternator! Multipool! ☯ ☯ http://yinyangpool.com/ 
Free Distribution! https://bitcointalk.org/index.php?topic=623937
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
July 07, 2011, 10:36:13 PM
 #82

Excellent.

It looks like this SDK 2.1 bugfix is popular!  Perhaps now people will stop telling me to use SDK 2.4.

Thanks Diapolo for the fix and I'm glad I was able to help.
bmgjet
Member
**
Offline Offline

Activity: 98
Merit: 10


View Profile
July 07, 2011, 11:23:51 PM
 #83

version 2011-07-06 gives best over all speed. 0.500 faster then my modded one but opening firefox drops speed from 277 down to 233 where my one only drops 2mh/s. Probably just the way phatk works since iv never used the stock one.

version 2011-07-07 is all over the place. jumps between 255-280 without firefox open so don't know if its faster or slower. With firefox its more stable then older version and drops to 252-258

Im using it with poclbm.exe -v -w 256 (128 gave same result) on 6850 overclocked/underclocked and 2.1 SDK.

Donations to: 1BMGjetfht9XLkGBYR4TSsuXjrYEKACcow
1stbits: 1bmgjet
300MHash/s 6850 http://www.techpowerup.com/gpuz/5u6wr/
Overclocked for 6 years and still strong http://valid.canardpc.com/show_oc.php?id=1931458 & http://valid.canardpc.com/show_oc.php?id=285337
BOARBEAR
Member
**
Offline Offline

Activity: 77
Merit: 10


View Profile
July 08, 2011, 12:55:40 AM
 #84

I have a feature request

Can you please make it work with intel openCL?

thank you
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 08, 2011, 05:24:10 AM
 #85

Excellent.

It looks like this SDK 2.1 bugfix is popular!  Perhaps now people will stop telling me to use SDK 2.4.

Thanks Diapolo for the fix and I'm glad I was able to help.


It was only possible to fix it that fast, because you showed me the log files and error output! So we helped eachother, thanks too!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 08, 2011, 05:25:46 AM
 #86

version 2011-07-06 gives best over all speed. 0.500 faster then my modded one but opening firefox drops speed from 277 down to 233 where my one only drops 2mh/s. Probably just the way phatk works since iv never used the stock one.

version 2011-07-07 is all over the place. jumps between 255-280 without firefox open so don't know if its faster or slower. With firefox its more stable then older version and drops to 252-258

Im using it with poclbm.exe -v -w 256 (128 gave same result) on 6850 overclocked/underclocked and 2.1 SDK.

I discovered a MH/sec drop while, using Firefox, too. It has to be related with the new GPU acceleration that FF implemented in 4.0 and up.

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 08, 2011, 05:27:00 AM
 #87

I have a feature request

Can you please make it work with intel openCL?

thank you

I know that Intel recently released and OpenCL gold SDK ... the kernel uses standard OpenCL commands and an AMD extension only for BFI_INT / BITALIGN. I see no reason why it should not work. Have you got some error logs for me? You know that for Intel it will only use the CPUs!?

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
kwaaak
Full Member
***
Offline Offline

Activity: 139
Merit: 100


View Profile
July 08, 2011, 09:55:22 AM
 #88

Thanks a lot
r3v3rs3
Newbie
*
Offline Offline

Activity: 22
Merit: 0


View Profile
July 08, 2011, 10:12:33 AM
 #89

2011-07-03 -> 2011-07-07

Wheezy x64, 11.4, SDK 2.4:

Box #1:

- HD5750, 875/300, AGGRESSION=11, 175 MH/s -> 176 MH/s

Box #2:

- HD5750, 900/300, AGGRESSION=11, 181 MH/s -> 183 MH/s
- HD5770, 950/300, AGGRESSION=11, 215 MH/s -> 217 MH/s

XP 32, 11.7 preview, SDK 2.5:

- HD5770, 1000/300, AGGRESSION=12, 222 MH/s -> 223 MH/s
- HD5830, 1050/300, AGGRESSION=9, 337 MH/s -> 337 MH/s

phatk w/ Ma patch -> 2011-07-07

Natty x32, 11.6, SDK 2.4:

- HD5750, 900/300, AGGRESSION=9, 176 MH/s -> 176 MH/s
- HD5770, 950/1200 (going to be RBE'ed to 300 Wink), AGGRESSION=9, 204 MH/s -> 204 MH/s
- HD5830, 1000/300, AGGRESSION=9, 312 MH/s -> 313 MH/s

phoenix 1.50 w/ common flags: VECTORS BFI_INT FASTLOOP=false WORKSIZE=256

Nice work, sent some bitcents to 1B6LEGEUu1USreFNaUfvPWLu6JZb7TLivM. Smiley
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 08, 2011, 11:57:07 AM
 #90

Guys, I introduced a small glitch, which produces an OpenCL compiler warning in version 07-07. For stability reasons please change line 77:

old:
u W[123];

new:
u W[124];

I missed sharound(123), which writes to W[123], which is undefined, because it's out of range. Sorry for that!
Will upload a fixed version shortly (only includes the change above and stays 07-07).

Edit:
Download 07-07 fixed: http://www.mediafire.com/?o7jfp60s7xefrg4

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Tx2000
Full Member
***
Offline Offline

Activity: 182
Merit: 100



View Profile
July 08, 2011, 02:55:09 PM
 #91

Tried it out for the first time and I do see some noticable gains.



5850 @ 970/350
GUIMiner 2011-7-1

poclbm opencl - 410-413Mh
phoenix phatk - 408-413Mh

with your modified kern - 415-416.4Mh



Send a little something your way
BOARBEAR
Member
**
Offline Offline

Activity: 77
Merit: 10


View Profile
July 08, 2011, 06:28:38 PM
 #92

I have a feature request

Can you please make it work with intel openCL?

thank you

I know that Intel recently released and OpenCL gold SDK ... the kernel uses standard OpenCL commands and an AMD extension only for BFI_INT / BITALIGN. I see no reason why it should not work. Have you got some error logs for me? You know that for Intel it will only use the CPUs!?

Dia
I have tried.  Poclum will not run at all.  It crashed upon starting.

I took a look at the code.  I think the comments are messy and some not really helpful.  Do you think its an good idea to 'fix' the comment?  btw comment why you type cast the hex value so other developer wont think its unnecessary and remove it.
error
Hero Member
*****
Offline Offline

Activity: 588
Merit: 500



View Profile
July 08, 2011, 09:58:24 PM
 #93

These two changes have taken my 5850s from 345MHash/sec to 360MHash/sec. Very nice.

3KzNGwzRZ6SimWuFAgh4TnXzHpruHMZmV8
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 08, 2011, 11:57:40 PM
 #94

I have a feature request

Can you please make it work with intel openCL?

thank you

I know that Intel recently released and OpenCL gold SDK ... the kernel uses standard OpenCL commands and an AMD extension only for BFI_INT / BITALIGN. I see no reason why it should not work. Have you got some error logs for me? You know that for Intel it will only use the CPUs!?

Dia
I have tried.  Poclum will not run at all.  It crashed upon starting.

I took a look at the code.  I think the comments are messy and some not really helpful.  Do you think its an good idea to 'fix' the comment?  btw comment why you type cast the hex value so other developer wont think its unnecessary and remove it.

Hex-values are type-casted so that the kernel works with AMD 2.1 SDK, which throws an error, if NOT type-casted.
I don't understand what you want to tell me with the "comments are messy" part.

If you get an error log with Intel SDK please post it here, so I can have a look at it.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 08, 2011, 11:58:43 PM
 #95

These two changes have taken my 5850s from 345MHash/sec to 360MHash/sec. Very nice.

5830s seem like THE card for my modded kernel. Great to hear Smiley.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Vince
Newbie
*
Offline Offline

Activity: 38
Merit: 0


View Profile
July 09, 2011, 02:04:58 PM
 #96

Hi Diapolo!

Great to see you're making progress!
There's one thing that pops into my eye:

you already do:
if(Vals[7].x == -H[7])

why not add the K[60] right into it and remove from upper instruction? Saves a whole instruction and will work 100% ;-)

if(Vals[7].x == -H[7]-K[60])


Lets see if I can find more ..
Diapolo (OP)
Hero Member
*****
Offline Offline

Activity: 772
Merit: 500



View Profile WWW
July 09, 2011, 02:45:37 PM
 #97

Hi Diapolo!

Great to see you're making progress!
There's one thing that pops into my eye:

you already do:
if(Vals[7].x == -H[7])

why not add the K[60] right into it and remove from upper instruction? Saves a whole instruction and will work 100% ;-)

if(Vals[7].x == -H[7]-K[60])


Lets see if I can find more ..


Good idea and works, can't verify via KernelAnalyzer, but seems like a vector addition less.
Will be included in the next version!

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Vince
Newbie
*
Offline Offline

Activity: 38
Merit: 0


View Profile
July 09, 2011, 02:59:02 PM
 #98

Another addition waiting to be removed:

 Vals[7] = (Vals[3] = (u)0xb956c25b + D1 + s1(4) + ch(4)) + H1;

-> D1 is only used here, so why not add (u)0xb956c25b during precalculation?  Wink

Add

self.state2[3] = np.uint32(self.state2[3] + 0xb956c25b);

to __init__.py, line 77 for me, right behind:

self.calculateF(data)

And remove (u)0xb956c25b from kernel.cl

This also works 100%, no logic change involved here.

bcforum
Full Member
***
Offline Offline

Activity: 140
Merit: 100


View Profile
July 09, 2011, 05:35:47 PM
 #99


What parameters are you using with phatk?

On my normally aspirated R6970 Lightning I get 419.xMH/s (Ma fix in poclbm):

Code:
-k poclbm DEVICE=0 AGGRESSION=13 BFI_INT WORKSIZE=64 VECTORS FASTLOOP=false

The fastest I've ever gotten with phatk is 403.xMH/s

Code:
-k phatk DEVICE=0 AGGRESSION=13 BFI_INT WORKSIZE=256 VECTORS FASTLOOP=false

Any suggestions?

If you found this post useful, feel free to share the wealth: 1E35gTBmJzPNJ3v72DX4wu4YtvHTWqNRbM
huayra.agera
Full Member
***
Offline Offline

Activity: 154
Merit: 100



View Profile
July 09, 2011, 06:08:03 PM
 #100

I hope you can make an optimization for the next OCL version (v2.5 (684.212)) available in beta form already. =)

BTC: 1JMPScxohom4MXy9X1Vgj8AGwcHjT8XTuy
Pages: « 1 2 3 4 [5] 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!