Bitcoin Forum
December 03, 2016, 10:08:52 PM *
News: Latest stable version of Bitcoin Core: 0.13.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 [3] 4 »  All
  Print  
Author Topic: *Catalyst 12.1 Preview* Decreased performance, anyone else confirm?  (Read 19330 times)
Fiyasko
Legendary
*
Offline Offline

Activity: 1428


Okey Dokey Lokey


View Profile
December 22, 2011, 12:02:11 AM
 #41

AKA your new kernal mixed with these drivers is totally fucking pointless for miners

... so just don't use it.

Dia
+1

http://bitcoin-otc.com/viewratingdetail.php?nick=DingoRabiit&sign=ANY&type=RECV <-My Ratings
https://bitcointalk.org/index.php?topic=857670.0 GAWminers and associated things are not to be trusted, Especially the "mineral" exchange
1480802932
Hero Member
*
Offline Offline

Posts: 1480802932

View Profile Personal Message (Offline)

Ignore
1480802932
Reply with quote  #2

1480802932
Report to moderator
1480802932
Hero Member
*
Offline Offline

Posts: 1480802932

View Profile Personal Message (Offline)

Ignore
1480802932
Reply with quote  #2

1480802932
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1480802932
Hero Member
*
Offline Offline

Posts: 1480802932

View Profile Personal Message (Offline)

Ignore
1480802932
Reply with quote  #2

1480802932
Report to moderator
1480802932
Hero Member
*
Offline Offline

Posts: 1480802932

View Profile Personal Message (Offline)

Ignore
1480802932
Reply with quote  #2

1480802932
Report to moderator
-ck
Moderator
Legendary
*
Offline Offline

Activity: 1988


Ruu \o/


View Profile WWW
December 22, 2011, 04:18:51 AM
 #42

Well I think the 100% CPU usage is not a fault in AMDs drivers but it comes from how I process the OpenCL buffer, which holds nonces.
To speed up kernel execution I removed control-flow (if-statements) from the kernel, but now check the whole buffer for valid nonces, even if there are none ... this leads (my guess) to an endless processing loop in Phoenix, which is the drawback of the kernel changes.
Just as a data point, cgminer does not do this. It does not check the whole buffer.

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
Transisto
Donator
Legendary
*
Offline Offline

Activity: 1624



View Profile WWW
December 22, 2011, 06:22:33 AM
 #43

I get 280 instead of 334mhs with 12.1 (5850 at 825mhz 300ram poclbm -v -w256)

That is not some small drop ... this is 20%,  I was expecting ~5% and that it was worth it for the CPU bug.
bronan
Hero Member
*****
Offline Offline

Activity: 765


Lazy Lurker Reads Alot


View Profile WWW
December 22, 2011, 10:52:13 AM
 #44

try with different settings cause when i use -v w256 i have a lot less performance
setting it to -v2 -w 128 should fit better on a 5850
Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
December 22, 2011, 11:14:47 AM
 #45

Well I think the 100% CPU usage is not a fault in AMDs drivers but it comes from how I process the OpenCL buffer, which holds nonces.
To speed up kernel execution I removed control-flow (if-statements) from the kernel, but now check the whole buffer for valid nonces, even if there are none ... this leads (my guess) to an endless processing loop in Phoenix, which is the drawback of the kernel changes.
Just as a data point, cgminer does not do this. It does not check the whole buffer.

Hi Con,

That is correct, but CGMINER uses Phatk2, which seems currently to not work that well with SDK / runtime 2.6.
So it's again a trial and error to get the best tradeoff.

Dia

Edit: I found out that a Worksize of 64 with Phoenix works great now, this was much slower before 2.6.

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
BOARBEAR
Member
**
Offline Offline

Activity: 77


View Profile
December 22, 2011, 03:43:09 PM
 #46

For those who got less hash with 12.1 with cgminer

Try worksize 64 with vectors 4
-ck
Moderator
Legendary
*
Offline Offline

Activity: 1988


Ruu \o/


View Profile WWW
December 22, 2011, 09:03:56 PM
 #47

For those who got less hash with 12.1 with cgminer

Try worksize 64 with vectors 4
That's interesting because each GPU will report what its "preferred vector size" is, and often it comes out to 4, yet despite that, virtually all GPUs had much better performance (with the older SDK) and 2 vectors. Perhaps they live up to their promise now?

Primary developer/maintainer for cgminer and ckpool/ckproxy.
Pooled mine at kano.is, solo mine at solo.ckpool.org
-ck
gat3way
Sr. Member
****
Offline Offline

Activity: 256


View Profile
December 22, 2011, 10:58:56 PM
 #48

Nope, preferred vector size does not always mean "best performance". Wider vectors mean more GPRs used and the more GPRs you use, the less wavefronts you can schedule on a CU, thus occupancy goes down. Also, on VLIW5 hardware, uint4 vectors are not optimal, it can happen that there are no 5 non-dependent instructions to fill the whole VLIW bundle. That depends on your code.

For example, with hash cracking you might end up with uint8 being much better than uint4 for kernels like the MD5 or NTLM one as ALUPacking goes up and the number of used GPRs is 10-20 at most. On the other hand, more complex algorithms like SHA512 are much better with uint2 or even a scalar implementation as the number of used GPRs greatly hampers the occupancy. With memory-intensive kernels like DES ones (thanks god bitcoin is not one), occupancy becomes even more important as more concurrency means memory access  latencies are more easily "hidden".

Back to uint2, problem with it is that it's even worse at utilizing all the slots in the VLIW bundle and your ALUPacking just always sucks. However, generally speaking with most bitcoin kernels it's a tradeoff worth having as bad occupancy in that particular case is worse than bad ALUPacking.

uint3 should provide a better balance between those, but it was broken in pre-2.6 APP SDK releases. Now they fixed it and I am curious about results...


PS as of why suddenly uint4 started performing better, it could be either that they iimproved scheduling or that they have improved the backend compiler to pack instructions better with uint4 / worse with uint2. It could be actually both.
BOARBEAR
Member
**
Offline Offline

Activity: 77


View Profile
December 22, 2011, 11:24:36 PM
 #49

Oh btw there are two different versions of 12.1 preview

The one I tried is this:
http://developer.amd.com/Downloads/OpenCL1.2-Static-Cplus-preview-drivers-Windows.exe

It has a newer openCL than the other 12.1 preview
Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
December 23, 2011, 02:07:11 AM
 #50

Nope, preferred vector size does not always mean "best performance". Wider vectors mean more GPRs used and the more GPRs you use, the less wavefronts you can schedule on a CU, thus occupancy goes down. Also, on VLIW5 hardware, uint4 vectors are not optimal, it can happen that there are no 5 non-dependent instructions to fill the whole VLIW bundle. That depends on your code.

For example, with hash cracking you might end up with uint8 being much better than uint4 for kernels like the MD5 or NTLM one as ALUPacking goes up and the number of used GPRs is 10-20 at most. On the other hand, more complex algorithms like SHA512 are much better with uint2 or even a scalar implementation as the number of used GPRs greatly hampers the occupancy. With memory-intensive kernels like DES ones (thanks god bitcoin is not one), occupancy becomes even more important as more concurrency means memory access  latencies are more easily "hidden".

Back to uint2, problem with it is that it's even worse at utilizing all the slots in the VLIW bundle and your ALUPacking just always sucks. However, generally speaking with most bitcoin kernels it's a tradeoff worth having as bad occupancy in that particular case is worse than bad ALUPacking.

uint3 should provide a better balance between those, but it was broken in pre-2.6 APP SDK releases. Now they fixed it and I am curious about results...


PS as of why suddenly uint4 started performing better, it could be either that they iimproved scheduling or that they have improved the backend compiler to pack instructions better with uint4 / worse with uint2. It could be actually both.


I implemented uint3 quite a few months ago, but it was bugged (like you said) ... since 2.6 I'm able to compile my kernel via KernelAnalyzer and get no errors (the uint3 kernel is much longer by the way). But I guess I did something wrong in the init.py from Phoenix, which I can't solve by myself ... Phoenix crashes if kernel is started. Are you skilled enough to take a look at it?

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
gat3way
Sr. Member
****
Offline Offline

Activity: 256


View Profile
December 23, 2011, 04:44:36 PM
 #51

Not quite, python is not among my strong sides. I may rewrite my miner though, just for the experiment. Anyway I have more important projects right now.
HolodeckJizzmopper
Member
**
Offline Offline

Activity: 106


View Profile
December 29, 2011, 04:31:35 AM
 #52

Confirming a 20% drop in hashing performance across a variety of cards using 11.12.

Fuck.

Time to roll back to older drivers.

I hate ATI so damned much sometimes.

Edit: Using phoenix/phatk
Diapolo
Hero Member
*****
Offline Offline

Activity: 769



View Profile WWW
December 29, 2011, 12:03:47 PM
 #53

Confirming a 20% drop in hashing performance across a variety of cards using 11.12.

Fuck.

Time to roll back to older drivers.

I hate ATI so damned much sometimes.

Edit: Using phoenix/phatk

Could you try this one (https://bitcointalk.org/index.php?topic=25860) with Phoenix 1.7.1 and report your results?

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
Fiyasko
Legendary
*
Offline Offline

Activity: 1428


Okey Dokey Lokey


View Profile
December 29, 2011, 08:17:42 PM
 #54

Confirming a 20% drop in hashing performance across a variety of cards using 11.12.

Fuck.

Time to roll back to older drivers.

I hate ATI so damned much sometimes.

Edit: Using phoenix/phatk

10% drop across all my cards. But no more cpu bug, Cant tell wich is worse.

http://bitcoin-otc.com/viewratingdetail.php?nick=DingoRabiit&sign=ANY&type=RECV <-My Ratings
https://bitcointalk.org/index.php?topic=857670.0 GAWminers and associated things are not to be trusted, Especially the "mineral" exchange
Fiyasko
Legendary
*
Offline Offline

Activity: 1428


Okey Dokey Lokey


View Profile
December 31, 2011, 07:02:37 AM
 #55

Confirming a 20% drop in hashing performance across a variety of cards using 11.12.

Fuck.

Time to roll back to older drivers.

I hate ATI so damned much sometimes.

Edit: Using phoenix/phatk

10% drop across all my cards. But no more cpu bug, Cant tell wich is worse.

So can anyone bring there hash rate back to normal with the cat 12.1? I love it's gaming performance, But im losing 10% of my speed so that i can save 17% of my cpu usage... I'd rather not burn out one of my cores, Atop of that it's a waste of power.


http://bitcoin-otc.com/viewratingdetail.php?nick=DingoRabiit&sign=ANY&type=RECV <-My Ratings
https://bitcointalk.org/index.php?topic=857670.0 GAWminers and associated things are not to be trusted, Especially the "mineral" exchange
conspirosphere.tk
Legendary
*
Offline Offline

Activity: 1862


Revolution will be decentralized


View Profile WWW
January 02, 2012, 09:03:19 PM
 #56

I confirm a 30+ Mhs drop on both a 5870 and a 5830 with both CataCLYSM 11.12 and 12.1 compared with 11.6.
What is MUCH WORSE is that you cannot revert back effectively: uninstalling and driver sweeping before rebooting and reinstalling 11.6 will NOT return your previous performance.
Luckily, I had a fresh system backup made just yesterday that I could restore no prob.
Now I am back to mining happily with my 5870@970/300 hashing at 440Mhs+ and a 5830@940/300 hashing at 300Mhs+ with Phoenix 1.7 on Xp.

Fiyasko
Legendary
*
Offline Offline

Activity: 1428


Okey Dokey Lokey


View Profile
January 02, 2012, 09:58:54 PM
 #57

We really REALLY need to optomise these new drivers...
Some of us still play videogames and mine coins y'know... It's not just all Deticated miners running

http://bitcoin-otc.com/viewratingdetail.php?nick=DingoRabiit&sign=ANY&type=RECV <-My Ratings
https://bitcointalk.org/index.php?topic=857670.0 GAWminers and associated things are not to be trusted, Especially the "mineral" exchange
bal3wolf
Sr. Member
****
Offline Offline

Activity: 426



View Profile
January 02, 2012, 10:05:38 PM
 #58

mining works good for me no cpu usage and still good perf but i been crashing in games on so im going back to try 11.12.

my btc address 1LRWTJS3rf8ubG2oMjcm7CmGGDJQSomdRP
Mining tools and Drivers
Fiyasko
Legendary
*
Offline Offline

Activity: 1428


Okey Dokey Lokey


View Profile
January 03, 2012, 06:52:26 AM
 #59

mining works good for me no cpu usage and still good perf but i been crashing in games on so im going back to try 11.12.
...Wtf...
12.1 i've never crashed yet...
Im LOVING these drivers. but the lack of 10% mining speed really hurts

http://bitcoin-otc.com/viewratingdetail.php?nick=DingoRabiit&sign=ANY&type=RECV <-My Ratings
https://bitcointalk.org/index.php?topic=857670.0 GAWminers and associated things are not to be trusted, Especially the "mineral" exchange
bal3wolf
Sr. Member
****
Offline Offline

Activity: 426



View Profile
January 06, 2012, 06:42:50 PM
 #60

It would crash in dead island 1-4hrs at random 11.12 does not and it still has 0 % cpu usage but the lower mining speed to.

my btc address 1LRWTJS3rf8ubG2oMjcm7CmGGDJQSomdRP
Mining tools and Drivers
Pages: « 1 2 [3] 4 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!