Hey guys, figured I'd make my first post useful.
So if you have a 58xx series card (could be more) and upgraded your catalyst drivers to latest, you may have noticed that you are getting nothing but this:
GPU0: invalid nonce - HW error
Of course, this is no good! You won't be sending any blocks that are deemed invalid, and contributing nothing to whatever pool you might be on!
The reason for this seems to be that amd_bytealign is either working differently or no longer usable in the 13.4 drivers (and probably a few before it). that phatk kernel that cgminer and bfgminer uses was upgraded to harness this optimization, however with the newer drivers "bitselect" is automatically optimized to BFI INT without us having to do it!
So, how do you fix it? simple!
1. To start, navigate to your miner's folder and look for a file that begins with "phatk" and ends with ".cl". this is the OpenCL code that makes up the kernel, but as it's a script it can be opened in any text document viewer (I highly recommend Notepad++!)
2. Inside this file may be a bit daunting, but don't worry, there's only a few small changes needed! You'll notice around line 61 (depending on your version of miner) you'll see the following (or very similar):
#ifdef BITALIGN
#pragma OPENCL EXTENSION cl_amd_media_ops : enable
#define rot(x, y) amd_bitalign(x, x, (uint)(32 - y))
// This part is not from the stock poclbm kernel. It's part of an optimization
// added in the Phoenix Miner.
// Some AMD devices have Vals[0] BFI_INT opcode, which behaves exactly like the
// SHA-256 Ch function, but provides it in exactly one instruction. If
// detected, use it for Ch. Otherwise, construct Ch out of simpler logical
// primitives.
#ifdef BFI_INT
// Well, slight problem... It turns out BFI_INT isn't actually exposed to
// OpenCL (or CAL IL for that matter) in any way. However, there is
// a similar instruction, BYTE_ALIGN_INT, which is exposed to OpenCL via
// amd_bytealign, takes the same inputs, and provides the same output.
// We can use that as a placeholder for BFI_INT and have the application
// patch it after compilation.
// This is the BFI_INT function
#define Ch(x, y, z) amd_bytealign(x,y,z)
// Ma can also be implemented in terms of BFI_INT...
#define Ma(z, x, y) amd_bytealign(z^x,y,x)
#else // BFI_INT
// Later SDKs optimise this to BFI INT without patching and GCN
// actually fails if manually patched with BFI_INT
#define Ch(x, y, z) bitselect((u)z, (u)y, (u)x)
#define Ma(x, y, z) bitselect((u)x, (u)y, (u)z ^ (u)x)
#define rotr(x, y) amd_bitalign((u)x, (u)x, (u)y)
#endif
#else // BITALIGN
#define Ch(x, y, z) (z ^ (x & (y ^ z)))
#define Ma(x, y, z) ((x & z) | (y & (x | z)))
#define rot(x, y) rotate((u)x, (u)y)
#define rotr(x, y) rotate((u)x, (u)(32-y))
#endif
Some of this block is the problematic code for the 13.4 Catalysts, and as such we need to change it!
3. Delete the above block of code (aprrox line 49 to 86 inclusive), and replace it with the following:
#ifdef BITALIGN
#pragma OPENCL EXTENSION cl_amd_media_ops : enable
#define rot(x, y) amd_bitalign(x, x, (uint)(32 - y))
// This part is not from the stock poclbm kernel. It's part of an optimization
// added in the Phoenix Miner.
// Some AMD devices have Vals[0] BFI_INT opcode, which behaves exactly like the
// SHA-256 Ch function, but provides it in exactly one instruction. If
// detected, use it for Ch. Otherwise, construct Ch out of simpler logical
// primitives.
//We have an SDK which automatically optimizes to BFI INT, so lets do this
#define Ch(x, y, z) bitselect(z, y, x)
#define Ma(x, y, z) bitselect((u)x, (u)y, (u)z ^ (u)x)
#define rotr(x, y) amd_bitalign((u)x, (u)x, (u)y)
#else // BITALIGN
#define Ch(x, y, z) (z ^ (x & (y ^ z)))
#define Ma(x, y, z) ((x & z) | (y & (x | z)))
#define rot(x, y) rotate((u)x, (u)y)
#define rotr(x, y) rotate((u)x, (u)(32-y))
#endif
This avoids any of the logic that should cause you problems.
4. if your miner is running, shut it down. if you have any leftover files starting in "phatk" and ending in ".bin", it's probably best to delete those.
5. Start up your miner. it should now start accepting blocks!
Hope this has helped. Note that this is modified from the following post regarding a similar problem on poclbm:
https://bitcointalk.org/index.php?topic=221041.0.
Happy mining!