I just want to point out that the poster did not find this, rethaw did, and detailed it in his post here:
http://bitcointalk.org/index.php?topic=23067.0So if you are going to donate to anyone, I'd say let it go to the person that originally discovered this.
I might be wrong about this, but I'm pretty sure a lot of updated miner's have already incorporated this as well, but check for yourself, rethaw's instructions are pretty detailed.
I have been running with the modification myself for about a month, no different in stales but a slight bump in MH/S
--------------EDIT-----------------
So I might be wrong about who originally came up with this, because the post referenced to in the one I linked is by someone else again, but anyway, this has been documented before.
If you actually looked at what I did, it's different from that other modification. I'm not even sure if it really increases speed, or if I'm just having a lucky day with my miner
I changed
#define Ma2(x, y, z) ((y & z) | (x & (y | z)))
to
#define Ma2(x, y, z) amd_bytealign((z^y), (x), (y))
The other modification is
#define Ma(x, y, z) amd_bytealign((y), (x | z), (z & x))
#define Ma(x, y, z) amd_bytealign( (z^x), (y), (x) )
EDIT: Leaving this up so nobody asks the same thing again