Okay, here's a tip for free for anybody using the Cypress chip (5870, 5970) ... it seems the new phoenix miner 1.3 goes a bit faster using worksize=64 now not worksize=128 ... and it is same on new poclbm also i think ...i.e., -w 64 is now better than -w 128 ... must be something to with the BFI_INT flag being enabled ... check it out (I waited until new difficulty is set to release this info but have been using this for a few days already).
... you can thank me later with tips
EDIT: after more testing as per discussion below NO worksize flag or setting (default) is better than -w64 and -w128