I'll try to keep this short. I need some help - anyone got any ideas?
1. I bought 10 nicely priced XFX new-style 58xx cards. 5x 5850 and 5x 5830. Both use the same, new board design, with a single heatsink on the GPU only, a round copper and aluminium job with a single fan. The card isn't entirely shrouded like most - it's a bare card with a small plastic shroud on top of the fan, only there so XFX can put their branding on it. It's cosmetic and not functional. The card actually ends up bowing in the middle if it's not secured into a PC... they don't look too clever in my open frame rigs:
2. Here are some pics of what they look like (mine aren't perfectly flat like the first image):
Apologies to bit-tech.net for ripping off their images, but they've got good all-round photos of the cards, and it saves bandwidth on my net connection, since I'd simply host identical photos
The 5850 and 5830 versions of the card are identical apart from, oddly, the 5830 has MORE components on the right hand side (power conversion stages) of the PCB.
3. Some cards, *I think*, may be functional were it not for the dry joint / loose fan connection plug - I'll try soldering the pins back into place, but it doesn't bode well for QC...
4. Booting a test rig with any of the 'broken' cards simply gives a black screen. Only two cards (of 10) are proper-failed like this.
5. Booting a test rig with SIX of the unusable cards works fine. I don't run Windows anywhere but a locked-down virtual machine on a souped-up Macbook Air under VMware, but have a DOS bootable flashdrive for the purpose of flashing BIOSes of logic boards and GPUs. In all cases, these cards work fine in DOS, and work fine in the logic board BIOS. Even the EFI BIOSes becoming popular these days work fine - i.e. the flashy 'visual eye candy' in the BIOS screens work *without any artefacts*. VGA mode, effectively, works perfectly.
6. Installing Linux, of three separate flavours, works perfectly. In other words, the radeon open-source drivers make the card appear to work fine.
7. Installing the proprietary Catalyst drivers (11.6 is my favourite, but have tried the latest new 11.11 too) results in an immediate display crash and system lock-up as soon as the driver modules are loaded into the kernel.
8. The failure mode is a black screen with a checkerboard pattern of red squares. I can't find an easy image to copy off the internet, but this pattern is clearly the same everywhere: here is a random Google artefact picture, which shows a snooker game with black squares in a distinctive pattern superimposed on top. On my systems, imagine a black screen with the distinctive pattern of *red* squares on top. It's the same every time - whether 5850 or 5830:
9. The correct functioning under DOS and VGA conditions suggests the card itself isn't broken, no? But the failure every time the proprietary drivers are loaded strongly suggests that the failure is down to power management in the card. I've googled this problem to death and it appears that the open-source and VGA drivers don't put the card into 'idle power-saving' mode, but run at the GPU's BIOS power level 0, which is full 'boot' power. On loading the BIOSes into RBE, each one has 4 or 5 different clock and voltage settings...
10. Additional evidence for this is that the rigs consume a constant, high amount of power even doing nothing when in the BIOS or in DOS, or in Linux before the proprietary drivers are loaded. Installing the proprietary drivers brings a large drop in the power consumption of the box (measured with a kill-a-watt type device). This suggests that the alternate clock / voltage levels of the card are only accessed by the proprietary drivers, and switching these clocks / voltages causes the crash.
11. I've tried writing a hacked BIOS to the card which has the same 725 MHz core / lower 300 MHz memory / 1.088 V standard for ALL power states - hoping that one of the 'low power' states was unstable and that locking ALL voltages and clocks to the Boot level (which works) would solve the problem... but it doesn't. I'm reasonably sure that the BIOS flash worked correctly, because after a load of tests, I gave up and tried slinging a 5870 BIOS onto my test 5850, and the card won't boot any more
So the flash process works - but there's still something in the drivers that causes the crash.
I'm all out of ideas now - anyone seen this and successfully fixed it, or does everyone simply play the XFX lose-all-price-savings-on-paying-postage warranty return merry-go-round game? The 'new' cards looked 'used' to me - as in refurbished - so I have no doubt that any return (if sanctioned) would result in me being sent another used card - with just as much potential for needing *another* return. I don't have time to repackage and send off 10 cards repeatedly, and in the UK, the postage costs are paid by *me*. Multiple returns would be economically pointless...