OK - I know I'm resuscitating an old thread, but it was mine. So I think I'm allowed to.
A month on, the original 4 cards that died have been joined by two more.
Guess what? Both were XFX brand cards.
So far, out of the batch of three XFX 5850 'new style' cards I bought from videoshop.co.uk (with the spiral heatpipe fansinks - pretty much identical to the new XFX 5830, but faster due to the better GPU), only one is still operating.
Not to beat that, out of their slower brothers, the XFX 5830 'new style' cards (of which I bought four, foolishly), only two are still working. I may be able to cut this failure rate down to 1, since one of these cards was spiking up to 100˚C randomly, and on close inspection it was clearly a dry joint on the connector for the fan on the GPU board. So assuming the GPU isn't irrevocably toasted, re-soldering the fan connector *may* fix it.
However, out of 5 new-model XFX cards, all using the same basic card design and identical spiral heatpipe fansinks, I've had 66.6% failure rate on the 5850s, and 50% failure rate on the 5830s.
The other 'fast' XFX cards I own are a pair of XFX 5850 Black Edition cards, which I was hoping to be good mining cards. Failure rate on these is 50%, with one card preventing two of my test logic boards from even making a POST (no beeps, lights, nothing)... I'm not putting that card anywhere *near* my production miners just in case it trashes the logic board.
Oddly, I've got four XFX 5770 single-slot cards - which are amazingly decent (cross fingers... don't want instant failures...) - I've loaded a tower-case PC with all four cards and it's quiet, reliable and low-power. So it's not as if I should have learned my lesson about XFX back when I bought the first Black Edition 5850s and found the quality terrible. The 5770 single-slot cards may not appear or feel 'military grade' but they're very reliable, so far (non-stop mining, overclocked, undervolted).
Anything higher power from XFX - 5830s, 5850s, of all types - has had absolutely unacceptable failure rates here at Catfish Towers. I have to speak to videocardshop.co.uk but having done some research on the Internet, the Americans have found XFX quality to be similarly appalling and the general experience appears to be endless RMAs.
I'm a Mac OS X guy mainly (and before Bitcoin) so virtually all of my kit is made by Apple. Hence I haven't had to do 'RMAs' at all - there are 11 Mac computers in this office (my cellar...) and only one ever had to be complained about to Apple (at an Apple Store). They gave me a deal-you-can't-refuse on a new Mac Pro as a result (my Quad G5's liquid cooling system started losing efficiency). So I'm not used to (a) DOA or poor quality kit that fails after a week or less; or (b) arguing with telephone retailers about whether I justify a refund or not.
The cards were bought purely to run Bitcoin mining OpenCL kernels. Unless it says that the cards aren't suited for this on the box, I think it's OK to assume that the cards should work - albeit at a slightly reduced lifespan. Remember that I have open-frame shelf rigs with plenty of airflow, so I'm not cooking my cards. However if some retail telephonist decided to accuse *me* of being the cause of the cards failing, then I'd either lose my rag beyond all belief (getting me nowhere), or have no real answer other than 'why are all the dead cards your XFX models, and the rest still work?'...
I've scanned the American NewEgg website for my particular XFX cards, and checked the feedback. Ridiculous numbers of people got duff cards and had to send them back, at their expense, to XFX. But even though this cost each customer $50 or whatever, they still gave good feedback. I guess customer service in general must be pretty lousy in the USA, but I'm not playing this game.
I noticed that my XFX cards appeared to be 'used' before I installed them - some had the cling-film on the card front removed already, some had the accessories (DVI-VGA adapter, etc.) removed, and some honest chap had put *three* current PC games back in the box (if anyone wants these games - give me a PM, as I'm an old bloke with loads of Mac machines and I loved PC first person shooter games AGES ago, for example my favourites ever were Heretic, Quake 2 and Half-Life, I don't play games any more).
So perhaps every XFX card I bought was a returned item, which could have been returned dishonestly after an overclocking experiment gone wrong, or worse a BIOS flash gone wrong. These items were marked 'new' by the retailer, so maybe I've got a leg to stand on, but being honest, I bought them to mine BTC - i.e. tuned to run at their maximum. Mine haven't failed after gross overclocking - two were DOA, the others failed within a few days of *non* overclocked (but memory underclocked) work. This *reduces* temperatures...
The problem is that I have two 6950s from XFX... neither has failed yet, but I'm concerned. A large part of my hash power would evaporate if all XFX cards died. Half of them already have done, I can't handle many more failing.
This isn't a pop at videocardshop.co.uk - I'll have a public pop at them if they tell me to eff off when it comes to my complaints, but since I haven't spoken to them about these XFX cards, they may be very understanding. However, it's a big warning to any of you thinking you can save a bit of money by buying these XFX cards in the UK from this retailer. The prices are very attractive... but my sample size is big enough now to make a statement about *why* the prices are so attractive. Stay away from XFX graphics cards. Even with open frames, loads of cool high-flow air, and plenty of PSU headroom, just check out the percentage failure rates above
(Unless you go for the single-slot 5770s, which seem to all handle 200 MH/s each happily at 62-70˚C... and seem perfectly reliable. Odd)...
Anyone else found identical issues with XFX brand graphics cards? My best cards are the two Asus DirectCU 6950s, but apart from two expensive cards (too small a sample size), the brand I've had sustained success with is Sapphire... the original 5850 has been running 990 MHz for months, the 5850 'extreme' cards are still hammering away (well they will be, I'm about to replace the XFX cards with them), and the dual-slot 5770s clock to the moon.
And if Sapphire are known to be 'bad' - what would be recommended if you were considering building another 12-GPU shelf and wanted to build it and forget it... i.e. not the daily messing around I'm currently experiencing, not knowing whether my logic boards are failing or whether the GPU has failed / is rescuable / needs killing with fire.
I'm getting pissed off now, back to 'this is why I started using Macs' - whereas the initial PC-building was *entertaining*. The XFX boards have trashed it. Am I just unlucky - or is it well-known that bitcoin miners should avoid XFX boards?
Lastly... when graphics boards do this, is there any hope of recovering them, maybe by refinishing the GPU / heatsink interface, cleaning the whole thing up, etc. (none of the boards show burnt-out components or anything obviously knackered) - or is the damage likely internal and unfixable? Looks like around £1,000 total loss right now. Grrr.