Bitcoin Forum
July 16, 2024, 07:00:44 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: first GPU out of 4 is hotter and causes hardware errors  (Read 1312 times)
Pompobit (OP)
Hero Member
*****
Offline Offline

Activity: 736
Merit: 508


View Profile
January 25, 2014, 10:06:05 PM
 #1

Hi,

I have a mining rig with 4x 280x, on an Asrock 970 ex4, connected with powered 16x to 1x risers cables. Two cards are connected to 2 PCI-E 16x slots, the remaining two to PCI-E 1X slots

They mine with an average of 730Kh/s each one, but the behaviour of GPU0 is very different than the others:

- GPU 0 has a temperature of about 80°, the rest 72-73°
- GPU 0 gets hardware errors. A few, but it generates usually when cgminer is just started. The other GPUs NEVER generate HW errors, with the same settings.
- no matter which card the GPU 0 is, switching cables and slots among the cards doesn't solve anything. The GPU 0, the one attached to the PCI-E 16x nearest the cpu, always have the same behaviour.


Any ideas? Some suggestions?
pixl8tr
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
January 25, 2014, 10:18:44 PM
 #2

It could be a number of things.  My first suggestion  is to see if its the card or, it's position in your rig.  EDIT: oh seen that you have tried that...
See what happens when you switch positions with one of the other cards.  If the issues moves with the card, you may just have a "weak" GPU.  I have run across a few of those myself that I had to replace.

Otherwise make sure the intake fan is not sucking hot air from a nearby component or that the airflow is not blocked in any way.  Take a small case fan and try to blow directly on the card to make sure its getting enough fresh air.

who | grep -i blonde | date; cd ~; unzip; touch; finger; bjobs; uptime; strip;. grab; mount; yes; umount; sleep; brun;
Donations: 18ByQvDUmaMKkQbYvUWmnPSu9BWeNxVMoc
Pompobit (OP)
Hero Member
*****
Offline Offline

Activity: 736
Merit: 508


View Profile
January 25, 2014, 10:26:44 PM
 #3

It could be a number of things.  My first suggestion  is to see if its the card or, it's position in your rig.  EDIT: oh seen that you have tried that...
See what happens when you switch positions with one of the other cards.  If the issues moves with the card, you may just have a "weak" GPU.  I have run across a few of those myself that I had to replace.

Otherwise make sure the intake fan is not sucking hot air from a nearby component or that the airflow is not blocked in any way.  Take a small case fan and try to blow directly on the card to make sure its getting enough fresh air.

Thank you for your help.

By the way the position of the card is not the problem, because the same card in the same position runs much cooler is it is not detected as GPU0 and stops to generate HW errors : \
pixl8tr
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
January 25, 2014, 10:39:19 PM
 #4

You mentioned it has powered risers.  Could there be an issue with the Power supply on the riser?  IE: did you also switch the risers when you switched the cards position?
The other thought I had, is GPU0 is near the CPU?  Could the heat from the CPU be overheating the card?

who | grep -i blonde | date; cd ~; unzip; touch; finger; bjobs; uptime; strip;. grab; mount; yes; umount; sleep; brun;
Donations: 18ByQvDUmaMKkQbYvUWmnPSu9BWeNxVMoc
Pompobit (OP)
Hero Member
*****
Offline Offline

Activity: 736
Merit: 508


View Profile
January 25, 2014, 10:44:53 PM
 #5

You mentioned it has powered risers.  Could there be an issue with the Power supply on the riser?  IE: did you also switch the risers when you switched the cards position?
The other thought I had, is GPU0 is near the CPU?  Could the heat from the CPU be overheating the card?


unfortunately, I also switched between riser cables, with no effect.

No, GPU0 slot is near the CPU, but the card is far from it
pixl8tr
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
January 26, 2014, 12:47:52 AM
 #6

Can you confirm that the fan spins on  the card that is in GPU0 slot?  Maybe there is something wrong with the signal lines or pins in the socket?  Try to have a fan blowing directly on the card. If that helps you know there is something wrong with the cooling in that spot.  Other than that, maybe if you can post a picture we can see something?

who | grep -i blonde | date; cd ~; unzip; touch; finger; bjobs; uptime; strip;. grab; mount; yes; umount; sleep; brun;
Donations: 18ByQvDUmaMKkQbYvUWmnPSu9BWeNxVMoc
Wipeout2097
Sr. Member
****
Offline Offline

Activity: 840
Merit: 255


SportsIcon - Connect With Your Sports Heroes


View Profile
January 26, 2014, 01:03:27 AM
 #7

Check with Gpu-z, the PCI-e speed and version at which GPU0 is running. PCI-e v3.0 is recent.
Then, the motherboard may be overvolting the PCI-e slot, considering the competition there is on OC friendly motherboards.

Then see if you can change PCI-e frequency and latency on the Bios or with tweaking software. Disable sound, firewire and whatever else not required for mining.

What are your thread-concurrency settings? Some will give HW errors in the very beginning as you mention.



███████████████████████████████████████████████████████████████
██▀       ▀█       ▀████████████        ▀█         █▀       ▀██
██   ▀██▄▄▄█   ██   ████████████   ███   ████   ████   ▀██▄▄▄██
███▄     ▀██       ▄████████████       ▄█████   █████▄     ▀███
██▀▀▀██▄   █   █████████████████   █▄  ▀█████   ████▀▀▀██▄   ██
██▄       ▄█   █████████████████   ██▄  ▀████   ████▄       ▄██
███████████████████████████████████████████████████████████████
██       ██▀      ▀█████████████    ▀██   █████████████████████
████   ███   ▄██▄   ████████████     ▀█   █████████████████████
████   ███   ████████   ████   █   ▄  ▀   █████████████████████
████   ███   ▀██▀   █   ████   █   █▄     █████████████████████
██       ██▄      ▄███        ██   ██▄    █████████████████████
███████████████████████████████████████████████████████████████
██████████████
██
██
██
██
██
██
██
██
██
██
██
██████████████
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                             ████████████████████████████████████████████████
.
.
.

████████████████████████████████████████████████████████████          ████████████████                                 ██████████████████████████████████████████████████████████████████████████████████████
██████████████
██
██
██
██
██
██
██
██
██
██
██
██████████████
███████
██
██
██
██
██
██
██
██
██
██
██
███████
███████
██
██
██
██
██
██
██
██
██
██
██
███████
►►  Powered by
BOUNTY
DETECTIVE
Treggar
Full Member
***
Offline Offline

Activity: 196
Merit: 100


View Profile
January 26, 2014, 03:19:10 AM
 #8

I get the same issue, minus the HW errors, but my GPU0 runs hotter than the others which makes the hashrate drop as the clock speed drops due to heat.

Is your GPU0 attached to a display?  I found that mine attached to a 1080P display was hashing slower than the others but when I dropped the resolution of the display down to 640x480 it helped alot.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!