Bitcoin Forum
May 13, 2024, 03:41:38 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Is my R9 280X faulty?  (Read 10803 times)
flash72 (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 0


View Profile
December 14, 2013, 09:53:57 PM
 #1

Hi

I have recently setup a rig to mine litecoins and various other altcoin, I started with one Sapphire Vapor X R9 280X which worked fine and I easily managed to get it to around 730 kh/s and undervolting led to my system using about 315 watts. All was good so I decided to buy another of the same video cards, finally tracked one down and installed it but using the same settings led to different results

http://s25.postimg.org/5re2ddc6n/cgminer.gif

The new card seems to run hotter and get less kh/s it's really letting the team down.  So I started looking at what may be different about the two cards and found they had different two different BIOS's and the only other difference I could find was the part number on the card itself,  The good one part number is 299-5E210-004SA and the bad one is 299-6E210-004SA

http://s25.postimg.org/ffrigi6zz/Specs.gif

So I thought it must be bios version so after much googling and not finding much I pushed the button to enable the UEFI bios and wrote over it with the bios of the good card but after that it would not boot up so I had to revert back to the original bios.

So I looked into it further and noticed some interesting things in GPU-Z

http://s25.postimg.org/5txy07fu7/Sensors.gif

I noticed the GPU core clock is very up and down and the VDDC is all over the place also the VRM temp is at 108 degrees!

I was hoping someone with more knowledge of this stuff would be able to help me out is this card faulty?  Should I be taking it back to the shop just got it a couple of days ago.  I don't like my chances on getting a replacement though.
1715571698
Hero Member
*
Offline Offline

Posts: 1715571698

View Profile Personal Message (Offline)

Ignore
1715571698
Reply with quote  #2

1715571698
Report to moderator
1715571698
Hero Member
*
Offline Offline

Posts: 1715571698

View Profile Personal Message (Offline)

Ignore
1715571698
Reply with quote  #2

1715571698
Report to moderator
"The nature of Bitcoin is such that once version 0.1 was released, the core design was set in stone for the rest of its lifetime." -- Satoshi
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715571698
Hero Member
*
Offline Offline

Posts: 1715571698

View Profile Personal Message (Offline)

Ignore
1715571698
Reply with quote  #2

1715571698
Report to moderator
1715571698
Hero Member
*
Offline Offline

Posts: 1715571698

View Profile Personal Message (Offline)

Ignore
1715571698
Reply with quote  #2

1715571698
Report to moderator
1715571698
Hero Member
*
Offline Offline

Posts: 1715571698

View Profile Personal Message (Offline)

Ignore
1715571698
Reply with quote  #2

1715571698
Report to moderator
ilaurens
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
December 14, 2013, 09:57:37 PM
 #2

As far I know there is nothing unusual that the second card receives less speed. You are not the only one.

About the bios. It's another version. I think that the new one contains a older bios version.

You can use a atiflash to change the versions. http://www.overclock.net/t/1353325/tutorial-atiwinflash-how-to-flash-the-bios-of-your-ati-cards
wiredmine
Member
**
Offline Offline

Activity: 69
Merit: 10



View Profile
December 14, 2013, 10:06:27 PM
 #3

As far I know there is nothing unusual that the second card receives less speed. You are not the only one.

About the bios. It's another version. I think that the new one contains a older bios version.

You can use a atiflash to change the versions. http://www.overclock.net/t/1353325/tutorial-atiwinflash-how-to-flash-the-bios-of-your-ati-cards

"not finding much I pushed the button to enable the UEFI bios and wrote over it with the bios of the good card"

From what I understood, he said he wrote over the UEFI bios on the BAD card with the bios from the GOOD card, and then the card failed to boot. So he must know to use ATIFlash

BTC: 19Q8QuDUj39LmAJEyUimdd7aiFb6QxxZ4g
flash72 (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 0


View Profile
December 14, 2013, 10:13:57 PM
 #4

As far I know there is nothing unusual that the second card receives less speed. You are not the only one.

About the bios. It's another version. I think that the new one contains a older bios version.

You can use a atiflash to change the versions. http://www.overclock.net/t/1353325/tutorial-atiwinflash-how-to-flash-the-bios-of-your-ati-cards

Yeah I already did flash the bios and that was the guide I followed, looking at the bios version it looks like the newer version is the bad one (015.041.000.000.000000) so flash with the old one (015.039....) but as mentioned it wouldnt boot so had to revert to the old one
ilaurens
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
December 14, 2013, 10:19:48 PM
 #5

Yes sorry did not notice that.. My bad Tongue

VDDC current is rather high, at least compared to the other cards. So it's pulling power and has to cool the extra current. So that might explain the temperature raise. So the question is, why does it pull that much?

Check your config and make sure you are not incidently overclocking that card

flash72 (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 0


View Profile
December 14, 2013, 10:27:53 PM
 #6

Yes sorry did not notice that.. My bad Tongue

VDDC current is rather high, at least compared to the other cards. So it's pulling power and has to cool the extra current. So that might explain the temperature raise. So the question is, why does it pull that much?

Check your config and make sure you are not incidently overclocking that card



Config is the same for both cards

setx GPU_MAX_ALLOC_PERCENT 100
setx GPU_USE_SYNC_OBJECTS 1
cgminer --scrypt -I 13 -g 2 -v 1 -w 256 --lookup-gap 2 --shaders 2048 --thread-concurrency 8192 --temp-cutoff 90 --temp-overheat 85 --temp-target 70 --gpu-memclock 1500 --gpu-engine 1060 --expiry 1 --scan-time 1 --queue 0 --no-submit-stale -o http://ltc.give-me-coins.com:3334 -u -p --failover-only -o [Suspicious link removed]:8888 -u -p
pumpclub
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
December 14, 2013, 10:39:36 PM
 #7

i was having same problem with r9 280 but i send it back and new one is working just great!
flash72 (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 0


View Profile
December 14, 2013, 10:42:21 PM
 #8

i was having same problem with r9 280 but i send it back and new one is working just great!

Was yours the same card as mine? Sapphire Vapor X?  And was the power jumping around all over the place as well?
pumpclub
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
December 14, 2013, 10:58:50 PM
 #9

well mine was gigabyte
Oden
Newbie
*
Offline Offline

Activity: 32
Merit: 0


View Profile
December 22, 2013, 03:11:32 PM
 #10

Same problem for me too.

I have 3 cards, the 2 of them has bios ver.015.041 and other has ver.015.039

The 015.039 card running at ~740 (72c temp) and its very stable. Even at higher temperature its ok!
Also I undervoltage the card at 1.137 volt with Trixx and gain -5c on temperature without lose speed.

For the other two cards, on windows, I never passed 700 with many different settings.
Now I am using linux (SMOS distro) to run those two cards (015.041) and getting ~710 only if temperature stay lower than 70c.

You have right about the bios. I followed the same steps you did, and after searching came here.
I cant downgrade BIOS, or i didnt find a proper rom yet that fits.


Here is my config and hope someone find a better solution in the future.

Code:
{
"pools" : [
        {
                "url" : "x",
                "user" : "x",
                "pass" : "x"
        }
],
"api-listen" : true,
"api-allow" : "W:127.0.0.1,192.168.0/24",
"intensity" : "20",
"vectors" : "1",
"worksize" : "256",
"kernel" : "scrypt",
"auto-fan" : true,
"temp-cutoff" : "85",
"temp-overheat" : "80",
"temp-target" : "71",
"expiry" : "30",
"gpu-dyninterval" : "7",
"log" : "5",
"queue" : "1",
"retry-pause" : "5",
"scan-time" : "30",
"scrypt" : true,
"temp-hysteresis" : "3",
"shares" : "0",
"shaders" : "2048",
"thread-concurrency" : "24768",
"gpu-threads" : "1",
"gpu-engine" : "1080",
"gpu-vddc": "1.130",
"sharethreads" : "32",
"lookup-gap" : "2",
"gpu-powertune" : "-20",
"gpu-memclock" : "1500",
"no-submit-stale": true
}
flash72 (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 0


View Profile
December 22, 2013, 09:23:02 PM
 #11

Same problem for me too.

I have 3 cards, the 2 of them has bios ver.015.041 and other has ver.015.039

The 015.039 card running at ~740 (72c temp) and its very stable. Even at higher temperature its ok!
Also I undervoltage the card at 1.137 volt with Trixx and gain -5c on temperature without lose speed.

For the other two cards, on windows, I never passed 700 with many different settings.
Now I am using linux (SMOS distro) to run those two cards (015.041) and getting ~710 only if temperature stay lower than 70c.

You have right about the bios. I followed the same steps you did, and after searching came here.
I cant downgrade BIOS, or i didnt find a proper rom yet that fits.


Here is my config and hope someone find a better solution in the future.

Code:
{
"pools" : [
        {
                "url" : "x",
                "user" : "x",
                "pass" : "x"
        }
],
"api-listen" : true,
"api-allow" : "W:127.0.0.1,192.168.0/24",
"intensity" : "20",
"vectors" : "1",
"worksize" : "256",
"kernel" : "scrypt",
"auto-fan" : true,
"temp-cutoff" : "85",
"temp-overheat" : "80",
"temp-target" : "71",
"expiry" : "30",
"gpu-dyninterval" : "7",
"log" : "5",
"queue" : "1",
"retry-pause" : "5",
"scan-time" : "30",
"scrypt" : true,
"temp-hysteresis" : "3",
"shares" : "0",
"shaders" : "2048",
"thread-concurrency" : "24768",
"gpu-threads" : "1",
"gpu-engine" : "1080",
"gpu-vddc": "1.130",
"sharethreads" : "32",
"lookup-gap" : "2",
"gpu-powertune" : "-20",
"gpu-memclock" : "1500",
"no-submit-stale": true
}

Interesting, so what are you VRM temps like?

I noticed that my fans by default were spinning quite slow around 45% so I upped the fans to 70% and that got the VRM temps back to like 85 degrees C.
Oden
Newbie
*
Offline Offline

Activity: 32
Merit: 0


View Profile
December 22, 2013, 11:52:30 PM
 #12

My VRM on bad cards is over 100c on windows (last checked), And I have spikes on CPU Clock, like you.

As I said, now I run those cards on linux and I dont know how to check there about the VRM temp.
I have the rig outside of home and temp is low (5c) so I am running at 66c with 40% fan speed.

Definitely its a BIOS problem.
miztaziggy
Sr. Member
****
Offline Offline

Activity: 432
Merit: 500


View Profile
January 07, 2014, 11:04:39 PM
 #13

I just posted this:
https://bitcointalk.org/index.php?topic=404139.msg4374559#msg4374559

Then I found your post.

Seems I have exactly the same issue.

Something has changed on the cards. The VRM temperature on my old cards runs at 70 deg max.

On the new ones it's over 100 on 3 of the 5 and about 90 on one and 75 on the other.

The cards throttle and mining performance suffers.

Has no one found a solution to this? I am surprised no one has taken off the cooler and inspected the VRM cooler. Maybe extra thermal paste?

 *Image Removed*
hyphenated
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
January 07, 2014, 11:35:49 PM
 #14

The spiking is presumably thermal throttling, which is not surprising, as those temps are getting high.  

You are performing some mild overclocking; I would be be tempted to start again at stock for the so-called bad card only and see what the numbers look like.  If it is running hot and you can RMA it, do so.  Check to ensure that the card physical position isn't driving the problem - maybe there is a dead spot in airflow.  Are you running open chassis or cased?  Two hot GPU cards in a case = need for very good airflow.

Are these reference designs (single fan) or partner designs (dual fan for sapphires)?  Paradoxically the ref design would be better for a case, as it blows hotter air out, rather than circulates it.  However, the ref design is generally accepted as very poor when viewed against the Gigabyte 3-fan Windforce, for example.

If the 'bad' card is a reference design *and* RMA is not an option I would redo the TIMs with some good-quality thermal replacement - the reference designs seem to have had some assembly issues (scratches, poor TIM application).  The partners basically sold what they were given - zero input or QC.

However, the current draw is a concern - could have been cooked.

miztaziggy
Sr. Member
****
Offline Offline

Activity: 432
Merit: 500


View Profile
January 08, 2014, 03:18:43 PM
 #15

I have received some MSI R9 280x 1050/1500 edition cards today.

When running that, GPUz shows no vrm temperatures.

GPUz (and MSI Afterburner) show only:
GPU Core
GPU Mem
GPU temp
Fan
Load
VDDC
Mem usage

Nothing else shown.

Why would that be? I am using same AMD drivers as all my other rigs. Do these cards lack all those extra sensors?

I can't tell how hot VRM is but the card is throttling. I run at 85% fan and card runs at 60 deg, but every few minutes activity drops from 100 to 34 and hash from 700 to 200.

 *Image Removed*
coinye
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
January 08, 2014, 03:45:04 PM
 #16

The bad card is simply overheating, 108 degrees for vrm is not good. and it is dropping frequency because of that.
grendel25
Legendary
*
Offline Offline

Activity: 2282
Merit: 1031



View Profile
March 09, 2014, 08:47:57 AM
 #17

Just wanted to add my 2-cents.  My opinion on these 280Xs is that there is some "burn-in" to consider.  I remember my XFX 280x giving me similar problems  (cycling between 99% and 64% GPU usage to keep temps lower) but eventually it settled down.  Now, my VisionTek 280X is doing the same thing.

I'm going to let it run and see how it goes... test my "burn-in" theory.

edit: so for those not familiar with "burn-in"... it's a common maintenance practice to allow for various manufacturing issues to work themselves out.  Some times there are thin layers of crud from insulation materials that need to burn off and of course that generates additional heat in the process.  In other situations, burn-in is even required for various components to harmonize various frequencies which is usually in radio frequency applications.

..EPICENTRAL .....
..EPIC: Epic Private Internet Cash..
.
.
▄▄█████████▄▄
▄█████████████████▄
▄█████████████████████▄
▄████████████████▀▀█████▄
▄████████████▀▀▀    ██████▄
████████▀▀▀   ▄▀   ████████
█████▄     ▄█▀     ████████
████████▄ █▀      █████████
▀████████▌▐       ████████▀
▀████████ ▄██▄  ████████▀
▀█████████████▄███████▀
▀█████████████████▀
▀▀█████████▀▀
.
▄▄█████████▄▄
▄█████████████████▄
▄█████████████████████▄
▄████████▀█████▀████████▄
▄██████▀  ▀     ▀  ▀██████▄
██████▌             ▐██████
██████    ██   ██    ██████
█████▌    ▀▀   ▀▀    ▐█████
▀█████▄  ▄▄     ▄▄  ▄█████▀
▀██████▄▄███████▄▄██████▀
▀█████████████████████▀
▀█████████████████▀
▀▀█████████▀▀
.
.
[/center]
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!