Bitcoin Forum
June 27, 2024, 12:56:50 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: 3 Cards burned after 14hrs?  (Read 1212 times)
ruepa (OP)
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
December 01, 2017, 01:44:52 PM
 #1

I've build a mining rig with 3 nvidia 1070 ti AERO and used a Asus Prime Z270-p motherboard.
I'm using nvOC and I've already enabled above 4G decoding and reset the security keys on the BIOS

Initially the system was booting fine, and i've set up the cards with CC 175 and a MC 1200, power limit of 115W and fan 80. On the 1bash
I've opened the miner terminal and it was mining around 480 sol's per card, with a temperature around 70 oC, i've letf it running for about 5hrs before turning off the monitor and going to work, at the time temperature were stable at 68oC with occasional spikes to 70oC.

According to the pool i was connected the rig was running for around 14hrs and just before turning off there was a spike in the sol's on the graph (nothing crazy). When i got home the RIG was on the bios. Now when i boot into linux if i leave the terminal open it crashes and reboots. If i close the terminal i can use the system, i can go into youtube and watch movies in 1080p, see the gpu's detected on the nvidia app with no arthiphacts.
If i try to run the mining terminal it crashes strait away, opening just the terminal without running the mining bash it also crashes but takes longer.

First of all, do you think it was possible to have burned the gpu's with those settings with around a total of 20hrs of work?

What can be causing this? I've already tried changing risers, formatting the USB and running it again, connecting the GPU one by one and all of them show the same behaviour
crocozino
Sr. Member
****
Offline Offline

Activity: 420
Merit: 250



View Profile
December 01, 2017, 01:53:29 PM
 #2

well, I doubt that it is possible to burn GPU like that, cause all nvidia gpus have some kind of protection for overheating, it will try to reduce clock
or just reboot system
I think it connects to your PSU - check the cables out of PSU and coming into GPUs - what the state of them? are they ok?
what kind of PSU do you have?
ruepa (OP)
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
December 01, 2017, 01:59:27 PM
 #3

The PSU it's a EVGA750CQ Gold.
I find it hard to belive that all 3GPU's have burned, i tried the GPU one by one (disconnecting the connection from the motherboard of the other 2) and was getting the exact same ploblem, but i guess it's possible.
Metroid
Sr. Member
****
Offline Offline

Activity: 2142
Merit: 353


Xtreme Monster


View Profile
December 01, 2017, 02:02:10 PM
 #4

well, I doubt that it is possible to burn GPU like that, cause all nvidia gpus have some kind of protection for overheating,

BTC Address: 1DH4ok85VdFAe47fSVXNVctxkFhUv4ujbR
philipma1957
Legendary
*
Online Online

Activity: 4172
Merit: 8061


'The right to privacy matters'


View Profile WWW
December 01, 2017, 03:20:01 PM
 #5

your clocks are too high

set them at 145 for core

and at 1000 for mc

480 sols  meant you stretched to the max

with 175  and 1200


once they work  with

145 and 1000


bump

to 150 and 1000


you will be closer to 450  sols  but it won't crash

▄▄███████▄▄
▄██████████████▄
▄██████████████████▄
▄████▀▀▀▀███▀▀▀▀█████▄
▄█████████████▄█▀████▄
███████████▄███████████
██████████▄█▀███████████
██████████▀████████████
▀█████▄█▀█████████████▀
▀████▄▄▄▄███▄▄▄▄████▀
▀██████████████████▀
▀███████████████▀
▀▀███████▀▀
.
 MΞTAWIN  THE FIRST WEB3 CASINO   
.
.. PLAY NOW ..
cryptbro
Newbie
*
Offline Offline

Activity: 82
Merit: 0


View Profile
December 01, 2017, 04:23:57 PM
 #6

mc 1200 jesus christ chillll man haha just lower them bad boys and try again
GeePeeU
Sr. Member
****
Offline Offline

Activity: 540
Merit: 251


ASK


View Profile
December 01, 2017, 04:31:18 PM
 #7

your clocks are too high

set them at 145 for core

and at 1000 for mc

480 sols  meant you stretched to the max

with 175  and 1200


once they work  with

145 and 1000


bump

to 150 and 1000


you will be closer to 450  sols  but it won't crash

Shoutout to this man. Take his advice. He is a honeypot of information around here.

Always doubt.
QuintLeo
Legendary
*
Offline Offline

Activity: 1498
Merit: 1030


View Profile
December 01, 2017, 11:59:20 PM
 #8

your clocks are too high

set them at 145 for core

and at 1000 for mc

480 sols  meant you stretched to the max

with 175  and 1200


once they work  with

145 and 1000


bump

to 150 and 1000


you will be closer to 450  sols  but it won't crash

 1070 ti pulling 480 sol/s isn't pushing hard at all.

 70C though is high on temp for 115 watts, especially with 80% fan - SOMETHING is blocking airflow to the cards or the room is VERY VERY hot.
 Those Aero blower cards shouldn't have THAT bad of cooling, though they would tend to run a LITTLE warmer than good fan-cooled cards unless they're in a case.

 My EVGA and Zotac 1070 ti cards in my "new shelf/rack" first rig only run 62-63C at 50% fan in a room that is 86 F measured right above the "room fan" that blows air into that rig.


 It's POSSIBLE that you got a batch of flaky cards - but if they're working for video playing it sounds more like you're pushing them too hard, probably on the memory clock.





I'm no longer legendary just in my own mind!
Like something I said? Donations gratefully accepted. LYLnTKvLefz9izJFUvEGQEZzSkz34b3N6U (Litecoin)
1GYbjMTPdCuV7dci3iCUiaRrcNuaiQrVYY (Bitcoin)
Ultegra134
Hero Member
*****
Offline Offline

Activity: 1610
Merit: 786



View Profile
December 02, 2017, 01:20:49 AM
 #9

It's highly unlikely that you got a faulty batch, but possible though. However, I'd try resetting or lowering their clocks as already suggested. Be extra careful with such expensive GPUs, it's not that hard to roast all of them if a power cut or sudden power surge happens, I'd suggest buying an external UPS to secure them.

R


▀▀▀▀▀▀▀██████▄▄
████████████████
▀▀▀▀█████▀▀▀█████
████████▌███▐████
▄▄▄▄█████▄▄▄█████
████████████████
▄▄▄▄▄▄▄██████▀▀
LLBIT
  CRYPTO   
FUTURES
 1,000x 
LEVERAGE
COMPETITIVE
    FEES    
 INSTANT 
EXECUTION
.
   TRADE NOW   
QuintLeo
Legendary
*
Offline Offline

Activity: 1498
Merit: 1030


View Profile
December 02, 2017, 08:06:08 AM
 #10

While I'm thinking about it - do your MSI 1070 ti Aero cards have a ball-bearing blower?
My research to date has been inconclusive, MSI hasn't replied to my direct question yet, and the ASUS variation on the same theme that IS ball bearing is out of stock....




I'm no longer legendary just in my own mind!
Like something I said? Donations gratefully accepted. LYLnTKvLefz9izJFUvEGQEZzSkz34b3N6U (Litecoin)
1GYbjMTPdCuV7dci3iCUiaRrcNuaiQrVYY (Bitcoin)
bigjee
Full Member
***
Offline Offline

Activity: 434
Merit: 107



View Profile
December 02, 2017, 08:28:26 AM
 #11

@OP. Linux is more for advanced users.
Might want to start of with default settings in Windows and tweak through after burner.

More straight forward than making errors keying in numbers that you don't necessarily know how far off are from the default value.
I used NVoc and gave up soon after. Not much benefit in terms of hashrate and definitely not user friendly.
Ledipaa
Newbie
*
Offline Offline

Activity: 62
Merit: 0


View Profile
December 02, 2017, 10:18:50 AM
 #12

Yea its pretty obvious that your cards cant handle +1200Mhz mem clocks, hell there aint a single card that can do that! Some samsung chips cant do even +800. Best are usually +1000mhz. Im suprised your rig didnt crash right after the 1st miner launch with those insane clocks.
ruepa (OP)
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
December 02, 2017, 12:12:25 PM
 #13

your clocks are too high

set them at 145 for core

and at 1000 for mc

480 sols  meant you stretched to the max

with 175  and 1200


once they work  with

145 and 1000


bump

to 150 and 1000


you will be closer to 450  sols  but it won't crash

Thanks for the input, yesterday i managed to put the rig back and running (i belive the problem is the USB drive since it now has happen again and the rig was back and running after formatting it).
I went down in MC to 600 and CC to 150 as soon as i managed to boot the system and was getting around 460sol per card.
Now i'm starting to think the problem is the USB driver where i have the OS. The system again was running great for about 24hrs and then stopped working and it only boots nvOC to the login screen, when i try to log in with the password it reboots. I'm now formatting the USB to intall again nvOC to make sure that the problem is there, waiting for Lexar drive at the moment, but i think i will end up going with a ssd
STSMiner
Full Member
***
Offline Offline

Activity: 270
Merit: 115



View Profile
December 02, 2017, 12:20:35 PM
Last edit: December 02, 2017, 12:32:01 PM by STSMiner
 #14

You are pushing those cards too hard with the overclocks, most cards don't run with clocks too high.

I get over 510 sols on my EVGA 1070 Ti with these settings below with the temp at 70c using the zm miner.

Set the memory to -303 and the core to +140 and power to 85 - 90.

Lower your overclocks and keep the fans above 90% to keep them cool.
ruepa (OP)
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
December 02, 2017, 01:18:13 PM
 #15

You are pushing those cards too hard with the overclocks, most cards don't run with clocks too high.

I get over 510 sols on my EVGA 1070 Ti with these settings below with the temp at 70c using the zm miner.

Set the memory to -303 and the core to +140 and power to 85 - 90.

Lower your overclocks and keep the fans above 90% to keep them cool.
I was thinking about turning the memory clock negative since it should't make a big different on zencash. As i said above the system went down again but i belive it's a issue with the USB drive.
Formatting the USB and re intalling nvOC is the only thing that gets the system back and running. Will try with a differente USB to see if i get better stability.
At moment i'm running with memory 600 and clock 140, power limit 110

Will try these settings for the next 24hrs, and go up by 5 on clock if i don't get any reboots.
Thanks for all the advices Wink
ruepa (OP)
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
December 02, 2017, 01:33:16 PM
 #16

1070 ti pulling 480 sol/s isn't pushing hard at all.

 70C though is high on temp for 115 watts, especially with 80% fan - SOMETHING is blocking airflow to the cards or the room is VERY VERY hot.
 Those Aero blower cards shouldn't have THAT bad of cooling, though they would tend to run a LITTLE warmer than good fan-cooled cards unless they're in a case.

 My EVGA and Zotac 1070 ti cards in my "new shelf/rack" first rig only run 62-63C at 50% fan in a room that is 86 F measured right above the "room fan" that blows air into that rig.


 It's POSSIBLE that you got a batch of flaky cards - but if they're working for video playing it sounds more like you're pushing them too hard, probably on the memory clock.


What are your settings if you don't mind asking? I think the main problem is on the USB stick, and it was the reason that system crashed and not the OC speeds. Even if for sure i was pushing the MC way over what needed.

Do you recommend anything to remote monitor the rig using nvOC? I think i will end up using teamviwer but was looking for other alternatives
faanigee
Member
**
Offline Offline

Activity: 95
Merit: 10


View Profile
December 02, 2017, 01:53:42 PM
 #17

try windows and check it. if you have same problem then send me your configuration file...
philipma1957
Legendary
*
Online Online

Activity: 4172
Merit: 8061


'The right to privacy matters'


View Profile WWW
December 02, 2017, 04:08:01 PM
 #18

1070 ti pulling 480 sol/s isn't pushing hard at all.

 70C though is high on temp for 115 watts, especially with 80% fan - SOMETHING is blocking airflow to the cards or the room is VERY VERY hot.
 Those Aero blower cards shouldn't have THAT bad of cooling, though they would tend to run a LITTLE warmer than good fan-cooled cards unless they're in a case.

 My EVGA and Zotac 1070 ti cards in my "new shelf/rack" first rig only run 62-63C at 50% fan in a room that is 86 F measured right above the "room fan" that blows air into that rig.


 It's POSSIBLE that you got a batch of flaky cards - but if they're working for video playing it sounds more like you're pushing them too hard, probably on the memory clock.


What are your settings if you don't mind asking? I think the main problem is on the USB stick, and it was the reason that system crashed and not the OC speeds. Even if for sure i was pushing the MC way over what needed.

Do you recommend anything to remote monitor the rig using nvOC? I think i will end up using teamviwer but was looking for other alternatives

simplemining  is 2 usd a rig a month  uses linux  allows clocking and monitoring.  If you have bigger rigs 2 usd a month is cheap enough.

I like nvoc   but it is not as good as simplemining   for maintaining multiple rigs in multiple locations.

Still doing some setup work for winter mining.  But I have  3 simple mining accounts  with  13 rigs in all  So I pay 26 usd a month  but the rigs earn 3000 usd a month

so under 1 %  it  is worth it to me.

I will be setting up a 4th location  much larger it is 1 hour and 10 minutes from my house. 
Simple mining will earn its keep  as that spot will have a lot of rigs and I do not feel like driving to change shit  more then 1 time a week.

▄▄███████▄▄
▄██████████████▄
▄██████████████████▄
▄████▀▀▀▀███▀▀▀▀█████▄
▄█████████████▄█▀████▄
███████████▄███████████
██████████▄█▀███████████
██████████▀████████████
▀█████▄█▀█████████████▀
▀████▄▄▄▄███▄▄▄▄████▀
▀██████████████████▀
▀███████████████▀
▀▀███████▀▀
.
 MΞTAWIN  THE FIRST WEB3 CASINO   
.
.. PLAY NOW ..
ruepa (OP)
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
December 02, 2017, 05:01:32 PM
 #19

I looked into simplemining but i feel as i'm starting nvOC will allow me to make more small mistakes ( that i need to learn ) and allow to better understand the principles.
As i expand into a mining farm problably will look again into simplemining. For the time being i'm mostly learning, will complete the rig in the next couple weeks (will add 3 more graphic cards).
sindikat
Full Member
***
Offline Offline

Activity: 364
Merit: 106


View Profile
December 02, 2017, 06:13:28 PM
 #20

I never raise the temperature of my GPU above 58 degrees. 70 degrees is the limit for GPU. But I think that is not the case. Did you have any problems with the power supply. If the power supply also worked to the limit could cause a power surge. This is enough to fail a large part of your equipment.
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!