Bitcoin Forum
December 16, 2017, 04:05:29 AM *
News: Latest stable version of Bitcoin Core: 0.15.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: [1]
  Print  
Author Topic: Rig with 6x 1080Ti reseting - Need help  (Read 237 times)
zimmix
Newbie
*
Offline Offline

Activity: 25


View Profile
November 23, 2017, 06:56:14 PM
 #1

I'm having a problem with my rig and will appreciate any help you can provide, I will try to be as clear as possible.

I have one rig with 6x 1080 Ti running SMOS as below.

3x 750w PSU linked together (2 gpus for each psu) - 80 plus bronze (yeah I know...)
1x Biostar 250 tb pro (12 slots)
1x 8gb RAM corsair vengeance
1x CPU G3930 7 gen
1x 32gb 3.0 Pendrive with SMOS
6x risers
6x EVGA 1080ti (same model)

Rig worked fine for more than a month and then it started to restart and sometimes turning only 3, 4 or 5 gpus. If I changed the mining program to a non mineable situation (pass the wrong parameters for the mining program) then it would start and "work as expected" - with error messages appearing informing that the parameters were wrong, but not reseting anymore.

With this in mind, I did a lot of checks such as changing the risers, exchanging places between psus and gpus, underclocking (heavily) / removing overclock. During all the tests, nothing changed, suggesting that the gpus were not faulty as if I turned only 2 to 4 gpus it worked like a charm.

To bridge between the psus I had 2x add2psu-similar piece (later I checked and it was a generic version, not the original one fml) and I saw that one was failing (or it looked like it was), so I removed both pieces and did a "direct link" with a jumper until the the variant cable similar to the lian li cable arrived, that was able to link the 3 psu altogether.

During the meantime I left 4 gpus mining without the reset problem and jumping the psu, I even exchanged the cables, gpus, risers and psus between them and it worked perfectly with 4 gpus regardless the psu they were linked, as long as only 2 random psus were on.

The cable arrived and I turned it all on, it worked! For the first time in a week the rig was mining with 6 gpus, and it lasted one day, until the problem returned, but different this time, it reseted for a few times (5~10) and became stable again, it happened 2 times in 2 days and all the times all 6 gpus were recognized by the system, different from before where 3 to 5 was normal and 6 almost never during the reset loop prior to the change of cables.

Right now I'm thinking in acquiring the original add2psu and test. I'm aware that chaining psus aren't the best idea, but I had the psus on hand and didn't wanted to buy new ones and I want to be sure that the problem is the psu chain before acquiring new parts.

Important informations:
- The resets occurred right after the beginning of the mining process and I tried different miners / algos.
- I guess I isolated many components of the system and the only "faulty" one was the "add2psu" adapter. What I wasn't able to isolate: pendrive, ram, gpu and the mobo.
- Resets happened once every 30 to 45 seconds more or less with "no screen session found" as system message.
- The mobo have the config as required by manufacturer / SMOS install guide.
- It worked perfectly during more than one month, I was even able to restart many times in a row during tests (manually or using the SMOS dashboard) and it worked without flaws.
- The mobo have 2 molex inputs for extra power that are connected to the psus and I have no further info if this can be a problem or not.

What advice / suggestion would you give to me?

*Sorry for the text wall!  Lips sealed
Bitcoin mining is now a specialized and very risky industry, just like gold mining. Amateur miners are unlikely to make much money, and may even lose money. Bitcoin is much more than just mining, though!
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1513397129
Hero Member
*
Offline Offline

Posts: 1513397129

View Profile Personal Message (Offline)

Ignore
1513397129
Reply with quote  #2

1513397129
Report to moderator
1513397129
Hero Member
*
Offline Offline

Posts: 1513397129

View Profile Personal Message (Offline)

Ignore
1513397129
Reply with quote  #2

1513397129
Report to moderator
1513397129
Hero Member
*
Offline Offline

Posts: 1513397129

View Profile Personal Message (Offline)

Ignore
1513397129
Reply with quote  #2

1513397129
Report to moderator
wacko
Hero Member
*****
Offline Offline

Activity: 742


View Profile
November 23, 2017, 07:50:31 PM
 #2

I wouldn't think on those PSU sync devices, afaik their only job is to start all PSUs at the same time. After PSUs turned on, they don't do anything. If you'd have problems with starting you rig then yeah, those sync devices could be the reason, but it doesn't seem likely that they are in your case. You don't even need them at all, shorting the 16th pin with a paper clip achieves pretty much the same.

Check the board, or just replace it temporarily. You've only got 6 GPUs in total, there's plenty of cheap boards out there that can run 6 cards. And it's a good idea to have some kind of a spare or test system anyway, at least if you're planning on expanding. I mean, what else could it be? If you're sure that the GPUs and the risers are fine, then it's either the platform or the power. Then again, "750W Bronze" doesn't say much about your power, there are good units matching this description, and there units so bad that they'll burn your house and then a few houses next to you.
zimmix
Newbie
*
Offline Offline

Activity: 25


View Profile
November 23, 2017, 08:51:00 PM
 #3

PSU is from Thermaltake. I have one mobo that can run up to 5 cards, I guess I will try it, but I've changed the risers connection on the mobo, as it can have up to 12 gpus I was able to change them all without success.

One weird thing is that I wasn't able to make it work properly using the jumpers (even if it powered all cards), but with the adapter it gets stable sometimes at first try and sometimes after reseting a few times.

Since 2x gpus and its risers, mobo, 2x mobo molex are connected could it be that a failure while initializing other psus could overcharge the main psu, as it doesnt have too much spare power? Note that the rig resets as soon as the mining is about to start (miner software is on), so it's as soon as all cards are really powered up ready to mine (with all cards detected), could it be that a late power up causing this overcharge? - As I said, if I set a dummy miner to start, the rig 100% of times gets stable at first try, because the cards aren't really powered up. If it was the mobo, wouldn't it not occur like this?
ivakar
Sr. Member
****
Offline Offline

Activity: 457



View Profile
November 24, 2017, 03:09:14 AM
 #4

It looks like you have some problems with Power Supply, I mean check very well the cables - which coming into the gpus and the cables which go out of the PSUs. Is there any kind of problems with them.
The second choice is - MB, change it

CryptoWatcher420
Sr. Member
****
Online Online

Activity: 294

Small Time Miner, Rig Builder, Crypto Trader


View Profile
November 24, 2017, 06:24:25 AM
 #5

I'm having a problem with my rig and will appreciate any help you can provide, I will try to be as clear as possible.

I have one rig with 6x 1080 Ti running SMOS as below.

3x 750w PSU linked together (2 gpus for each psu) - 80 plus bronze (yeah I know...)
1x Biostar 250 tb pro (12 slots)
1x 8gb RAM corsair vengeance
1x CPU G3930 7 gen
1x 32gb 3.0 Pendrive with SMOS
6x risers
6x EVGA 1080ti (same model)

Rig worked fine for more than a month and then it started to restart and sometimes turning only 3, 4 or 5 gpus. If I changed the mining program to a non mineable situation (pass the wrong parameters for the mining program) then it would start and "work as expected" - with error messages appearing informing that the parameters were wrong, but not reseting anymore.

With this in mind, I did a lot of checks such as changing the risers, exchanging places between psus and gpus, underclocking (heavily) / removing overclock. During all the tests, nothing changed, suggesting that the gpus were not faulty as if I turned only 2 to 4 gpus it worked like a charm.

To bridge between the psus I had 2x add2psu-similar piece (later I checked and it was a generic version, not the original one fml) and I saw that one was failing (or it looked like it was), so I removed both pieces and did a "direct link" with a jumper until the the variant cable similar to the lian li cable arrived, that was able to link the 3 psu altogether.

During the meantime I left 4 gpus mining without the reset problem and jumping the psu, I even exchanged the cables, gpus, risers and psus between them and it worked perfectly with 4 gpus regardless the psu they were linked, as long as only 2 random psus were on.

The cable arrived and I turned it all on, it worked! For the first time in a week the rig was mining with 6 gpus, and it lasted one day, until the problem returned, but different this time, it reseted for a few times (5~10) and became stable again, it happened 2 times in 2 days and all the times all 6 gpus were recognized by the system, different from before where 3 to 5 was normal and 6 almost never during the reset loop prior to the change of cables.

Right now I'm thinking in acquiring the original add2psu and test. I'm aware that chaining psus aren't the best idea, but I had the psus on hand and didn't wanted to buy new ones and I want to be sure that the problem is the psu chain before acquiring new parts.

Important informations:
- The resets occurred right after the beginning of the mining process and I tried different miners / algos.
- I guess I isolated many components of the system and the only "faulty" one was the "add2psu" adapter. What I wasn't able to isolate: pendrive, ram, gpu and the mobo.
- Resets happened once every 30 to 45 seconds more or less with "no screen session found" as system message.
- The mobo have the config as required by manufacturer / SMOS install guide.
- It worked perfectly during more than one month, I was even able to restart many times in a row during tests (manually or using the SMOS dashboard) and it worked without flaws.
- The mobo have 2 molex inputs for extra power that are connected to the psus and I have no further info if this can be a problem or not.

What advice / suggestion would you give to me?

*Sorry for the text wall!  Lips sealed

3 psu's man what a joke, get a server grade psu with a breakout board. theres lots that are great and have high wattage

Support Custom PCI-Express/riser cables & EPS 12v plus pigtail for a pico Psu for GPU MINING RIGS! - Donations are welcome and help keep things going!
Donations - 15aSLkuKDF5BvV6E3HUy4b5UYyamB97Fb7
Join Me on Discord! https://discord.gg/VDwWFcK
<iframe src="https://discordapp.com/widget?id=290395610165805056&theme=dark" width="350" height="500" allowtransparency="true" frameborder="0"></iframe>
dadesu
Sr. Member
****
Offline Offline

Activity: 252



View Profile
November 24, 2017, 06:50:28 AM
 #6

There is big chance that your problem is with psu connected on 2 gpu and suply for system.
For start try to mine low power consumption coin on 50% power limit, and if it is stable then something else then psu is problem.
But 90% psu are problem.

         ▄███████████████▄
       ▄██▀             ▀██▄
    ▄▄██▀                 ▀██▄▄
█████▀▀       ▄▀▀▀▀▀▀▀▄▄    ▀▀█████
██          ▄▀ ▄▄▄▀▀▀▀▄▀█▄▄      ██
▐█▌       ▄▀ ▄▀ ▄▄▄▀▀▀▄▀▀▀███   ▐█▌
 ██      ▄▀▄▀▄▀▀▄▄▄▀▀▀▀▀█ ▄█▀   ██
 ▐█▌    █▄▀▄▀▄█▀▀▀ ▀█▀ ▄▀▄▀█   ▐█▌
  ██    █▄▀▄▀▄▄█▀ ▄▀ ▄▀▄▀▄▀█   ██
  ▐█▌ ▀▄█████▀▄▄▀▀▄▄▀▄▀▄▀▄▀█  ▐█▌
   ██▌▀████▀██▄▄▀▀▄▄▀▄▀▄▀▄█▀ ▐██
    ██▌▀█▀▀█▄▀▀▄▀▀▄▄▀▄█▄▄█▀ ▐██
     ██▌ ▀  ▀███▄▄▄█████▀  ▐██
      ██▄      ▀▀▀▀▀      ▄██
       ▀██▄             ▄██▀
         ▀██▄         ▄██▀
           ▀██▄     ▄██▀
             ▀███▄███▀
               ▀███▀
DeepOnion 
★ ★ ★ ★ ★   ❱❱❱ JOIN AIRDROP NOW!
TOR INTEGRATED & SECURED
★  Your Anonymity Guaranteed
★  Your Assets Secured by TOR
★  Guard Your Privacy!
|Bitcointalk
Reddit
Telegram
|                        ▄▄▀▄▄▀▄▄▀▄▀▀
                    ▄▄██▀█▀▄▀▀▀
                  ▄██▄█▄██▀
                ▄██████▀
              ▄██████▀
  ▄█▄▄▄▄▄▄▄▄▄██████▀
██████▀▀▀▀▀██████▀
 ▀█████  ▄███████
  ████████████▀██
  ██▀███████▀  ██
  ██ ▀████▀    ██
  ██   ▀▀      ██
  ▀█████████████▀
zimmix
Newbie
*
Offline Offline

Activity: 25


View Profile
November 24, 2017, 04:48:18 PM
 #7

I guess I will have to check the cables as well. I really think the problem is the bridge between the PSUs, it's running perfectly for a few days, but if the energy fails or I try to remotely restart it, then the problem starts again until it gets stable.

I will try with a more powerful PSU for the main one (mobo + 2 gpus), because if the cable heats a lot, it may overcharge it just a little, enough to reduce it's efficiency to a level close or lower than the actual power output, reseting the system. Since the PSU is almost on its power output limit. It would explain the fact that the system gets stable after a few tries, as the components / cables will cool after some stand by period.

But I guess I had two problems, one was with the "generic add2psu" that failed and caused a collapse over the system, other is related to overcharge of the capacity of the main psu (under tests).

Thank you all for your contributions, feel free to enlighten me more if you have anything else to say!!

Ah, yeah I know it's bad enough to use 3 PSU in a row, but server PSU where I live is too costly as well as powerful PSUs, it goes for more than $400 a 1000W PSU.
wacko
Hero Member
*****
Offline Offline

Activity: 742


View Profile
November 24, 2017, 05:13:34 PM
 #8

Ah, yeah I know it's bad enough to use 3 PSU in a row, but server PSU where I live is too costly as well as powerful PSUs, it goes for more than $400 a 1000W PSU.
The cards you're using are not cheap either. I'd understand if it's a temporary solution for you until "proper" PSUs arrive, but I wouldn't suggest you to keep using these 750w units for long.
fanatic26
Hero Member
*****
Offline Offline

Activity: 560


View Profile
November 24, 2017, 05:20:52 PM
 #9

You can get a server PSU that runs that whole rig for under $200.


P.S. those add2psu things are a total waste of money. Just make sure you turn the GPU only PSUs on first then the main one and you wont have any problems. Also you dont need to reset all 3 PSUs to restart the machine, just the one connected to the motherboard.

Stop buying industrial miners, running them at home, and then complaining about the noise.
Pages: [1]
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!