Bitcoin Forum
April 24, 2024, 01:20:15 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Rig with 6x 1080Ti reseting - Need help  (Read 348 times)
zimmix (OP)
Newbie
*
Offline Offline

Activity: 31
Merit: 0


View Profile
November 23, 2017, 06:56:14 PM
 #1

I'm having a problem with my rig and will appreciate any help you can provide, I will try to be as clear as possible.

I have one rig with 6x 1080 Ti running SMOS as below.

3x 750w PSU linked together (2 gpus for each psu) - 80 plus bronze (yeah I know...)
1x Biostar 250 tb pro (12 slots)
1x 8gb RAM corsair vengeance
1x CPU G3930 7 gen
1x 32gb 3.0 Pendrive with SMOS
6x risers
6x EVGA 1080ti (same model)

Rig worked fine for more than a month and then it started to restart and sometimes turning only 3, 4 or 5 gpus. If I changed the mining program to a non mineable situation (pass the wrong parameters for the mining program) then it would start and "work as expected" - with error messages appearing informing that the parameters were wrong, but not reseting anymore.

With this in mind, I did a lot of checks such as changing the risers, exchanging places between psus and gpus, underclocking (heavily) / removing overclock. During all the tests, nothing changed, suggesting that the gpus were not faulty as if I turned only 2 to 4 gpus it worked like a charm.

To bridge between the psus I had 2x add2psu-similar piece (later I checked and it was a generic version, not the original one fml) and I saw that one was failing (or it looked like it was), so I removed both pieces and did a "direct link" with a jumper until the the variant cable similar to the lian li cable arrived, that was able to link the 3 psu altogether.

During the meantime I left 4 gpus mining without the reset problem and jumping the psu, I even exchanged the cables, gpus, risers and psus between them and it worked perfectly with 4 gpus regardless the psu they were linked, as long as only 2 random psus were on.

The cable arrived and I turned it all on, it worked! For the first time in a week the rig was mining with 6 gpus, and it lasted one day, until the problem returned, but different this time, it reseted for a few times (5~10) and became stable again, it happened 2 times in 2 days and all the times all 6 gpus were recognized by the system, different from before where 3 to 5 was normal and 6 almost never during the reset loop prior to the change of cables.

Right now I'm thinking in acquiring the original add2psu and test. I'm aware that chaining psus aren't the best idea, but I had the psus on hand and didn't wanted to buy new ones and I want to be sure that the problem is the psu chain before acquiring new parts.

Important informations:
- The resets occurred right after the beginning of the mining process and I tried different miners / algos.
- I guess I isolated many components of the system and the only "faulty" one was the "add2psu" adapter. What I wasn't able to isolate: pendrive, ram, gpu and the mobo.
- Resets happened once every 30 to 45 seconds more or less with "no screen session found" as system message.
- The mobo have the config as required by manufacturer / SMOS install guide.
- It worked perfectly during more than one month, I was even able to restart many times in a row during tests (manually or using the SMOS dashboard) and it worked without flaws.
- The mobo have 2 molex inputs for extra power that are connected to the psus and I have no further info if this can be a problem or not.

What advice / suggestion would you give to me?

*Sorry for the text wall!  Lips sealed
If you see garbage posts (off-topic, trolling, spam, no point, etc.), use the "report to moderator" links. All reports are investigated, though you will rarely be contacted about your reports.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713921615
Hero Member
*
Offline Offline

Posts: 1713921615

View Profile Personal Message (Offline)

Ignore
1713921615
Reply with quote  #2

1713921615
Report to moderator
1713921615
Hero Member
*
Offline Offline

Posts: 1713921615

View Profile Personal Message (Offline)

Ignore
1713921615
Reply with quote  #2

1713921615
Report to moderator
1713921615
Hero Member
*
Offline Offline

Posts: 1713921615

View Profile Personal Message (Offline)

Ignore
1713921615
Reply with quote  #2

1713921615
Report to moderator
wacko
Legendary
*
Offline Offline

Activity: 1106
Merit: 1014


View Profile
November 23, 2017, 07:50:31 PM
 #2

I wouldn't think on those PSU sync devices, afaik their only job is to start all PSUs at the same time. After PSUs turned on, they don't do anything. If you'd have problems with starting you rig then yeah, those sync devices could be the reason, but it doesn't seem likely that they are in your case. You don't even need them at all, shorting the 16th pin with a paper clip achieves pretty much the same.

Check the board, or just replace it temporarily. You've only got 6 GPUs in total, there's plenty of cheap boards out there that can run 6 cards. And it's a good idea to have some kind of a spare or test system anyway, at least if you're planning on expanding. I mean, what else could it be? If you're sure that the GPUs and the risers are fine, then it's either the platform or the power. Then again, "750W Bronze" doesn't say much about your power, there are good units matching this description, and there units so bad that they'll burn your house and then a few houses next to you.
zimmix (OP)
Newbie
*
Offline Offline

Activity: 31
Merit: 0


View Profile
November 23, 2017, 08:51:00 PM
 #3

PSU is from Thermaltake. I have one mobo that can run up to 5 cards, I guess I will try it, but I've changed the risers connection on the mobo, as it can have up to 12 gpus I was able to change them all without success.

One weird thing is that I wasn't able to make it work properly using the jumpers (even if it powered all cards), but with the adapter it gets stable sometimes at first try and sometimes after reseting a few times.

Since 2x gpus and its risers, mobo, 2x mobo molex are connected could it be that a failure while initializing other psus could overcharge the main psu, as it doesnt have too much spare power? Note that the rig resets as soon as the mining is about to start (miner software is on), so it's as soon as all cards are really powered up ready to mine (with all cards detected), could it be that a late power up causing this overcharge? - As I said, if I set a dummy miner to start, the rig 100% of times gets stable at first try, because the cards aren't really powered up. If it was the mobo, wouldn't it not occur like this?
ivakar
Hero Member
*****
Offline Offline

Activity: 756
Merit: 507



View Profile
November 24, 2017, 03:09:14 AM
 #4

It looks like you have some problems with Power Supply, I mean check very well the cables - which coming into the gpus and the cables which go out of the PSUs. Is there any kind of problems with them.
The second choice is - MB, change it
CryptoWatcher420
Sr. Member
****
Offline Offline

Activity: 462
Merit: 258

Small Time Miner, Rig Builder, Crypto Trader


View Profile
November 24, 2017, 06:24:25 AM
 #5

I'm having a problem with my rig and will appreciate any help you can provide, I will try to be as clear as possible.

I have one rig with 6x 1080 Ti running SMOS as below.

3x 750w PSU linked together (2 gpus for each psu) - 80 plus bronze (yeah I know...)
1x Biostar 250 tb pro (12 slots)
1x 8gb RAM corsair vengeance
1x CPU G3930 7 gen
1x 32gb 3.0 Pendrive with SMOS
6x risers
6x EVGA 1080ti (same model)

Rig worked fine for more than a month and then it started to restart and sometimes turning only 3, 4 or 5 gpus. If I changed the mining program to a non mineable situation (pass the wrong parameters for the mining program) then it would start and "work as expected" - with error messages appearing informing that the parameters were wrong, but not reseting anymore.

With this in mind, I did a lot of checks such as changing the risers, exchanging places between psus and gpus, underclocking (heavily) / removing overclock. During all the tests, nothing changed, suggesting that the gpus were not faulty as if I turned only 2 to 4 gpus it worked like a charm.

To bridge between the psus I had 2x add2psu-similar piece (later I checked and it was a generic version, not the original one fml) and I saw that one was failing (or it looked like it was), so I removed both pieces and did a "direct link" with a jumper until the the variant cable similar to the lian li cable arrived, that was able to link the 3 psu altogether.

During the meantime I left 4 gpus mining without the reset problem and jumping the psu, I even exchanged the cables, gpus, risers and psus between them and it worked perfectly with 4 gpus regardless the psu they were linked, as long as only 2 random psus were on.

The cable arrived and I turned it all on, it worked! For the first time in a week the rig was mining with 6 gpus, and it lasted one day, until the problem returned, but different this time, it reseted for a few times (5~10) and became stable again, it happened 2 times in 2 days and all the times all 6 gpus were recognized by the system, different from before where 3 to 5 was normal and 6 almost never during the reset loop prior to the change of cables.

Right now I'm thinking in acquiring the original add2psu and test. I'm aware that chaining psus aren't the best idea, but I had the psus on hand and didn't wanted to buy new ones and I want to be sure that the problem is the psu chain before acquiring new parts.

Important informations:
- The resets occurred right after the beginning of the mining process and I tried different miners / algos.
- I guess I isolated many components of the system and the only "faulty" one was the "add2psu" adapter. What I wasn't able to isolate: pendrive, ram, gpu and the mobo.
- Resets happened once every 30 to 45 seconds more or less with "no screen session found" as system message.
- The mobo have the config as required by manufacturer / SMOS install guide.
- It worked perfectly during more than one month, I was even able to restart many times in a row during tests (manually or using the SMOS dashboard) and it worked without flaws.
- The mobo have 2 molex inputs for extra power that are connected to the psus and I have no further info if this can be a problem or not.

What advice / suggestion would you give to me?

*Sorry for the text wall!  Lips sealed

3 psu's man what a joke, get a server grade psu with a breakout board. theres lots that are great and have high wattage

6pin to EPS 12v 4+4pin w/pigtail & 2.5mm barrel plug for Pico Psu for SERVER PSU ONLY GPU MINING RIGS! | Donations: BTC-  | Join Me on Discord! https://discord.gg/VDwWFcK
dadesu
Sr. Member
****
Offline Offline

Activity: 378
Merit: 258



View Profile
November 24, 2017, 06:50:28 AM
 #6

There is big chance that your problem is with psu connected on 2 gpu and suply for system.
For start try to mine low power consumption coin on 50% power limit, and if it is stable then something else then psu is problem.
But 90% psu are problem.

           ▀██▄ ▄██▀
            ▐█████▌
           ▄███▀███▄
         ▄████▄  ▀███▄
       ▄███▀ ▀██▄  ▀███▄
     ▄███▀  ▄█████▄  ▀███▄
   ▄███▀  ▄███▀ ▀███▄  ▀███▄
  ███▀  ▄████▌   ▐████▄  ▀███
 ███   ██▀  ██▄ ▄██  ▀██   ███
███   ███  ███   ███  ███   ███
███   ███   ███████   ███   ███
 ███   ███▄▄       ▄▄███   ███
  ███▄   ▀▀█████████▀▀   ▄███
   ▀████▄▄           ▄▄████▀
      ▀▀███████████████▀▀
DeepOnion
Anonymous and Untraceable
ANN  Whitepaper  Facebook  Twitter  Telegram  Discord 





      ▄▄██████████▄▄
    ▄███▀▀      ▀▀█▀   ▄▄
   ███▀              ▄███
  ███              ▄███▀   ▄▄
 ███▌  ▄▄▄▄      ▄███▀   ▄███
▐███  ██████   ▄███▀   ▄███▀
███▌ ███  ███▄███▀   ▄███▀
███▌ ███   ████▀   ▄███▀
███▌  ███   █▀   ▄███▀  ███
▐███   ███     ▄███▀   ███
 ███▌   ███  ▄███▀     ███
  ███    ██████▀      ███
   ███▄             ▄███
    ▀███▄▄       ▄▄███▀
      ▀▀███████████▀▀
.
zimmix (OP)
Newbie
*
Offline Offline

Activity: 31
Merit: 0


View Profile
November 24, 2017, 04:48:18 PM
 #7

I guess I will have to check the cables as well. I really think the problem is the bridge between the PSUs, it's running perfectly for a few days, but if the energy fails or I try to remotely restart it, then the problem starts again until it gets stable.

I will try with a more powerful PSU for the main one (mobo + 2 gpus), because if the cable heats a lot, it may overcharge it just a little, enough to reduce it's efficiency to a level close or lower than the actual power output, reseting the system. Since the PSU is almost on its power output limit. It would explain the fact that the system gets stable after a few tries, as the components / cables will cool after some stand by period.

But I guess I had two problems, one was with the "generic add2psu" that failed and caused a collapse over the system, other is related to overcharge of the capacity of the main psu (under tests).

Thank you all for your contributions, feel free to enlighten me more if you have anything else to say!!

Ah, yeah I know it's bad enough to use 3 PSU in a row, but server PSU where I live is too costly as well as powerful PSUs, it goes for more than $400 a 1000W PSU.
wacko
Legendary
*
Offline Offline

Activity: 1106
Merit: 1014


View Profile
November 24, 2017, 05:13:34 PM
 #8

Ah, yeah I know it's bad enough to use 3 PSU in a row, but server PSU where I live is too costly as well as powerful PSUs, it goes for more than $400 a 1000W PSU.
The cards you're using are not cheap either. I'd understand if it's a temporary solution for you until "proper" PSUs arrive, but I wouldn't suggest you to keep using these 750w units for long.
fanatic26
Hero Member
*****
Offline Offline

Activity: 756
Merit: 560


View Profile
November 24, 2017, 05:20:52 PM
 #9

You can get a server PSU that runs that whole rig for under $200.


P.S. those add2psu things are a total waste of money. Just make sure you turn the GPU only PSUs on first then the main one and you wont have any problems. Also you dont need to reset all 3 PSUs to restart the machine, just the one connected to the motherboard.

Stop buying industrial miners, running them at home, and then complaining about the noise.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!