I'm having a problem with my rig and will appreciate any help you can provide, I will try to be as clear as possible.
I have one rig with 6x 1080 Ti running SMOS as below.
3x 750w PSU linked together (2 gpus for each psu) - 80 plus bronze (yeah I know...)
1x Biostar 250 tb pro (12 slots)
1x 8gb RAM corsair vengeance
1x CPU G3930 7 gen
1x 32gb 3.0 Pendrive with SMOS
6x risers
6x EVGA 1080ti (same model)
Rig worked fine for more than a month and then it started to restart and sometimes turning only 3, 4 or 5 gpus. If I changed the mining program to a non mineable situation (pass the wrong parameters for the mining program) then it would start and "work as expected" - with error messages appearing informing that the parameters were wrong, but not reseting anymore.
With this in mind, I did a lot of checks such as changing the risers, exchanging places between psus and gpus, underclocking (heavily) / removing overclock. During all the tests, nothing changed, suggesting that the gpus were not faulty as if I turned only 2 to 4 gpus it worked like a charm.
To bridge between the psus I had 2x add2psu-similar piece (later I checked and it was a generic version, not the original one fml) and I saw that one was failing (or it looked like it was), so I removed both pieces and did a "direct link" with a jumper until the the variant cable similar to the lian li cable arrived, that was able to link the 3 psu altogether.
During the meantime I left 4 gpus mining without the reset problem and jumping the psu, I even exchanged the cables, gpus, risers and psus between them and it worked perfectly with 4 gpus regardless the psu they were linked, as long as only 2 random psus were on.
The cable arrived and I turned it all on, it worked! For the first time in a week the rig was mining with 6 gpus, and it lasted one day, until the problem returned, but different this time, it reseted for a few times (5~10) and became stable again, it happened 2 times in 2 days and all the times all 6 gpus were recognized by the system, different from before where 3 to 5 was normal and 6 almost never during the reset loop prior to the change of cables.
Right now I'm thinking in acquiring the original add2psu and test. I'm aware that chaining psus aren't the best idea, but I had the psus on hand and didn't wanted to buy new ones and I want to be sure that the problem is the psu chain before acquiring new parts.
Important informations:
- The resets occurred right after the beginning of the mining process and I tried different miners / algos.
- I guess I isolated many components of the system and the only "faulty" one was the "add2psu" adapter. What I wasn't able to isolate: pendrive, ram, gpu and the mobo.
- Resets happened once every 30 to 45 seconds more or less with "no screen session found" as system message.
- The mobo have the config as required by manufacturer / SMOS install guide.
- It worked perfectly during more than one month, I was even able to restart many times in a row during tests (manually or using the SMOS dashboard) and it worked without flaws.
- The mobo have 2 molex inputs for extra power that are connected to the psus and I have no further info if this can be a problem or not.
What advice / suggestion would you give to me?
*Sorry for the text wall!