Bitcoin Forum
May 22, 2024, 03:50:16 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: S17 Pro Issues  (Read 96 times)
thebeardsman (OP)
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
January 30, 2022, 04:20:38 PM
 #1

Hello,

First post here but I've been lurking for a while.

I have one uppity S17 Pro that seems to be highly sensitive to low temperatures, much more so than the others. The main symptom is when incoming air gets cold (like 35F or below), it will essentially go into a reboot loop. It will boot up and start hashing for ~30-45 seconds, then it appears to lose all 3 boards and restart itself. Sometimes I can coax it back online but it's been gradually getting worse over the last couple months. Then, just a few days ago, I started seeing intermittent, slightly erratic chip temp readings from one board. This device's stability is now so low I'm at the point where I want to send it out for repair, but I thought I'd throw this out here in case it turns out to be something I can diag/repair myself.

For comparison, the other miners will run happily until incoming air gets into the teens (F). Then, they will typically reboot once and run happily again for 1-6 hours before they do it again, if they do it again. Yes, I'm sure you'll tell me that's bad for them, and I normally modulate the incoming cooling air temp but there's some diagnostic value in knowing that difference exists.

I'm running Braiins OS+, just installed the new 21.12.1 release. Didn't improve anything. Changing the power settings doesn't effect anything. If I look at the log, I see lots of "TX fifo on hashboard (n) is empty" where (n) is 1, 2, or 3. I can post more of the log if you'd like to read it. There's no mention of temp sensors or anything else. The fact that the "TX fifo" errors occur simultaneously on all three hashboards has me thinking it might be a control board issue?

Thank you for your time.
thebeardsman (OP)
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
January 30, 2022, 05:05:02 PM
 #2

Update:

I turned it on long enough to get a log and this time it's reporting temp sensor errors from two boards:

Sun Jan 30 09:23:28 2022 daemon.err bosminer[1632]: Jan 30 16:23:28.484 ERROR bosminer_hal::sensor: Sensor hb3.11[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 11 )

<snip>

Sun Jan 30 09:23:29 2022 daemon.err bosminer[1632]: Jan 30 16:23:29.918 ERROR bosminer_hal::sensor: Sensor hb2.8[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 8 )

<snip>

Sun Jan 30 09:23:30 2022 daemon.err bosminer[1632]: Jan 30 16:23:30.020 ERROR bosminer_hal::sensor: Sensor hb2.36[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 36 )

<snip>

Sun Jan 30 09:25:03 2022 daemon.err bosminer[1632]: Jan 30 16:25:03.696 ERROR bosminer_hal::sensor: Sensor hb3.8[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 8 )
Sun Jan 30 09:25:03 2022 daemon.err bosminer[1632]: Jan 30 16:25:03.797 ERROR bosminer_hal::sensor: Sensor hb3.36[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 36 )

Assuming I'm correct in interpreting the sensor IDs, looks like there are 2x sensors on hashboard 2, and 3x on hashboard 3 that aren't communicating.

Further assuming there's nothing to do about this myself if I don't trust my soldering skills in this context?
tangy_t
Newbie
*
Offline Offline

Activity: 27
Merit: 0


View Profile
January 30, 2022, 05:53:35 PM
 #3

Had a similar issue as well before my board went down. Before I pulled the board and sent it to repair, when i saw this error I shutdown the machine for 10mins and warmed moved my intake to a mix of warm room air and outside cold air. Brought the board back for another week or so. Also try vnishs FW might bring it back totally.
wndsnb
Hero Member
*****
Offline Offline

Activity: 544
Merit: 589


View Profile
January 30, 2022, 07:38:31 PM
 #4

I believe the problem is from flaky solder connections, these miners are plagued with them. I have found the problems tend to show up more often when cold, then when you warm them up thermal expansion can close the flaky connection enough for the miner to run. You may be able to limp along for a while, but in my experience, it is just a matter of time before it starts failing solidly and won't come up even when warm.

The same issues can cause the temperature sensor issues. Sometimes all the downstream sensors from the place where the issue is start having communication errors.

Have some dead Bitmain 17 series hashboards or full miners?
I'll buy them ... send me a PM with what you have and I'll make you an offer!
thebeardsman (OP)
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
January 30, 2022, 07:51:42 PM
 #5

I believe the problem is from flaky solder connections, these miners are plagued with them. I have found the problems tend to show up more often when cold, then when you warm them up thermal expansion can close the flaky connection enough for the miner to run. You may be able to limp along for a while, but in my experience, it is just a matter of time before it starts failing solidly and won't come up even when warm.

The same issues can cause the temperature sensor issues. Sometimes all the downstream sensors from the place where the issue is start having communication errors.

That would fit with my symptoms. I think I may have reached that point of no return today. Couldn't get it to stay hashing on all 3 boards, even with the ambient temp up in the 50F range, which is a new low for this machine.

I disabled the 2 hashboards that were reporting sensor issues and so far it's been running fine on the remaining board. I won't know if it's significantly more stable until it has a chance to run overnight, but I've accepted I will need repairs.

I see you're in the market for 17-series components, I don't suppose you offer repair services?
BitMaxz
Legendary
*
Offline Offline

Activity: 3262
Merit: 2974


Block halving is coming.


View Profile WWW
January 30, 2022, 11:11:03 PM
 #6

Would you mind switching it back to stock firmware the above logs are from BraiinsOS if you can switch it back to the original firmware and run it again maybe the temp issue will show up?
And then post the kernel logs here because I'm a bit confused about the logs from BraiinsOS compared to the original firmware.

Also, I think BraiinsOS still detecting those temp you can only disable it if configure your miner to disable temperature sensor scanning(--no-sensor-scan).
I don't know how to apply it on s17 but try to post on this thread below then ask how to disable the temp sensor scanning.

- https://bitcointalk.org/index.php?topic=5036844.0

But if you are looking for someone who can repair this issue then check https://www.zeusbtc.com/Repair.asp ask them if they have a repair shop near your area.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Artemis3
Legendary
*
Offline Offline

Activity: 2030
Merit: 1563


CLEAN non GPL infringing code made in Rust lang


View Profile WWW
February 04, 2022, 12:43:30 PM
 #7

Remember that the Target Temperature is also the pre-heat value (temperature control MUST be set to Auto).

If you are starting a miner from cold in winter, it may help to pre-heat it yourself with a hair dryer or the output from another miner first.

██████
███████
███████
████████
BRAIINS OS+|AUTOTUNING
MINING FIRMWARE
|
Increase hashrate on your Bitcoin ASICs,
improve efficiency as much as 25%, and
get 0% pool fees on Braiins Pool
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!