Bitcoin Forum
May 06, 2024, 10:32:09 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: S19JPro Faulty Server Hashbord issue when startup  (Read 111 times)
Athematica88 (OP)
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
May 14, 2023, 01:18:24 PM
 #1

hello !

I received recently from bitmain reseller my 6 S19 JPro 104 Th i ordered a couple of weeks ago ! I started the configuration on the servers one by one and plugged them on my farm gradually ! one of them started to show abnormal behaviour until i noticed that the hashbords went off- line after some server reboots !!

Kindly get notified that the server is still under warranty period

Other 5 Servers S19 JPro are mining fine !! Except the one mentioned above
please find in the attachment the log file for the faulty server
2023-05-14 12:34:52 ==========================capability start==========================
2023-05-14 12:34:52 board num = 3
2023-05-14 12:34:52 board id = 0, chain num = 1
2023-05-14 12:34:52    chain id = 0
2023-05-14 12:34:52 board id = 1, chain num = 1
2023-05-14 12:34:52    chain id = 1
2023-05-14 12:34:52 board id = 2, chain num = 1
2023-05-14 12:34:52    chain id = 2
2023-05-14 12:34:52 ==========================capability end============================
2023-05-14 12:34:52 chain num = 3
2023-05-14 12:34:52 skip loading levels for now
2023-05-14 12:34:53 load chain 0 eeprom data
2023-05-14 12:34:54 load chain 1 eeprom data
2023-05-14 12:34:54 load chain 2 eeprom data
2023-05-14 12:34:54 i2c_sim_init start
2023-05-14 12:34:54 init gpio477
2023-05-14 12:34:54 init gpio476
2023-05-14 12:34:54 i2c_sim_init end
2023-05-14 12:34:56 power open power_version = 0x75
2023-05-14 12:34:58 power is not Calibrated
2023-05-14 12:35:00 miner type: Antminer S19j Pro
2023-05-14 12:35:00 multi machine mode
2023-05-14 12:35:00 load machine BHB42603 conf
2023-05-14 12:35:00 machine : BHB42603
2023-05-14 12:35:00 chain_num 4, chain_domain_num 42, chain_asic_num 126, domain_asic_num 3
2023-05-14 12:35:00 fan_eft : 0  fan_pwm : 100
2023-05-14 12:35:00 create thread get_nonce_and_register_thread
2023-05-14 12:35:00 fixed working voltage = 1360
2023-05-14 12:35:00 min freq in eeprom = 545
2023-05-14 12:35:00 fixed frequency is 545
2023-05-14 12:35:00 Chain
  • PCB Version: 0x0170
2023-05-14 12:35:00 Chain
  • BOM Version: 0x0010
2023-05-14 12:35:00 Chain [1] PCB Version: 0x0170
2023-05-14 12:35:00 Chain [1] BOM Version: 0x0010
2023-05-14 12:35:00 Chain [2] PCB Version: 0x0170
2023-05-14 12:35:00 Chain [2] BOM Version: 0x0010
2023-05-14 12:35:00 bad chain id = 3
2023-05-14 12:35:00 Fan check passed.
2023-05-14 12:35:00 uart_trans addr:0xf704d000.
2023-05-14 12:35:00 max sensor num = 4
2023-05-14 12:35:00 STATUS_INITED: soc init done!
2023-05-14 12:35:00 temperature_monitor_thread start...
2023-05-14 12:35:02 start to init...
2023-05-14 12:35:04 power type version: 0x0075
2023-05-14 12:35:05 disable power watchdog: 0x0000
2023-05-14 12:35:06 Enter sleep to make sure power release finish.
2023-05-14 12:36:02 Slept 55 seconds, diff = 5.
2023-05-14 12:36:06 set_voltage_by_steps to 1500.
2023-05-14 12:36:15 start up min temp by 75a = 24
2023-05-14 12:36:18 Chain[0]: find 126 asic, times 0
2023-05-14 12:36:19 !!! reg crc error
2023-05-14 12:36:20 Chain[1]: find 124 asic, times 0
2023-05-14 12:36:21 !!! reg crc error
2023-05-14 12:36:22 Chain[1]: find 124 asic, times 1
2023-05-14 12:36:23 !!! reg crc error
2023-05-14 12:36:24 Chain[1]: find 124 asic, times 2
2023-05-14 12:36:24 Chain 1 only find 124 asic, will power off hash board 1
2023-05-14 12:36:26 Chain[2]: find 126 asic, times 0
2023-05-14 12:36:26 ERROR_SOC_INIT: soc init failed!
2023-05-14 12:36:26 stop_mining: soc init failed!
2023-05-14 12:36:26 uninit_temp_info
2023-05-14 12:36:26 do not read temp anymore...
2023-05-14 12:36:26 cancel thread
2023-05-14 12:36:26 ****power off hashboard****
2023-05-14 12:36:27 temp monitor thread exit
2023-05-14 12:37:19 Version num 65536
2023-05-14 12:37:19 Mask num 0x1fffe000
2023-05-14 12:37:19 opt_multi_version = 65536, interval timeout = 2837042
2023-05-14 12:37:19 freq = 545, percent = 90, hcn = 4809, timeout = 2837042
Would you please assist for the case described !
Many Thanks in advance.
BitcoinCleanup.com: Learn why Bitcoin isn't bad for the environment
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715034729
Hero Member
*
Offline Offline

Posts: 1715034729

View Profile Personal Message (Offline)

Ignore
1715034729
Reply with quote  #2

1715034729
Report to moderator
philipma1957
Legendary
*
Offline Offline

Activity: 4116
Merit: 7856


'The right to privacy matters'


View Profile WWW
May 14, 2023, 01:45:56 PM
 #2

hello !

I received recently from bitmain reseller my 6 S19 JPro 104 Th i ordered a couple of weeks ago ! I started the configuration on the servers one by one and plugged them on my farm gradually ! one of them started to show abnormal behaviour until i noticed that the hashbords went off- line after some server reboots !!

Kindly get notified that the server is still under warranty period

Other 5 Servers S19 JPro are mining fine !! Except the one mentioned above
please find in the attachment the log file for the faulty server
2023-05-14 12:34:52 ==========================capability start==========================
2023-05-14 12:34:52 board num = 3
2023-05-14 12:34:52 board id = 0, chain num = 1
2023-05-14 12:34:52    chain id = 0
2023-05-14 12:34:52 board id = 1, chain num = 1
2023-05-14 12:34:52    chain id = 1
2023-05-14 12:34:52 board id = 2, chain num = 1
2023-05-14 12:34:52    chain id = 2
2023-05-14 12:34:52 ==========================capability end============================
2023-05-14 12:34:52 chain num = 3
2023-05-14 12:34:52 skip loading levels for now
2023-05-14 12:34:53 load chain 0 eeprom data
2023-05-14 12:34:54 load chain 1 eeprom data
2023-05-14 12:34:54 load chain 2 eeprom data
2023-05-14 12:34:54 i2c_sim_init start
2023-05-14 12:34:54 init gpio477
2023-05-14 12:34:54 init gpio476
2023-05-14 12:34:54 i2c_sim_init end
2023-05-14 12:34:56 power open power_version = 0x75
2023-05-14 12:34:58 power is not Calibrated
2023-05-14 12:35:00 miner type: Antminer S19j Pro
2023-05-14 12:35:00 multi machine mode
2023-05-14 12:35:00 load machine BHB42603 conf
2023-05-14 12:35:00 machine : BHB42603
2023-05-14 12:35:00 chain_num 4, chain_domain_num 42, chain_asic_num 126, domain_asic_num 3
2023-05-14 12:35:00 fan_eft : 0  fan_pwm : 100
2023-05-14 12:35:00 create thread get_nonce_and_register_thread
2023-05-14 12:35:00 fixed working voltage = 1360
2023-05-14 12:35:00 min freq in eeprom = 545
2023-05-14 12:35:00 fixed frequency is 545
2023-05-14 12:35:00 Chain
  • PCB Version: 0x0170
2023-05-14 12:35:00 Chain
  • BOM Version: 0x0010
2023-05-14 12:35:00 Chain [1] PCB Version: 0x0170
2023-05-14 12:35:00 Chain [1] BOM Version: 0x0010
2023-05-14 12:35:00 Chain [2] PCB Version: 0x0170
2023-05-14 12:35:00 Chain [2] BOM Version: 0x0010
2023-05-14 12:35:00 bad chain id = 3
2023-05-14 12:35:00 Fan check passed.
2023-05-14 12:35:00 uart_trans addr:0xf704d000.
2023-05-14 12:35:00 max sensor num = 4
2023-05-14 12:35:00 STATUS_INITED: soc init done!
2023-05-14 12:35:00 temperature_monitor_thread start...
2023-05-14 12:35:02 start to init...
2023-05-14 12:35:04 power type version: 0x0075
2023-05-14 12:35:05 disable power watchdog: 0x0000
2023-05-14 12:35:06 Enter sleep to make sure power release finish.
2023-05-14 12:36:02 Slept 55 seconds, diff = 5.
2023-05-14 12:36:06 set_voltage_by_steps to 1500.
2023-05-14 12:36:15 start up min temp by 75a = 24
2023-05-14 12:36:18 Chain[0]: find 126 asic, times 0
2023-05-14 12:36:19 !!! reg crc error
2023-05-14 12:36:20 Chain[1]: find 124 asic, times 0
2023-05-14 12:36:21 !!! reg crc error
2023-05-14 12:36:22 Chain[1]: find 124 asic, times 1
2023-05-14 12:36:23 !!! reg crc error
2023-05-14 12:36:24 Chain[1]: find 124 asic, times 2
2023-05-14 12:36:24 Chain 1 only find 124 asic, will power off hash board 1

2023-05-14 12:36:26 Chain[2]: find 126 asic, times 0
2023-05-14 12:36:26 ERROR_SOC_INIT: soc init failed!
2023-05-14 12:36:26 stop_mining: soc init failed!
2023-05-14 12:36:26 uninit_temp_info
2023-05-14 12:36:26 do not read temp anymore...
2023-05-14 12:36:26 cancel thread
2023-05-14 12:36:26 ****power off hashboard****
2023-05-14 12:36:27 temp monitor thread exit
2023-05-14 12:37:19 Version num 65536
2023-05-14 12:37:19 Mask num 0x1fffe000
2023-05-14 12:37:19 opt_multi_version = 65536, interval timeout = 2837042
2023-05-14 12:37:19 freq = 545, percent = 90, hcn = 4809, timeout = 2837042
Would you please assist for the case described !
Many Thanks in advance.

one board is bad.
return whole machine for a replacement machine.


or detach the bad board and ask the seller for a 33% discount on 1 machine


▄▄███████▄▄
▄██████████████▄
▄██████████████████▄
▄████▀▀▀▀███▀▀▀▀█████▄
▄█████████████▄█▀████▄
███████████▄███████████
██████████▄█▀███████████
██████████▀████████████
▀█████▄█▀█████████████▀
▀████▄▄▄▄███▄▄▄▄████▀
▀██████████████████▀
▀███████████████▀
▀▀███████▀▀
.
 MΞTAWIN  THE FIRST WEB3 CASINO   
.
.. PLAY NOW ..
BitMaxz
Legendary
*
Online Online

Activity: 3248
Merit: 2965


Block halving is coming.


View Profile WWW
May 14, 2023, 02:10:10 PM
Merited by philipma1957 (2)
 #3

Chain 1 or middle board only found 124 ASIC since Bitmain firmware have resitriction about faulty ASIC it will turn off and let two hashboard run normally.

Try to flash it first with stock firmware sometimes it can solve this problem but if not replacement is the beat solution.

I think you can run it with Braiins and let braiins to auto tune your unit to let the middle hashboard to mine even it has 2 bad ASIC. Only if the reseller didn't replace your unit.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
mikeywith
Legendary
*
Offline Offline

Activity: 2226
Merit: 6367


be constructive or S.T.F.U


View Profile
May 19, 2023, 03:27:20 AM
 #4

Since you bought a few of those machines, it's probably about time to learn how to read the kernel log to spot the potential errors.

Code:
2023-05-14 12:36:18 Chain[0]: find 126 asic, times 0
2023-05-14 12:36:19 !!! reg crc error
2023-05-14 12:36:20 Chain[1]: find 124 asic, times 0
2023-05-14 12:36:21 !!! reg crc error
2023-05-14 12:36:22 Chain[1]: find 124 asic, times 1
2023-05-14 12:36:23 !!! reg crc error
2023-05-14 12:36:24 Chain[1]: find 124 asic, times 2
2023-05-14 12:36:24 Chain 1 only find 124 asic, will power off hash board 1
2023-05-14 12:36:26 Chain[2]: find 126 asic, times 0

S19j Pro comes with 126 chips on each hashboard, you can find this information online or by looking at a good miner's kernel log.

At first, the miner checks Chain 0 which is usually the one on your left-hand side when facing the miner (you can read on the control board and find the numbering, deosn't always have to be labeled 0,1,2 it could be 3,4,5 , 5,6,7, you know the smallest number represents chain 0).

So chain 0 sees 126 asics, it moves to chain 1, which is the middle board and the one that usually fails, it will always try to send the single 3 times until one of these scenarios happen

A- it gets a reply signal from chip 126
B- the signal drops before reaching that last chip for 3 consecutive times

then it moves to the last chain and repeats the same process.

If it returns 0 errors, you will have a few possibilities, but since it is returning 124, it simply means that chip 124/125 has an issue, probably some solder-related issue, something that's beyond the average person's ability to fix.

So what to do now?

1- Perform a reboot and see if that helps
2- Flash ONLY stock firmware from Bitmain's website

if those two don't help, you should contact the seller and request a refund/replacement, if they offer you 20-30% return, take it, if they ask you to send only the bad hashboard for repair/replacement, go for it, try to refuse the option of returning the whole gear since that could take months.

Try to negotiate with the buyer to get the best deal out of this mess, tell them you can send video proof, showing that it's indeed the gear he sold you, would be better if done via a video call and you follow their instructions.

do NOT attempt to flash any custom firmware unless the seller approves it, because they will use that against you and won't grand you any warranty.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!