Bitcoin Forum
April 30, 2024, 06:18:52 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: XP Error  (Read 204 times)
Tsub (OP)
Member
**
Offline Offline

Activity: 65
Merit: 10


View Profile
September 24, 2022, 03:37:39 AM
 #1

I am troubleshooting an issue with 2 of my XP's. 

What is weird is that usually, if you have a hashboard issue, you can unplug it and the S19 will run with 2.  With an XP, when you unplug it, it throws a different error.

Here is the part that throws the error.  I can post the full log if you really need it.

Any insight / best guesses / etc  would be appreciated.

2022-09-24 02:27:49 Initializing the power, please wait, this may take up to 2 minutes...
2022-09-24 02:28:19 Slept 30 seconds, diff = 2.
2022-09-24 02:28:29 set_voltage_by_steps to 1500.
2022-09-24 02:28:31 bad chain id = 3
2022-09-24 02:28:35 start up min temp by 75a = 45
2022-09-24 02:28:37 Chain[0]: find 110 asic, times 0
2022-09-24 02:28:39 Chain[1]: find 110 asic, times 0
2022-09-24 02:28:41 Chain[2]: find 3 asic, times 0
2022-09-24 02:28:43 Chain[2]: find 3 asic, times 1
2022-09-24 02:28:45 Chain[2]: find 3 asic, times 2
2022-09-24 02:28:45 Chain 2 only find 3 asic, will power off hash board 2
2022-09-24 02:28:45 ERROR_SOC_INIT: soc init failed!
2022-09-24 02:28:45 stop_mining: soc init failed!
2022-09-24 02:28:45 uninit_temp_info
2022-09-24 02:28:45 do not read temp anymore...
2022-09-24 02:28:45 cancel thread
2022-09-24 02:28:45 ****power off hashboard****
2022-09-24 02:28:45 power off
2022-09-24 02:28:46 temp monitor thread exit
2022-09-24 02:28:56 Version num 65536
2022-09-24 02:28:56 Mask num 0x1fffe000
2022-09-24 02:28:56 Note: addrInterval or corenum is not initialized.
1714501132
Hero Member
*
Offline Offline

Posts: 1714501132

View Profile Personal Message (Offline)

Ignore
1714501132
Reply with quote  #2

1714501132
Report to moderator
1714501132
Hero Member
*
Offline Offline

Posts: 1714501132

View Profile Personal Message (Offline)

Ignore
1714501132
Reply with quote  #2

1714501132
Report to moderator
1714501132
Hero Member
*
Offline Offline

Posts: 1714501132

View Profile Personal Message (Offline)

Ignore
1714501132
Reply with quote  #2

1714501132
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714501132
Hero Member
*
Offline Offline

Posts: 1714501132

View Profile Personal Message (Offline)

Ignore
1714501132
Reply with quote  #2

1714501132
Report to moderator
mikeywith
Legendary
*
Offline Offline

Activity: 2212
Merit: 6366


be constructive or S.T.F.U


View Profile
September 24, 2022, 05:55:53 PM
 #2

With these 3 lines

Code:
2022-09-24 02:28:41 Chain[2]: find 3 asic, times 0
2022-09-24 02:28:43 Chain[2]: find 3 asic, times 1
2022-09-24 02:28:45 Chain[2]: find 3 asic, times 2

The software insists that it can still see Chain[2] and that chain 2 has missing chips, so until the line goes away, it will be hard to troubleshoot any further, can you try to swap the data cables between the hashboards and see if this part of the kernel changes?

Also, please use code tag when posting kernel log.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Tsub (OP)
Member
**
Offline Offline

Activity: 65
Merit: 10


View Profile
September 24, 2022, 07:03:04 PM
 #3

With these 3 lines

Code:
2022-09-24 02:28:41 Chain[2]: find 3 asic, times 0
2022-09-24 02:28:43 Chain[2]: find 3 asic, times 1
2022-09-24 02:28:45 Chain[2]: find 3 asic, times 2

The software insists that it can still see Chain[2] and that chain 2 has missing chips, so until the line goes away, it will be hard to troubleshoot any further, can you try to swap the data cables between the hashboards and see if this part of the kernel changes?

Also, please use code tag when posting kernel log.

Thanks - I swapped cables / hashboards from another XP on Chain 2 (one of the 3 hashboards) and it cleared the error. 

I have another one with the same error on Chain 0 so I know what to do now.

I will need to send these hashboards off for testing / repair but at least I have a whole rig running.  Thanks for the suggestion!
BitMaxz
Legendary
*
Offline Offline

Activity: 3234
Merit: 2955


Block halving is coming.


View Profile WWW
September 24, 2022, 11:05:16 PM
 #4

Sometimes this kind of issue is due to fallen heatsinks if you can reattach it with thermal glue the hashboard will work again but if you check your hashboard physically without a loose heatsink.
You can try to repair it on your own by using a hot air blower or hair blower you can try to point it to where the suspicious ASIC you think busted/cold solder joints you need to check the logs like yours it only detects 3 ASIC so you need to point it there near ASIC 3 you can check the hashboard repair guide from this link below and look for hashboard diagram to find where is #3 ASIC is located so point your blower to #4 then test it again.

- https://www.zeusbtc.com/manuals/Antminer-S19-Hash-Board-Repair-Guide.asp

If it doesn't fix your issue then you need to hire a technician that site also has a list of repair centers just click the ASIC miner repair on the top of the page.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
mikeywith
Legendary
*
Offline Offline

Activity: 2212
Merit: 6366


be constructive or S.T.F.U


View Profile
September 25, 2022, 10:50:58 AM
Merited by BitMaxz (1)
 #5


Thanks - I swapped cables / hashboards from another XP on Chain 2 (one of the 3 hashboards) and it cleared the error. 

I have another one with the same error on Chain 0 so I know what to do now.

I will need to send these hashboards off for testing / repair but at least I have a whole rig running.  Thanks for the suggestion!

Great news, have you actually swapped the hashboard from another miner? It was not possible to mix and match on the previous generation, hopefully the XP works differently.

@Bitmaxz, this is not possible with the new gen miners, the heatsink on the chip front is one whole large block attached with solder/thermal and screws, no way it would do anything like that.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Tsub (OP)
Member
**
Offline Offline

Activity: 65
Merit: 10


View Profile
September 25, 2022, 02:55:27 PM
 #6


Thanks - I swapped cables / hashboards from another XP on Chain 2 (one of the 3 hashboards) and it cleared the error. 

I have another one with the same error on Chain 0 so I know what to do now.

I will need to send these hashboards off for testing / repair but at least I have a whole rig running.  Thanks for the suggestion!

Great news, have you actually swapped the hashboard from another miner? It was not possible to mix and match on the previous generation, hopefully the XP works differently.

@Bitmaxz, this is not possible with the new gen miners, the heatsink on the chip front is one whole large block attached with solder/thermal and screws, no way it would do anything like that.

Yes, I moved the hashboard from another XP.  That other one is having a different issue.  Waiting for parts on it.

Feels like I will be sending that one off to have repaired with 2 broken hashboards and a broken PSU! 
BitMaxz
Legendary
*
Offline Offline

Activity: 3234
Merit: 2955


Block halving is coming.


View Profile WWW
September 25, 2022, 11:02:02 PM
Merited by mikeywith (2)
 #7

@Bitmaxz, this is not possible with the new gen miners, the heatsink on the chip front is one whole large block attached with solder/thermal and screws, no way it would do anything like that.

If its different from the old s19 units then he can just point the blower or hotair on the front of the heatsink near on that chip just to give a bit heat because some parts is not fully soldered there some parts have cold solder joint. So giving them a heat can melt it and gives more contact from chips to the board.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
mikeywith
Legendary
*
Offline Offline

Activity: 2212
Merit: 6366


be constructive or S.T.F.U


View Profile
September 26, 2022, 12:00:02 AM
 #8

If its different from the old s19 units then he can just point the blower or hotair on the front of the heatsink near on that chip just to give a bit heat because some parts is not fully soldered there some parts have cold solder joint. So giving them a heat can melt it and gives more contact from chips to the board.

You need to understand that solder issues and heatsink issues are completely different, and they result in different logs regardless of the miner model, if there is an issue with the solder whereby the chip is not in contact with the hashboard, it will cause a board failure such as missing asics or no asics at all.

On the other hand, a loose heatsink will not display any error, in fact, the miner will read all Asics and will run just fine for a few seconds and then stop after throwing a temp error, many people (myself included) thought the issue with the 17 series was about heatsinks losing contact to the chip, but no, it turns out that it's the actual chip, you can confirm that yourself by taking out a few heatsinks and the miner will still get full asic count, mine for a while and those chips without heatsink will without a doubt pass the max temp in a few seconds and will display "over-temp on chain x" power off.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Tsub (OP)
Member
**
Offline Offline

Activity: 65
Merit: 10


View Profile
September 26, 2022, 12:08:44 AM
 #9

If its different from the old s19 units then he can just point the blower or hotair on the front of the heatsink near on that chip just to give a bit heat because some parts is not fully soldered there some parts have cold solder joint. So giving them a heat can melt it and gives more contact from chips to the board.

You need to understand that solder issues and heatsink issues are completely different, and they result in different logs regardless of the miner model, if there is an issue with the solder whereby the chip is not in contact with the hashboard, it will cause a board failure such as missing asics or no asics at all.

On the other hand, a loose heatsink will not display any error, in fact, the miner will read all Asics and will run just fine for a few seconds and then stop after throwing a temp error, many people (myself included) thought the issue with the 17 series was about heatsinks losing contact to the chip, but no, it turns out that it's the actual chip, you can confirm that yourself by taking out a few heatsinks and the miner will still get full asic count, mine for a while and those chips without heatsink will without a doubt pass the max temp in a few seconds and will display "over-temp on chain x" power off.


The interesting thing about the XP's is that instead of 50 small heatsinks, it uses one giant heatsink with screws to attach it. 
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!