Bitcoin Forum
June 29, 2024, 06:13:05 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 [2] 3 4 »  All
  Print  
Author Topic: Antminer D3 reports trouble reading PIC temperature's  (Read 10300 times)
tperalta82
Full Member
***
Offline Offline

Activity: 174
Merit: 100


View Profile
September 27, 2017, 05:18:56 PM
 #21

Great input,

I also tested with 7 different PSU's, same results, as for the controller, I switched the boards around on the controller, and in my case, the same board has errors along the controller ports.

A thing i noted, is that whenever the pool stops responding, those errors just come up, and then when it starts receiving back jobs, everything comes back to normal (just not the HW Errors part of course), this happens like 100% of the times so far.
So i'd say it's a software bug mixed with hardware issues.
Also, then the pool stops responding, the fans spin up to their max (bad software design i'd say).
Also tried to underclock to under 15GH/s values, same issues happen.
As for the networking part, well, I have other things running on the network for months now with no issues... so I cut that one out

Let's wait and see what bitmain does about this

Hi Guys,

Just a quick followup, I've had a chance to so some more testing with 5 D3's and did the following:
1) Tested with a single APW3++ PSU on a 230V mains source
2) Tested with a 3600 Watts single rail PSU (Lab testbench PSU, so not a converted server PSU or PC PSU)
3) Tested with a power conditioner (used in high end audio setups)
4) Flashed the latest available firmware from: https://s3.cn-north-1.amazonaws.com.cn/shop-bitmain/download/Antminer-D3-201709131713-0M.tar.gz
5) Hard wire connection to a Cisco 24Port managed switch (tested on different ports with different cables)
6) Set the mining pool to Antpool
7) Inspected the hashbords of a single D3 for damaged solder points powerlanes etc, loose/missing heatsinks -> All looked fine.

Results:
- All of the D3's had the random error mode red led warning flash
- All of the D3's had the random (The red led is in sync with this message) "read_temp_func: can't read all sensor's temperature, close PIC and need reboot!!!" message in the kernel log's.
- All of the D3's had the occasional "all x'es" on 1 or 2 hashbords, and returning to normal after a "reboot"

This seems to be what most of the contributors to this thread experienced as well.
- So it seems that it's not related to the stability of the used power supply.
- Chances that each D3 (at least in the case of the posters in this thread, that have several D3's that exhibit the exact same behavior) has 1 or 2 malfunctioning hashbords seems unlikely,
and I assume that Bitmain would notice this with their Quality Assurance tests.

That kind of leaves me with:
- Software bugs
- Controller (board) bugs

What I still want to test:
- Is the behavior the same when disconnecting 1 or 2 hashboards.

I'm hoping that Bitmain is able to sort this out with a firmware update (if it's indeed a software problem), however... They might not be inclined to do so since the rapid increase in Dash difficulty might make this
an uninteresting investment.


BossmanPL
Newbie
*
Offline Offline

Activity: 37
Merit: 0


View Profile
September 30, 2017, 08:26:34 PM
 #22

i am also having this issue, does new firmware packed in tar.gz i have to upload via webgui ?, does this new firmware helped others ?
tperalta82
Full Member
***
Offline Offline

Activity: 174
Merit: 100


View Profile
October 01, 2017, 01:39:10 AM
 #23

i am also having this issue, does new firmware packed in tar.gz i have to upload via webgui ?, does this new firmware helped others ?

Mine are much more stable lately. And yes, you have to upload via webgui and the firmware is packed on tar.gz
JohnBitCo
Sr. Member
****
Offline Offline

Activity: 2030
Merit: 356


View Profile
October 01, 2017, 12:30:30 PM
 #24

The software is looking at the wrong pointer for the info from the hardware. You need to supply the proper info for the software, probably in the config file and then restart the whole thing, that will solve the problem. It is not a big deal and only takes a minute of two.


tperalta82
Full Member
***
Offline Offline

Activity: 174
Merit: 100


View Profile
October 01, 2017, 08:29:19 PM
 #25

The software is looking at the wrong pointer for the info from the hardware. You need to supply the proper info for the software, probably in the config file and then restart the whole thing, that will solve the problem. It is not a big deal and only takes a minute of two.




The data is hardcoded into the miner when it's built, you can probably check the cgminer from ckolivas and check the antminers headers there (although D3 is not there)
So the software is not getting wrong pointers, it's just badly made, like whenever there is a connection error to one of the pools, the PIC temperature error is thrown, because the ASIC's are kind of reset or something, and even the fans spin up for no reason, bad coding.

Or if there is no work from the pool (can happen), same things happen.

Just check the kernel log and you'll find this pattern
seizu
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
October 04, 2017, 10:28:56 PM
 #26

I think it's just an overclock issue. We had the same issues, after we reduced the clock speed to 437, all problems where gone. No xxxooxxxx and HW errors since 3 days.
Now we are hashing with 15.8GH/s, which is the clock rate which was originally announced by BitMan.
Conclusion: BitMan has just overclocked the D3 and labeled it with 17GH without testing it. An that is really annoying.  Angry Angry Angry
therealfalcon
Member
**
Offline Offline

Activity: 113
Merit: 10


View Profile
October 08, 2017, 07:08:27 PM
 #27

having same isssues any soultion on   t his ?

majorlee
Full Member
***
Offline Offline

Activity: 134
Merit: 100

First DJ to play gigs for Bitcoin & Crypto Guru


View Profile WWW
October 08, 2017, 07:27:26 PM
 #28

so whats the round up on the D3?

its performing as it should?

is the quality flagging on some units??

im asking as i trying to get hold of one, anyone selling in UK?

▀▀▀ ▀▀ ▀▀▀    AI-COIN ICO  ●  Building Wealth Through AI & The Power of BLOCKCHAIN    ▀▀▀ ▀▀ ▀▀▀
slack    |    Telegram    |    twitter    |    facebook    |    ANN Thread
▄▄▄▄▄ ▄▄▄▄ ▄▄▄▄▄    READ THE WHITEPAPER  [ REGISTER TODAY ]    ▄▄▄▄▄ ▄▄▄▄ ▄▄▄▄▄
coldlexm
Newbie
*
Offline Offline

Activity: 2
Merit: 0


View Profile
October 09, 2017, 10:59:48 AM
Last edit: October 09, 2017, 06:05:48 PM by coldlexm
 #29

Just got two new D3. Had the same kind of problem with them. Random failures  with red LED blinking (failure LED) , coolers at max, noise and sometimes loosing network connection. Only cold reboot helps.

Noticed that behavior when touching the device or just the network cable.

Finally managed to repeat the failures manually with both units by messing with network cable, but not disconnecting it. We have done some dangerous tests.

Removing the front panel solves the problem completely. Looks like the front panel contacts sometimes with metal RJ-45 socket on board causing device failure. Reproduced this on video.

https://youtu.be/-Kug4HwiFYY

Please donate LTC  if it helps: LKPV84ysTA3eR58oCAoLwUgLPUyuxLcgdu
Bibi187
Full Member
***
Offline Offline

Activity: 420
Merit: 106


https://steemit.com/@bibi187


View Profile WWW
October 09, 2017, 03:33:08 PM
 #30

So far i know from personal testing  ...

D3 hangout at max speed when pool seems to be offline or dead.

D3 hangout at max speed on multi coin pool when pool take time to submit new work, back to normal when submission is done.

This 2 issues result in message warning about "trouble reading PIC temperature's", so for my case is a software issue.

If you have this issue without problem on pool, check controller unplug/plug etc ...

I never have a "red light issue".

D3 lost some chip when running have to reboot, fixed after firmware update.

D3 dont like to be turned off with "shutdown -h now", stuck for long time have to kill it by hand.

In some case your D3 will lost stability after this hard shutdown, and you will start to have xxxx xxxxx xxxx board even if you reboot. Go back to factory default and upgrade firmware again (even if is same version)

I running them on low noise setup, 43% fan speed each. You need fresh air to enter and that ok.

My D3's running between 70c to 83c without any issue. For a CPU, 85c as critical point is ok, CPU can handle MORE.

Some time one of my miner get disconnect like pool is dead, but the other one working ... Same switch, same network. A easy fix was to ssh on this one and enable a ping on google. No more connection lost (that maybe come from my router port)

DeepOnion    ▬▬  Anonymous and Untraceable  ▬▬    ENJOY YOUR PRIVACY  •  JOIN DEEPONION
▐▐▐▐▐▐▐▐   ANN  Whitepaper  Facebook  Twitter  Telegram  Discord    ▌▌▌▌▌▌▌▌
Get $ONION  (✔Cryptopia  ✔KuCoin)  |  VoteCentral  Register NOW!  |  Download DeepOnion
blocksminer
Newbie
*
Offline Offline

Activity: 25
Merit: 0


View Profile WWW
October 10, 2017, 02:08:44 PM
 #31

bump to this
tperalta82
Full Member
***
Offline Offline

Activity: 174
Merit: 100


View Profile
October 15, 2017, 05:12:45 AM
 #32

WRote a small script in php to check your D3's and reboot them automatically in case an hashboard has errors

https://gist.github.com/tperalta82/6e11253cd4b9cf9c5c6fa15bbf046d4f

put this in a linux server as a cronjob, change the $hosts array, to add your arrays.

There are 3 different methods to call reboot on the miner, personally, the wget one is better for me, because it runs everywhere (embedded devices)
Had an issue with curl, seems to fail to authetnicate on some weird cases
and ssh lib is not present everywhere.

Read the code first, understand it, redistribute it, improve it, do whatever you want with it.

P.S.: it works for me
npcomp
Newbie
*
Offline Offline

Activity: 7
Merit: 0


View Profile
October 26, 2017, 09:57:43 AM
 #33

I don't know what cause this to happen. I started mining on nicehash and this read_temp_func repeating occured every 5-10 minutes. But when I move to others pool like antpool itself. It works fine for many hours. I'll keep monitor and test on this situation. Hope the next firmware they release solve this.
Minerdude8
Newbie
*
Offline Offline

Activity: 33
Merit: 0


View Profile
October 26, 2017, 10:29:33 PM
 #34

I got all xxxx xxx on one of my boards too, a  online reboot fixed the problem for now.
Jagrafess
Newbie
*
Offline Offline

Activity: 3
Merit: 0


View Profile
October 30, 2017, 07:42:47 PM
 #35

I got 2 D3 yesterday and they keep speeding up the fans every 5 minutes.

Every time this happens, the log says "read_temp_func: can't read all sensor's temperature, close PIC and need reboot!!!". I configured mining on nicehash and I'm wondering if this is connected to the problem?

Anyone got a grip on this issue by now to narrow it down?
Jagrafess
Newbie
*
Offline Offline

Activity: 3
Merit: 0


View Profile
October 30, 2017, 08:50:06 PM
 #36

I got 2 D3 yesterday and they keep speeding up the fans every 5 minutes.

Every time this happens, the log says "read_temp_func: can't read all sensor's temperature, close PIC and need reboot!!!". I configured mining on nicehash and I'm wondering if this is connected to the problem?

Anyone got a grip on this issue by now to narrow it down?

Switched to ViaBTC and the problem didn't occur again. Might be an issue with Nicehash. Since everyone keeps forgetting to send a follow up if the problem was solved, please send me a PM if I forget it as well Wink
dzierski
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
November 06, 2017, 07:51:04 PM
 #37

Same problem here :/
Shurkan
Newbie
*
Offline Offline

Activity: 27
Merit: 0


View Profile
November 06, 2017, 07:57:14 PM
 #38

I also had this problem
The problem is only with the Nicehash pool
If you use a different pool, the problem does not exist

Shurkan
CodeSingularity (OP)
Newbie
*
Offline Offline

Activity: 9
Merit: 0


View Profile
November 08, 2017, 10:40:37 PM
 #39

Hi Guys,

There seems to be a new firmware for the D3
With the description:
1. fix the issue: some miner's chip status is 'x' after running for several hours or days.

https://s3.cn-north-1.amazonaws.com.cn/shop-bitmain/download/Antminer-D3-201711022227-0M.tar.gz

I'll Test it when I can.
br1mcoin
Member
**
Offline Offline

Activity: 182
Merit: 10

br1mcoin : Savings & Wealth Creation Coin x11 Algo


View Profile
November 09, 2017, 01:40:26 AM
 #40

Hi Guys,

There seems to be a new firmware for the D3
With the description:
1. fix the issue: some miner's chip status is 'x' after running for several hours or days.

https://s3.cn-north-1.amazonaws.com.cn/shop-bitmain/download/Antminer-D3-201711022227-0M.tar.gz

I'll Test it when I can.

we can confirm that this is now running well so far.

Do try it out.

Pages: « 1 [2] 3 4 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!