Bitcoin Forum
November 19, 2024, 01:39:11 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 3 4 5 6 7 »  All
  Print  
Author Topic: Antminer R4 - dead hash board? Shitty design, read before you send it back  (Read 12354 times)
RadekG (OP)
Hero Member
*****
Offline Offline

Activity: 924
Merit: 500



View Profile
January 22, 2017, 07:12:34 PM
Last edit: July 19, 2018, 10:42:44 AM by frodocooper
 #1

Hi, I am back with some info about R4 dead boards.

I bought R4 batch 4 and tried if it can be tweaked a bit, but I applied changes during warming-up phase which killed one of hashboard with ASIC count=0 error.

I tried many restarts and tweaking of Config.ini script, but without success. There is "single-board-test" program which test ASIC count, cores and reliability, but it also check firmware checksum. During my resuscitation both hashboards showed many times incorrect firmware in PIC (usual problem in S7, but R4 will re-flash it automatically) and very random behaviour.

Some errors collected during restarts:

PIC has not written default frequency. (automatic search did not work, sometimes it found 20 ASIC, sometimes more).
PIC voltage 790 varying up to 940
chain 7 (J8) ASIC count 0
incorrect temperature reading
sometimes it just freezed

After several hours of restarting and running "single-board-test" I think each ASIC has its frequency written in PIC, each board has also maximum voltage set, so it is not probable it will be possible to overclock the board without cracking PIC firmware. Crash seems to be happening on I2C communication with PIC controller with more than one hash board connected.

TLDR:

I found solution by unplugging one board and run miner only with ONE board until it warm-up and start hashing. After that I did the same with second board. Both boards worked, so I plugget them together and miner started with both boards running as new. Check kernel log for ASIC=63 during warming-up phase.

Hope this will help you before you decide to send whole R4 unit to HK.
agentcash
Newbie
*
Offline Offline

Activity: 49
Merit: 0


View Profile
January 23, 2017, 12:28:01 AM
 #2

I am a bit confused as to your fix.

You ran single board #1 through warmup, then ran single board #2 through warmup, then ran both together?
philipma1957
Legendary
*
Offline Offline

Activity: 4312
Merit: 8872


'The right to privacy matters'


View Profile WWW
January 23, 2017, 02:54:35 AM
Last edit: July 19, 2018, 10:43:16 AM by frodocooper
 #3

🎥🎥🎥📀📀📀My r4 just dropped a board today.  I will go to the hosting site and try your fix  on monday.  thanks.🎥🎥🎥📀📀📀

▄▄███████▄▄
▄██████████████▄
▄██████████████████▄
▄████▀▀▀▀███▀▀▀▀█████▄
▄█████████████▄█▀████▄
███████████▄███████████
██████████▄█▀███████████
██████████▀████████████
▀█████▄█▀█████████████▀
▀████▄▄▄▄███▄▄▄▄████▀
▀██████████████████▀
▀███████████████▀
▀▀███████▀▀
.
 MΞTAWIN  THE FIRST WEB3 CASINO   
.
.. PLAY NOW ..
NotFuzzyWarm
Legendary
*
Online Online

Activity: 3822
Merit: 2706


Evil beware: We have waffles!


View Profile
January 23, 2017, 03:01:17 AM
 #4

Tried it with my dead board -- by itself even after several hard/soft boots the bad card never reports more than 8 chips and refused to even try hashing. Reconnected the good board and back to 4.4THs on the good board and only 450GHs on the bad board.

Looking at the log in several places it reports the asic number is stored in the PIC and then jumps over several tests both for the good board and the bad one. Wondering if resetting or (god forbid) reflashing the miner might reset that info stored in the PIC...

Really hoping that the other R4 B6 arriving on Monday does better!

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome!  3NtFuzyWREGoDHWeMczeJzxFZpiLAFJXYr
 -Sole remaining active Primary developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
GWhisper
Full Member
***
Offline Offline

Activity: 120
Merit: 100


View Profile
January 23, 2017, 05:01:43 AM
 #5

Really hoping that the other R4 B6 arriving on Monday does better!

I'm wishing you luck - I've got 2 x B6 and both have dropped a board. One was immediate, the other 2 days later.
Finksy
Legendary
*
Offline Offline

Activity: 1022
Merit: 1003



View Profile
January 23, 2017, 07:02:55 AM
 #6

Really hoping that the other R4 B6 arriving on Monday does better!

I can't help but think of Einstein's misattributed quote about insanity when I read this.  I wish you the best, it seems like a lot of people are having high failure rates with these, with threads and posts popping up daily. I for one and glad I kept away from them despite some serious separation-ASICxiety.

IBM 2880W PSU Packages: https://bitcointalk.org/index.php?topic=966135 IBM 4K PSU Breakout Boards & Packages: https://bitcointalk.org/index.php?topic=1308296 
Server PSU-powered GPU rig solutions! https://bitcointalk.org/index.php?topic=1864539  Wallet address: 1GWQYCv22cAikgTgT1zFuAmsJ9fFqq9TXf 
RadekG (OP)
Hero Member
*****
Offline Offline

Activity: 924
Merit: 500



View Profile
January 24, 2017, 09:24:18 PM
 #7

I am a bit confused as to your fix.

You ran single board #1 through warmup, then ran single board #2 through warmup, then ran both together?

Sorry if my explanation is too complicated.

1) turn off the miner
2) unplug working board
3) start miner until it hashes.
4) if you are lucky, it will recover bad board
5) turn off miner
6) plug good board back
7) start miner until it hashes.
Cool if you are lucky, you have both working boards
9) if you have bad luck, try it again. You can also try start "single-board-test" program via Putty several times.

NEVER stop or apply changes before miner start hashing. It killed my board, but I recovered it by previous steps.
RadekG (OP)
Hero Member
*****
Offline Offline

Activity: 924
Merit: 500



View Profile
January 24, 2017, 09:32:51 PM
Last edit: July 19, 2018, 10:44:11 AM by frodocooper
 #8

🎥🎥🎥📀📀📀My r4 just dropped a board today.  I will go to the hosting site and try your fix  on monday.  thanks.🎥🎥🎥📀📀📀

I spent many hours solving this problem and when I was thinking to give up when I tried last thing which surprisingly recovered my bad board. Good luck!
takagari
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000


View Profile
January 24, 2017, 09:51:33 PM
Last edit: July 19, 2018, 10:44:25 AM by frodocooper
 #9

I'm wishing you luck - I've got 2 x B6 and both have dropped a board. One was immediate, the other 2 days later.

Hope you got smart and put both bad boards in one for the return
NotFuzzyWarm
Legendary
*
Online Online

Activity: 3822
Merit: 2706


Evil beware: We have waffles!


View Profile
January 24, 2017, 10:03:16 PM
Last edit: July 19, 2018, 10:44:45 AM by frodocooper
 #10

Hope you got smart and put both bad boards in one for the return

Um ONLY if Bitmain has said that you can do that! Breaking the seals over the screws to swap boards w/o authorization is a great way to void the warranty....

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome!  3NtFuzyWREGoDHWeMczeJzxFZpiLAFJXYr
 -Sole remaining active Primary developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
GWhisper
Full Member
***
Offline Offline

Activity: 120
Merit: 100


View Profile
January 24, 2017, 10:21:57 PM
Last edit: July 19, 2018, 10:45:02 AM by frodocooper
 #11

Um ONLY if Bitmain has said that you can do that! Breaking the seals over the screws to swap boards w/o authorization is a great way to void the warranty....

I did consider this as an option but Bitmain has requested both units to be returned. For now I'm holding onto them until after CNY, I might send them off next Friday. I did ask if they could swap them for T9's but they didn't go for that.
fanatic26
Hero Member
*****
Offline Offline

Activity: 756
Merit: 560


View Profile
January 24, 2017, 10:29:07 PM
 #12

Bitmain has said that you can do that! Breaking the seals over the screws to swap boards w/o authorization is a great way to void the warranty....

Bitmain usually has an order number or serial number sticker on each individual board. They would for sure be able to tell if you did this.

Stop buying industrial miners, running them at home, and then complaining about the noise.
NotFuzzyWarm
Legendary
*
Online Online

Activity: 3822
Merit: 2706


Evil beware: We have waffles!


View Profile
January 25, 2017, 02:54:24 AM
 #13

Won't help with the 1st batch-6 I got last week that had 1 board fail 22hrs after starting it. Just noticed after poking around with it the past couple days that the red Vcore-OK LED is not lit meaning the DC-DC circuit is dead/offline.

Wonder what is feeding the 8-good (but producing 0 hash rate) ASIC's that the miner finds on chain-8? Possibly from the controller via the data cable? The system *does* find the PIC and talks to it....

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome!  3NtFuzyWREGoDHWeMczeJzxFZpiLAFJXYr
 -Sole remaining active Primary developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
agentcash
Newbie
*
Offline Offline

Activity: 49
Merit: 0


View Profile
January 25, 2017, 04:02:35 AM
 #14

This is an interesting log so far. I wonder wtf is going on here...

Code:
Chain[J7] has 82 asic
retry Chain[J7] has 0 asic
retry Chain[J7] has 167 asic
retry Chain[J7] has 99 asic
retry Chain[J7] has 149 asic
retry Chain[J7] has 0 asic
retry Chain[J7] has 63 asic
...
OK: Chain[J7] is for this machine! [minerMAC: 08:85:07:d1:xx:xx]

are they doing board level mac addr checks?
NotFuzzyWarm
Legendary
*
Online Online

Activity: 3822
Merit: 2706


Evil beware: We have waffles!


View Profile
January 25, 2017, 04:09:59 AM
 #15

Good question. Pouring over the logs from the R4b6 with a bad board I can say they are certainly doing a ton of device-level checks right down to how many good cores are in each chip...

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome!  3NtFuzyWREGoDHWeMczeJzxFZpiLAFJXYr
 -Sole remaining active Primary developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
GWhisper
Full Member
***
Offline Offline

Activity: 120
Merit: 100


View Profile
January 25, 2017, 04:17:40 AM
 #16

Out of my 2 x B6 R4's which I received last Thursday.

1 x Hash board was DOA - never hashed
1 x Hash board failed after 2 days (Saturday)

I continued to run both on the remaining good boards (disconnected both power and IO from the bad) until this morning when I noticed I've lost yet another board.

So I'm experiencing a 75% failure rate on these boards, less than ideal. Worth noting that I keep my miners in a temp/humidity controlled data center.
agentcash
Newbie
*
Offline Offline

Activity: 49
Merit: 0


View Profile
January 25, 2017, 04:23:48 AM
 #17

I'm at about a 25% failure rate on my R4's and 5-10% on S9's so far. The S9's have been far more heat tolerant, but they also require a bit of babysitting as boards will go down while still reporting full speed in the interface.
NotFuzzyWarm
Legendary
*
Online Online

Activity: 3822
Merit: 2706


Evil beware: We have waffles!


View Profile
January 25, 2017, 04:37:50 AM
 #18

Out of my 2 x B6 R4's which I received last Thursday.

1 x Hash board was DOA - never hashed
1 x Hash board failed after 2 days (Saturday)
<snip>
Worth noting that I keep my miners in a temp/humidity controlled data center.
Verrrry interesting.... I also got my bad R4 last Thurs the 19th...  Huh Wonder how many other flaky ones were in that lot...

re: s9's, I have 16 of them from batch-1 on up and only have had 2 boards fail. One was from the b1 (failed last Oct.) and another was from a batch-12 and in-Warranty. Had both repaired/replaced by Bitmain Warranty in CO -- yes even the in-factory warranty one (was faster, fully insured, etc.).

Oh, and again: I highly recommend Awesome Miner. Free up to 4 miners and well worth the scaled price if you have more. It will check the miners and restart CGminer (BMminer) which sometimes hangs and if you set thresholds can also fully soft-boot the miner when hashrate drops or other things arise.

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome!  3NtFuzyWREGoDHWeMczeJzxFZpiLAFJXYr
 -Sole remaining active Primary developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
RadekG (OP)
Hero Member
*****
Offline Offline

Activity: 924
Merit: 500



View Profile
January 25, 2017, 12:00:40 PM
Last edit: July 19, 2018, 10:46:38 AM by frodocooper
 #19

Sorry if my explanation is too complicated.

1) turn off the miner
2) unplug working board
3) start miner until it hashes.
4) if you are lucky, it will recover bad board
5) turn off miner
6) plug good board back
7) start miner until it hashes.
Cool if you are lucky, you have both working boards
9) if you have bad luck, try it again. You can also try start "single-board-test" program via Putty several times.

NEVER stop or apply changes before miner start hashing. It killed my board, but I recovered it by previous steps.

Sorry I missed important thing: I am disconnecting white data cable from controller board.

UPDATE: Just tested this on latest batch, another DOA R4 arrived. Afer starting with "bad" board only it ressurected.
RadekG (OP)
Hero Member
*****
Offline Offline

Activity: 924
Merit: 500



View Profile
January 25, 2017, 12:11:13 PM
 #20

Won't help with the 1st batch-6 I got last week that had 1 board fail 22hrs after starting it. Just noticed after poking around with it the past couple days that the red Vcore-OK LED is not lit meaning the DC-DC circuit is dead/offline.

Wonder what is feeding the 8-good (but producing 0 hash rate) ASIC's that the miner finds on chain-8? Possibly from the controller via the data cable? The system *does* find the PIC and talks to it....

Yes, my dead board was talking via IIC, but read random voltage or produced random (error) messages regarding speed, ASIC count or ASIC frequency. I think this is the same problem with PIC firmware continuing from S7, but they are trying to fix this problem by reflashing bad fw on the fly. Sometimes, both boards was affected by reflashing fw or random number of ASIC found. I did my unbeliveable simple solution which worked. It found correct fw with correct number of ASIC and it also found correct PIC frequency settings for each ASIC. Everything worked fine even before with two boards connected it was not.
Pages: [1] 2 3 4 5 6 7 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!