Just an FYI, my job is a bit demanding and takes me away from the farm, often for weeks. I'm nearby for a little while but depending if the thread goes for a bit, it may "die" and be "revived" only because I can't do anything for 20 days.
Software-ish at my disposal:
- Linux
- Remote access via SSH - Raspberry Pi (command-line), Laptop (X11 tunnel)
- Wireshark
- Handful of other network tools that I don't know how to use fully - nmap, iw...
Innosilicon A6:
The A6 I bought off eBay. Like many things in life, I think all is good and it sits for 6 months before I plug it in. The seller unplugged all the control cable ribbons. I'm not sure if the order matters. Looking at the front of the machine, designating the connectors as (closest-furthest, left-right): FA1-4, FB1-4, R1-8, they are connected as FA1-4 to R5-8 and FB1-4 to R1-4. When it is powered on it behaves as I would expect it to (lights, fans). However I have no idea what the IP is.
I tried both resets. I've never seen it appear on 192.168.1.254 (per manual). I don't see it appear in my DHCP range (per hope). I tried a direct connection with my laptop with a fixed 192.168.1.x address and ran a trace with Wireshark. I don't think I saw anything from the miner. I was hoping some attempt to communicate would be made but I'm not exactly sure what I'm looking for. I think everything came from my laptop.
Then I thought I'll just use a card with some firmware I'll download.... There is a card slot in the aluminum, but no reader on the PCB. The control board is G19 CON V2.
Baikal-B:
These worked fine... mostly. Then one day one died. I plug in the PSU absolutely nothing. During the last 20 days apparently 2 died. (A power outage seems to have occurred. All my panels have surge protectors.... I wonder how much of a shock it is to go from 0kw demand to 40...) I have 3 BK-Bs per Bitmain APW3++ PSU. The first machine that died months ago, the others start up in the group; that machine is completely lifeless. However with the 2 recents, the PSU cuts off after all the leds blink for a brief moment on all machines. Selectively booting them up, there is one in that group that runs.
There is another machine which worked and now never seems to connect. The leds never stop blinking... I'm going to try to reflash the OS.
Then for a handful of BK-Bs they drop and never manage to reconnect. Unlike my Bitmains these machines disappear when using nmap. So I uploaded a script to each one executed as a cronjob (Gone through a couple of iterations [comments on bottom]:
#!/bin/bash
utime=`uptime -p`
min=`echo "$utime" | awk '{print $(NF-1)}'`
if ([[ "$utime" == *","* ]] || [ $min -gt 5 ]); then
read -r hash before <<<`php /root/cgminer-api.php 127.0.0.1 summary | grep -E '(MHS 5s)|(Last)' | awk '{print int($(NF));}'`
date=`date`;
if ([[ -z $hash ]] || [[ -z $before ]]); then
echo "$date: Socket error. Terminating" >> reboot.log
sudo reboot;
fi
now=`date "+%s"`
msg='';
let "delta = $now - $before"
if [ $delta -gt 600 ]; then
msg="$date: Potential Hangup ($delta s)";
elif [ $hash -lt 25000 ]; then
msg="$date: Low hash rate ($hash)";
fi
if [[ ! -z $msg ]]; then
echo "$msg" >> reboot.log
sudo reboot;
fi
fi
# out=`ping -c 1 192.168.1.1 | grep -oF " 0% packet loss"`
# ... (Version 1)
# if [ -z "$out" ]; then
# pid=`ps -C sgminer | awk 'END{print $1}'`
# cpu=`top -p $pid -n 1 -b | awk 'END{print int($9)}'`
# ... (Version 2)
# if [ $cpu -lt 8 ]; then
If anything, I think it speeds up the problem. (I just reviewed the script again. FYI I mine LBC only)
Any ideas or help with the above would be awesome!
THANK YOU!!!!