Actually, looking at the kernel log I see that the temperature is too high! Bummer. Any idea on why this particular machine might be experiencing this? I have many other machines on the same rack and they are all running at "normal" (e.g. 70-80 degrees).
Fatal Error: Temperature is too high!
do read_temp_func once...
do check_asic_reg 0x08
get RT hashrate from Chain[5]: (asic index start from 1-63)
get RT hashrate from Chain[6]: (asic index start from 1-63)
get RT hashrate from Chain[7]: (asic index start from 1-63)
Check Chain[J6] ASIC RT error: (asic index start from 1-63)
Check Chain[J7] ASIC RT error: (asic index start from 1-63)
Check Chain[J8] ASIC RT error: (asic index start from 1-63)
Done check_asic_reg
do read temp on Chain[5]
Chain[5] Chip[62] TempTypeID=55 middle offset=30
read failed, old value: Chain[5] Chip[62] local Temp=91
read failed on Chain[5] Chip[62] middle Temp old value:102
Done read temp on Chain[5]
do read temp on Chain[6]
Chain[6] Chip[62] TempTypeID=55 middle offset=27
read failed, old value: Chain[6] Chip[62] local Temp=87
read failed on Chain[6] Chip[62] middle Temp old value:98
Done read temp on Chain[6]
do read temp on Chain[7]
Chain[7] Chip[62] TempTypeID=55 middle offset=29
read failed, old value: Chain[7] Chip[62] local Temp=84
read failed on Chain[7] Chip[62] middle Temp old value:93
Done read temp on Chain[7]
set FAN speed according to: temp_highest=102 temp_top1[PWM_T]=102 temp_top1[TEMP_POS_LOCAL]=91 temp_change=17 fix_fan_steps=1
set full FAN speed...
FAN PWM: 100
read_temp_func Done!
CRC error counter=0