I am about done with a "cointerra-monitor". It remotely monitors the status of the cgminer on the Cointerra. Remote = Nothing to install on your cointerra machine.
1) Every minute it checks the status of 1 cointerra.
2) If any of the ASIC arent alive it marks a error count
3) If the cgminer isnt running it marks an error count
4) If error count gets above a configurable value it will
- copy the cgminer.log file from the cointerra
- email the user
- reboot the cointerra
5) If there are any successes above it will reset the error counter
Its python based so you can run on any OS you want.
I was thinking of doing more fancy stuff with the log file, cgminer RPC port statuses, Windows.exe + Service but sometimes K.I.S.S (keep it simple stupid) is best.
Of course, free and open source. If you want to be notified when it is done PM me and I will send you a link to it else I will just post it on this page. It should be finished by next weekend. You know, day job slows things down.
I have pushed the initial Alpha version to Github. It is being tested now with a few more people from this thread. I am using Ubuntu for the Host OS. I will try it in a Windows 7 VMware.
https://github.com/dprophet/cointerra-monitor/Basically it:
1) Looks for sick/dead/disabled ASIC's
2) Monitors temperature, hash rate, voltage, fan speed, etc... A few dozen stats from the Cointerra and Pool.
3) SCP cgminer.log file from cointerra box
4) Emails errors/warning to the user. This email includes a gzip file of #4 above and the monitor.log.gz
5) Reboots the TerraMiner OS when there are critical issues. Right now the only "critical" issue is a sick/disabled ASIC chip or cgminer isnt running.
I am in the process of:
1) I need to add more pool stats to see if all pools go down.
2) Adding an array of terraminers's so it will monitor N number of them.
I need to wait for another TerraMiner before I code this up (no, my additional machines are not in yet)
3) Working support for this into the MobileMiner IOS/Android App. In communications with the developer now to see if it is Ok. I want a push to a mobile platform if there are issues.
4) More test cases where this agent doesnt recognize the problem so I can code them up. I a few dozen stats in my data structures from the pool/chips so its easy to add new conditions to watch/take-action on.
Alpha testers are encouraged (even desirable). If you want my help making sure this works for you, best do it now while this is my personal priority before I move onto other wishlist/tasks.