T - can you consider the following features:
1. counter/countdown displayed for circles/rigs that has begun reset/power-cycle process. This way, at dashboard we can see what process has begun and have taken place. This is key for troubleshooting. WIthout this, I dont even know if SRR is working or running -- and assume its working. Over at smOS web dashboard, it would have been nice to have the power/reset control available but as explained by you -- smOS don't communicate with SRR.
2. for me, when SRR need to be activated on a rig that means miner has failed to restart the miner on its own, meaning problem may be related to other than miner, eg. memory leaks/corruption, network issues etc. Instead of reset, I would almost certainly prefer power cycle -- meaning power down and x minutes late power up.
3. When running Claymore miners, I also have his monitoring utility running in the background. This way I keep tabs with whats going on at miner app level. Claymore can also do run bat file when miner hangs due to oppencl errors - not sure how you plan to use this current utility to somewhat compatible with SRR for Claymore miners only. Maybe this is not necessary but his utility has many key info like how many times the miner resetted, how many time pool failovered etc.
4. I have re-arranged my SRR units and slot numbering again so that I didnt miss anything - so that I can correctly activate all SRRs to value more than 0 for wd delay; but I think I really need the "2500 secs" watchdog delay because my non-RX cards are STILL being triggered to reset unnecessarily (or too early). I cant figure out why this situation doesnt effect my RX480 rigs -- they respond well to SRR even at 250 secs. I can only conclude this may be how my network environment is laid out. So looking forward to the v2 fo SRR and enhancements.
1. See what i can do, its good feature.
About controlling SRR i will be getting into it, i have 2 concepts of doing that and i will implement one or even two of them.
a) sending commands that for example to shutdown rig number 1 but sending this command thru another working rig.
b) it is propably possible to make SRR comunicating with my SimpleMiningOS dashboard, i will look into this mater and see if i can implement it but dont know ETA.
2. In my case, like 80% problems are solved by fast restart, and only some like 20% need cold reset (shutdown, wait 5 minutes, power on)
I will check if i can implement some kind of feature to preset this by own needs.
3. If gpu error occurs AND you have set (-w1 -r1 in command) the it will run reboot.sh in which there are special commands that will FORCE REBOOT your rig, and this is working pretty well most cases.
In some cases if this will freeze rig, then now SRR kicks in by not getting keep allive messages.
So in other words, software on rig side can solve most of problems and doest that, but if it fails and the rigs wont send keepallive message within specified number of seconds then in second line SRR starts rebooting.
I think that we dont need to make it compatible as every of those features works on different level (first software reboot, and if it wont help then SRR hardware fast reboot and if it 4 tome fails then hardware cold reboot)
Isnt that best idea ?
I will try to make SRR restarts counter, but i will also do the restarts counter in SMOS dashboard which will be looking at rig uptime, if in next report rig will show less uptime than last one - it means reboot.
I was thinking about it and i will do this.
4. 2500 seconds will be in this next release. already doing that.
Also i might know why R OS is booting longer.
Its booting Graphical enviroment and THEN it starts running SRR agent. It takes lot more time.
in RX OS there is no graphical interface so the booting process is much faster.
I thing i can speed up this agent script under R OS so you wont have this issue. thx for reporting.