I've been working on many MANY improvements for BFL support in cgminer.
I think I'm about ready to submit a pull request into cgminer, but I still need to test how it operates with other FPGAs.
Essentially it should have no effect on Icarus, ztex and ModMiner, but since don't have any of those, I can't test them so I'm hoping some kind volunteers could compile my fork here, and run it for a while:
https://github.com/pshep/cgminerThe most significant change for FPGAs is the inclusion of 'SICK' and 'DEAD' processing, which was previously reserved only for OpenCL devices. For Icarus, Ztex and ModMiner, this should tell you if they are sick or dead (for BFL it'll attempt to re-init the device).
For BFL devices, my changes do the following:
- Timeout to restart work if it's taking too long
A nonce range should take just over 5s. Any longer and device is throttling.
Fixes issue where BFL appears to stall in cgminer
- Count throttling as a zero hash error
- Temp taken in watchdog thread
Now a disabled device will still return a current temperature, rather than the last value before disabling.
- Work restart on new work
This was cause very high stale rate for me...
Previously on a work restart, cgminer would allow the BFL to continue with the stale block. Now this is checked, and while and nonces found in that time will be wasted, the work will be discarded and new work will be started immediately and not after the 5s the BFL takes to return results.
- Timing adjustments
The 'wait for results' was hard-set to 4500ms before polling at 10ms intervals. With variation of systems and new firmwares of differing hash rates, a hard set timer could be either inefficient (starts polling way before necessary) or wasteful (starts polling way after result is ready). The auto-adjustment will find the correct wait time to minimize polling and and delay retrieving results.
- Device re-initialization
When a device is disabled (for whatever reason - user or by cgminer) then re-enabled, the device is re-initialized, rather then assuming communications are still working.
- Sick / Dead monitoring
As with OpenCL devices, BFL devices will be checked for sickness (60s no response) or dead (10 mins no response) and try to re-initialize them.
- Improved logging
Most logs now include the device in question, i.e.: "BFL0: took longer than 15s"
- Device start offset
Delays the start of each device by a random time between 0 and 100ms so that they don't all make calls at exactly the same time.
If you can help, I'll be much appreciative.
Thanks