One thing I've seen is LOTS of stales. When I used the latest hashkill with deepbit, I noticed I had 7% stales, that's plain crazy. I've tried tinkering with clock speeds, cooling other pools, nothing really helps to resolve this.
Hmpf, that should have been resolved already. You are sure you ran install.sh with root privs?
* config file
That would be nice...but what exactly are we going to configure that is hard to configure via command-line?
* some type of RPC (JSON-RPC anyone) interface for managing operation and querying status of miners (hash speed, temp, pause resume queue, etc)
Right now there is such mechanism, but it's like read-only...you can have a look at the ~/.hashkill/bitcoin.json. It exports hash speed and etc stuff in json format. It sucks though and definitely it would be good to control it remotely.
* ability to adjust fan speeds automatically and provide ability to set min and max speeds per card
This is a very bad idea. Tried doing that in the past. The end result being something that works good on one system and goes crazy on another (well I guess that's what the whole GPU programming is about anyway). Overall I am not inclined
* either prioritize workloads per GPU or per GPU temperature... it would be good for long term operations to have all the GPUs working at the same temperature
This is an interesting idea that never occured to me before. Worth trying.
* an aggressiveness setting
I guess -D and -G work that way. -D -G4 is rather aggressive while just -G1 is the most responsive.
I used sudo ./install.sh so there were no permissions problems. Quick note, when you release an updated version of hashkill, please change the filename with every revision, I'm sure it will help with ambiguity and the bloody web proxies I deal with.
Config files keep things easy to manage when more and more features come into play.
At present I have 2x6990s running, and due to current cooling constraints, the 6990 that is running up top (gpu0 & gpu1) is running 15% hotter than the bottom card. Due to this, I am using phoenix and set the aggressiveness to 5 on GPU0, 7 on GPU1 and 12 on GPU2 and 3. I tried playing around with D and G, however I wasn't able to keep the card from overheating.
With regards to the temperature threshold in the current implementation, do you poll the SDK until the temperature drops 10C below threshold and begin processing, or do you have a predetermined time based time out.
With regards to prioritizing workloads to keep constant temperatures on the GPUs, it would help to prolong reduce wear from the cards heating up and cooling down.
BTW, have you tested hashkill with btcguild?