First problem in what you are doing:
p2pool and the farm located remotely from each other
Bad idea. P2pool is extremely dependent upon latency due to the excessive restart rate.
Not really. The latency between the sites is up to 30ms, which causes only about 0.2% stales by itself. However, it would increase non-linearly with addition of units, since each of them receives several kilobytes (because of large coinbase) simultaneously at work restart.
Stratum work isn't large.
It sends a merkle tree slice size O(log2(n)) (n=txn count) plus a coinbase txn - where the size of the coinbase txn will of course depend on what extra 'stuff' is put in there.
P2pool likes to put a bucket load of micro-payments in there - yep - so use something other than p2pool.
Also, if you are saying that multiplying it by a large number makes it large - well yep that's true of any number >=1
Getwork will use way more bandwidth than stratum.
The trick I am trying to do is increasing rollntime to say 1000, which means 1 getwork request per 20s per unit. Getwork response fits within 1 packet, while stratum response from p2pool is several kB. Please correct me if I am wrong.
A network packet is up to 64k - but of course they are broken up depending on the network connection settings.
But again you're looking at it the wrong way.
Indeed a single stratum response is bigger than a single getwork response, but the stratum response is usable for 30s (longer on bad pools
) and with a small 32bit nonce2 will support > 6millionTH/s for that 30s.
With all of that work being rolled up to the cgminer maximum of 90s, that's of course more than 500millionTH/s
Thus a proxy for up to 500millionTH/s of miners is a few K of work sent to the proxy every 30 seconds.
The original rollntime was a bad design idea to start with.
Doing small rolls on stratum (up to 90) is a reasonable solution to reduce workload generating internal stratum work.
Doing large rolls is actually an issue on the bitcoin network itself. The larger the roll, the more the small possibility of valid blocks being rejected, increases.
The Getwork in my S1 binary is the Getwork in cgminer - same thing - I'm part of the cgminer team, not a fork.
I may be wrong but it seems like it has something to do with the endian issue. It was fixed at some time before your last build. In the stock binary (I mean that old one, supplied as a part of S1 ROM), getwork sends wrong nonce, while in your last build, it sends empty responses (does not submit solutions). Maybe you just didn't test getwork?
My point being that if there is a getwork endian issue - submit a cgminer pull request fix.
No I don't test getwork - I don't even have a reasonable source of it setup anywhere that will find results.
Your setup is the fault.
Add a proxy if you wish to reduce bandwidth.
Moving away from stratum would only be due to you misunderstanding how the protocols work.
Which proxy do you suggest? The stratum-mining-proxy by slush?
Not sure - the only good proxy I know of isn't ready yet.
Cross compiling is the same as for the avalon.
Read their info on how to do that ... or the abundant info provided by openwrt.
Thank you!