I was thinking that to get better accuracy when estimating network hashrate you could exploit the fact that the big pools publicise their hashrate with almost perfect accuracy, i.e. the exact number of shares. If you trust their numbers you could get a lower variance if you counted the pools' blocks separately.
If N(t) is the number of blocks minted up to time t, then ΔN(t)/Δt is the best estimation of network hashrate λ in the period Δt, which has variance λ/Δt. But if we express it as a sum of unknown and known hashrates:
N(t) = N'(t) + N''(t) where N' has the unknown rate λ' and N'' has known rate λ'', then Var( N(t)/Δt ) = Var(N'(t)/Δt + N''(t)/Δt) = λ'/Δt + 0
According to http://blockchain.info/pools
we could reduce the variance by half by using the three largest pools, and by more than 75% by using the ten largest pools.
My question is, which pools can you trust to give correct numbers?