Right. The cause for some stales may be the miner, and for other stales it's just that it takes time for the mining pool to notify all miners, or that the miner delivers a proof-of-work at the same time it is notified that it is stale.
As far as I know, pools still allow people to use miners that don't support LP. I think it might not be a bad idea to at least notify people that they're wasting effort. This is especially important for pools that pay proportionately and count stale shares.
I think if some users have more stales due to the miner they are using, it doesn't make sense to pay them for those additional stales.
I agree. In practice, the only way you can actually do this is never pay for stale shares or require LP.
Other than that all users would probably average the same rate of stales over time. In this case it doesn't make a difference whether you get paid for stales or not, you would still get the same share of the 50 BTC. It might have a psychologically positive effect to get paid for stales though?
Exactly. I've been trying to reason out if people who mine faster or slower would have more stale shares --during the 9 minutes or so when no blocks are generated and no stales are possible, the faster miner builds up more shares. But the slower miner is less likely to generate a share during the window where stale shares are possible.
Higher network latency could mean more stale shares. But the effect should be pretty minimal. Even assuming an extra 800ms latency, that would be at most an extra .2% stales.
The other side to it is that the mining pool is competing with other mining pools, so keeping the rate of stales down means a little more money for all users. Consider 50% stales versus 0% stales and a user with 10% of the proofs of work. With 50% stales it takes twice as long between every 50 bitcoins minted by the pool. Assuming all users have 50% stales it won't matter for each individual payout whether users are paid for stales or not. But it does matter whether you get a payout of 5 bitcoins every day versus every two days.
I don't think your math is correct. A stale share simply means that the network found a new block before the share was submitted. It doesn't mean the work generated to produce the stale share had no chance to produce valid block. Losing the race doesn't mean you didn't have a chance to win the race.
If the pool and/or miners are wasting 50% of the work, then they lose 50% of the potential income. For 1% waste the loss is 1%. That said, it is impossible to get to 0% stales, because absolute synchronization is impossible in a distributed network. But getting closer to 0% means more money for everyone in the pool.
You are assuming that because a work unit lost the race, it had no chance of winning the race.