But any balance will be temporary, as whatever algorithm or technique is used, if it is profitable people will aggregate the appropriate resources.
Exactly. Also, the perceived 'bottlenecks' are usually addressed to some degree by general advances in system architectures, sometimes skewing the distribution further.
The point with my mass PW algo is how pliable it is. As a network matures and miners change, the numbers can change without having to switch algos and potentially fork. Simply changing t would affect how much raw hashing is done in proportion to disk IO (spacing out bursts of disk IO) to not favor massive disk-laden servers but still deny ASICs profitability, or minimizing t would bring it to a disk-laden server.
I'd assume that a value of roughly 31 for t would put today's hashing into the CPU-favored range due to the difficulties of piping data into an openCL kernel. Anything up there would starve an ASIC of hashing input due to lack of fast enough disks to keep up with hundreds-of-gigahash hashing.
That middle ground would effectively require a highly multi-faceted mining rig with fast disks, fast processors, or even specialized high-speed buses for loading data onto a GPU, instead of allowing a specialization of a given component. n essence, it would level the playing field quite a bit.
As for killing off pools, that's a whole other idea I myself am queasy about, though it wouldn't cause massive hashrate drops (instead making them harder to achieve)
'highly multi-faceted mining rig with fast disks, fast processors'
How does that improve conditions for the 'average' users with basic home systems? What you're proposing sounds like a disadvantage to distribution of hashrate.
Also, I suggest looking into the proposed opencl 2.0 specifications and HSA before going too far with that concept.