Yes, I understand what you're trying to say.
I only wanted to make clear there's a need for improvement.
At the end of the day HW alone does nothing and SW alone does very little. NV right now has a certainly better ecosystem around it.
Is this pertaining to different OSes (Linux vs Windows)? GPU vs CPU software (i.e. jamming AMD into software meant for CPUs where as NVIDIA gets it's own miner). Or the language the program is coded in?
It is related to the "shape" the kernel gets.
Imagine putting a square peg in a round hole. To do this, GPUs split the beg in pieces, pass each smaller piece then put them back on the other side.
AMD believes this is not efficient usage of transistors and this is a reason they won the console war again. NV by contrast believes they should optimize the worst case and this is what they do.
Chained hashing is even worse: you have taped several pegs together!
If this possibility exists and large gains can be made, why has nobody done it? Is this something that the coders are keeping to themselves? Or is it only possible with certain algos? Would like to learn more about the possibilities and limitations of this.
It took me almost two months to redo qubit + miner modifications (qubit is the tail of X11 FYI, it could be considered X5). I guess you get the idea.
It is possible - and perhaps even likely - many have the improved kernels already. If they have a room full of GPUs, they are consistent advantage. If you have one, not so much.
Not all algorithms can be improved. Some are so simple they're close to efficient... it's on a case-by-case basis. For example, Echo is nearly optimal. I honestly don't like X11/X13 much, the main problem I see is: it is inefficient for GPU users, it is inefficient for a small company doing FPGAs... it is a pipe dream for a big investor who wants to roll out its own ASIC already (they are shelling out 7-digits so paying four engineers instead of one isn't much of a problem). I simply don't like the idea but it's an hash like everything else out of here.