I'm not sure to fully understand your idea.
For the 4th part in your whitepaper, the difficulty function f(l) must be sensitive enough to l to satisfy E2(m)<E1(m), which means that f(max) must be much larger than f(min). How is f(l) designed?
I really don't understand the selfish mining attack in your whitepaper, and I don’t understand how the 1/m^2 was derived?
Due to the difference in difficulty, the fork probability of the system is reduced, which is indeed a good solution.
Good luck to you!
[moderator's note: consecutive posts merged]