Currently the first block that arrives is picked by miners so the "secretives" have incencitive to "delay public blocks", an incentive to listen in on other mining pools "detect early public block announcement to quick react to it" and to try and inject/publish their secretive block as fast as possible towards other miners that have not yet received a new block.
Your analysis is pretty incomplete. Their success at winning that "fast as possible" race, along with their total hashrate, is essential. Otherwise a delay is a pure loss.
It's indeed a bit more complex than the article (who's link is now broken, apperently the pdf converter crashed... update: v1 seems to still be available, v1 is from 1 november, v2 seems to have crashed, v2 is from 4 november) seems to make out, at least from the algorithm, but later it examines the revenues at page 9, which is basically what my posting below is about, however their algorithm does not include my refinement below, I don't think their algorithm includes spreading/delaying techniques, sometimes sabotaging the enemy or sometimes helping the enemy when it's beneficial, ofcourse this needs to be formulated in code/extra lines but it shouldn't be too hard, my posting examines the friction between calculation speed versus spread speed and could help to make more clear decisions of what to do in certain cases:
Let's denote the attackers as the secretives: S
Let's denote the public as the publics: P
Working on blocks now has a few possibilities:
PSS
PSP
PPS
PPP
Both start working on the previous public block indicated by the first P.
PS (Group S works on next block S based on block P)
PP (Group P works on next block P based on block P)
Both find their block at the same time.
S wants to keep it secret to continue work on the next S.
P wants to publish their P.
Here it's clear that S wants to delay P as long as possible, so here a delay of the enemy/P is beneficial towards S, you agree with this so far ?
Now suppose P was late.
(the first P of the three letters is now ignored so:
SS
SP
PS
PP
are the possibilities)
S can now comfortable work on their next S.
P catches up and publishes their first P.
S now has to publish their first S and does so and hopes to win from P. Again here S has a incentive/benefit to delaying P.
However let's assume S is not too successfull at delaying the spread of block P.
And now both blocks arrive at miners, both blocks are now candidates to be included into the block chain.
Now a situation occurs where S has a benefit to help P spread their next block P as fast as possible but only if next block P was based/calculated on block S.
Otherwise again S has a benefit at delaying next block P. If next block P was not based on block S then block S loses their block and instead block P becomes the main chain followed by next block P.
Therefore for this selfish mining algorithm to work well blocks have to be analyzed.
Sometimes it's beneficial for S to sabotage/delay P and sometimes it's beneficial for S to help/speed up spread of P.
However this assumes that S cannot quickly enough calculate next block S, if they instead could then it would still not be wise to help P.
However it's more likely that next block P will spread before they can calculate next block S based on previous block S so it's seems clear to take the money and run... be statisfied with what you got so far.
However it's clear that there is some friction here.
Calculation speed of next block S based on block S versus spread of next block P based on block S.
S now has a choice:
1. Delay spread of next block P based on block S and thereby undermining it's own block S in the hopes of calculating next block S based on block S itself and thus getting a double reward.
versus
2. Acceptance of spread of next block P based on block S and thereby at least embedding block S into the main blockchain and throwing away calculations on next block S and starting over based on next block P and thus getting a single reward.
As long as calculation speed of next block S based on block S is slower than spread of next block P based on block S it seems to make sense to choose option 2.
In other words, the algorithm will need to be adaptive and determine if the spreading block of the opponent was based on a previous block from S or a previous block from P and take a decision on that and do what it thinks is best based on speeds, calculation speed versus spreading speed. Spreading speed seems to be in a few seconds, while calculation speed can be 10 minutes.
However other factors may influence this, denial of service attacks, network attacks, crashes etc.
(Also this posting only looks at S versus P, it does not include S versus S versus P
)