Where is your analysis that 128 bits of entropy is required?
128 bits isn't an absolute requirement, it's a comfortable rule of thumb. You can arrive at basically this number by making conservative estimates about the energy requirements of brute force (e.g. assuming an optimal classical computer, incrementing a counter, requires about 240 million tons of tnt energy equivalent to increment from 0 to 2^128-1, which is clearly secure against whatever threat model or algorithmic speedups you wish to suppose)
This, plus the fact that 128 bits of security is almost always very cheap to have has resulted in the conventional wisdom that cryptosystems with less security than that are snake oil. You can probably drop a couple bits and wave claims of strengthening at it and pass the smell test, but not much more than that.
The whole bitcoin system was designed to provide at least 128 bits of security for this reason.
Also why is deterministic wallet held to higher standard than an encrypted wallet? Why doesn't the mainline client then reject any passphrase without 128 bits of entropy.
Because most attackers will not have the encrypted wallet. Your security is passphrase PLUS wallet, which is an enormously higher standard than just passphrase. Belt and suspenders. And what I'm describing for deterministic wallets is effectively the same: Something you have (the random seed) and something you know (the passphrase).
Moreover, you can't actually measure entropy. You can make guesses based on assumed source models, but you don't really know it unless you generated it. Rejecting passwords by some simplistic model actually reduces entropy.
That combined with salt would make even a 30 bits or 40 bits of entropy impossible to brute force.
Whoa whoa whoa. Full stop. Salt? Where does this 'salt' come from? If you make the 'salt' at least 128 bits and store it you have _exactly_ what I've described. And that's a fine thing: so long as there is enough entropy from strongly random sources to make blind attacks infeasible then its all good. But you still have to record that salt someplace.
(and if you're strengthening you also need to store the strengthening amount, unless you always strengthen to the least common denominator)
It's pretty hard to reason about strengthening, because you can't generally prove that there isn't a way to shortcut it. In fact, if you assume quantum computation you get a minimum speedup of sqrt(n) for any possible strengthening scheme. Strengthening has practical value and should be used whenever weak passwords might be used, but it's not a replacement for real entropy.
Example of high security deterministic wallet:
You're assuming that the passphrase has 40 bits of entropy, this is a fundamental error. Multiple studies have shown that its basically impossible to get high entropy passphrases from humans, even if you give them excellent advice. People on this forum have frequently bragged about their oh so secure schemes, which actually provide fairly little entropy. This isn't because they're bad or stupid, or because they deserved to get robbed— being random is something that humans are just not good at.
(The 30 minutes assumption is insane too— almost all users would choose a less secure alternative over waiting 30 minutes— but whatever, taking off a factor of 100 isn't what breaks your argument)
Moreover, even if the user is bad and stupid and deserves to get robbed— when they _do_ get robbed the reputation of the whole system is called into question. Responsible security conscious developers build systems which remain secure even in the face of user stupidity— ones which only fail in the face of unstoppable heroic stupidity whos stupidity would be obvious to even the most unsophisticated observers.
Even the most intelligent users will sometimes make boneheaded moves, so even if you're confident that you're better than the typical user— you still should strongly prefer software that isn't gratuitously vulnerable to operator error. Any developer who isn't assuming that their users will make mistakes, will choose passwords with less entropy then they think they have, will leak partial passwords to shoulder surfers, etc. just hasn't studied the problem space hard enough.