Since this is the beginners and help section, lets nitpick a bit, shall we?
Sure. You don't mind if I "nitpick" your nitpicking, do you?
No! Its not a mathematical problem and no solution to a problem is found. Its not even difficult to perform. This bad metaphor made it hard for me to understand PoW in the beginning.
The basic Idea is that the mining node (usually called "miner") is hashing certain information of the block.
And "hashing" is performing a specific set of mathematical steps, which could be described as "a mathematical problem", right?
To calculate a hash is easy,
Have you tried calculating one by hand? I don't think I'd describe it as easy.
there are USB sticks that calculate 333 Million of them per second (333 Megahash/s).
Correct. It might not be "easy", but a computer can do it VERY fast. Therefore we set a difficulty on the problem by enforcing certain conditions on the result. This makes it "difficult" to find an acceptable result, since you have to try MANY, MANY times before you find an acceptable answer.
The hash however has to meet certain criteria. In its binary form it has to have a certain amount of leading zeros.
No. The difficulty is not controlled with a number of zeros at the beginning of the hash. This bad metaphor made it hard for me to understand POW in the beginning. The basic Idea is that the resulting hash value must be less than an agreed target. This has a side effect of resulting in lots of zeros at the beginning of the hash value, but if all we did was chop off a zero each time, it would double the time between blocks for each zero. This would make it impossible to have more than 256 total difficulty adjustments, and would make it impossible to adjust the difficulty when the average block time is only 9 minutes (or 11 minutes).
Yes, the total number of coins will approach(!) 21 Million BTC, but never reach 21 Million unless the protocol is changed. Due to how the halving works the total number of Bitcoins will be 20999999.9769 BTC
Actually, due to some bugs in a mining program early on, there are 135.20897146 BTC that are completely missing. Therefore the total number of bitcoins that will ever be created if the protocol is not changed is only 20999864.76792854
Additionally, there are at least 2,610.0138685 BTC that are provably unspendable. Without a significant change to the protocol, you can safely assume that these bitcoins have been "destroyed". Therefore the maximum number of spendable bitcoins that can exist in the future won't exceed 20,997,254.75406004
Basically: the number of leading zeros the hash must have to be accepted changes. Thus you need to calculate more (or less) lottery tickets on average to win the reward (find a new block)
Basically: the value of the has must be lower than an agreed target.