Shouldn't every ECDSA key -> pubkey -> hashamahash -> base58check- > address be valid?
This is correct, every key should give a valid address... that is if the all calculations were fully completed.
vanitygen doesn't complete address for the check-sum part. These last bytes are all set to 0 to make sure the length is correct.
The first part of the generated address ~25 chars will not change due to this check-sum.
By skipping this part a lot of speed is gained when searching for longer match.
(this part is actually always executed in the -regex mode, the speed difference can be observed)
Only when a match is found the checksum is calculated en and the WIF and final address is stored.
When matching a valid bitcoin address i.e. just the leading "1" this match requires the check-sum to be generated.
This part is currently not as optimized for speed as the generator part; for the function of vanitygen this optimization is actually not needed.
On the CPU the hashing is done in batches of 256 ECDSA keys, I have to recheck the code but I think that when a match is found
the remaining 255 might no longer be checked/validated.
The ECDSA part of the generation is more than 70% of the cpu time.
Probably the reason the "1" and "1A" are relative equal in speed, statistically each batch will have about 10 "1A" matches.
The ECSDA engine and the hashing up to the checksum is very optimized.
When optimizing the checksum and base58 part as well, skipping the match check, just storing all results to file would make
this prog a faster 'any address' generator.