... it would make more sense to have it longer, it could decrease the overhead ...
Surely a longer nonce would
increase the overhead, because more bytes must be processed on every hashing attempt.
From my understanding, getwork has to be called more frequently because of the small nonce.
Perhaps a modified getwork2 could send part of the merkle tree with a blank extranonce. Then the worker could fill in the extranonce with a random value, to generate their own work units. Then they wouldn't have to hit the server every 4GH. If collisions were a problem, each worker could be given a range for extranonce.
Having the workers increment the timestamp is also an option, but above 4GH/s this is not enough by itself.
A pool could use a higher share difficulty to reduce the rate that shares are generated, but I don't think it would decrease the load on the pool server much, because the getwork requests occur at least once per 4GH regardless of the share difficulty.
It would be nice if pool operators could tune the share difficulty to be higher with more variance and lower server load, vs lower share difficulty for less variance and higher server load.