I can believe that it would be possible.
As a simple example: a bignum library divides numbers into n-bit chunks, and operates on those chunks one at a time. If the chunks were 64-bits instead of 32, then the bignum library would (near enough) double in speed.
There is more to it than that of course, optimisation is about more than just bits.
Also, just because the existing code doesn't show speed up on 64-bit, doesn't mean it is impossible that 64-bit optimisations are possible with different algorithms.
Edit: of course it would make no difference at all when it is the GPU doing all the calculating.