I am going to reference some password analysis tools for comparison, specifically the ones at http://golubev.com/
Read Ivan's blog at http://www.golubev.com/blog/?paged=2
, and it shows some speed comparison.
Are we not hashing or finding the hash as efficiently as possible? On a GTS240, single Sha1 hashing should be able to go at a rate 133,000 Khash a second. On the cuda client pudding has released, that same GPU is only cranking out 20,000 Khash a second.
I know we are looking for a Sha256 hash, but every comparison I can find from PHP, to VS, to C++ says that Sha256 takes about twice as long as Sha1 to calculate. So the rate shouldn't be decreasing by 10 times, should only be decreased by half.
Any ideas from the developers here?
Yes, at least three things are missing from the equation:
- Time to move data into the GPU. I've tested my alg doing the same sha2 (so only one 64 byte transfer in) and got ~100MHs. Compare that to the 6MHs I get with the full implementation.
- We are actually doing 2 sha calculations, so halve your expected times.
- To avoid transferring all hashs back to the system memory, the comparison of the result hash against the target difficulty is done on the GPU, which takes a slice of time there too.
Of course implementations can be very optimized, all loops unrolled, memory used more efficiently. But the target should never be anything over 25% of the sha1 test, I'd say.