There is some tips to speed-up keyhunt-cuda (rotor-cuda):
Apply this then you need less grid size, like 4096x512 will be enough for 4090:
https://bitcointalk.org/index.php?topic=5244940.msg63526413#msg63526413Also change this:
__device__ __noinline__ void CheckHashSEARCH_MODE_SA(uint64_t* px, uint64_t* py, int32_t incr, uint32_t* hash160, uint32_t* out)
{
switch (mode) {
case SEARCH_COMPRESSED:
CheckHashCompSEARCH_MODE_SA(px, (uint8_t)(py[0] & 1), incr, hash160, out);
break;
case SEARCH_UNCOMPRESSED:
CheckHashUnCompSEARCH_MODE_SA(px, py, incr, hash160, out);
break;
case SEARCH_BOTH:
CheckHashCompSEARCH_MODE_SA(px, (uint8_t)(py[0] & 1), incr, hash160, out);
CheckHashUnCompSEARCH_MODE_SA(px, py, incr, hash160, out);
break;
}
}
to this because doing switch-case in kernel is very bad idea:
__device__ __noinline__ void CheckHashSEARCH_MODE_SA(uint64_t* px, uint64_t* py, int32_t incr, uint32_t* hash160, uint32_t* out)
{
CheckHashCompSEARCH_MODE_SA(px, (uint8_t)(py[0] & 1), incr, hash160, out);
}
also maxFound can be completely removed to search puzzle, because we need only one return result anyway
Rotor-cuda speed with this mods:
[00:17:10] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.453247 %] [R: 0] [T: 6,412,923,043,840 (43 bit)] [F: 0]
[00:17:11] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.500549 %] [R: 0] [T: 6,421,244,542,976 (43 bit)] [F: 0]
[00:17:12] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.547852 %] [R: 0] [T: 6,429,566,042,112 (43 bit)] [F: 0]
[00:17:13] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.595154 %] [R: 0] [T: 6,437,887,541,248 (43 bit)] [F: 0]
[00:17:15] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.642456 %] [R: 0] [T: 6,446,209,040,384 (43 bit)] [F: 0]
[00:17:16] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.689758 %] [R: 0] [T: 6,454,530,539,520 (43 bit)] [F: 0]
[00:17:17] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.737061 %] [R: 0] [T: 6,462,852,038,656 (43 bit)] [F: 0]
[00:17:18] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.784363 %] [R: 0] [T: 6,471,173,537,792 (43 bit)] [F: 0]
[00:17:20] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.831665 %] [R: 0] [T: 6,479,495,036,928 (43 bit)] [F: 0]
[00:17:21] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.878967 %] [R: 0] [T: 6,487,816,536,064 (43 bit)] [F: 0]
Thanks.