Bitcoin Forum
November 19, 2024, 04:12:30 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 [47] 48 49 50 51 52 53 54 55 »
  Print  
Author Topic: == Bitcoin challenge transaction: ~1000 BTC total bounty to solvers! ==UPDATED==  (Read 54269 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic. (11 posts by 1+ user deleted.)
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1204
Merit: 237

Shooters Shoot...


View Profile
December 12, 2023, 05:54:29 AM
 #921

While running some performance tests with Rotor-Cuda I've noticed that when I assign a monstruous grid-size for my GPU, I can get more speed.

If I set it like this --gpux 18000,512 I get a steady 4.62 GK/s peaking at 6.90 GK/s for a few seconds.

Can this cause any problems, like skipping keys during the search? If so, can anyone recommend a good grid-size for a 3080ti?

Thanks in advance!
The best thing to do is to test your grid size and run through a small range, something like a 2^40 range. See if the grid size finds the key or not.

I use a similiar version of KeyHunt Cuda / Rotor and haven't missed a key with a large grid size. But seriously, run a simple test to know for sure with your card and setup.
CY4NiDE
Member
**
Offline Offline

Activity: 63
Merit: 14


View Profile
December 12, 2023, 06:38:59 AM
 #922

Hey there, thanks for your reply. Much appreciated.

So if it can pass the 2^40 test without skipping any keys can I deem it safe?

So far no problems with 2^35, 2^38, 2^39, 2^40.

--gpux 32000,512

1CY4NiDEaNXfhZ3ndgC2M2sPnrkRhAZhmS
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1204
Merit: 237

Shooters Shoot...


View Profile
December 12, 2023, 06:45:40 AM
 #923

Hey there, thanks for your reply. Much appreciated.

So if it can pass the 2^40 test without skipping any keys can I deem it safe?

So far no problems with 2^35, 2^38, 2^39, 2^40.

--gpux 32000,512
I would run a few tests. Put some keys at the beginning of range, the middle and the end. I've used some large grid sizes with no issues.

If you are using the rekey option, you will get fluctuation in your speed no matter the grid size; as it spins up to rekey and then picks back up.
CY4NiDE
Member
**
Offline Offline

Activity: 63
Merit: 14


View Profile
December 14, 2023, 08:20:06 PM
 #924

I definitely got ahead of myself.  Grin

I ran X-Point mode using --gpux 32000,512 against 20 keys spread over the 2^40 range and only 2 of those keys were being found.

Same with --gpux 18000,512 and anything in between.

In the end only with --gpux 1024,512 the program was able to find all 20 keys without skipping.

I'll run 2^50 next against more keys to see if this effect gets mitigated as the space increases.

Haven't checked Address mode or Hash160 mode yet.

1CY4NiDEaNXfhZ3ndgC2M2sPnrkRhAZhmS
digaran
Copper Member
Hero Member
*****
Offline Offline

Activity: 1330
Merit: 899

🖤😏


View Profile
December 14, 2023, 08:51:53 PM
 #925

You might be interested to read and learn more about grid sizes, there are more stats which you could find by visiting this nvidia page  there are technical stats on what is the acceptable grid size for different applications. You can't simply use any arbitrary grid size.

🖤😏
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1204
Merit: 237

Shooters Shoot...


View Profile
December 14, 2023, 11:08:07 PM
Last edit: December 14, 2023, 11:24:02 PM by WanderingPhilospher
Merited by digaran (1)
 #926

You might be interested to read and learn more about grid sizes, there are more stats which you could find by visiting this nvidia page  there are technical stats on what is the acceptable grid size for different applications. You can't simply use any arbitrary grid size.
Digger doesn't even own a GPU or PC, lol. He's doing all of his tests from an old Blackberry flip phone Smiley

I've had some large grid sizes CY4NiDE, but I keep them multiples. If card stock grid is 38,128; then I will keep a front grid that is a multiple of 38. I normally run multiple of 38x256. I've used 760x512 with no issues, 1520x512 with no issues, and I think 1 multiple higher.

Those are tests with KeyHunterGPU; Rotor has some flaws in it, especially if using the continue option. I stopped using/testing all versions of Rotor after I discovered a bug with the continue option, because it was causing keys to be skipped.



CY4NiDE
Member
**
Offline Offline

Activity: 63
Merit: 14


View Profile
December 15, 2023, 12:37:07 AM
Last edit: December 15, 2023, 12:54:04 AM by CY4NiDE
Merited by Halab (2), JayJuanGee (1)
 #927

Yeah, I was testing different grids for my card the other day, within the reasonable bounds, keeping it a small multiple of the original grid.

After a while I decided to play with it a bit and increased the grid by larger factors, thus arriving at numbers like 18000 and 32000. They are not arbitrary.

I thought the program wouldn't even initiate. For my surprise not only it ran but it had increased speeds. Then it came to mind that it was probably jumping over a bunch of keys.  Roll Eyes

It was too good to be true. I was getting a constant 5GK/s with sudden peaks to 9GK/s every few seconds running sequential X-Point mode with a grid-size like --gpux 36000x512 against #130.

Raise it much further than that and the speed will keep dropping to 0.00 MK/s for a few seconds during the entire search.

Anyways, if going for random mode I guess this issue could be overlooked? One can have more threads thus searching faster, the trade-off being skipping some keys...

About this other flaw [in a scenario where the grid-size is not causing it to skip keys] could I avoid it by updating the lower range in my .bat file to be the last key shown in the counter before terminating the session, instead of using continue.bat?

Thanks!

1CY4NiDEaNXfhZ3ndgC2M2sPnrkRhAZhmS
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1204
Merit: 237

Shooters Shoot...


View Profile
December 15, 2023, 04:16:33 AM
Merited by JayJuanGee (1)
 #928

Yeah, I was testing different grids for my card the other day, within the reasonable bounds, keeping it a small multiple of the original grid.

After a while I decided to play with it a bit and increased the grid by larger factors, thus arriving at numbers like 18000 and 32000. They are not arbitrary.

I thought the program wouldn't even initiate. For my surprise not only it ran but it had increased speeds. Then it came to mind that it was probably jumping over a bunch of keys.  Roll Eyes

It was too good to be true. I was getting a constant 5GK/s with sudden peaks to 9GK/s every few seconds running sequential X-Point mode with a grid-size like --gpux 36000x512 against #130.

Raise it much further than that and the speed will keep dropping to 0.00 MK/s for a few seconds during the entire search.

Anyways, if going for random mode I guess this issue could be overlooked? One can have more threads thus searching faster, the trade-off being skipping some keys...

About this other flaw [in a scenario where the grid-size is not causing it to skip keys] could I avoid it by updating the lower range in my .bat file to be the last key shown in the counter before terminating the session, instead of using continue.bat?

Thanks!

Yes, you could. I implemented a total key counter in mine. It would print to file, total # of keys, that way, even if power went out, I'd have a good starting point.

If you do it this way, make sure you keep/know start and end range. You then need to take total keys ran, divide by the number of threads (grid size) and then take that number and add it to your initial/last start AND end range.

Example:
If you had a start range of 0 and an end range of 1000000 (keep it small for this purpose) and your grid size was 10x10. The program says you have ran/checked 10,000 keys total.
Take 10,000 (total keys) and divide by 10x10=100 (grid size); 10,000 / 100 = 100. So each gpu thread checked 100 keys.
So for your next batch file, you would have a start/end range of 100:1000100.
If you only change the start range by 100, then you are overlapping/possibly missing keys checked on the other threads. If you stop and think about it, or do the math, it'll make sense.
Your first thread checked 0-100 (now on second run it should start at 100 and be on the hook to check up to 10,100); the last thread checked 990,000-990,100. If you don't adjust the end range as well, your last thread will now be checking 999,900 instead of starting where it left off at 990,100.
Lol, again, if you do the math you'll understand. Hope it made some sense.
CY4NiDE
Member
**
Offline Offline

Activity: 63
Merit: 14


View Profile
December 16, 2023, 08:49:17 AM
 #929

Yes, you could. I implemented a total key counter in mine. It would print to file, total # of keys, that way, even if power went out, I'd have a good starting point.

If you do it this way, make sure you keep/know start and end range. You then need to take total keys ran, divide by the number of threads (grid size) and then take that number and add it to your initial/last start AND end range.

Example:
If you had a start range of 0 and an end range of 1000000 (keep it small for this purpose) and your grid size was 10x10. The program says you have ran/checked 10,000 keys total.
Take 10,000 (total keys) and divide by 10x10=100 (grid size); 10,000 / 100 = 100. So each gpu thread checked 100 keys.
So for your next batch file, you would have a start/end range of 100:1000100.
If you only change the start range by 100, then you are overlapping/possibly missing keys checked on the other threads. If you stop and think about it, or do the math, it'll make sense.
Your first thread checked 0-100 (now on second run it should start at 100 and be on the hook to check up to 10,100); the last thread checked 990,000-990,100. If you don't adjust the end range as well, your last thread will now be checking 999,900 instead of starting where it left off at 990,100.
Lol, again, if you do the math you'll understand. Hope it made some sense.


That's actually a very good explanation, really appreciate it man!

1CY4NiDEaNXfhZ3ndgC2M2sPnrkRhAZhmS
3dmlib
Jr. Member
*
Offline Offline

Activity: 44
Merit: 2


View Profile
December 16, 2023, 10:18:51 PM
 #930

Those are tests with KeyHunterGPU; Rotor has some flaws in it, especially if using the continue option. I stopped using/testing all versions of Rotor after I discovered a bug with the continue option, because it was causing keys to be skipped.

Hello. What is continue option? What bug exactly? Also, can somebody explain me what maxFound option is and how it used in code? Thanks.
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1204
Merit: 237

Shooters Shoot...


View Profile
December 17, 2023, 05:22:22 AM
 #931

Those are tests with KeyHunterGPU; Rotor has some flaws in it, especially if using the continue option. I stopped using/testing all versions of Rotor after I discovered a bug with the continue option, because it was causing keys to be skipped.

Hello. What is continue option? What bug exactly? Also, can somebody explain me what maxFound option is and how it used in code? Thanks.
The continue option was an option in rotorcuda that would save how many keys searched and grid size and readjust the range on a restart.
It had flaws, as in sometimes it would not adjust correctly, or the total keys searched line would be blank.

The maxFound option was the max keys the program could find in a single kernel call. I don't remember that being in keyhuntcuda or rotorcuda but more of the vanitysearch/forks of vanitysearch.
3dmlib
Jr. Member
*
Offline Offline

Activity: 44
Merit: 2


View Profile
December 17, 2023, 07:08:32 AM
Merited by JayJuanGee (1)
 #932

Those are tests with KeyHunterGPU; Rotor has some flaws in it, especially if using the continue option. I stopped using/testing all versions of Rotor after I discovered a bug with the continue option, because it was causing keys to be skipped.

Hello. What is continue option? What bug exactly? Also, can somebody explain me what maxFound option is and how it used in code? Thanks.
The continue option was an option in rotorcuda that would save how many keys searched and grid size and readjust the range on a restart.
It had flaws, as in sometimes it would not adjust correctly, or the total keys searched line would be blank.

The maxFound option was the max keys the program could find in a single kernel call. I don't remember that being in keyhuntcuda or rotorcuda but more of the vanitysearch/forks of vanitysearch.

Thanks. I had some thoughts about this program optimization also.

1. It uses global device memory access even if searching by one key. Why it can't fit searched bitcoin address ripemd160 hash, public key incremental function and ripemd160(sha256) functions in cache?
2. As I understand it executes kernel from cpu thread several times on range. Why don't do it just one time for entire range supplied to kernel.
3. ripemd160(sha256) using Tensor cores?
fecell
Jr. Member
*
Offline Offline

Activity: 136
Merit: 2


View Profile
January 05, 2024, 01:25:46 AM
Last edit: January 05, 2024, 06:14:29 AM by fecell
 #933

If you can do that, congratulations because you just partially broke elliptic curve.

No, i mean I can reduce a generator range to skip not random values, so time to bruteforce reduced too.

For example, 23 bit key to test (python 3.11 + ice_secp256k1.dll).
with secret algo:
GOT: KwDiBf89QgGbjEhKnhXJuH7LrciVrZi3qYjgd9M7rVkthFNsQ6i7
10.363348245620728 s

with usual range (2^22 ... 2^23-1)
GOT: KwDiBf89QgGbjEhKnhXJuH7LrciVrZi3qYjgd9M7rVkthFNsQ6i7
16.832353353500366 s

with big values, like 66 bit, a lot of values just skiped as NOT random binary values, because cant be randomly generated by author (by wallet software).
for example, first value for 66-bit range is 100000100100100101010011001011000111000111001011000111000111001011, all values less is fail.
this value give generator as first value applyed with random's rules

anyway, pure python not a good instrument to get result. wanna use numba cuda.jit, but still learning how to.
Baboshka
Newbie
*
Offline Offline

Activity: 7
Merit: 0


View Profile
January 11, 2024, 08:20:01 PM
 #934

If you can do that, congratulations because you just partially broke elliptic curve.

No, i mean I can reduce a generator range to skip not random values, so time to bruteforce reduced too.

For example, 23 bit key to test (python 3.11 + ice_secp256k1.dll).
with secret algo:
GOT: KwDiBf89QgGbjEhKnhXJuH7LrciVrZi3qYjgd9M7rVkthFNsQ6i7
10.363348245620728 s

with usual range (2^22 ... 2^23-1)
GOT: KwDiBf89QgGbjEhKnhXJuH7LrciVrZi3qYjgd9M7rVkthFNsQ6i7
16.832353353500366 s

with big values, like 66 bit, a lot of values just skiped as NOT random binary values, because cant be randomly generated by author (by wallet software).
for example, first value for 66-bit range is 100000100100100101010011001011000111000111001011000111000111001011, all values less is fail.
this value give generator as first value applyed with random's rules

anyway, pure python not a good instrument to get result. wanna use numba cuda.jit, but still learning how to.

Hi fecell .. can you please explain more why values less "100000100100100101010011001011000111000111001011000111000111001011" will fail .. thanks and regards
fecell
Jr. Member
*
Offline Offline

Activity: 136
Merit: 2


View Profile
January 12, 2024, 04:19:16 PM
 #935

can you please explain
NO. excusema
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1204
Merit: 237

Shooters Shoot...


View Profile
January 12, 2024, 10:37:04 PM
 #936


He can't lol.

Really, what do you mean?

How is 0x10492A658E39638E5 the first value for the 66 bit range?

Maybe I am misreading your statement(s).
Emmanuelex
Jr. Member
*
Offline Offline

Activity: 137
Merit: 2


View Profile
January 13, 2024, 12:09:05 AM
 #937

Meehn... Seems like this is a game for programmers?🤦🏾‍♂️😁 I'm outta here I guess
3dmlib
Jr. Member
*
Offline Offline

Activity: 44
Merit: 2


View Profile
January 20, 2024, 10:29:13 PM
Last edit: January 20, 2024, 10:43:07 PM by 3dmlib
 #938

There is some tips to speed-up keyhunt-cuda (rotor-cuda):

Apply this then you need less grid size, like 4096x512 will be enough for 4090:

https://bitcointalk.org/index.php?topic=5244940.msg63526413#msg63526413

Also change this:

__device__ __noinline__ void CheckHashSEARCH_MODE_SA(uint64_t* px, uint64_t* py, int32_t incr, uint32_t* hash160, uint32_t* out)
{
   switch (mode) {
   case SEARCH_COMPRESSED:
      CheckHashCompSEARCH_MODE_SA(px, (uint8_t)(py[0] & 1), incr, hash160, out);
      break;
   case SEARCH_UNCOMPRESSED:
      CheckHashUnCompSEARCH_MODE_SA(px, py, incr, hash160, out);
      break;
   case SEARCH_BOTH:
      CheckHashCompSEARCH_MODE_SA(px, (uint8_t)(py[0] & 1), incr, hash160, out);
      CheckHashUnCompSEARCH_MODE_SA(px, py, incr, hash160, out);
      break;
   }
}

to this because doing switch-case in kernel is very bad idea:

__device__ __noinline__ void CheckHashSEARCH_MODE_SA(uint64_t* px, uint64_t* py, int32_t incr, uint32_t* hash160, uint32_t* out)
{
   
   CheckHashCompSEARCH_MODE_SA(px, (uint8_t)(py[0] & 1), incr, hash160, out);
      
}

also maxFound can be completely removed to search puzzle, because we need only one return result anyway

Rotor-cuda speed with this mods:

  [00:17:10] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.453247 %] [R: 0] [T: 6,412,923,043,840 (43 bit)] [F: 0]
  [00:17:11] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.500549 %] [R: 0] [T: 6,421,244,542,976 (43 bit)] [F: 0]
  [00:17:12] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.547852 %] [R: 0] [T: 6,429,566,042,112 (43 bit)] [F: 0]
  [00:17:13] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.595154 %] [R: 0] [T: 6,437,887,541,248 (43 bit)] [F: 0]
  [00:17:15] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.642456 %] [R: 0] [T: 6,446,209,040,384 (43 bit)] [F: 0]
  [00:17:16] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.689758 %] [R: 0] [T: 6,454,530,539,520 (43 bit)] [F: 0]
  [00:17:17] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.737061 %] [R: 0] [T: 6,462,852,038,656 (43 bit)] [F: 0]
  [00:17:18] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.784363 %] [R: 0] [T: 6,471,173,537,792 (43 bit)] [F: 0]
  [00:17:20] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.831665 %] [R: 0] [T: 6,479,495,036,928 (43 bit)] [F: 0]
  [00:17:21] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.878967 %] [R: 0] [T: 6,487,816,536,064 (43 bit)] [F: 0]


Thanks.
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1204
Merit: 237

Shooters Shoot...


View Profile
January 20, 2024, 10:57:36 PM
 #939

There is some tips to speed-up keyhunt-cuda (rotor-cuda):

Apply this then you need less grid size, like 4096x512 will be enough for 4090:

https://bitcointalk.org/index.php?topic=5244940.msg63526413#msg63526413

Also change this:

__device__ __noinline__ void CheckHashSEARCH_MODE_SA(uint64_t* px, uint64_t* py, int32_t incr, uint32_t* hash160, uint32_t* out)
{
   switch (mode) {
   case SEARCH_COMPRESSED:
      CheckHashCompSEARCH_MODE_SA(px, (uint8_t)(py[0] & 1), incr, hash160, out);
      break;
   case SEARCH_UNCOMPRESSED:
      CheckHashUnCompSEARCH_MODE_SA(px, py, incr, hash160, out);
      break;
   case SEARCH_BOTH:
      CheckHashCompSEARCH_MODE_SA(px, (uint8_t)(py[0] & 1), incr, hash160, out);
      CheckHashUnCompSEARCH_MODE_SA(px, py, incr, hash160, out);
      break;
   }
}

to this because doing switch-case in kernel is very bad idea:

__device__ __noinline__ void CheckHashSEARCH_MODE_SA(uint64_t* px, uint64_t* py, int32_t incr, uint32_t* hash160, uint32_t* out)
{
   
   CheckHashCompSEARCH_MODE_SA(px, (uint8_t)(py[0] & 1), incr, hash160, out);
      
}

also maxFound can be completely removed to search puzzle, because we need only one return result anyway

Rotor-cuda speed with this mods:

  [00:17:10] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.453247 %] [R: 0] [T: 6,412,923,043,840 (43 bit)] [F: 0]
  [00:17:11] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.500549 %] [R: 0] [T: 6,421,244,542,976 (43 bit)] [F: 0]
  [00:17:12] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.547852 %] [R: 0] [T: 6,429,566,042,112 (43 bit)] [F: 0]
  [00:17:13] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.595154 %] [R: 0] [T: 6,437,887,541,248 (43 bit)] [F: 0]
  [00:17:15] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.642456 %] [R: 0] [T: 6,446,209,040,384 (43 bit)] [F: 0]
  [00:17:16] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.689758 %] [R: 0] [T: 6,454,530,539,520 (43 bit)] [F: 0]
  [00:17:17] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.737061 %] [R: 0] [T: 6,462,852,038,656 (43 bit)] [F: 0]
  [00:17:18] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.784363 %] [R: 0] [T: 6,471,173,537,792 (43 bit)] [F: 0]
  [00:17:20] [CPU+GPU: 6.72 Gk/s] [GPU: 6.72 Gk/s] [C: 36.831665 %] [R: 0] [T: 6,479,495,036,928 (43 bit)] [F: 0]
  [00:17:21] [CPU+GPU: 6.71 Gk/s] [GPU: 6.71 Gk/s] [C: 36.878967 %] [R: 0] [T: 6,487,816,536,064 (43 bit)] [F: 0]


Thanks.
What was the speed before and which version of Rotor-cuda are you using?
One checks symmetry/endos, etc. one does not.
The one that checks endos, is not good for the puzzle and the speed is misleading.
3dmlib
Jr. Member
*
Offline Offline

Activity: 44
Merit: 2


View Profile
January 21, 2024, 09:28:53 AM
 #940

What was the speed before and which version of Rotor-cuda are you using?
One checks symmetry/endos, etc. one does not.
The one that checks endos, is not good for the puzzle and the speed is misleading.

Speed before my mods was about 6.38 Gk/s.
I think I used this one:
https://github.com/Vladimir855/Rotor-Cuda

Is any better version available?

Thanks.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 [47] 48 49 50 51 52 53 54 55 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!