Bitcoin Forum
May 22, 2024, 12:51:06 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 »  All
  Print  
Author Topic: BSGS solver for cuda  (Read 3460 times)
Etar (OP)
Sr. Member
****
Offline Offline

Activity: 616
Merit: 312


View Profile
October 15, 2021, 10:43:34 AM
 #41


ahan Sad,   i am not good at cuda or in programing , but if i use -i in kangaroo , it is returning correct parameters of memory.

is it possible to mix some codes from kangaroo side ? or any way to hardcode memory ?
Ussualy people used cuda runtime api it is different library incompatible with cuda driver api.
I was try to solve 32bit limitation few years ago as soon as the first cards with more than 4GB memory appeared.
But unfortunately this limit could not be overcome.
And do you need to utilize all the memory?
On my 2080ti already at -w 27 the hash rate drops from 570mkeys to 81. While at 3070 everything is fine.
So you need first to check how your hashrate will decrease with increasing parameter -w.
here is with cuDeviceTotalMem_v2
Code:
APP VERSION: 1.2.1
Found 1 Cuda device.
Cuda device:GeForce RTX 2080 Ti(11264Mb)
Device have: MP:68 Cores+4352
Shared memory total:49152
Constant memory total:65536
return correct 64bit values but it is only information it is didn`t help to solve all limitation in cuda commands.
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 15, 2021, 10:54:05 AM
 #42


ahan Sad,   i am not good at cuda or in programing , but if i use -i in kangaroo , it is returning correct parameters of memory.

is it possible to mix some codes from kangaroo side ? or any way to hardcode memory ?
Ussualy people used cuda runtime api it is different library incompatible with cuda driver api.
I was try to solve 32bit limitation few years ago as soon as the first cards with more than 4GB memory appeared.
But unfortunately this limit could not be overcome.
And do you need to utilize all the memory?
On my 2080ti already at -w 27 the hash rate drops from 570mkeys to 81. While at 3070 everything is fine.
So you need first to check how your hashrate will decrease with increasing parameter -w.
here is with cuDeviceTotalMem_v2
Code:
APP VERSION: 1.2.1
Found 1 Cuda device.
Cuda device:GeForce RTX 2080 Ti(11264Mb)
Device have: MP:68 Cores+4352
Shared memory total:49152
Constant memory total:65536


some question i have for my understanding

does memory allocation in gpu maks difference in speed?
how to know T, P and b optimal value for my card (3080)?
what is W and -htsz role?
and what is item size ?
can i occupy more ram in computer to give some speed boost as i have 128GB memory ? if yes how can ?
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 15, 2021, 11:06:52 AM
Merited by NotATether (1)
 #43

please take a look in these 2 URLs ~ they fixed this issue.

https://github.com/BOINC/boinc/issues/1773
https://github.com/BOINC/boinc/pull/2707
perhaps you will get some clue
Etar (OP)
Sr. Member
****
Offline Offline

Activity: 616
Merit: 312


View Profile
October 15, 2021, 11:06:56 AM
Merited by NotATether (2), studyroom1 (1)
 #44


does memory allocation in gpu maks difference in speed?
how to know T, P and b optimal value for my card (3080)?
what is W and -htsz role?
and what is item size ?
can i occupy more ram in computer to give some speed boost as i have 128GB memory ? if yes how can ?

-t use 512 for your 3080
-b use 68, shoud be multiples of SM count your cars(3080 have 68 SM)
-p use 256, this value mean how many xpoints will compute each thread in kernel.
-w it is number of baby step, -w 26 mean create array with size 2^26 as large this array then more big giant step. But you should check you hashrate when increase -w it shodn`t drop more then 1.5 times. For ex, your hashrate with -w 26 is 1500 Mkeys and if with -w 27 your hashrate is more then 1000 mkeys then there will be sense to increase -w

-htsz use default 25, it is size of Hash Table. you can change -htsz only if you have small baby aray(-w) less then Hash Table size
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 15, 2021, 11:09:36 AM
 #45


does memory allocation in gpu maks difference in speed?
how to know T, P and b optimal value for my card (3080)?
what is W and -htsz role?
and what is item size ?
can i occupy more ram in computer to give some speed boost as i have 128GB memory ? if yes how can ?

-t use 512 for your 3080
-b use 68, shoud be multiples of SM count your cars(3080 have 68 SM)
-p use 256, this value mean how many xpoints will compute each thread in kernel.
-w it is number of baby step, -w 26 mean create array with size 2^26 as large this array then more big giant step. But you should check you hashrate when increase -w it shodn`t drop more then 1.5 times. For ex, your hashrate with -w 26 is 1500 Mkeys and if with -w 27 your hashrate is more then 1000 mkeys then there will be sense to increase -w

-htsz use default 25, it is size of Hash Table. you can change -htsz only if you have small baby aray(-w) less then Hash Table size


awesome  , big thanks 
Etar (OP)
Sr. Member
****
Offline Offline

Activity: 616
Merit: 312


View Profile
October 15, 2021, 01:51:19 PM
 #46

Seems like i fix app..  Grin Replace most commands with unofficial _v2
Code:
GPU #0 launched
GPU #0 TotalBuff: 5168.000Mb
GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001
GPU#0 Cnt:00000000000000000000000000000000000000000000000015ea000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:0000000000000000000000000000000000000000000000002bd4000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:00000000000000000000000000000000000000000000000041be000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:00000000000000000000000000000000000000000000000057a8000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:0000000000000000000000000000000000000000000000006d92000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:000000000000000000000000000000000000000000000000835a000000000001 696MKey/s x536870912 2^29.44 x2^30=2^59.44
GPU#0 Cnt:0000000000000000000000000000000000000000000000009922000000000001 696MKey/s x536870912 2^29.44 x2^30=2^59.44
***********GPU#0************
Total solutions: 1
KEY!!>000000000000000000000000000000000000000000000001a838b13505b26867
Pub: 30210c23b1a047bc9bdbb13448e67deddc108946de6de639bcc75d47c0216b1be383c4a8ed4fac77c0d2ad737d8499a362f483f8fe39d1e86aaed578a9455dfc
****************************
Found in 17 seconds
Result above with -w 29
Also speed is little increased.
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 15, 2021, 06:33:39 PM
Last edit: October 15, 2021, 06:46:14 PM by studyroom1
 #47

Seems like i fix app..  Grin Replace most commands with unofficial _v2
Code:
GPU #0 launched
GPU #0 TotalBuff: 5168.000Mb
GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001
GPU#0 Cnt:00000000000000000000000000000000000000000000000015ea000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:0000000000000000000000000000000000000000000000002bd4000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:00000000000000000000000000000000000000000000000041be000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:00000000000000000000000000000000000000000000000057a8000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:0000000000000000000000000000000000000000000000006d92000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:000000000000000000000000000000000000000000000000835a000000000001 696MKey/s x536870912 2^29.44 x2^30=2^59.44
GPU#0 Cnt:0000000000000000000000000000000000000000000000009922000000000001 696MKey/s x536870912 2^29.44 x2^30=2^59.44
***********GPU#0************
Total solutions: 1
KEY!!>000000000000000000000000000000000000000000000001a838b13505b26867
Pub: 30210c23b1a047bc9bdbb13448e67deddc108946de6de639bcc75d47c0216b1be383c4a8ed4fac77c0d2ad737d8499a362f483f8fe39d1e86aaed578a9455dfc
****************************
Found in 17 seconds
Result above with -w 29
Also speed is little increased.


awesome bro , thanks for sharing all this ~~ will test it
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 15, 2021, 08:51:50 PM
 #48

Seems like i fix app..  Grin Replace most commands with unofficial _v2
Code:
GPU #0 launched
GPU #0 TotalBuff: 5168.000Mb
GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001
GPU#0 Cnt:00000000000000000000000000000000000000000000000015ea000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:0000000000000000000000000000000000000000000000002bd4000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:00000000000000000000000000000000000000000000000041be000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:00000000000000000000000000000000000000000000000057a8000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:0000000000000000000000000000000000000000000000006d92000000000001 697MKey/s x536870912 2^29.45 x2^30=2^59.45
GPU#0 Cnt:000000000000000000000000000000000000000000000000835a000000000001 696MKey/s x536870912 2^29.44 x2^30=2^59.44
GPU#0 Cnt:0000000000000000000000000000000000000000000000009922000000000001 696MKey/s x536870912 2^29.44 x2^30=2^59.44
***********GPU#0************
Total solutions: 1
KEY!!>000000000000000000000000000000000000000000000001a838b13505b26867
Pub: 30210c23b1a047bc9bdbb13448e67deddc108946de6de639bcc75d47c0216b1be383c4a8ed4fac77c0d2ad737d8499a362f483f8fe39d1e86aaed578a9455dfc
****************************
Found in 17 seconds
Result above with -w 29
Also speed is little increased.

i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+

studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 16, 2021, 04:35:50 AM
 #49

Do i have to remove 04 from bigging of uncompressed key or software can recognize with 04 also?

and seems like one more issue is there

if i am using range like this
49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5eba34000000000000
its running fine in range but when i use 120 range it is calculating range fine but running very below and showing false collision  

this started like this
GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001
.....
GPU#0 Cnt:0000000000000000000000000000000000000000000000160cc3800000000001

but i set range 0x800000000000000000000000000000 to 0xffffffffffffffffffffffffffffff

is something wrong with software ?

WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1064
Merit: 219

Shooters Shoot...


View Profile
October 16, 2021, 04:47:08 AM
 #50

Do i have to remove 04 from bigging of uncompressed key or software can recognize with 04 also?

and seems like one more issue is there

if i am using range like this
49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5eba34000000000000
its running fine in range but when i use 120 range it is calculating range fine but running very below and showing false collision  

this started like this
GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001
.....
GPU#0 Cnt:0000000000000000000000000000000000000000000000160cc3800000000001

but i set range 0x800000000000000000000000000000 to 0xffffffffffffffffffffffffffffff

is something wrong with software ?


You can run with 04 in front of uncompressed key's x,y points; you just can not use a compressed key in any format.

The Cnt's are the giant steps. Program offsets (subtracts start range) pubkey on startup and then after all the baby steps and sorting, the GPU starts the giant steps. Nothing is wrong with the program. If you run a smaller range, you will see the same thing and you will see it will solve for the inputted key. False collisions are normal due to 8 bytes stored in hash table.
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 16, 2021, 05:37:26 AM
 #51

Do i have to remove 04 from bigging of uncompressed key or software can recognize with 04 also?

and seems like one more issue is there

if i am using range like this
49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5eba34000000000000
its running fine in range but when i use 120 range it is calculating range fine but running very below and showing false collision  

this started like this
GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001
.....
GPU#0 Cnt:0000000000000000000000000000000000000000000000160cc3800000000001

but i set range 0x800000000000000000000000000000 to 0xffffffffffffffffffffffffffffff

is something wrong with software ?


You can run with 04 in front of uncompressed key's x,y points; you just can not use a compressed key in any format.

The Cnt's are the giant steps. Program offsets (subtracts start range) pubkey on startup and then after all the baby steps and sorting, the GPU starts the giant steps. Nothing is wrong with the program. If you run a smaller range, you will see the same thing and you will see it will solve for the inputted key. False collisions are normal due to 8 bytes stored in hash table.

awesome man got the point i was worried that something wrong with my setup
Etar (OP)
Sr. Member
****
Offline Offline

Activity: 616
Merit: 312


View Profile
October 16, 2021, 05:46:45 AM
 #52

i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+
Make sure you run v1.3.1. you can check version in the begining.
sky59sky59
Jr. Member
*
Offline Offline

Activity: 38
Merit: 34


View Profile
October 16, 2021, 05:53:09 AM
 #53

i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+
Make sure you run v1.3.1. you can check version in the begining.

Does it search whole 256bit space? Or limited only to few lsb bits?
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 16, 2021, 06:10:46 AM
 #54

i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+
Make sure you run v1.3.1. you can check version in the begining.

yes bro this is v1.3.1 nut still same incorrect detection Sad
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 16, 2021, 06:18:49 AM
 #55

i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+
Make sure you run v1.3.1. you can check version in the begining.

yes bro this is v1.3.1 nut still same incorrect detection Sad

i fixed it as i did not delete old files which were computed before by old program , when i deleted all old .bin etc file and now everything is fine thanks man will start testing now
Etar (OP)
Sr. Member
****
Offline Offline

Activity: 616
Merit: 312


View Profile
October 16, 2021, 06:19:59 AM
Last edit: October 16, 2021, 06:32:33 AM by Etar
 #56

i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+
Make sure you run v1.3.1. you can check version in the begining.

yes bro this is v1.3.1 nut still same incorrect detection Sad
Very strange because you talk above about false collision but i remove printing false collision in cmd in version 1.3.1(they certainly happen, but they are no longer visible)
By the way bsgs fast only in small ranges like 2^64 and less. if you will try use bsgs for #80 puzzle for ex. then you will search pubkeys much longer then JLP kangaroo.
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 16, 2021, 06:34:44 AM
 #57

i am sorry bro its still same and still showing 4095 memory and cannot utilize above 3200+
Make sure you run v1.3.1. you can check version in the begining.

yes bro this is v1.3.1 nut still same incorrect detection Sad
Very strange because you talk above about false collision but i remove printing false collision in cmd in version 1.3.1(they certainly happen, but they are no longer visible)
By the way bsgs fast only in small ranges like 2^64 and less. if you will try use bsgs for #80 puzzle for ex. then you will search pubkeys much longer then JLP kangaroo.

i am beginner so must be doing some wrong but what program you will recommend for above 80-bit range
davidjjones
Newbie
*
Offline Offline

Activity: 25
Merit: 14


View Profile
October 16, 2021, 06:42:53 AM
 #58

Is there any script to uncompress multiple pubkeys in a file?
like this script: BTC Adresses > HASH160
https://github.com/sezginyildirim91/btc-address-to-hash160
Etar (OP)
Sr. Member
****
Offline Offline

Activity: 616
Merit: 312


View Profile
October 16, 2021, 06:43:34 AM
 #59

--snip--
i am beginner so must be doing some wrong but what program you will recommend for above 80-bit range
JLP Kangaroo. https://github.com/JeanLucPons/Kangaroo
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
October 16, 2021, 06:52:44 AM
 #60

Total: 4294967296 bytes
Save BIN file:79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798_536870912_b.BI N
  • chunk:2147483648b
Error when saving chunk: save:2147483648b, got:-2147483648b
Press Enter to exit


Huh  new error
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!