Bitcoin Forum
May 05, 2024, 03:43:58 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 [33] 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 ... 96 »
  Print  
Author Topic: BitCrack - A tool for brute-forcing private keys  (Read 74435 times)
NotATether
Legendary
*
Offline Offline

Activity: 1596
Merit: 6728


bitcoincleanup.com / bitmixlist.org


View Profile WWW
January 20, 2021, 06:04:31 PM
 #641

Edit2:
So, to view the memory you gotta enable device debugging, but when enabled, it 'works' (slow af ofc). Great.

Yeah, that's the warning I was talking about  Sad . I'm glad you were able to at least narrow it down to Turing-to-Ampere arch changes though.

Perhaps there is a way in VS to print the contents of a raw pointer address that each variable is at? That should surely make an misaligned access error if we try to do that if it's indeed the variable that's not aligned. Alternatively we can just view the addresses in hexadecimal and see which ones are not divisible by 4,8, etc. I know gdb can do both of these things but I forgot the commands for them.

Docs say I should compile against c86,sm86, so will test what that does using legacy mode. Whatever that mode even does.

To my understanding, the difference between compute_* and sm_* is that the compute_ targets are for compiling the CUDA code into PTX, which is some kind of assembly code for GPUs. Basically there are different versions of assembly code specifications, and they're forward compatible with newer PTX versions (the so-called compute caps). In fact that's the reason brichard19 was able to leave the makefile at compute_35 all this time without bitcrack going to hell for everyone running a newer GPU family.

While sm_* is the specification of the binary, what we'd call in C/C++ land the "linking phase", and it absolutely has to match the compute cap for the specific GPU you're targeting in order to run on it. The GPU binaries are not forward-compatible, which means you can't run a CUDA binary corresponding on one GPU family on any other family (unless it's compute cap has the same major version).

For example, if you compile a CUDA program for Kepler's PTX architecture, which is what bitcrack was doing all this time, it's gonna work on Maxwell,Pascal,Volta,...etc. All families after it too.

But the binary itself won't work unless your GPU family has the same major version as inside the sm_ the program was compiled against (e.g Volta 7.0 and Turing 7.5, but not Ampere 8.6 or Pascal 6.1.

If you wanted to distribute a CUDA binary that works on all of those families, you gotta pass a low compute_ version and hen the sm_ versions for every family you want it to run on, e.g.  

Code:
-gencode=arch=compute_35,code=sm_35,sm_50,sm_52,sm_61,sm_75,sm_86

to make it run on every desktop GPU starting with Kepler (This excludes compute caps for embedded Tegra GPUs in game consoles, a few datacenter Tesla GPUs, and the Titan V Wink). Stuffing all those binaries inside a single program also turns out to make it very huge, perhaps the reason why video game installs are several GBs large.

At any rate, the blasted thing is supposed to work without passing any compute cap flags at all so maybe we're focusing on the wrong thing Smiley I'll study the docs some more because attempting to debug without understanding CUDA won't get us anywhere.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
1714923838
Hero Member
*
Offline Offline

Posts: 1714923838

View Profile Personal Message (Offline)

Ignore
1714923838
Reply with quote  #2

1714923838
Report to moderator
1714923838
Hero Member
*
Offline Offline

Posts: 1714923838

View Profile Personal Message (Offline)

Ignore
1714923838
Reply with quote  #2

1714923838
Report to moderator
The network tries to produce one block per 10 minutes. It does this by automatically adjusting how difficult it is to produce blocks.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714923838
Hero Member
*
Offline Offline

Posts: 1714923838

View Profile Personal Message (Offline)

Ignore
1714923838
Reply with quote  #2

1714923838
Report to moderator
1714923838
Hero Member
*
Offline Offline

Posts: 1714923838

View Profile Personal Message (Offline)

Ignore
1714923838
Reply with quote  #2

1714923838
Report to moderator
1714923838
Hero Member
*
Offline Offline

Posts: 1714923838

View Profile Personal Message (Offline)

Ignore
1714923838
Reply with quote  #2

1714923838
Report to moderator
renedx
Jr. Member
*
Offline Offline

Activity: 36
Merit: 3


View Profile
January 20, 2021, 06:59:54 PM
Last edit: January 21, 2021, 01:32:59 AM by renedx
 #642



Yeah, that's the warning I was talking about  Sad . I'm glad you were able to at least narrow it down to Turing-to-Ampere arch changes though.

Perhaps there is a way in VS to print the contents of a raw pointer address that each variable is at? That should surely make an misaligned access error if we try to do that if it's indeed the variable that's not aligned. Alternatively we can just view the addresses in hexadecimal and see which ones are not divisible by 4,8, etc. I know gdb can do both of these things but I forgot the commands for them.

Docs say I should compile against c86,sm86, so will test what that does using legacy mode. Whatever that mode even does.

To my understanding, the difference between compute_* and sm_* is that the compute_ targets are for compiling the CUDA code into PTX, which is some kind of assembly code for GPUs. Basically there are different versions of assembly code specifications, and they're forward compatible with newer PTX versions (the so-called compute caps). In fact that's the reason brichard19 was able to leave the makefile at compute_35 all this time without bitcrack going to hell for everyone running a newer GPU family.

While sm_* is the specification of the binary, what we'd call in C/C++ land the "linking phase", and it absolutely has to match the compute cap for the specific GPU you're targeting in order to run on it. The GPU binaries are not forward-compatible, which means you can't run a CUDA binary corresponding on one GPU family on any other family (unless it's compute cap has the same major version).

For example, if you compile a CUDA program for Kepler's PTX architecture, which is what bitcrack was doing all this time, it's gonna work on Maxwell,Pascal,Volta,...etc. All families after it too.

But the binary itself won't work unless your GPU family has the same major version as inside the sm_ the program was compiled against (e.g Volta 7.0 and Turing 7.5, but not Ampere 8.6 or Pascal 6.1.

If you wanted to distribute a CUDA binary that works on all of those families, you gotta pass a low compute_ version and hen the sm_ versions for every family you want it to run on, e.g.  

Code:
-gencode=arch=compute_35,code=sm_35,sm_50,sm_52,sm_61,sm_75,sm_86

to make it run on every desktop GPU starting with Kepler (This excludes compute caps for embedded Tegra GPUs in game consoles, a few datacenter Tesla GPUs, and the Titan V Wink). Stuffing all those binaries inside a single program also turns out to make it very huge, perhaps the reason why video game installs are several GBs large.

At any rate, the blasted thing is supposed to work without passing any compute cap flags at all so maybe we're focusing on the wrong thing Smiley I'll study the docs some more because attempting to debug without understanding CUDA won't get us anywhere.

You're totally right, was just reading about these. Thanks for adding, you're amazing in explaining stuff btw.
The reason for trying this was because of the legacy build thing. I noticed the docs stating the following:

Code:
1.4.3. Independent Thread Scheduling Compatibility
NVIDIA GPUs since Volta architecture have Independent Thread Scheduling among threads in a warp. If the developer made assumptions about warp-synchronicity2, this feature can alter the set of threads participating in the executed code compared to previous architectures. Please see Compute Capability 7.0 in the Programming Guide for details and corrective actions. To aid migration to the NVIDIA Ampere GPU architecture, developers can opt-in to the Pascal scheduling model with the following combination of compiler options.

nvcc -gencode=arch=compute_60,code=sm_80 ...

And while debugging, I noticed my breakpoints changes using different specific versions.

My understanding of CUDA is at minimum, but I'm fascinated about it too much. It's a shame the person with CUDA knowledge didn't share any insights yet.

Edit:
Attempts to log, gather more info or whatever, will slow down the program just enough that it *works*.  Roll Eyes
Prob. the reason it works on older hardware. Its really the speed killing it.
NotATether
Legendary
*
Offline Offline

Activity: 1596
Merit: 6728


bitcoincleanup.com / bitmixlist.org


View Profile WWW
January 21, 2021, 11:29:50 AM
 #643

Edit:
Attempts to log, gather more info or whatever, will slow down the program just enough that it *works*.  Roll Eyes
Prob. the reason it works on older hardware. Its really the speed killing it.

Maybe the SMs in the GPU have so many threads per block running that they don't have enough time to align all the pointers for Bitcrack. The GPU does a lot of its own things at runtime. I never saw bitcrack do any alignment in the cracking loop, but neither did I check the parts responsible for initialization.

Could the initialization stage be launching kernels with invalid block sizes in a grid? Maybe there's a way to make it launch a 1x1 sized grid so we can see what happens.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1064
Merit: 219

Shooters Shoot...


View Profile
January 21, 2021, 03:25:44 PM
 #644

Edit:
Attempts to log, gather more info or whatever, will slow down the program just enough that it *works*.  Roll Eyes
Prob. the reason it works on older hardware. Its really the speed killing it.

Maybe the SMs in the GPU have so many threads per block running that they don't have enough time to align all the pointers for Bitcrack. The GPU does a lot of its own things at runtime. I never saw bitcrack do any alignment in the cracking loop, but neither did I check the parts responsible for initialization.

Could the initialization stage be launching kernels with invalid block sizes in a grid? Maybe there's a way to make it launch a 1x1 sized grid so we can see what happens.
Was going back to look at the issues you all are trying to fix.  This may or may not help you all. If you download the latest cuBitCrack from github, it will run with the GTX 10 series just fine, with any driver however, the only way I could get it to work on RTX 20xx cards was to roll back the driver to 452.

Not sure if you can look at what changed between the 452 driver and newer drivers that caused it to stop working on 20xx cards. unless it's a mere cuda runtime thing.

Also, I have noticed in another program that when I try and generate random keys with the 20xx or 30xx, that is when I get the misaligned or memory error, but only when I have a large input list (of addresses). Got me thinking with BitCrack, when you have a large list or the program starts generating all those starting points, that's where error is coming into play. I wonder if you choked down the starting points to say 32, 32, 32, if same error would exist.

When I use straight key to key search, no generation of random points, I have no issues with any series of cards on the other program.

Also, the new Ampere has 4 SMs, 4 x 32. But from what I have seen, it seems like random points, creating many points does something to throw the misaligned address/memory errors. Crazy stuff to say the least.
renedx
Jr. Member
*
Offline Offline

Activity: 36
Merit: 3


View Profile
January 21, 2021, 05:32:07 PM
 #645

Was going back to look at the issues you all are trying to fix.  This may or may not help you all. If you download the latest cuBitCrack from github, it will run with the GTX 10 series just fine, with any driver however, the only way I could get it to work on RTX 20xx cards was to roll back the driver to 452.

Not sure if you can look at what changed between the 452 driver and newer drivers that caused it to stop working on 20xx cards. unless it's a mere cuda runtime thing.

Also, I have noticed in another program that when I try and generate random keys with the 20xx or 30xx, that is when I get the misaligned or memory error, but only when I have a large input list (of addresses). Got me thinking with BitCrack, when you have a large list or the program starts generating all those starting points, that's where error is coming into play. I wonder if you choked down the starting points to say 32, 32, 32, if same error would exist.

When I use straight key to key search, no generation of random points, I have no issues with any series of cards on the other program.

Also, the new Ampere has 4 SMs, 4 x 32. But from what I have seen, it seems like random points, creating many points does something to throw the misaligned address/memory errors. Crazy stuff to say the least.

I'll try the 1x1 grid later suggested above. Imho we need someone with more CUDA knowledge to crack this one.
Code breaks differ from run to run (last one was readIntLSW (https://github.com/brichard19/BitCrack/blob/master/cudaMath/secp256k1.cuh#L110))..
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1064
Merit: 219

Shooters Shoot...


View Profile
January 21, 2021, 06:00:27 PM
 #646

Was going back to look at the issues you all are trying to fix.  This may or may not help you all. If you download the latest cuBitCrack from github, it will run with the GTX 10 series just fine, with any driver however, the only way I could get it to work on RTX 20xx cards was to roll back the driver to 452.

Not sure if you can look at what changed between the 452 driver and newer drivers that caused it to stop working on 20xx cards. unless it's a mere cuda runtime thing.

Also, I have noticed in another program that when I try and generate random keys with the 20xx or 30xx, that is when I get the misaligned or memory error, but only when I have a large input list (of addresses). Got me thinking with BitCrack, when you have a large list or the program starts generating all those starting points, that's where error is coming into play. I wonder if you choked down the starting points to say 32, 32, 32, if same error would exist.

When I use straight key to key search, no generation of random points, I have no issues with any series of cards on the other program.

Also, the new Ampere has 4 SMs, 4 x 32. But from what I have seen, it seems like random points, creating many points does something to throw the misaligned address/memory errors. Crazy stuff to say the least.

I'll try the 1x1 grid later suggested above. Imho we need someone with more CUDA knowledge to crack this one.
Code breaks differ from run to run (last one was readIntLSW (https://github.com/brichard19/BitCrack/blob/master/cudaMath/secp256k1.cuh#L110))..
Jean Luc is the CUDA master...I'll see if he has messed with the 30xx cards yet
WhyMe
Sr. Member
****
Offline Offline

Activity: 661
Merit: 250


View Profile
January 22, 2021, 02:38:04 PM
 #647

Last summer I was running bitcrack on some 2060 without any problem. I don't understand all the lasts messages.

EDIT
oh, it's about driver version, ok
renedx
Jr. Member
*
Offline Offline

Activity: 36
Merit: 3


View Profile
January 22, 2021, 03:50:32 PM
 #648

Last summer I was running bitcrack on some 2060 without any problem. I don't understand all the lasts messages.

EDIT
oh, it's about driver version, ok

Yeah, the Ampere cards do not run on these drivers. Sad story.
Kangoo*
Newbie
*
Offline Offline

Activity: 2
Merit: 0


View Profile
January 24, 2021, 12:30:12 AM
Last edit: January 24, 2021, 11:19:13 AM by Kangoo*
 #649

Last summer I was running bitcrack on some 2060 without any problem. I don't understand all the lasts messages.

EDIT
oh, it's about driver version, ok

Yeah, the Ampere cards do not run on these drivers. Sad story.


Not right. It's bitcrack's kernel code that need to be update. Here the solution (GPUMath), if some developer can apply it in the bitcrack's kernel code

https://github.com/JeanLucPons/VanitySearch/commit/d3c1debb12233722f6ccc09ed3317769161a4773



elvis13
Newbie
*
Offline Offline

Activity: 26
Merit: 2


View Profile
January 24, 2021, 04:29:31 PM
 #650

Friends, tell me the optimal parameters for the RTX 2060, b, t, r. Huh
Lolo54
Member
**
Offline Offline

Activity: 117
Merit: 32


View Profile
January 24, 2021, 05:16:37 PM
 #651

Friends, tell me the optimal parameters for the RTX 2060, b, t, r. Huh
-b 36 -t 512 -p 2800
-b 36 -t 258 -p 2500
NotATether
Legendary
*
Offline Offline

Activity: 1596
Merit: 6728


bitcoincleanup.com / bitmixlist.org


View Profile WWW
January 27, 2021, 07:43:02 PM
 #652

Last summer I was running bitcrack on some 2060 without any problem. I don't understand all the lasts messages.

EDIT
oh, it's about driver version, ok

Yeah, the Ampere cards do not run on these drivers. Sad story.


Not right. It's bitcrack's kernel code that need to be update. Here the solution (GPUMath), if some developer can apply it in the bitcrack's kernel code

https://github.com/JeanLucPons/VanitySearch/commit/d3c1debb12233722f6ccc09ed3317769161a4773

I just see a bunch of __device__ functions in GPUMath.h, the other files are unrelated to this problem since they are C source code.

In that commit we have:

__device__ __forceinline__ uint32_t ctz(uint64_t x) {

__device__ void _DivStep62(uint64_t u[5],uint64_t v[5],
   int32_t* pos

__device__ void MatrixVecMulHalf(uint64_t dest[5],uint64_t u[5],uint64_t v[5],int64_t _11,int64_t _12,uint64_t* carry) {

__device__ void MatrixVecMul(uint64_t u[5],uint64_t v[5],int64_t _11,int64_t _12,int64_t _21,int64_t _22) {

__device__ uint64_t AddCh(uint64_t r[5],uint64_t a[5],uint64_t carry) {

The only thing they have in common with Bitcrack code is that these are also using array parameters and are somehow succeeding.



Here's what I did find in bitcrack code.

In cudaDeviceKeys.cu, elliptic curve-related code about x and y points:

Code:
__constant__ unsigned int *_xPtr[1];

__constant__ unsigned int *_yPtr[1];


__device__ unsigned int *ec::getXPtr()
{
return _xPtr[0];
}

__device__ unsigned int *ec::getYPtr()
{
return _yPtr[0];
}

Why are these a single-element array of a pointer in the first place? It looks redundant and the size of 1 is just making more opportunities to get pointer handling wrong.

It's relevant because after xPtr is copied to device memory in x, then this code is called that uses the arrays x and chain (and also inverse but that is initialized to fixed values so we know that's already aligned).

Code:
beginBatchAdd(_INC_X, x, chain, i, i, inverse);

chain is also initialized to _CHAIN which is also a 1-array of a pointer.

I read some things about __constant__ variables in constant memory and learned that it's read-only and can only be initialized during declaration but I could be wrong and may have to check again.



But I think this part could be the real problem:

Code:
// inside for loop of doIteration()
        unsigned int x[8];

        unsigned int digest[5];
        // ... more array decelerations inside loop, such as for y

I think because it's forcing CUDA to discard and allocate memory repeatedly, it might be putting this array on some unaligned address or something.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
dimsontt
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
January 29, 2021, 08:12:37 PM
 #653

Can someone help please?
is there a way to stop/resume in cmd terminal?
thanks in advance!
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
January 30, 2021, 05:01:50 AM
 #654

why the hell so low speed with max  -b 70 -t 512 -p 2078

[2021-01-30.08:55:30] [Info] Compression: compressed
[2021-01-30.08:55:30] [Info] Starting at: 0000000000000000000000000000000000000000000000000000000000000001
[2021-01-30.08:55:30] [Info] Ending at:   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364140
[2021-01-30.08:55:30] [Info] Counting by: 0000000000000000000000000000000000000000000000000000000000000001
[2021-01-30.08:55:30] [Info] Compiling OpenCL kernels...
[2021-01-30.08:55:30] [Info] Initializing GeForce RTX 3080
[2021-01-30.08:55:36] [Info] Generating 74,475,520 starting points (2841.0MB)
[2021-01-30.08:55:42] [Info] 10.0%
[2021-01-30.08:55:43] [Info] 20.0%
[2021-01-30.08:55:43] [Info] 30.0%
[2021-01-30.08:55:43] [Info] 40.0%
[2021-01-30.08:55:43] [Info] 50.0%
[2021-01-30.08:55:44] [Info] 60.0%
[2021-01-30.08:55:44] [Info] 70.0%
[2021-01-30.08:55:44] [Info] 80.0%
[2021-01-30.08:55:44] [Info] 90.0%
[2021-01-30.08:55:45] [Info] 100.0%
[2021-01-30.08:55:45] [Info] Done
[00:00:00] 4545/10240MB | 1 target 780.92 MKey/s
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1064
Merit: 219

Shooters Shoot...


View Profile
January 30, 2021, 05:04:00 AM
 #655

why the hell so low speed with max  -b 70 -t 512 -p 2078

[2021-01-30.08:55:30] [Info] Compression: compressed
[2021-01-30.08:55:30] [Info] Starting at: 0000000000000000000000000000000000000000000000000000000000000001
[2021-01-30.08:55:30] [Info] Ending at:   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364140
[2021-01-30.08:55:30] [Info] Counting by: 0000000000000000000000000000000000000000000000000000000000000001
[2021-01-30.08:55:30] [Info] Compiling OpenCL kernels...
[2021-01-30.08:55:30] [Info] Initializing GeForce RTX 3080
[2021-01-30.08:55:36] [Info] Generating 74,475,520 starting points (2841.0MB)
[2021-01-30.08:55:42] [Info] 10.0%
[2021-01-30.08:55:43] [Info] 20.0%
[2021-01-30.08:55:43] [Info] 30.0%
[2021-01-30.08:55:43] [Info] 40.0%
[2021-01-30.08:55:43] [Info] 50.0%
[2021-01-30.08:55:44] [Info] 60.0%
[2021-01-30.08:55:44] [Info] 70.0%
[2021-01-30.08:55:44] [Info] 80.0%
[2021-01-30.08:55:44] [Info] 90.0%
[2021-01-30.08:55:45] [Info] 100.0%
[2021-01-30.08:55:45] [Info] Done
[00:00:00] 4545/10240MB | 1 target 780.92 MKey/s

OpenCL versus the Cuda version. clBitCrack versus cuBitCrack; have you tried cuBitCrack?
studyroom1
Jr. Member
*
Offline Offline

Activity: 40
Merit: 7


View Profile
January 30, 2021, 05:15:08 AM
 #656

why the hell so low speed with max  -b 70 -t 512 -p 2078

[2021-01-30.08:55:30] [Info] Compression: compressed
[2021-01-30.08:55:30] [Info] Starting at: 0000000000000000000000000000000000000000000000000000000000000001
[2021-01-30.08:55:30] [Info] Ending at:   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364140
[2021-01-30.08:55:30] [Info] Counting by: 0000000000000000000000000000000000000000000000000000000000000001
[2021-01-30.08:55:30] [Info] Compiling OpenCL kernels...
[2021-01-30.08:55:30] [Info] Initializing GeForce RTX 3080
[2021-01-30.08:55:36] [Info] Generating 74,475,520 starting points (2841.0MB)
[2021-01-30.08:55:42] [Info] 10.0%
[2021-01-30.08:55:43] [Info] 20.0%
[2021-01-30.08:55:43] [Info] 30.0%
[2021-01-30.08:55:43] [Info] 40.0%
[2021-01-30.08:55:43] [Info] 50.0%
[2021-01-30.08:55:44] [Info] 60.0%
[2021-01-30.08:55:44] [Info] 70.0%
[2021-01-30.08:55:44] [Info] 80.0%
[2021-01-30.08:55:44] [Info] 90.0%
[2021-01-30.08:55:45] [Info] 100.0%
[2021-01-30.08:55:45] [Info] Done
[00:00:00] 4545/10240MB | 1 target 780.92 MKey/s

OpenCL versus the Cuda version. clBitCrack versus cuBitCrack; have you tried cuBitCrack?

cuda is not working yet but other gus post there gtx 3060 ti result with CL and that is even more than me , even 2070 with cl is more than mine
wtf
zahid888
Member
**
Offline Offline

Activity: 260
Merit: 19

the right steps towerds the goal


View Profile
January 30, 2021, 09:20:25 AM
Last edit: February 01, 2021, 05:17:59 PM by zahid888
 #657

why the hell so low speed with max  -b 70 -t 512 -p 2078

[2021-01-30.08:55:30] [Info] Compression: compressed
[2021-01-30.08:55:30] [Info] Starting at: 0000000000000000000000000000000000000000000000000000000000000001
[2021-01-30.08:55:30] [Info] Ending at:   FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364140
[2021-01-30.08:55:30] [Info] Counting by: 0000000000000000000000000000000000000000000000000000000000000001
[2021-01-30.08:55:30] [Info] Compiling OpenCL kernels...
[2021-01-30.08:55:30] [Info] Initializing GeForce RTX 3080
[2021-01-30.08:55:36] [Info] Generating 74,475,520 starting points (2841.0MB)
[2021-01-30.08:55:42] [Info] 10.0%
[2021-01-30.08:55:43] [Info] 20.0%
[2021-01-30.08:55:43] [Info] 30.0%
[2021-01-30.08:55:43] [Info] 40.0%
[2021-01-30.08:55:43] [Info] 50.0%
[2021-01-30.08:55:44] [Info] 60.0%
[2021-01-30.08:55:44] [Info] 70.0%
[2021-01-30.08:55:44] [Info] 80.0%
[2021-01-30.08:55:44] [Info] 90.0%
[2021-01-30.08:55:45] [Info] 100.0%
[2021-01-30.08:55:45] [Info] Done
[00:00:00] 4545/10240MB | 1 target 780.92 MKey/s

OpenCL versus the Cuda version. clBitCrack versus cuBitCrack; have you tried cuBitCrack?

cuda is not working yet but other gus post there gtx 3060 ti result with CL and that is even more than me , even 2070 with cl is more than mine
wtf

-b 128 -t 256 -p 1024

[2021-01-30.14:44:54] [Info] Compression: compressed
[2021-01-30.14:44:54] [Info] Starting at: 0000000000000000000000000000000000000000000000000000000DDEEE9D0B
[2021-01-30.14:44:54] [Info] Ending at:   0000000000000000000000000000000000000000000000FFFFFFFFFFFFFFFFFF
[2021-01-30.14:44:54] [Info] Counting by: 0000000000000000000000000000000000000000000000000000001000000000
[2021-01-30.14:44:54] [Info] Compiling OpenCL kernels...
[2021-01-30.14:44:54] [Info] Initializing GeForce RTX 3060 Ti
[2021-01-30.14:44:56] [Info] Generating 33,554,432 starting points (1280.0MB)
[2021-01-30.14:44:59] [Info] 10.0%
[2021-01-30.14:44:59] [Info] 20.0%
[2021-01-30.14:45:00] [Info] 30.0%
[2021-01-30.14:45:00] [Info] 40.0%
[2021-01-30.14:45:00] [Info] 50.0%
[2021-01-30.14:45:00] [Info] 60.0%
[2021-01-30.14:45:00] [Info] 70.0%
[2021-01-30.14:45:00] [Info] 80.0%
[2021-01-30.14:45:01] [Info] 90.0%
[2021-01-30.14:45:01] [Info] 100.0%
[2021-01-30.14:45:01] [Info] Done
[2021-01-30.14:45:01] [Info] Loading addresses from 'C:/Users/Desktop/1.txt'
[2021-01-30.14:45:01] [Info] 104 addresses loaded (0.0MB)
GeForce RTX 3060 3072 / 8192MB | 104 targets 814.79 MKey/s (23,689,428,992 total) [00:00:27]

maybe you have to use that parameter.. still waiting for cubitcrack

1BGvwggxfCaHGykKrVXX7fk8GYaLQpeixA
NotATether
Legendary
*
Offline Offline

Activity: 1596
Merit: 6728


bitcoincleanup.com / bitmixlist.org


View Profile WWW
January 30, 2021, 10:32:20 AM
Last edit: January 30, 2021, 11:00:59 AM by NotATether
 #658

Can someone help please?
is there a way to stop/resume in cmd terminal?
thanks in advance!

Yeah there is, just pass --continue FILENAME to bitcrack and it will save its progress every 60 seconds to the file, and in the case of starting bitcrack with an existing file, it'll load its current keys and working data from there.

why the hell so low speed with max  -b 70 -t 512 -p 2078

Someone posted benchmarks of Bitcrack running in OpenCL mode on a bunch of NVIDIA GPUs here and the results were not good. It could be because the OpenCL version does not use the thread and block parameters you pass as these are CUDA-specific things, or maybe the OpenCL implementation of keysearch code was just not hand-tuned, it is still alpha after all. uint256_t is used everywhere inside there and I highly doubt OpenCL has an atomic 256bit int type. Even CUDA only has as large as a 128-bit atomic int so when this stuff is implemented using smaller-sized variables the runtime goes up accordingly.

EDIT: Just checked the code, yeah ClKeySearchDevice.cpp does in fact use threads and blocks but they are not fully parallelized as in CUDA and are only made use of to call the function that generates the starting points and to call the _doStepKernelWithDouble function i.e there is still a loop running on the CPU that generates an ever increasing number of points but the actual computation function is called with the threads and blocks.

Perhaps that explains why the CUDA performance is better.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
renedx
Jr. Member
*
Offline Offline

Activity: 36
Merit: 3


View Profile
February 01, 2021, 04:36:22 PM
 #659



cuda is not working yet but other gus post there gtx 3060 ti result with CL and that is even more than me , even 2070 with cl is more than mine
wtf

For OpenCL & max out optimized:
A 3090 RTX does ~2000 Mkey/s (overclocked)
A 3090 RTX does ~1800 Mkey/s (normal)
A 3080 RTX does ~1600 Mkey/s (overclocked)
A 3080 RTX does ~1400 Mkey/s (normal)
brainless
Member
**
Offline Offline

Activity: 316
Merit: 34


View Profile
February 02, 2021, 11:15:26 AM
 #660

one cuda coder bitcoinforktech disapear from 21 jan 2021, anyone know where he go, hope he will be fine, not infected by covid19, if anyone know write here about him for not online from 21 jan

https://bitcointalk.org/index.php?action=profile;u=2863059

13sXkWqtivcMtNGQpskD78iqsgVy9hcHLF
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 [33] 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 ... 96 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!