Bitcoin Forum
April 24, 2024, 12:50:30 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 ... 61 »
  Print  
Author Topic: VanitySearch (Yet another address prefix finder)  (Read 31047 times)
Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 17, 2019, 03:05:11 PM
 #141

Could you try this:
Code:
pons@linpons:~/VanitySearch$ /usr/local/cuda/bin/cuda-memcheck --tool memcheck VanitySearch -g 1 -check

On my Linux it does not work (too old hardware) but on windows it ends like this.

Code:
C:\C++\VanitySearch\x64\ReleaseSM30>cuda-memcheck --tool memcheck VanitySearch.exe -g 1 -check
...
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 GeForce GTX 645 (3x192 cores) Grid(1x128)
Endianness: Little
Seed: 1006346800
401.220 KiloKey/sec
ComputeKeys() found 46 items , CPU check...
GPU/CPU check OK
========= ERROR SUMMARY: 0 errors
1713963030
Hero Member
*
Offline Offline

Posts: 1713963030

View Profile Personal Message (Offline)

Ignore
1713963030
Reply with quote  #2

1713963030
Report to moderator
1713963030
Hero Member
*
Offline Offline

Posts: 1713963030

View Profile Personal Message (Offline)

Ignore
1713963030
Reply with quote  #2

1713963030
Report to moderator
1713963030
Hero Member
*
Offline Offline

Posts: 1713963030

View Profile Personal Message (Offline)

Ignore
1713963030
Reply with quote  #2

1713963030
Report to moderator
TalkImg was created especially for hosting images on bitcointalk.org: try it next time you want to post an image
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713963030
Hero Member
*
Offline Offline

Posts: 1713963030

View Profile Personal Message (Offline)

Ignore
1713963030
Reply with quote  #2

1713963030
Report to moderator
1713963030
Hero Member
*
Offline Offline

Posts: 1713963030

View Profile Personal Message (Offline)

Ignore
1713963030
Reply with quote  #2

1713963030
Report to moderator
1713963030
Hero Member
*
Offline Offline

Posts: 1713963030

View Profile Personal Message (Offline)

Ignore
1713963030
Reply with quote  #2

1713963030
Report to moderator
Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 17, 2019, 03:47:15 PM
 #142

I committed a new Makefile with debug option.

Code:
make clean
make gpu=1 debug=1 all

In debug mode no inlining is done.

But, obviously it is much slower.
So launch

Code:
pons@linpons:~/VanitySearch$ ./VanitySearch -g 1 -check
arulbero
Legendary
*
Offline Offline

Activity: 1914
Merit: 2071


View Profile
March 17, 2019, 04:09:58 PM
Last edit: March 17, 2019, 04:20:11 PM by arulbero
 #143

I committed a new Makefile with debug option.

Code:
make clean
make gpu=1 debug=1 all

In debug mode no inlining is done.

But, obviously it is much slower.
So launch

Code:
pons@linpons:~/VanitySearch$ ./VanitySearch -g 1 -check


Code:
./VanitySearch -g 1 -check
GetBase10() Results OK
Add() Results OK : 108.696 MegaAdd/sec
Mult() Results OK : 10.684 MegaMult/sec
Div() Results OK : 1.656 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() Results OK : 132.041 KiloInv/sec
IntGroup.ModInv() Results OK : 2.222 MegaInv/sec
ModMulK1() Results OK : 3.661 MegaMult/sec
ModMulK1order() Results OK : 1.700 MegaMult/sec
ModSqrt() Results OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(1x128)
Seed: 888394
193.110 KiloKey/sec
ComputeKeys() found 26 items , CPU check...
GPU/CPU check OK


Code:
~/VanitySearch$ /usr/local/cuda-8.0/bin/cuda-memcheck --tool memcheck VanitySearch -g 1 -check
========= CUDA-MEMCHECK
GetBase10() Results OK
Add() Results OK : 109.890 MegaAdd/sec
Mult() Results OK : 10.695 MegaMult/sec
Div() Results OK : 1.818 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() Results OK : 130.572 KiloInv/sec
IntGroup.ModInv() Results OK : 2.182 MegaInv/sec
ModMulK1() Results OK : 3.602 MegaMult/sec
ModMulK1order() Results OK : 1.684 MegaMult/sec
ModSqrt() Results OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(1x128)
Seed: 781110
15.061 KiloKey/sec
ComputeKeys() found 26 items , CPU check...
GPU/CPU check OK
========= ERROR SUMMARY: 0 errors

Code:
~/VanitySearch$ /usr/local/cuda-8.0/bin/cuda-memcheck --tool memcheck VanitySearch -g 32 -check
========= CUDA-MEMCHECK
GetBase10() Results OK
Add() Results OK : 80.000 MegaAdd/sec
Mult() Results OK : 10.030 MegaMult/sec
Div() Results OK : 1.883 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() Results OK : 130.924 KiloInv/sec
IntGroup.ModInv() Results OK : 2.221 MegaInv/sec
ModMulK1() Results OK : 3.659 MegaMult/sec
ModMulK1order() Results OK : 1.704 MegaMult/sec
ModSqrt() Results OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(32x128)
Seed: 639838
59.308 KiloKey/sec
ComputeKeys() found 721 items , CPU check...
GPU/CPU check OK
========= ERROR SUMMARY: 0 errors
Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 17, 2019, 04:55:04 PM
 #144

Ok Thanks, could you try to run cuda-memcheck on the release version.
arulbero
Legendary
*
Offline Offline

Activity: 1914
Merit: 2071


View Profile
March 17, 2019, 05:32:56 PM
 #145

Ok Thanks, could you try to run cuda-memcheck on the release version.



Code:
~/VanitySearch-1.8$ /usr/local/cuda-8.0/bin/cuda-memcheck --tool memcheck VanitySearch -g 1 -check
========= CUDA-MEMCHECK
GetBase10() Results OK
Add() Results OK : 123.457 MegaAdd/sec
Mult() Results OK : 23.148 MegaMult/sec
Div() Results OK : 5.208 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() : 341.317 KiloInv/sec
IntGroup.ModInv() : 9.130 MegaInv/sec
ModMulK1() : 12.968 MegaMult/sec
ModSqrt() OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(1x128)
Seed: 223215
95.697 KiloKey/sec
ComputeKeys() found 26 items , CPU check...
Expected item not found 3412bb65 cb39a716 67dcd486 209b19df c65e364c
Expected item not found fefea644 d535267a 46308e46 c579e91b 0aad3ee2
Expected item not found 3412726b 9830f325 9c5f0d95 a99e2a9b 6c473922
Expected item not found 341292e1 b4a39d2c 59e34f3d 38725b42 dfc2e801
Expected item not found fefeba57 c1209e3d 1b79200c b9529018 de0e35e4
Expected item not found fefe4aaa 34f02402 4ed76c83 a1d60efc 8c79f7a6
Expected item not found fefe8742 63e9b7bc b13a08f1 28229fd8 30987ed3
CPU found 22 items
========= ERROR SUMMARY: 0 errors
stivensons
Jr. Member
*
Offline Offline

Activity: 82
Merit: 1


View Profile
March 17, 2019, 06:04:15 PM
 #146

Ok Thanks, could you try to run cuda-memcheck on the release version.


if you post a release windows , I can test it too  Smiley

Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 17, 2019, 06:28:46 PM
 #147

if you post a release windows , I can test it too  Smiley

You can test with the release you have.
You can try:
Code:
VanitySearch -gpuId 0 -check 
VanitySearch -gpuId 6 -check (On the 3GB)
Thanks Wink


Tomorow, I will try to set up cuda sdk 10 on a recent hardware (Linux) and see If I can reproduce the issue.



stortz
Jr. Member
*
Offline Offline

Activity: 40
Merit: 15


View Profile
March 17, 2019, 10:43:52 PM
 #148

I tried your program with the parameters as shown in the sample + my username
Code:
-stop -gpu 1stortz

it ran, but just closed after finding it
did it generate the private keys into a file?
I am confused
stivensons
Jr. Member
*
Offline Offline

Activity: 82
Merit: 1


View Profile
March 18, 2019, 05:12:04 AM
 #149

if you post a release windows , I can test it too  Smiley

You can test with the release you have.
You can try:
Code:
VanitySearch -gpuId 0 -check 
VanitySearch -gpuId 6 -check (On the 3GB)
Thanks Wink


Tomorow, I will try to set up cuda sdk 10 on a recent hardware (Linux) and see If I can reproduce the issue.





cuda 10

Code:
G:\vanitysearch>vanitysearch   -gpuId 0 -check
GetBase10() Results OK
Add() Results OK : 567.189 MegaAdd/sec
Mult() Results OK : 38.169 MegaMult/sec
Div() Results OK : 4.410 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() : 281.352 KiloInv/sec
IntGroup.ModInv() : 8.365 MegaInv/sec
ModMulK1() : 10.770 MegaMult/sec
ModSqrt() OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #0 GeForce GTX 1060 6GB (10x128 cores) Grid(80x128)
Seed: 1853432973
296.742 MegaKey/sec
ComputeKeys() found 1947 items , CPU check...
GPU/CPU check OK

Code:
G:\vanitysearch>vanitysearch   -gpuId 6 -check
GetBase10() Results OK
Add() Results OK : 556.067 MegaAdd/sec
Mult() Results OK : 35.273 MegaMult/sec
Div() Results OK : 4.104 MegaDiv/sec
ModInv()/ModExp() Results OK
ModInv() : 260.561 KiloInv/sec
IntGroup.ModInv() : 7.773 MegaInv/sec
ModMulK1() : 9.881 MegaMult/sec
ModSqrt() OK !
Check Generator :OK
Check Double :OK
Check Add :OK
Check GenKey :OK
Adress : 15t3Nt1zyMETkHbjJTTshxLnqPzQvAtdCe OK!
Adress : 1BoatSLRHtKNngkdXEeobR76b53LETtpyT OK!
Adress : 1JeanLucgidKHxfY5gkqGmoVjo1yaU4EDt OK(comp)!
Adress : 1Test6BNjSJC5qwYXsjwKVLvz7DpfLehy OK!
Adress : 1BitcoinP7vnLpsUHWbzDALyJKnNo16Qms OK(comp)!
Check Calc PubKey (full) 1ViViGLEawN27xRzGrEhhYPQrZiTKvKLo :OK
Check Calc PubKey (even) 1Gp7rQ4GdooysEAEJAS2o4Ktjvf1tZCihp:OK
Check Calc PubKey (odd) 18aPiLmTow7Xgu96msrDYvSSWweCvB9oBA:OK
GPU: GPU #6 GeForce GTX 1060 3GB (9x128 cores) Grid(72x128)
Seed: 2205931314
260.131 MegaKey/sec
ComputeKeys() found 1752 items , CPU check...
GPU/CPU check OK
Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 18, 2019, 06:53:09 AM
Merited by stortz (1)
 #150

Hello,

it ran, but just closed after finding it
did it generate the private keys into a file?
I am confused

To output the key in a file, use the -o option.
Code:
VanitySearch -stop -gpu -o key.txt 1stortz

Many thanks stivensons for the report Smiley
Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 18, 2019, 12:15:32 PM
 #151

Linux binary are available for download here (experimental).
They are compiled with CUDA SDK10.
Thanks to test them Wink

http://zelda38.free.fr/VanitySearch/
Lisa Finn
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile WWW
March 19, 2019, 06:11:55 PM
 #152

Hello,

I would like to present a new bitcoin prefix address finder called VanitySearch. It is very similar to Vanitygen.
The main differences with Vanitygen are that VanitySearch is not using the heavy OpenSSL for CPU calculation and that the kernel is written in Cuda in order to take full advantage of inline PTX assembly.
On my Intel Core i7-4770, VanitySearch runs ~4 times faster than vanitygen64. (1.32 Mkey/s -> 5.27  MK/s)
On my  GeForce GTX 645, VanitySearch runs ~1.5 times faster than oclvanitygen. (9.26 Mkey/s -> 14.548 MK/s)
If you want to compare VanitySearch and Vanitygen result, use the -u option for searching uncompressed address.
VanitySearch may not compute a good gridsize for your GPU, so make several tries using -g options in order to find best performances.
Using compressed addresses is roughly 20% faster.

VanitySearch is available from https://github.com/JeanLucPons/VanitySearch

There is still lots of improvement to do.
Feel free to test it and to submit issue.

Thanks.
Sorry for my bad English.
Jean-Luc

Is this really legil in  asian countries like India HuhHuhHuh
TryNinja
Legendary
*
Offline Offline

Activity: 2814
Merit: 6971



View Profile WWW
March 19, 2019, 06:52:37 PM
 #153

Is this really legil in  asian countries like India HuhHuhHuh
Why would a Bitcoin address generator be ilegal anywhere?

.
.HUGE.
▄██████████▄▄
▄█████████████████▄
▄█████████████████████▄
▄███████████████████████▄
▄█████████████████████████▄
███████▌██▌▐██▐██▐████▄███
████▐██▐████▌██▌██▌██▌██
█████▀███▀███▀▐██▐██▐█████

▀█████████████████████████▀

▀███████████████████████▀

▀█████████████████████▀

▀█████████████████▀

▀██████████▀▀
█▀▀▀▀











█▄▄▄▄
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
.
CASINSPORTSBOOK
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀▀█











▄▄▄▄█
Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 20, 2019, 09:00:00 AM
Merited by OgNasty (1)
 #154

Hello,

A new release of VanitySearch (1.9) is out:

Code:
Added -b option (Search compressed or uncompressed addresses)
Improved performance for loading large prefix list
Fixed difficulty calculation bug for prefix containing only '1'

Windows binaries: https://github.com/JeanLucPons/VanitySearch/releases/tag/1.9
Linux binaries: http://zelda38.free.fr/VanitySearch/ (Experimental)

Tanks to test it !
Smiley
arulbero
Legendary
*
Offline Offline

Activity: 1914
Merit: 2071


View Profile
March 20, 2019, 09:40:53 AM
 #155

A new release of VanitySearch (1.9) is out:

Code:
Added -b option (Search compressed or uncompressed addresses)
Improved performance for loading large prefix list
Fixed difficulty calculation bug for prefix containing only '1'


New version is slower on my pc (132 MKeys/s against 162 MKeys/s).
Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 20, 2019, 10:07:38 AM
 #156

New version is slower on my pc (132 MKeys/s against 162 MKeys/s).

On my Windows, performance are the same than the previous release (Cuda 10).
Slightly slower on Linux (Cuda 8.0), from 39.5MK/s to 37.9MK/s.

Anyway,
Do you compile or do you use Linux binaries ?
Do you solved your problem ? I didn't manage to reproduce the issue yet.
arulbero
Legendary
*
Offline Offline

Activity: 1914
Merit: 2071


View Profile
March 20, 2019, 11:36:29 AM
Merited by Jean_Luc (1)
 #157

New version is slower on my pc (132 MKeys/s against 162 MKeys/s).

On my Windows, performance are the same than the previous release (Cuda 10).
Slightly slower on Linux (Cuda 8.0), from 39.5MK/s to 37.9MK/s.

Anyway,
Do you compile or do you use Linux binaries ?
Do you solved your problem ? I didn't manage to reproduce the issue yet.


I compile the source myself. No, my problem is not solved. I have only Cuda 8.0.


Some ideas for (maybe) a little speed improvement:


1) in __device__ void ComputeKeys (GPUCompute.h) instead of doing HSIZE times

Code:
ModNeg256(dy,Gy[i]);  <--
ModSub256(dy, py);

you could do:

Code:
ModSub256(dy, pyn, Gy[i]);

and you compute only once pyn:

Code:
ModNeg256(pyn,py);

2) instead of

Code:
ModAdd256(py, Gy[i]);

Code:
ModSub256(py, sy);

To sum up:

Code:
ModSub256(dy, pyn, Gy[i]);

_ModMult(_s, dy, dx[i]);      //  s = (p2.y-p1.y)*inverse(p2.x-p1.x)
 _ModMult(_p2, _s, _s);        // _p = pow2(s)

ModSub256(px, _p2, px);
ModSub256(px, Gx[i]);         // px = pow2(s) - p1.x - p2.x;

ModSub256(py, sx, px);
 _ModMult(py, _s);             // py = - s*(ret.x-p2.x)
 ModSub256(py, sy);         // py = - p2.y - s*(ret.x-p2.x);  


3) in __device__ void ModSub256 instead of

Code:
     if ((int64_t)t < 0) {
    UADDO1(r[0], _P[0]);
    UADDC1(r[1], _P[1]);
    UADDC1(r[2], _P[2]);
    UADD1(r[3], _P[3]);
  }

it would be better something like that:

Code:
  if ((int64_t)t < 0) {
    USUBO1(r[0], 0x01000003d1);
    USUBC1(r[1], 0ULL);
    USUBC1(r[2], 0ULL);
    USUBC1(r[3], 0ULL);
  }

(I'm not sure what C means, I suppose means with carry)
arulbero
Legendary
*
Offline Offline

Activity: 1914
Merit: 2071


View Profile
March 20, 2019, 11:47:42 AM
 #158

Another sub function, if you want to test it:


Code:
__device__ void ModSub256(uint64_t *rp, uint64_t *ap, uint64_t *bp) {

 
  uint64_t a0, a1, a2, a3, b0, b1, b2, b3, r0, r1, r2, r3;
  int8_t c0, c1, c2, c3;


  a0 = ap[0];
  a1 = ap[1];
  a2 = ap[2];
  a3 = ap[3];

  b0 = bp[0];
  b1 = bp[1];
  b2 = bp[2];
  b3 = bp[3];
 
  /*
  r0 = a0 - b0;
  c0 = (a0 < b0) ? 1 : -1;
  c0 = (r0 == 0) ? 0 : c0;
 
  r1 = a1 - b1;
  c1 = (a1 < b1) ? 1 : -1;
  c1 = (r1 == 0) ? c0 : c1;
  r1 = r1 - (c0 == 1);
  
  r2 = a2 - b2;
  c2 = (a2 < b2) ? 1 : -1;
  c2 = (r2 == 0) ? c1 : c2;
  r2 = r2 - (c1 == 1);

  r3 = a3 - b3;
  c3 = (a3 < b3) ? 1 : -1;
  c3 = (r3 == 0) ? c2 : c3;
  r3 = r3 - (c2 == 1);
  */


  
  c0 = a0 < b0;
  r0 = a0 - b0;
  
  c1 = a1 < b1;
  r1 = a1 - b1;
  if(r1 == 0){ c1 = c0;}
  if(c0) {r1 = r1 - 1;}
  

  c2 = a2 < b2;
  r2 = a2 - b2;
  if(r2 == 0){ c2 = c1;}
  if(c1) {r2 = r2 - 1;}

  c3 = a3 < b3;
  r3 = a3 - b3;
  if(r3 == 0){ c3 = c2;}
  if(c2) {r3 = r3 - 1;}

  
  if(c3 == 1){


if(r0 > 0x1000003d0){  //almost always --> no borrow
                
r0 = r0 - 0x1000003d1;

}
else{
                    
   //c[0] = (r0 < 0x1000003d1) ? 1 : -1;
   //c0 = (r0 == 0x1000003d1) ? 0 : 1;
                //c0 = 1; // for sure r0 < 0x1000003d1

                r0 = r0 - 0x1000003d1;
                r1 = r1  - 1;  //c0 is 1
      

                c1 = (r1 == 0xffffffffffffffff) ? 1 : -1;
                c2 = (r2 == 0) ? c1 : -1;

if(c1 == 1) r2 = r2 - 1;
if(c2 == 1) r3 = r3 - 1;

              
};
   };
  
  
  
  rp[0] = r0;
  rp[1] = r1;
  rp[2] = r2;
  rp[3] = r3;


  return;
 
}


Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 20, 2019, 11:50:28 AM
 #159

Many thanks for the tips Wink
I will try this.

You don't want to try binary ? The libcudart.so.10.0 is also available from the given link. You do not need to set up cuda sdk 10 (unless a driver problem appears but this may work without installing anything).
You can just copy VanitySearch50 and the libcudart.so.10.0 in a directory and set the LD_LIBRARY_PATH.
Code:
export LD_LIBRARY_PATH=.
./VanitySearch50 ...

This is mainly to see if the problem is solved with CUDA 10 or if it comes from elsewhere.
Jean_Luc (OP)
Sr. Member
****
Offline Offline

Activity: 462
Merit: 696


View Profile
March 20, 2019, 11:56:10 AM
 #160

(I'm not sure what C means, I suppose means with carry)

Yes,
ADD0 is the initial add without carry and set carry flag
ADDC is add with carry and set carry flag
ADD is add with carry and do no set carry flag
Same for SUB
Function may be have a 1 suffix for unary function.
Pages: « 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 ... 61 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!