Bitcoin Forum
June 16, 2024, 04:34:23 AM *
News: Voting for pizza day contest
 
  Home Help Search Login Register More  
  Show Posts
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [22] 23 24 25 26 27 »
421  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 30, 2019, 07:11:48 AM
Thank you very much for testing Smiley
I'm starting to fight with VS2015. It seems that this time the problem comes from sub_borrow Cheesy
422  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 29, 2019, 08:27:00 PM
Hi,

I compiled VanitySearch for Windows with CUDA8 (Only for 2.0 compute cap).
Unfortunately, VS2015 performs wrong optimizations Sad so I had to disable optimization for CPU code.
But the GPU code seems to work as expected with the normal speed.
I will see if I can find where the compiler fails and if I can find a work around.
Thanks to test it.

http://zelda38.free.fr/VanitySearch/
423  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 28, 2019, 05:21:36 PM
Is there a theoretical model that would allow to calculate the maximum performance for a given hardware?

This would give an idea on how much more optimization you can achieve.

It is difficult to say.
Concerning VanitySearch I think we are near to the maximum if we keep present algorithms.
There is still few things to do but to get a significant performance increase it would need new algorithms such as 'partial' SHA or RIPE reversing in order to stop some calculation before getting the complete result or other thing like that...

424  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 28, 2019, 02:01:13 PM
Hello,

I set up the GTX 1050 Ti and I implemented the funnelshit for SHA and RIPE rotation (not yet for ModInv)
I was waiting for a more significant performance increase (I got only a little bit less than 3%).
Better than nothing.

Code:
C:\C++\VanitySearch\x64\ReleaseSM30>VanitySearch.exe -t 0 -gpu 1Testtttt
VanitySearch v1.11
Difficulty: 2988734397852221
Search: 1Testtttt [Compressed]
Start Thu Mar 28 14:48:27 2019
Base Key:3ECA27E3A98E4267E3D308CAA7E66B8972C31C4C02A7D16616BA46C32C59AFAC
Number of CPU thread: 0
GPU: GPU #0 GeForce GTX 1050 Ti (6x128 cores) Grid(48x128)
220.180 MK/s (GPU 220.180 MK/s) (2^32.76) [P 0.00%][50.00% in 109.4d][0]

Code:
C:\C++\VanitySearch\x64\ReleaseSM30>VanitySearch.exe -t 0 -gpu 1Testtttt
VanitySearch v1.11
Difficulty: 2988734397852221
Search: 1Testtttt [Compressed]
Start Thu Mar 28 14:51:10 2019
Base Key:7B8EEDDA6E7E418C9639AB5BBF0C14D2487D676ADDE6FC494F2504D3A026EF3B
Number of CPU thread: 0
GPU: GPU #0 GeForce GTX 1050 Ti (6x128 cores) Grid(48x128)
226.483 MK/s (GPU 226.483 MK/s) (2^32.85) [P 0.00%][50.00% in 106.4d][0]
425  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 27, 2019, 05:17:41 PM
Thanks for adding the version number!  Grin

You're welcome. Wink

Anyway, I managed to get back a used GTX 1050ti and I should be able to implement the funnel shift (for compute cap>3.5) which should speed up hashing and ModInv 62bit shift (unless nvcc is smart enough to use funnel shift alone when it sees something like ((x>>(32-n))|(x<<n)) )


426  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 27, 2019, 10:36:36 AM

Have you a _ModSqrMontgomery function?


No.

On the CPU:
The DRS62 ModInv cost ~160 ModSquareK1(), however the DRS62 works for all odd prime.
An optimization can also be done for SecpK1 prime as there is 2 mul by P.
DRS62: 362.696 KiloI/sec
ModSquareK1: 58.717 MegaS/sec

On the GPU, the 62bit right shift can also be optimized by the funnel shift.
427  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 27, 2019, 09:39:59 AM
Hello,

I published a new release (1.10):

-Support for compressed private key (Tested with Electrum 3.3.4)
-Slight performance increase

Thanks to test it
Have fun Wink
428  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 26, 2019, 07:27:29 PM
gcc version 7.0.1 20170407 (experimental) [trunk revision 246759] (Ubuntu 7-20170407-0ubuntu2)

Ok. I observed the issue with gcc 6 but with my gcc 7.3.0 it worked. It seems that this optimization bug is still here with 7.0.1. mmm... I will add a test for minor version and let the volatile up to gcc < 7.3. I tried with gcc 8.2 and it also works.
Thanks for the report.

429  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 26, 2019, 06:58:15 PM
Hi, I've just downloaded the VanitySearch Master, it works perfectly if I add "volatile" in this piece of code:

OK, which release of gcc are you using for compiling VanitySearch (not the CUDA code) ?
430  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 25, 2019, 06:21:12 PM
Yes, today the default is to free only one core when GPU is enabled, it will change this to number of GPU.
431  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 24, 2019, 04:57:42 PM
Do you recognize this crash error?

No I never experienced this crash. Thanks for the infos Wink

Is this included in your roadmap?

Salut Wink
I'm not yet familiar with P2SH addresses, I have to learn in detail. May be for 1-to-1 multisig P2SH.

Nice work anyway!

Thanks Wink

It is very strange with the process slower than without it.

Yes, This is because with -t 8, your CPU become a bottleneck and cannot handle GPU/CPU exchange.
When having good GPU keyrate, it is generally better to free 1 CPU core per GPU.

Jean_Luc, thank you for your hard work. If you break execution? Whether to keep VanitySearch a result?

If you are using a passphrase, and if you want to restart a search, you have to change your passphrase (1 character is enough) otherwise you will recompute exactly the same thing. If you're using the default random seed, the seed will change so you won't recompute the same thing, no need to save anything.
But I recommend to use a passphrase in order to generate safe private keys.

432  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 24, 2019, 05:32:04 AM
for the moment only on linux but it seems to me that jean_luc try or will try to adjust for windows also .... it is more difficult than for linux I think

Yes , on Windows no way to set up CUDA SDK 8.0 if a recent compiler (VC2017) is installed, even if the good one (VC2013) is also installed. The SDK setup fails. So the only solution is to start from a fresh install without VC2017 installed.

Also, any chance of OpenCL or is it going to only be only CUDA?
Thanks,
Dave

The problem with OpenCL is that I don't know how to access to the carry flag and how to perform a wide 64bit multiplication (i64xi64=>i128).

For instance:

Here is the code of oclvanitygen to perform an addition with carry:

Code:
#define bn_addc_word(r, a, b, t, c) do {			\
t = a + b + c; \
c = (t < a) ? 1 : ((c & (t == a)) ? 1 : 0); \
r = t; \
} while (0)

This can be reduced to a single adc instruction with CUDA (and also with Visual C++, gcc, etc...) !
Some OpenCL driver compilers are smart enough to understand this code and reduce it to a single adc instruction but not all !

For the wide 64bit multiplication (i64xi64=>i128), CUDA offer the needed instructions (mul.lo.u64 and mul.hi.u64), but with OpenCL is seems that the only way is to use 32bit integer and to use 64bit integer to perform the multiplication (i32xi32=>i64).

If an OpenCL expert know how to perform this efficiently, it would be great.

433  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 23, 2019, 05:51:01 PM
How it is possible??  Huh

Found by the CPU ? Try with -t 0...
434  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 23, 2019, 04:57:09 PM
Then we have the classic security problem of using pseudo-random seed. Alarm!
fix it faster to /dev/urandom

As written is the readme, for safe keys it is recommenced to use a passphrase using -s option (as for BIP38).
Concerning the default seed pbkdf2_hmac_sha512(date + uptime in us) , here we search for prefix, which means that a seed search attack might work on very short prefix and would require a very competitive and expensive hardware.

YES! moreover, I guarantee you that the mult of montgomery is a source of slow, especially for GPU.
...

As written is the readme, VanitySearch use now a 2 step folding modular multiplication optimized for SecpK1 prime.
435  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 22, 2019, 02:10:25 PM
An other report from a user using CUDA 8 and gcc 4.8 on a GeForce GTX 460. It works.
436  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 22, 2019, 01:38:45 PM
Don't worry, cuda 8 needs g++ 4.9, that's the problem.

I use g++ 4.8/CUDA 8 with my old Quadro and it works.

About the performance, I think most of the people use only compressed addresses.

If you do a specific ComputeKeys for only compressed keys (don't compute y at all!):

Yes you're right, I will make a second kernel optimized for compressed addresses only.
437  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 22, 2019, 01:29:48 PM
Yes, I already did it.

It will make me crazy.
It works on my 2 configs and a user on github just post a report on a GeForce GTX 1080 Ti (ccap=6.1) running on Ubuntu 18.04 and it works fine (he uses CUDA10).
438  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 22, 2019, 01:18:17 PM
You may have notice that I changed the makefile.
Now you should call it like this:

Code:
make gpu=1 ccap=50 all

And also set the good variable:
Code:
CUDA       = /usr/local/cuda-8.0
CXXCUDA    = /usr/bin/g++-4.8

The readme is up-to-date
439  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 22, 2019, 01:15:53 PM
Unfortunately all wrong!!!

That's strange. May be I introduced an other bug.
If you restore the volatile it works ?
440  Bitcoin / Development & Technical Discussion / Re: VanitySearch (Yet another address prefix finder) on: March 22, 2019, 08:09:08 AM

You can delete:

and delete u0, u1, u2 ,u3, r0, r1, r2, r3


I committed your mods and I removed unused variable and changed a bit the squaring, I just replaced the reset of variable t1 and t2 by UADD(t1, 0x0ULL, 0x0ULL); . With this, it is no longer necessary to reset to 0 t1 or t2, t1 is set with carry flag.
I also added my reduction which use MADC instruction (multiply and add).

You can try both implementation by changing at GPUEngine.gu:665
Code:
#if 1
to
Code:
#if 0


I also ported your ModSqr to CPU release in IntMod.cpp.

On my hardware no significant performance increase, the square is ~10% faster the classic mult, so on the global process, no measurable performance increase.

I removed again the volatile and added "memory" to clobber list of inline assembly. This should prevent the compiler to permute instruction (for pipelining optimization) and loose a carry or get a unexpected one.

Thanks to test the source on github and tell me if you still have the errors.

This is my last idea...
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [22] 23 24 25 26 27 »
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!