NotATether
Legendary
Offline
Activity: 1596
Merit: 6726
bitcoincleanup.com / bitmixlist.org
|
|
December 27, 2022, 05:43:41 PM |
|
Alright. There's one other thing to address: in the secp256k1_fe_mul (or something like that) function, the all but the last leg are multiplied by the constant R. This causes a result different from when I calculated an example in Python. So inside the fe_mul function, I need to modify it to avoid multiplying the values in the result (stack) by R, and send that multiplication to a temporary instead.
|
. .BLACKJACK ♠ FUN. | | | ███▄██████ ██████████████▀ ████████████ █████████████████ ████████████████▄▄ ░█████████████▀░▀▀ ██████████████████ ░██████████████ █████████████████▄ ░██████████████▀ ████████████ ███████████████░██ ██████████ | | CRYPTO CASINO & SPORTS BETTING | | │ | | │ | ▄▄███████▄▄ ▄███████████████▄ ███████████████████ █████████████████████ ███████████████████████ █████████████████████████ █████████████████████████ █████████████████████████ ███████████████████████ █████████████████████ ███████████████████ ▀███████████████▀ ███████████████████ | | .
|
|
|
|
|
|
|
|
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
|
Pieter Wuille
|
|
December 28, 2022, 03:19:38 PM |
|
Alright.
There's one other thing to address: in the secp256k1_fe_mul (or something like that) function, the all but the last leg are multiplied by the constant R. This causes a result different from when I calculated an example in Python. So inside the fe_mul function, I need to modify it to avoid multiplying the values in the result (stack) by R, and send that multiplication to a temporary instead.
That makes no sense; fe_mul just multiplies two field elements modulo p. That R constant is an implementation detail, that even differs between 32-bit and 64-bit platforms. It's not actually multiplying the result by this value. Note that field elements are internally stored in a denormalized representation where the limbs can overflow. If you want to convert it to a portable format, use fe_get_b32.
|
I do Bitcoin stuff.
|
|
|
NotATether
Legendary
Offline
Activity: 1596
Merit: 6726
bitcoincleanup.com / bitmixlist.org
|
|
December 28, 2022, 04:41:36 PM |
|
Note that field elements are internally stored in a denormalized representation where the limbs can overflow. If you want to convert it to a portable format, use fe_get_b32.
That must be why I was getting different results while testing. I'll check out this function and run my C++ and Python mod-mul programs again. It will be interesting to see the results of this.
|
. .BLACKJACK ♠ FUN. | | | ███▄██████ ██████████████▀ ████████████ █████████████████ ████████████████▄▄ ░█████████████▀░▀▀ ██████████████████ ░██████████████ █████████████████▄ ░██████████████▀ ████████████ ███████████████░██ ██████████ | | CRYPTO CASINO & SPORTS BETTING | | │ | | │ | ▄▄███████▄▄ ▄███████████████▄ ███████████████████ █████████████████████ ███████████████████████ █████████████████████████ █████████████████████████ █████████████████████████ ███████████████████████ █████████████████████ ███████████████████ ▀███████████████▀ ███████████████████ | | .
|
|
|
|
esa_a
Newbie
Offline
Activity: 1
Merit: 0
|
|
February 09, 2023, 08:55:12 PM |
|
And moreover VanitySeacrh is only fast at "Point pub = secp256k1->ComputePublicKey(&privKey);" since it generates a "Point GTable[256*32];" to be used in "Q = Add2(Q, GTable[256 * i + (b-1)])" having Jacobian Coordinates representation as calculation scheme. If you take Add2 to add one million points in a sequence it is faster than classic point_addition with inversion by far. But to use any Jacobian point after calculation further you need to use "Q.Reduce();" beforehand the same thing as inversion. And in that situation classic scheme using GMP becomes faster since you just add point and can use it further immediately without any Reduce().
Hey, did you try piggypiggy's implementation? Was it any faster?
|
|
|
|
alexeyneu
Member
Offline
Activity: 312
Merit: 30
|
|
February 09, 2023, 11:03:30 PM Last edit: February 10, 2023, 12:02:10 AM by alexeyneu Merited by vapourminer (1), Welsh (1) |
|
Ok, lets to to write it by parts using openAI Write in assembly a code to add two numbers of 512 bits ; Define the two 512-bit numbers to be added .data num1: .quad 0x1234567890ABCDEF, 0x1234567890ABCDEF, 0x1234567890ABCDEF, 0x1234567890ABCDEF num2: .quad 0x0987654321FEDCBA, 0x0987654321FEDCBA, 0x0987654321FEDCBA, 0x0987654321FEDCBA
; Define the 512-bit result result: .quad 0, 0, 0, 0
; Main program .text .global main main:
; Initialize the result to 0 mov $0, %rax mov %rax, result(,%rax,8)
; Loop through each of the 64-bit words in the numbers mov $0, %rax loop:
; Add the corresponding 64-bit words from num1 and num2 movq num1(,%rax,8), %rdx addq num2(,%rax,8), %rdx
; Store the result in the corresponding 64-bit word in the result mov %rdx, result(,%rax,8)
; Increment the loop counter and check if it has reached 4 inc %rax cmp $4, %rax jne loop
; End the program ret
but dude it's sandbox asm some kind of this stuff https://stackoverflow.com/questions/64726805/understanding-a-basic-assembly-code-with-lea-instruction#comment114452574_64726805it's not executable nor ring zero os part . it's nothing really. why to use asm then?
|
|
|
|
NotATether
Legendary
Offline
Activity: 1596
Merit: 6726
bitcoincleanup.com / bitmixlist.org
|
|
February 14, 2023, 08:15:08 AM Merited by vapourminer (1) |
|
There is really no point to be writing code in assembly that is not using instructions that are faster than the ones that gcc is compiling down to. For example, there are a bunch of MOVs, CMPs, JMPs, Add/Xor/Lea instructions when you compile some C file down to assembly. There are only two ways to make this faster: 1 - you can somehow reformat the assembly to remove excessive MOVs, so that its using as few instructions as possible (will not result in a large performance improvement) 2 - your use case can be accelerated by SIMD instructions (will result in a much faster performance).
|
. .BLACKJACK ♠ FUN. | | | ███▄██████ ██████████████▀ ████████████ █████████████████ ████████████████▄▄ ░█████████████▀░▀▀ ██████████████████ ░██████████████ █████████████████▄ ░██████████████▀ ████████████ ███████████████░██ ██████████ | | CRYPTO CASINO & SPORTS BETTING | | │ | | │ | ▄▄███████▄▄ ▄███████████████▄ ███████████████████ █████████████████████ ███████████████████████ █████████████████████████ █████████████████████████ █████████████████████████ ███████████████████████ █████████████████████ ███████████████████ ▀███████████████▀ ███████████████████ | | .
|
|
|
|
alexeyneu
Member
Offline
Activity: 312
Merit: 30
|
|
February 14, 2023, 07:13:34 PM Last edit: February 14, 2023, 07:42:59 PM by alexeyneu |
|
There is really no point to be writing code in assembly that is not using instructions that are faster than the ones that gcc is compiling down to. For example, there are a bunch of MOVs, CMPs, JMPs, Add/Xor/Lea instructions when you compile some C file down to assembly. There are only two ways to make this faster: 1 - you can somehow reformat the assembly to remove excessive MOVs, so that its using as few instructions as possible (will not result in a large performance improvement) 2 - your use case can be accelerated by SIMD instructions (will result in a much faster performance). but you can do simd in c. stuff like this _mm_setzero_si128() . in protected mode os you executable is just a task with little to no hw access. so asm has no much sense . if you remove some mov's in exe it'll not increase performance about this code(or smth like that) posted here - it's something stylized to msdos real mode asm . it's unrelated to binary produced after it.
|
|
|
|
|