Bitcoin Forum
May 26, 2024, 12:59:52 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Test vectors for address generation  (Read 145 times)
ashfame (OP)
Newbie
*
Offline Offline

Activity: 21
Merit: 15


View Profile
February 02, 2021, 06:16:46 PM
Merited by pooya87 (1), ABCbits (1), NotATether (1)
 #1

Hi folks, I am working on a wallet in go lang & would like to implement some test vectors.

I have implemented BIP32 test vectors, which covers deriving private & public keys at certain paths for a specific seed.

I see some are available for BIP39 as well but doesn't seem like a proper set, just English & Japanese included in other implementations, not in BIP itself.

My main concern is that I am not able to find any test vectors, right from seed to addresses derived as per BIP44 (or other address formats) or just deriving addresses from private keys?

What would get me the highest grade of testing for correctness of wallet functionality? Any guidance is appreciated.
NotATether
Legendary
*
Offline Offline

Activity: 1610
Merit: 6761


bitcoincleanup.com / bitmixlist.org


View Profile WWW
February 03, 2021, 03:47:29 AM
 #2

Well given that you already found the test vectors listed in the spec, you at least can test the procedure between mnemonic generation and the generation of the master private key. However the BIP32 spec has test vectors that take a seed and lists the extended public and private keys for it.

Chances are that you will have to test the BIP39 test vectors (minus the master private keys) and the derivation paths listed in the BIP32 test vectors, separately. After testing seed generation with BIP39, you'd carefully construct it's master private and public keys, and include those in your own test vector (as BIP32's is not formatted in JSON) that otherwise mirrors all the data contained in BIP32's vectors.

As the private key hex is in the first half of the extended private key, you will have to calculate that for each test vector yourself, and similarly for the private key WIF and the public key hex/ripemd160 hash/base58 representation for it, and put all that in your test vector too. There are no comprehensive test vectors that cover all those parts at once (or any of them, even).

Of course next you'd put your test vectors on Github so that other developers can use it to test their software.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
pooya87
Legendary
*
Offline Offline

Activity: 3458
Merit: 10589



View Profile
February 03, 2021, 04:38:53 AM
 #3

I suspect code smell based on your question.

If you are testing your BIP39 or 32 then you shouldn't be testing them using addresses that are derived. You should be testing it using the private keys. Then an entirely separate part of your program should test the private key to public key conversion and another part test public key to different address types. In a sense that there is no difference between a private key derived from a BIP32 seed or a private key that is randomly generated.
The public key to address tests can also be broken down to smaller parts. Public key to hash is one part, hash handed over to the correct encoder is another part, encoder encoding the given data correctly is another part. All of these have test vectors now.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
ashfame (OP)
Newbie
*
Offline Offline

Activity: 21
Merit: 15


View Profile
February 03, 2021, 05:38:00 AM
 #4

@NotATether Yes, I have implemented BIP32 test vectors as specified in BIP32. I will be implementing BIP39 test vectors today (atleast English to begin with). Post that, I get that in the usual sense we write test vectors ourselves because we know how something should behave or rather we can define & judge the output well. But in cryptography, since that's difficult, I am looking for test vectors written by people smarter than me, so that I can potentially find out issues with my own implementation and/or with the libraries I am using (github.com/btcsuite).

For eg: https://github.com/btcsuite/btcutil/pull/182 was fixed few months back & it was related to child derivations. If I was calculating my own test vectors, I wouldn't be able to spot this, until its brought to light. Using well-defined test vectors from bigger projects & more experienced developers in the space, is sorta assuring to some extent & helps build confidence in my project.

It's all on Github already, I am just extracting away the code from two different codebases (my last year project, but never launched) to a single package that my project can use or other implementations can rely on & I can finally release the entire project.
My current repo is here - https://github.com/ashfame/btcwallet and test vectors implemented are here - https://github.com/ashfame/btcwallet/blob/master/wallet_test.go

@pooya87 Correct, What you described as I should do, is exactly what I am trying to do. Just looking for my source of truth, I guess? Being able to check if the different addresses generated from any private key works or its broken, is good enough to start, even if that doesn't tell where the problem lies i.e. the hash, encoder etc.
pooya87
Legendary
*
Offline Offline

Activity: 3458
Merit: 10589



View Profile
February 03, 2021, 06:26:02 AM
 #5

For eg: https://github.com/btcsuite/btcutil/pull/182 was fixed few months back & it was related to child derivations.
This is a known bug in implementations that don't treat the key derivation as key derivation (similar to any KDF purely working with bytes instead of constantly converting between byte and int)! Another example of code smell and tightly coupled code.
It was found 5 years ago and the BIP already has a test vector for it:  https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki#test-vector-3

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
ashfame (OP)
Newbie
*
Offline Offline

Activity: 21
Merit: 15


View Profile
February 03, 2021, 07:43:02 AM
 #6

@pooya87 Right, so you understand why I am trying to look for test vectors that I can include and run against my code, yes?

I have already included BIP32 tests and they pass, but looking for more. Currently implementing BIP39 tests, but what other test vectors I must test against?
pooya87
Legendary
*
Offline Offline

Activity: 3458
Merit: 10589



View Profile
February 03, 2021, 08:10:23 AM
Last edit: February 03, 2021, 08:24:40 AM by pooya87
 #7

@pooya87 Right, so you understand why I am trying to look for test vectors that I can include and run against my code, yes?

I have already included BIP32 tests and they pass, but looking for more. Currently implementing BIP39 tests, but what other test vectors I must test against?
You should test parts that are YOUR code not the libraries you are using. For example your TestBIP32SpecTestVector is basically repeating the tests that your library already has (https://github.com/btcsuite/btcutil/blob/master/hdkeychain/extendedkey_test.go). If you don't trust the library to be correct then you shouldn't be using it in first place.
The tests you write should only test your own code. For example it should test the part where you split the path into its parts and then pass the resulting uint32 to the third party (library) code to derive the child.

If I were you I'll pull this part out and put it in a separate method that returns a uint32[]
https://github.com/ashfame/btcwallet/blob/e24bad6fa3482c796dc1c8a1e0d588a93a489d22/wallet.go#L223-L242
Then write extensive number of tests for that method like this:
Code:
m (case where path refers to the master itself)
m/ (same but it has an extra separator)
m/a (has invalid number)
m/1 (path has one index)
m/1/2 (path has two indexes)
m/1H (the index is hardened)
m/1h (a lower case is used)
m/h1 (h is placed in the wrong place)
m/1' (a different indicator which is more common than 'h', you'll find a bug in your code after running this case)
m/4294967295H (out of range value which is possibly another bug in your code)
m/1/2'/3/4'/5/6h/7H (a mixture)
You can also decide how they behave, the first 2 cases can return error or an empty array or the last one can be rejected because it is a mixture or be flexible and accept that too.

Now your GetNodeKeys needs a lot less tests and you trust that node.Derive(deriveIndex) method that is calling the third party code is behaving correctly. You can also verify that correct behavior by checking the source code of the third party code like I posted above.

EDIT: added 2 more test vectors.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
ashfame (OP)
Newbie
*
Offline Offline

Activity: 21
Merit: 15


View Profile
March 03, 2021, 11:40:55 AM
 #8

@pooya87

Thanks for your insightful answer! It was very helpful in understanding what I needed to know. Highly appreciated!

And sorry, I opened a bunch of tabs & then life happened, so took me a while to get back to you  Grin

You are right, my unit tests should only test my code and not anything else.
My attempt to retest BIP32 & BIP39 would fall under integration testing, because I am testing the rightness of my package interacting with another tested package.
So disregarding semantic for unit testing here. Please bear with me on that but I am open to hear any comments you may have.

About your point on pulling out the selected code piece in a function of its own, that's very welcoming advice, as I need to get better at that front. So I have managed to refactor it & define unit tests on how I am arriving at derivation indexes based on path string. Please see here: https://github.com/ashfame/btcwallet/commit/f057f383a3bbc136dc48dac5ad2e239be9115a95

I will be adding more unit tests but with functions involving cryptographic operations like deriving addresses, what do I use as my source of truth? This is something I am still struggling with.
pooya87
Legendary
*
Offline Offline

Activity: 3458
Merit: 10589



View Profile
March 04, 2021, 05:31:32 AM
 #9

I will be adding more unit tests but with functions involving cryptographic operations like deriving addresses, what do I use as my source of truth? This is something I am still struggling with.
That's a tough question. For any part, finding any test vectors or creating one yourself using the reference implementation (bitcoin core) is a good idea. For example you could create your edge case private keys, import them in core and get their public key and addresses. It also has a lot of test vectors that could be used.

I'd still continue with separation of concerns and make testing easier. For example it has the following parts that can be tested separately:
1. From entropy to child keys (BIP32) which is computing a bunch of HMACSHA512 and some elliptic curve cryptography. Test vectors are found in BIP32
2. From child private key to public key, test vectors found in NIST standards but not needed if you use a library to do the conversion.
3. From public key to byte array which is important because it must always return 33 bytes no matter what the value is (it may need padding with zeros if the x is smaller than 32 bytes).
4. From that byte[] to hash. You would be using a library here too so no need for test but tests are found in respective RFC docs or NIST also has a lot of hash tests.
5. From hash to address which is the encoding part, most base58 libraries have a lot of test vectors (bitcoin core also has them), bech32 tests can be found in BIP173

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
NotATether
Legendary
*
Offline Offline

Activity: 1610
Merit: 6761


bitcoincleanup.com / bitmixlist.org


View Profile WWW
March 04, 2021, 09:18:11 AM
 #10

4. From that byte[] to hash. You would be using a library here too so no need for test but tests are found in respective RFC docs or NIST also has a lot of hash tests.

I'd be cautious about assuming that third-party libraries have been fully checked with their own test vectors, because usually said test vectors do not exist. Some NIST function that's in a standard library and is maintained by Core developers, is more likely to have test vectors written for it [and verified with them] than some library written by one guy.

To Golang'a credit, they do have test vectors for their built in hmac, sha256 and sha512 modules (is that what they're called in Go?  Huh). Only RIPEMD160 tests, and the entire module, are missing which arguably is more important than everything else I mentioned because that influences the base58 tests and for those you'd want to pass actual hashes as input and not arbitrary text. It is also not defined by NIST which makes finding test vectors for it harder, since that place is where I believe most other implementations pull their tests from.

However, OpenSSL source code does have some test inputs and outputs and they have some other goodies too like (language-agnostic) PBKDF and ECC inputs/outputs which can be used by someone trying to make their own tests.

My point is some ripemd160 golang package out there probably won't have similar tests.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
pooya87
Legendary
*
Offline Offline

Activity: 3458
Merit: 10589



View Profile
March 04, 2021, 10:09:44 AM
 #11

I'd be cautious about assuming that third-party libraries have been fully checked with their own test vectors, because usually said test vectors do not exist.
You are correct but such libraries must never be used ever in first place.

Quote
It is also not defined by NIST which makes finding test vectors for it harder,
You can always find the original author of the hash algorithm and use the test vectors/code that they have provided: https://homes.esat.kuleuven.be/~bosselae/ripemd160/#Outline

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!