Bitcoin Forum
November 03, 2024, 05:33:34 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 [7] 8 9 10 11 12 13 14 15 16 »  All
  Print  
Author Topic: Deterministic wallets  (Read 48369 times)
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1030


bits of proof


View Profile WWW
April 17, 2013, 05:51:40 PM
 #121

I suggest to set and serialize maximum i for each extended key. Such serialized string would define the wallet keys without the need of a regular backup until maximum number of keys not exceeded.
It would also facilitate audit since the set of possible descendants of a key would be of a practicable size. Also the bloom filter could be precomputed for SPVs including all possible addresses derivable.

I'm not convinced. Part of the point is to be able to generate more keys on the fly as needed, without all of them being known in advance, or even be known how many there are.

There are use cases where knowing all possible keys (or addresses) in advance is an advantage. I mentioned two: audit and bloom filter computation. The limit could be optional with default 2^31 as is by the definition.
iddo
Sr. Member
****
Offline Offline

Activity: 360
Merit: 251


View Profile
April 21, 2013, 09:30:56 PM
Last edit: April 21, 2013, 10:28:48 PM by iddo
 #122

Hello all,

I've made some changes to BIP32 to deal with the issue of revealing a master key by exposing both public parent and private child. The change is important (the child key derivation function changes), but it's maximally compatible with the previous version. In short, it means you can now select whether you want "public derivation" (the older, type-2 style) or "secret derivation" (safer, type-1) in each step, and the default wallet structure uses secret derivation for deriving account keys from the master key. This means that a leak of a secret account key (or below) cannot result in revealing the master secret.

As I don't like encumbering the key derivation with information about wallet structures, the derivation type is not hardcoded but can be selected at each stage individually.

Any comments from iddo, or other people?

Very nice, I like your BIP32 modifications with type-1/type-2 derivation according to the highest bit of the index. it could give flexibility with scenarios that ErebusBat mentioned, something like person A giving person B that he trusts the ability to derive a branch (with privkeys) in his wallet, and then person B wishes to give an untrusted person C the ability to derive pubkeys in a sub-branch, so to enhance security he can break the homomophism link by doing type-1 derivation (person B couldn't have done the type-1 derivation if it was only possible at depth=1 of the "subaccounts", because he's not trustworthy enough to have access to that depth, and besides keeping the layout of extending the particular branch instead of starting a new subaccount could be useful for accounting and so on). Probably there would be more interesting use cases than this one, that I haven't thought of.

I suppose that with this updated BIP32, some of the complexity will be transferred to the people who should do GUI design for wallet layouts Smiley

As discussed, it's important to do the secret type-1 derivation from the master node (depth 0) to the subaccounts (depth 1), since if the master node leaks then the entire wallet leaks. So the derivation from depth 0 to depth 1 should be type-1 by default, and maybe if you wish to be extra safe (about preventing people from shooting themselves in the foot) then you should specify that type-2 derivation from depth 0 to depth 1 isn't allowed according to the BIP32 standard, and enforce that in the Satoshi client. BTW, with the updated BIP32, I guess that there's no need to save the initial entropy S in wallet.dat ?
iddo
Sr. Member
****
Offline Offline

Activity: 360
Merit: 251


View Profile
April 21, 2013, 09:44:41 PM
 #123

Now a BIP32 specific question:
Why are the child chaincodes derived from both the parent pubkey and the parent chaincode rather than just from the parent chaincode alone? What about this derivation:
K_n:=H(c_par||n)*K_par
c_n:=H(H(c_par||n))
where H is some cryptographic hash. Looks easier to me and doesn't require 512 bit digests. Am I missing some point?

What do you gain by not using all available data? I consider using one 512-bit hash much more elegant than doing several 256-bit hashes.

Seeing that some interna of BIP 32 are being re-discussed, I would like to bump this comment of mine from several month ago. Sorry for repeating myself, but I thought that new applications or use cases, and therefore new opinions, might have come up in the meantime. This question/suggestion is about the "public derivation". Instead of
Code:
I = HMAC-SHA512(Key=cpar, Data=Kpar || i) (with Kpar = kpar*G, EC multiplication)
Split I into two 32-byte sequences, IL and IR.
ki is equal to IL*kpar (mod n).
ci is equal to IR.
why don't we use
Code:
IL = HMAC-SHA256(cpar,Kpar||i)
IR = HMAC-SHA256(cpar,i)
the difference being that ci depends only on cpar and i (as before) but not on Kpar anymore.

The advantage that I see with having ci independent of Kpar is that it opens up the possibility of updating all extended pubkeys in the tree (by updating the root extended pubkey) without touching the chaincodes.

Consider a hierarchy of servers where each server holds the extended pubkey corresponding to its position in the hierarchy. Note what is required when a server B receives his extended pubkey from the higher level server A. In the communication between A and B the pubkey-part of B's extended pubkey has to be authenticated (if B uses the wrong pubkey, money sent to it will be lost) whereas the chaincode-part has to be encrypted (by default chaincodes are to kept secret). In an update throughout the whole tree we could spare us the encryption part of that communication (if ci was indepent of Kpar) because the chaincodes remain unchanged.

Countering the above cited comment: what do we gain by having ci depend on Kpar?

I'm not sure that I understood your use case, could you please explain the exact scenario more clearly? if you had an ancestor chaincode and pubkey, then you can derive the descendant chaincode and there's no necessity to encrypt and send it to you, and if you don't have any ancestor chaincode then an encrypted chaincode must be sent to you, no?

Maybe we do gain some security by having ci depend on Kpar, because commonly just the hashed address of Kpar is published (until the coins get spent).
iddo
Sr. Member
****
Offline Offline

Activity: 360
Merit: 251


View Profile
April 21, 2013, 10:04:28 PM
 #124

I suggest to set and serialize maximum i for each extended key. Such serialized string would define the wallet keys without the need of a regular backup until maximum number of keys not exceeded.

Huh? Why would regular backups be needed? The major advantage of deterministic wallets is that backups aren't needed (see here).
iddo
Sr. Member
****
Offline Offline

Activity: 360
Merit: 251


View Profile
April 21, 2013, 10:10:09 PM
 #125

Regarding using addition instead of multiplication: we switched to multiplication at some point, but I can't remember why - the only thing it adds is removing the ability to compute I_L from a parent and a child pubkey, which doesn't seem to gain you anything.

With the additive variant, don't you get I_L*G from parent and a child pubkeys, rather than I_L itself? I'm still not sure what are the advantages of the multiplicative variant.

The additive version would be faster because the "fast exponentiation" algorithm that computes I_L*G can use pre-computed values of x*G for small x. I believe OpenSSL does something like that.

Not saying that speed should be of any concern here, though..

I'm still hoping to hear some explanation on why the multiplicative variant is advantageous.
It seems to me that it makes sense to re-consider this issue before BIP32 is finalized.
etotheipi
Legendary
*
expert
Offline Offline

Activity: 1428
Merit: 1093


Core Armory Developer


View Profile WWW
April 21, 2013, 10:38:21 PM
 #126

I'm still hoping to hear some explanation on why the multiplicative variant is advantageous.
It seems to me that it makes sense to re-consider this issue before BIP32 is finalized.

Can someone precisely specify the additive variant of the CKD function?  I'm fairly certain I know, but I don't want to end up missing a detail and evaluating something else.


Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here!    (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)
iddo
Sr. Member
****
Offline Offline

Activity: 360
Merit: 251


View Profile
April 21, 2013, 10:55:22 PM
 #127

I'm still hoping to hear some explanation on why the multiplicative variant is advantageous.
It seems to me that it makes sense to re-consider this issue before BIP32 is finalized.

Can someone precisely specify the additive variant of the CKD function?  I'm fairly certain I know, but I don't want to end up missing a detail and evaluating something else.


It's described by gmaxwell in the OP of this thread.
As currently in BIP32 you would still have I=HMAC-SHA512(Key=c_par, Data=K_par || i)
And then:
k_i=k_par+I_L
K_i=K_par+G*I_L
c_i=I_R
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1181


View Profile WWW
April 21, 2013, 10:58:15 PM
 #128

With the additive variant, don't you get I_L*G from parent and a child pubkeys, rather than I_L itself? I'm still not sure what are the advantages of the multiplicative variant.

We actually just had a discussion about this, and I'm in favor of changing this. The proposal would be:
  • k_i = k_par + I_L
  • K_i = K_par + I_L*G

I've been holding back on changing this, as speed wasn't a priority really. However, it is a lot easier to do the required operations for public derivation in constant time, than for the multiplicative case, which means it's easier to secure against timing attacks (and we are, very intensively, dealing with private material here). Also, it means you only need almost the same primitives as needed for ECDSA in the first place.

It is not less secure, as you still need to solve the ECDLP to find I_L from I_L*G (=the public key, which is exposed). Assuming a weaker version of the scheme, where it's just K_i = K_par + (i*c_par)*G, and assuming there is a transaction with inputs corresponding to i values i1 and i1+1, and a transaction with values i2 and i2+1, an attacker could detect these were created by someone using the same chain code twice. This would not be possible in the multiplicative version, and also not in the real scheme where the 'tweak' I_L is computed using an HMAC.

Anyone objects to changing this?

I do Bitcoin stuff.
etotheipi
Legendary
*
expert
Offline Offline

Activity: 1428
Merit: 1093


Core Armory Developer


View Profile WWW
April 21, 2013, 11:02:26 PM
 #129

It's described by gmaxwell in the OP of this thread.

Thanks for that.   Just for myself, I copied it below, adding sub tags for subscripts, and bolding EC points (but not scalars)

As currently in BIP32 you would still have I=HMAC-SHA512(Key=cpar, Data=Kpar || i)
And then:

ki=kpar+IL
Ki=Kpar+G*IL
ci=IR


Pieter: what's the difference between what you just posted and what iddo posted?

Also, aren't timing attacks supposed to already be "covered" if you are using a vetted implementation?  I'm not opposed to making it easier to implement, but I thought that was part of the reason to use, say, OpenSSL, because they already diligently covered a bunch of these side-channel concerns.


Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here!    (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1181


View Profile WWW
April 21, 2013, 11:21:38 PM
 #130

Pieter: what's the difference between what you just posted and what iddo posted?
Nothing, timing coincidence.

Quote
Also, aren't timing attacks supposed to already be "covered" if you are using a vetted implementation?  I'm not opposed to making it easier to implement, but I thought that was part of the reason to use, say, OpenSSL, because they already diligently covered a bunch of these side-channel concerns.

OpenSSL's ECDSA_sign probably guarantees this (I haven't verified), but as generic EC multiplication isn't required by ECDSA_sign (only by ECDSA_verify), it's not guaranteed to have a constant-time version (and a variant-time one is certainly faster).

I do Bitcoin stuff.
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1030


bits of proof


View Profile WWW
April 22, 2013, 07:10:22 AM
 #131

I suggest to set and serialize maximum i for each extended key. Such serialized string would define the wallet keys without the need of a regular backup until maximum number of keys not exceeded.

Huh? Why would regular backups be needed? The major advantage of deterministic wallets is that backups aren't needed (see here).
Assume you lost your wallet storing the unspent transactions, but you have the root key for the BIP32 generator from a cold store. Theoretically nothing is lost since you can scan the block chain for all outputs that can be spent with descendants of the root key, but practically you are not able to do this if you do not know what sequences were used.

Therefore you either have to backup the sequences used or simpler set a limit to them in beforehand that is a practicable magnitude for a scan.
iddo
Sr. Member
****
Offline Offline

Activity: 360
Merit: 251


View Profile
April 22, 2013, 07:59:21 AM
 #132

I suggest to set and serialize maximum i for each extended key. Such serialized string would define the wallet keys without the need of a regular backup until maximum number of keys not exceeded.

Huh? Why would regular backups be needed? The major advantage of deterministic wallets is that backups aren't needed (see here).
Assume you lost your wallet storing the unspent transactions, but you have the root key for the BIP32 generator from a cold store. Theoretically nothing is lost since you can scan the block chain for all outputs that can be spent with descendants of the root key, but practically you are not able to do this if you do not know what sequences were used.

Therefore you either have to backup the sequences used or simpler set a limit to them in beforehand that is a practicable magnitude for a scan.

As Pieter mentioned in the link that I gave you, it should be very practical to scan in mostly all cases of wallet layouts, the branching factor and depth are quite tiny.
thanke
Member
**
Offline Offline

Activity: 104
Merit: 10


View Profile
April 22, 2013, 08:09:24 AM
 #133

The proposal would be:
  • k_i = k_par + I_L
  • K_i = K_par + I_L*G
[ ... ]
It is not less secure, as you still need to solve the ECDLP to find I_L from I_L*G (=the public key, which is exposed). Assuming a weaker version of the scheme, where it's just K_i = K_par + (i*c_par)*G, and assuming there is a transaction with inputs corresponding to i values i1 and i1+1, and a transaction with values i2 and i2+1, an attacker could detect these were created by someone using the same chain code twice. This would not be possible in the multiplicative version, and also not in the real scheme where the 'tweak' I_L is computed using an HMAC.

Can you specify what information your are assuming your attacker has? Does he know c_par and is just scanning all pubkeys he gets to see for a difference of c_par*G? Or how does his "detection" work?
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1181


View Profile WWW
April 22, 2013, 09:44:08 AM
 #134

It is not less secure, as you still need to solve the ECDLP to find I_L from I_L*G (=the public key, which is exposed). Assuming a weaker version of the scheme, where it's just K_i = K_par + (i*c_par)*G, and assuming there is a transaction with inputs corresponding to i values i1 and i1+1, and a transaction with values i2 and i2+1, an attacker could detect these were created by someone using the same chain code twice. This would not be possible in the multiplicative version, and also not in the real scheme where the 'tweak' I_L is computed using an HMAC.

Can you specify what information your are assuming your attacker has? Does he know c_par and is just scanning all pubkeys he gets to see for a difference of c_par*G? Or how does his "detection" work?
[/quote]

Sure. I assume the attacker can only observe the blockchain. For any transaction with multiple inputs, he subtracts the pubkeys used in any pair of inputs. If he finds two transactions that result in the same difference, they are very likely keys that are generated from the same extended parent (the difference would be (i-i')*c_par*G. Note that he can't find c_par or i or i' itself, just a likelyhood that c_par is the same. Also note that this doesn't work in the real scheme as not i*c_par but HMAC-SHA512(key=c_par, msg=K_par || i) is used, which guarantees no tweak will be reused.

This was just a thought experiment to see how much we'd have to weaken the construction before the add-vs-multiply makes any difference.

I do Bitcoin stuff.
thanke
Member
**
Offline Offline

Activity: 104
Merit: 10


View Profile
April 22, 2013, 10:25:39 AM
 #135

Sure. I assume the attacker can only observe the blockchain. For any transaction with multiple inputs, he subtracts the pubkeys used in any pair of inputs. If he finds two transactions that result in the same difference, they are very likely keys that are generated from the same extended parent (the difference would be (i-i')*c_par*G. Note that he can't find c_par or i or i' itself, just a likelyhood that c_par is the same. Also note that this doesn't work in the real scheme as not i*c_par but HMAC-SHA512(key=c_par, msg=K_par || i) is used, which guarantees no tweak will be reused.

This was just a thought experiment to see how much we'd have to weaken the construction before the add-vs-multiply makes any difference.

How does the corresponding weaker multiplicative scheme look like that you are comparing it with?
If K_i = (i*c_par)*K_par then you get the same phenomenon, right? The difference would be (i-i')*c_par*K_par, which is recognizable if (K_par,c_par) is reused. Seems to me that additive vs multiplicative derivation is not the point. Instead it's about how i and c_par are combined. Using c_par^i instead of i*c_par looks different at first sight.
iddo
Sr. Member
****
Offline Offline

Activity: 360
Merit: 251


View Profile
April 22, 2013, 10:36:09 AM
 #136

With the additive variant, don't you get I_L*G from parent and a child pubkeys, rather than I_L itself? I'm still not sure what are the advantages of the multiplicative variant.

We actually just had a discussion about this, and I'm in favor of changing this. The proposal would be:
  • k_i = k_par + I_L
  • K_i = K_par + I_L*G

I've been holding back on changing this, as speed wasn't a priority really. However, it is a lot easier to do the required operations for public derivation in constant time, than for the multiplicative case, which means it's easier to secure against timing attacks (and we are, very intensively, dealing with private material here). Also, it means you only need almost the same primitives as needed for ECDSA in the first place.

It is not less secure, as you still need to solve the ECDLP to find I_L from I_L*G (=the public key, which is exposed). Assuming a weaker version of the scheme, where it's just K_i = K_par + (i*c_par)*G, and assuming there is a transaction with inputs corresponding to i values i1 and i1+1, and a transaction with values i2 and i2+1, an attacker could detect these were created by someone using the same chain code twice. This would not be possible in the multiplicative version, and also not in the real scheme where the 'tweak' I_L is computed using an HMAC.

Anyone objects to changing this?

Why does constant time public derivation give security against timing attacks? Isn't it the case that public derivation doesn't use the privkeys, and therefore cannot leak privkeys? Maybe you meant security against leakage of just the chaincodes? I probably simply misunderstood what you're saying there, so it'd be helpful if you explain where exactly we need protection from timing attacks.

Yes, obviously with a variant that uses K_i=K_par+(i*c_par)*G you could detect c_par*G from the sub-branch with i1,i1+2 compared to the sub-branch with i2,i2+1, and maybe you could do much worse stuff as well. The OP also uses the safe version with hash(seed|i) to derive the next values. You can see that the formal proof in post #62 fails for K_i=K_par+(i*c_par)*G because we wouldn't claim that the r_i that's used for (sk_i,pk_i)=keygen(r_i) is pseudorandom. As you mentioned in post #103, if the attacker learns I_L then it doesn't seem like such a big deal, and as mentioned here the attacker needs to solve ECDLP to learn even just I_L, so the additive variant appears to be provably secure, as far as I can see. But of course it wouldn't hurt to show the finalized BIP32 to some elliptic curves crypto experts and ask for more peer review, before including it in the Satoshi client.
thanke
Member
**
Offline Offline

Activity: 104
Merit: 10


View Profile
April 22, 2013, 11:26:56 AM
 #137

I'm not sure that I understood your use case, could you please explain the exact scenario more clearly? if you had an ancestor chaincode and pubkey, then you can derive the descendant chaincode and there's no necessity to encrypt and send it to you, and if you don't have any ancestor chaincode then an encrypted chaincode must be sent to you, no?

Maybe we do gain some security by having ci depend on Kpar, because commonly just the hashed address of Kpar is published (until the coins get spent).

Scenario is this: Server A is holding (K_par,c_par). Server B is being set-up in a different location and is supposed to hold (K_i,c_i). All chaincodes are to be kept secret by the servers holding them, i.e. c_par shall only be known to A (not B) and c_i shall only be known to A and B (no third party). It is irrelevant here who knows K_par and K_i. So B does not derive anything itself. Instead, A does the derivation of (K_i,c_i) and sends the result to B. Of course, the communication of A to B must be authenticated. The point is that on top of that at least c_i must (in the general case) be sent in encrypted form.

I am having problems answering you other questions because I'm not sure what exactly the word "you" refers to.
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1030


bits of proof


View Profile WWW
April 22, 2013, 11:39:47 AM
 #138

I suggest to set and serialize maximum i for each extended key. Such serialized string would define the wallet keys without the need of a regular backup until maximum number of keys not exceeded.

Huh? Why would regular backups be needed? The major advantage of deterministic wallets is that backups aren't needed (see here).
Assume you lost your wallet storing the unspent transactions, but you have the root key for the BIP32 generator from a cold store. Theoretically nothing is lost since you can scan the block chain for all outputs that can be spent with descendants of the root key, but practically you are not able to do this if you do not know what sequences were used.

Therefore you either have to backup the sequences used or simpler set a limit to them in beforehand that is a practicable magnitude for a scan.

As Pieter mentioned in the link that I gave you, it should be very practical to scan in mostly all cases of wallet layouts, the branching factor and depth are quite tiny.

The assumption there is that if a key with a sequence number is not on the block chain, then higher sequences were not used, right?

I think that would not hold. A shop usually offers unique addresses to receive payment for individual orders. Some orders never get paid, therefore their key will not be present in the block chain, however keys with higher sequence number might be. A reconstruction of the wallet would fail at such a gap.
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1181


View Profile WWW
April 22, 2013, 11:46:44 AM
 #139

I don't think storing the maximal branching inside the serialized format is the right way to go. That doesn't mean wallets can't keep such information separately, but the problem is that you'll always hit more cases that aren't covered. Will you have separate limits of public and private derivation? Will you limit the number of levels below a node? The specific branching factors of all of these nodes? What if someone wants to use a non-standard more complex structure? Inherently, how the tree structure is used as a wallet, and how keys are picked in it, depends on implementation anyway.

For example, say you do this, and limit only at the direct level what the largest index of a subnode is. You're giving out addresses to pay to, a unique one for each invoice. Some are never paid, so you end up with gaps. At some point, you reach the maximal index. What now? Will you switch to a new account? Create an extra tree level? Tell the user to create a new backup? All those problems occur are exactly the same for the case where you didn't know the maximum index, but did have a maximal gap. When it gets exceeded, tell the user his gap is now larger and he needs a new backup.

I say this is all implementation specific, and doesn't belong in the serialization.

I do Bitcoin stuff.
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1181


View Profile WWW
April 22, 2013, 11:54:50 AM
 #140

Why does constant time public derivation give security against timing attacks? Isn't it the case that public derivation doesn't use the privkeys, and therefore cannot leak privkeys? Maybe you meant security against leakage of just the chaincodes? I probably simply misunderstood what you're saying there, so it'd be helpful if you explain where exactly we need protection from timing attacks.

I was wrong about this, as the problem of key exposure doesn't really exist for public key derivation, where these operations are required. It doesn't hold for chaincodes either, as these aren't tweaked.

This makes me wonder whether it's still worth changing, though the implementation complexity argument still holds.

I do Bitcoin stuff.
Pages: « 1 2 3 4 5 6 [7] 8 9 10 11 12 13 14 15 16 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!