justusranvier
Legendary
Offline
Activity: 1400
Merit: 1013
|
|
December 06, 2012, 12:23:44 PM |
|
Another use case for key hierarchies is shopping sites. When I create an account on Bitmit they should assign me an extended public key that my client saves in order to create all future payment addresses. This provides two advantages:
Security: As long as one does not go undetected during the initial account setup, a MITM attack can not redirect payments to the attacker's addresses since my client already knows where to send payments.
Anonymity: To the extent possible I don't want to use outputs from different addresses as inputs in a single transaction. If I need to pay Bitmit 10 BTC but all I have are two 6 BTC outputs on different addresses I can use two transactions to complete the sale by sending 6 BTC to address k and 4 BTC to address k+1. Now it will be more difficult for a third party to examine the blockchain and determine that the same individual controls both addresses.
|
|
|
|
thanke
Member
Offline
Activity: 104
Merit: 10
|
|
December 06, 2012, 02:55:15 PM |
|
As far as key hierarchies go, do we have any significant application for it yet? I would really like to be convinced that we absolutely need them.
I listed one in an earlier post. If I'm going to buy Bitcoins on an exchange I'd like to be able to give the exchange an extended public key so that they can automatically generate new withdrawal addresses without further interaction. Similar I'd like the exchange to give me an extended public key so that my HD-aware client can automatically generate the next unique deposit address every time I want to add Bitcoins to my account. We only need a key hierarchy as in BIP32 (chaincodes on >=two levels) if we are required to derive 2nd-level chaincodes from 1st-level chaincodes because, for some reason, it is infeasible to derive 2nd-level chaincodes from 1st-level privkeys. To make it clear that we do, let's add this to your example: - your withdrawals shall all go to savings addresses for which you will not want to access the privkey (1st-level) for a certain time frame (say years)
- within that timeframe you expect to sign up with different exchanges, each of which you want to equip with its own pubkey and chaincode (2nd-level)
Now a BIP32 specific question: Why are the child chaincodes derived from both the parent pubkey and the parent chaincode rather than just from the parent chaincode alone? What about this derivation: K_n:=H(c_par||n)*K_par c_n:=H(H(c_par||n)) where H is some cryptographic hash. Looks easier to me and doesn't require 512 bit digests. Am I missing some point?
|
|
|
|
justusranvier
Legendary
Offline
Activity: 1400
Merit: 1013
|
|
December 06, 2012, 06:01:35 PM |
|
We only need a key hierarchy as in BIP32 (chaincodes on >=two levels) if we are required to derive 2nd-level chaincodes from 1st-level chaincodes because, for some reason, it is infeasible to derive 2nd-level chaincodes from 1st-level privkeys. I would imagine this is the case for any website that needs the ability to generate new 2nd-level codes on the fly, since you wouldn't want to put the private keys on a public-facing server, such as if a public donation page handed out unique extended public keys for improved anonymity.
|
|
|
|
Pieter Wuille
|
|
December 06, 2012, 06:35:27 PM |
|
Now a BIP32 specific question: Why are the child chaincodes derived from both the parent pubkey and the parent chaincode rather than just from the parent chaincode alone? What about this derivation: K_n:=H(c_par||n)*K_par c_n:=H(H(c_par||n)) where H is some cryptographic hash. Looks easier to me and doesn't require 512 bit digests. Am I missing some point?
What do you gain by not using all available data? I consider using one 512-bit hash much more elegant than doing several 256-bit hashes.
|
I do Bitcoin stuff.
|
|
|
Pieter Wuille
|
|
December 06, 2012, 06:40:43 PM |
|
If BIP 0032 doesn't include an option to have a secret as one of the inputs of the hash derivation function, then I think that this kind of mode should be added to BIP 0032. This could even be the default, if we're worried about scenarios where users aren't careful and compromise a branch in their key hierarchy. We might wish to give users the option to have different secret seeds for particular branches in the key hierarchy, and store all these seeds in the wallet file (and warn the user that he needs to backup his wallet when he creates a new seed).
If you're not interested in using any of the features of the type-2 derivation, consider the extended public key the same way you'd consider a private key. It is not observable by the network, so if you don't reveal it, it is yours alone. This essentially turns the scheme in a type-1 derivation (you derive extended privkey from extended privkey).
|
I do Bitcoin stuff.
|
|
|
thanke
Member
Offline
Activity: 104
Merit: 10
|
|
December 06, 2012, 07:41:56 PM Last edit: December 06, 2012, 07:58:59 PM by thanke |
|
Now a BIP32 specific question: Why are the child chaincodes derived from both the parent pubkey and the parent chaincode rather than just from the parent chaincode alone? What about this derivation: K_n:=H(c_par||n)*K_par c_n:=H(H(c_par||n)) where H is some cryptographic hash. Looks easier to me and doesn't require 512 bit digests. Am I missing some point?
What do you gain by not using all available data? I consider using one 512-bit hash much more elegant than doing several 256-bit hashes. Ok, this is going to be rather vague and it remains to be discussed if you gain anything. But I'll try to make some points. It's more modular to have a derivation of chaincodes alone. The derivation is basically a "hierarchical" pseudo random number generator. It takes a 256bit seed (the master chaincode) and produces a tree of random 256bit numbers (the derived chaincodes). This may have other uses. Such a hierarchical PRNG can be abstracted from this particular application. Why deal with pairs of a number and a curve point if you can just deal with numbers? We would get three trees of privkeys, pubkeys and random numbers, respectively. The first two trees are obtained by multiplying their root with the third tree, from the top down. I find this easier to comprehend than one tree of privkeys and one tree of pairs. I think of a pubkey as something that can always be made public. A chaincode you may want to keep secret under certain circumstances. Philosophically, its counter-intuitive to combine them into a pair, if they may get separated upon export anyway. Not sure if there are applications, but maybe someone wants use the same keypair with more than one chaincode. Then it seems natural to separate them. What about using the same chaincode tree with more than one master keypair? If chaincodes are to be kept secret then pushing the chaincodes to all sub-level entities becomes an issue (=non-trivial work). Suppose the root entity wants to update its keypair. Having them separated there is no need to re-distribute chaincodes. Just publish the new master pubkey and the entities on sub-levels can recursively update their pubkeys (level by level). SHA256 is already used. Re-using it instead of SHA512 is one less primitive and less code to rely on.
|
|
|
|
Pieter Wuille
|
|
December 16, 2012, 04:45:14 PM |
|
I agree there it's certainly neat to see the chain code derivation as separate, but without clear use case, I think it just complicates matters - both for implementations and for humans who need to learn how it works.
The current proposal has the advantage that keys always (assuming no collisions) come in triples: (privkey, pubkey, chaincode), so you only need a single identifier per key. Separating the chain code means that you need two identifiers, and both can be combined arbitrarily.
I don't really see what you mean by "to combine them into a pair, if they may get separated upon export anyway" - there's just an extended private key and an extended public key. If you never export an extended public key, the whole scheme works as a type-1 derivation. Sure, for actual leaf nodes the extended part isn't used, but as the chain code never gets revealed to the network, this shouldn't harm.
|
I do Bitcoin stuff.
|
|
|
hackjealousy
Newbie
Offline
Activity: 53
Merit: 0
|
|
December 19, 2012, 10:01:48 PM |
|
Anonymity: To the extent possible I don't want to use outputs from different addresses as inputs in a single transaction. If I need to pay Bitmit 10 BTC but all I have are two 6 BTC outputs on different addresses I can use two transactions to complete the sale by sending 6 BTC to address k and 4 BTC to address k+1. Now it will be more difficult for a third party to examine the blockchain and determine that the same individual controls both addresses.
This is a use-case that isn't captured in BIP 0032 and I think should be. The "main" way we associate addresses with a single identity is when those addresses are used as multiple inputs in a transaction. We certainly could form separate transactions while using a single input for each but if all those transactions are being sent to the same address at roughly the same time, the assumption could still be made that those addresses were all associated with a single identity. With BIP 0032, the client could instead generate multiple transactions to unique derived public addresses. This would go a significant way towards removing this particular information leak.
|
|
|
|
grau
|
|
December 20, 2012, 07:42:33 AM |
|
Anonymity: To the extent possible I don't want to use outputs from different addresses as inputs in a single transaction. If I need to pay Bitmit 10 BTC but all I have are two 6 BTC outputs on different addresses I can use two transactions to complete the sale by sending 6 BTC to address k and 4 BTC to address k+1. Now it will be more difficult for a third party to examine the blockchain and determine that the same individual controls both addresses.
This is a use-case that isn't captured in BIP 0032 and I think should be. The "main" way we associate addresses with a single identity is when those addresses are used as multiple inputs in a transaction. We certainly could form separate transactions while using a single input for each but if all those transactions are being sent to the same address at roughly the same time, the assumption could still be made that those addresses were all associated with a single identity. With BIP 0032, the client could instead generate multiple transactions to unique derived public addresses. This would go a significant way towards removing this particular information leak. I understand that anonymity is a concern for quite a few users or transaction types, but that comes at a cost to the system since all solutions that try to avoid recombining inputs imply that coins will be fragmented to exponentially growing key set. That might not be a problem for the individual but is already a significant and growing issue for miner (and transaction validating clients) since lookups for unspent output have to work on an ever (and exponentially) increasing set. What about bargaining the interest for anonymity of the individual with that of the system by requiring that minimal transaction fee increases with growing imbalance of number of inputs and outputs ? That is: miner would accept transactions that aggregate inputs at lower fee and ask for proportionally higher fee if number of outputs exceeds that of the inputs?
|
|
|
|
Mike Hearn
Legendary
Offline
Activity: 1526
Merit: 1134
|
|
December 20, 2012, 10:38:59 AM |
|
hackjealousys use case is indeed one we were considering during the design of the payment protocol (ie, we already thought of this and would like to support it in future, at least in some wallets).
Performance of the core node has improved dramatically lately. I think it's worth using some of that for better privacy. It doesn't mean "never combine addresses" it just means wallets should try and impose an upper bound on the size of coins in its wallet. If I buy a Mars bar with a $100 bill, well, the cashier knows I have some money, but this is hardly a privacy leak worth worrying about. If I were to buy a Mars bar with a $100,000 bill (imagine it exists), that's a very severe privacy leak. So trying to ensure I get paid with outputs that don't exceed 10 BTC or so might make sense, so the size of change outputs can be limited.
|
|
|
|
justusranvier
Legendary
Offline
Activity: 1400
Merit: 1013
|
|
December 20, 2012, 12:10:26 PM |
|
If I buy a Mars bar with a $100 bill, well, the cashier knows I have some money, but this is hardly a privacy leak worth worrying about. With paper currency there isn't a permanent public record of every bill you've ever owned. The severity of an information leak here has nothing to do with the amounts being spent. Combining outputs makes connections which make it easier to identify spending.
|
|
|
|
grau
|
|
December 20, 2012, 12:26:42 PM |
|
While I understand your concern for privacy, this should be an option and not default, since a cost to the system. Merchants may actually opt for re-using the same account frequently to have simpler audit.
|
|
|
|
justusranvier
Legendary
Offline
Activity: 1400
Merit: 1013
|
|
December 20, 2012, 01:10:39 PM |
|
Merchants may actually opt for re-using the same account frequently to have simpler audit.
Is this hypothetical merchant who is reusing a single receiving address doing all his accounting on an abacus?
|
|
|
|
Pieter Wuille
|
|
December 20, 2012, 01:14:14 PM |
|
The "main" way we associate addresses with a single identity is when those addresses are used as multiple inputs in a transaction. We certainly could form separate transactions while using a single input for each but if all those transactions are being sent to the same address at roughly the same time, the assumption could still be made that those addresses were all associated with a single identity.
With BIP 0032, the client could instead generate multiple transactions to unique derived public addresses. This would go a significant way towards removing this particular information leak.
If you want people to be able to pay using several transactions/outputs to you (a good thing to do, imho), you should give them multiple addresses/outputscripts. The payment protocol that is being developed supports this, and this is the right way to implement that. Whether you generate those addresses from a single BIP32 chain, or randomly generate them all is of no concern.
|
I do Bitcoin stuff.
|
|
|
Mike Hearn
Legendary
Offline
Activity: 1526
Merit: 1134
|
|
December 20, 2012, 01:33:28 PM |
|
Well technically the payment protocol just gives outputs and allows users to submit multiple transactions in payment. It doesn't tell the wallet whether it actually should craft multiple transactions. Right now the spec leaves that open. A future extension may make sense whereby you can provide each output with a transaction index, to tell wallets which outputs to assign to which transactions.
|
|
|
|
grau
|
|
December 20, 2012, 02:46:18 PM |
|
Merchants may actually opt for re-using the same account frequently to have simpler audit.
Is this hypothetical merchant who is reusing a single receiving address doing all his accounting on an abacus? It is how the world outside Bitcoin works. Companies use a limited number of accounts to reduce audit effort. This is an other use case, probably not yours, but that does not mean it is wrong.
|
|
|
|
justusranvier
Legendary
Offline
Activity: 1400
Merit: 1013
|
|
December 20, 2012, 03:34:38 PM |
|
It is how the world outside Bitcoin works. Companies use a limited number of accounts to reduce audit effort. What is an "account", if not an arbitrary grouping of transactions? What happens if you call an extended public key (defined in BIP32) as an "account"? The extended public key is a single unique identifier so there's no reason not to consider the root address and all its children to be a single "account". That it can be subdivided into a hierarchy is just an implementation detail.
|
|
|
|
grau
|
|
December 20, 2012, 03:56:09 PM |
|
It is how the world outside Bitcoin works. Companies use a limited number of accounts to reduce audit effort. What is an "account", if not an arbitrary grouping of transactions? What happens if you call an extended public key (defined in BIP32) as an "account"? The extended public key is a single unique identifier so there's no reason not to consider the root address and all its children to be a single "account". That it can be subdivided into a hierarchy is just an implementation detail. Yes, an account is just a grouping of transactions. The only grouping currently commonly implemented is by address. Assuming BIP32 is implemented, then the audit would have to include every address for which public key derivable from the extended public key. But, how would an auditor tell if he/she has to check transactions only for the n-th key but not check if there are any key n+1 or even key n + m?
|
|
|
|
thanke
Member
Offline
Activity: 104
Merit: 10
|
|
December 20, 2012, 06:39:47 PM Last edit: December 20, 2012, 06:50:47 PM by thanke |
|
I don't really see what you mean by "to combine them into a pair, if they may get separated upon export anyway" - there's just an extended private key and an extended public key. If you never export an extended public key, the whole scheme works as a type-1 derivation. Sure, for actual leaf nodes the extended part isn't used, but as the chain code never gets revealed to the network, this shouldn't harm.
I. Pubkey and chaincode serve different purposes and their handling requires different security measures: the pubkey is simply a pubkey and can be publicised, the chaincode protects the anonymity of all derived keys and is (usually) kept secret. I would expect that they often do get separated. Someone may want to post the pubkey on a website or hand it out to other people, but would probably not do that with the chaincode. So he will "export" only the pubkey from his bitcoin client, not the chaincode. Or, conversely, if he decides his watching-only bitcoin client is too vulnerable, he will "delete" the chaincode from the wallet. That's what I meant with "they may get separated", and questioned whether they should be considered a pair (extended pubkey) in the first place. This is just a remark and, as you say, "without a use case", not yet a reason to change anything. However, the matter becomes more serious if you start out with your key triple and derive a key triple for an agent of yours. Then you have to secure your chaincode from your agent just as well as your privkey (otherwise the agent can recover your privkey from your chaincode as his privkey). So you will keep your extended privkey secure, but you certainly want to erase the chaincode from the extended pubkey that maybe stored in a vulnerable place. So after all I think it depends on the use case whether the chaincode is stored along with the privkey or with the pubkey. Creating extended pubkeys first and then erasing half of it is not very elegant. Maybe the three things should be separate from the beginning? II. The point I made about "updating pubkeys along the entire tree" can be backed up by a use case, I think. My use case is a bit biased, because I am thinking of pubkeys not only as payment addresses, but also as identities or pseudonyms. I am also assuming that every node in the key tree represents a separate entity, and that entities do not trust their children. Let's think of the nodes as branches and sub-branches or agents of some company, where each node is establishing it's identity via the pubkey. So the use case is this: Each entity (=node) wants to be in control of all derived entities, i.e. wants to be able to derive their keypairs, that's why a hierarchical scheme is used. We want anonymity, i.e. derived pubkeys cannot be linked to it's own pubkey, that's why chaincodes are used. When a sub-entity is created we tell the sub-entity its pubkey and chaincode (=extended pubkey). "Telling" in practice means to send it in encrypted form. Each entity will then separate it's chaincode from the pubkey: the pubkey is public because it is the identity and the chaincode is secured. Now suppose the root decides to update its pubkey, or to establish a second pubkey besides the first one. With extended pubkeys we have to redo all the work, i.e. we need an encrypted communication from each entity to each of its children. But with pubkeys and chaincodes in independent trees it is easier: each entity publishes its new pubkey, the children see it and compute their own new pubkey. Note that this "advantage" applies only to the derived pubkeys, the derived privkeys still require the encrypted communication. I admit this use case is quite abstract. I'm still mentioning it here, maybe someone can make more sense of it?!
|
|
|
|
iddo
|
|
February 01, 2013, 03:53:00 PM Last edit: February 01, 2013, 05:26:32 PM by iddo |
|
I think that there's some confusion here, type-2 should always be more secure than type-1, for any kind of user.
With type-1, the secret seed derives the actual privkeys, therefore if the secret seed is compromised then all of your privkeys leak.
With type-2, if the hash derivation function includes the secret seed as one of the concatenated inputs (as in the OP of this thread), then the secret seed in itself can only derive pubkeys, and unless the attacker also gains access to the additional (master) privkey secret, all your privkeys are still provably secure.
The disadvantage of having the secret seed as one of the concatenated inputs of the derivation function is that the 3rd-party couldn't generate new pubkeys on its own, without knowing the secret seed. But the other advantages of type-2 still hold, namely that the owner of the wallet can generate new pubkeys without accessing (decrypting) his privkeys, and that for backup purposes you only need to store your master privkey and secret seed (this is an advantage over the regular random-independent wallet, not over type-1 wallet).
If BIP 0032 doesn't include an option to have a secret as one of the inputs of the hash derivation function, then I think that this kind of mode should be added to BIP 0032. This could even be the default, if we're worried about scenarios where users aren't careful and compromise a branch in their key hierarchy. We might wish to give users the option to have different secret seeds for particular branches in the key hierarchy, and store all these seeds in the wallet file (and warn the user that he needs to backup his wallet when he creates a new seed).
If you're not interested in using any of the features of the type-2 derivation, consider the extended public key the same way you'd consider a private key. It is not observable by the network, so if you don't reveal it, it is yours alone. This essentially turns the scheme in a type-1 derivation (you derive extended privkey from extended privkey). Part of what I wrote above is nonsense I think, regarding the supposed "disadvantage", because I don't really see a practical use case where some 3rd-party should generate new pubkeys for someone elses wallet. So please disregard that. Also, I didn't understand why you consider what you said to be type-1, but nevermind. My basic concern was that if somehow one privkey leaks, then the attacker could also steal your other privkeys, unlike the case with the currently used random-independent wallets (unless the attacker doesn't know the secret seed that's used for deriving the deterministic keys). I see that you also added a remark regarding this on December 20 in the wiki ( link), and now that I've read and understood BIP32 I see why that remark is true. I see in BIP32 that the secret seed is indeed used in the derivations, i.e. by using the chaincodes, though perhaps there isn't enough emphasis there that we should regard the chaincodes as secrets. Specifically, BIP32 doesn't explicitly say which data of the wallet should be encrypted, and one straightforward way to handle the above concern is to have two (optional) passwords for the HD wallet, one major password that encrypts all the privkeys, and another minor password that encrypts all the chaincodes. When the user wishes to spend his coins, he has to provide his major password. When the user needs to create new pubkey addresses, he has to provide his minor password (if he opted to have one). It'd be reasonable for users to select a weak minor password that they can type quickly, and also as written in the wiki the client can keep a pool of N look-ahead keys cached, so having the minor password wouldn't be too burdensome. Do you think that allowing two passwords is a good idea? BTW, I noticed that in BIP32 (unlike in the OP) the derived pseudorandom number multiplies the privkey, instead of being added to the privkey. Is it just to have more efficient calculation than K_par+G*I_L for the pubkey? This method also permutes over the entire space, so it should be just as good? And why does the serialized format stores the fingerprint of the parent?
|
|
|
|
|