Bitcoin Forum
May 06, 2024, 12:58:54 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: 1 2 3 [All]
  Print  
Author Topic: The case for moving from a 160 bit to a 256 bit Bitcoin address  (Read 6878 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 01, 2017, 04:53:00 AM
Last edit: May 01, 2017, 11:50:53 AM by BurtW
 #1

The birthday problem can be applied to estimate the probability of a collision given the size of a hash function.

The Taylor series expansion of the exponential function can be used to estimate the probability of a collision as follows:

    p(n, d) = 1 - e-n(n - 1)/(2d)

where n is the number of hashes and d is the total number of possible hashes.

Given a hash function size of h in bits and a number of Bitcoin addresses generated 2x this can be simplified as follows:

    p(n, d) = 1 - e-2x(2x - 1)/(2(2h))

               = 1 - e-(22x - 2x)/(2h + 1)

               = 1 - e-[22x/(2h + 1) - 2x/(2h + 1)]

               = 1 - e-(22x - h - 1 - 2x - h - 1)

Within the limits of my Excel spreadsheet for h = 160 the probability of a collision for various values of x is:

Quote
 x   p(2x, 2160)
83   1
82   0.999664537
81   0.864664717
80   0.39346934
79   0.117503097
78   0.030766766
77   0.007782062
76   0.001951219
75   0.000488162
74   0.000122063
73   3.05171E-05
72   7.62937E-06
71   1.90735E-06
70   4.76837E-07
69   1.19209E-07
68   2.98023E-08
67   7.45058E-09
66   1.86265E-09
65   4.65661E-10
64   1.16415E-10
63   2.91038E-11
62   7.27596E-12
61   1.81899E-12
60   4.54747E-13
59   1.13687E-13
58   2.84217E-14
57   7.10543E-15
56   1.77636E-15
55   0
This yields the interesting result that after "using up" only 283 out of our 2160 Bitcoins addresses we will almost certainly have one or more address collisions.  In fact we really only want to "use up" about 255 Bitcoin addresses in order to keep the probability of a collision very near to zero (within the limits of the estimation function, my simplification, and the resolution abilities of the Excel spreadsheet).

How fast can we use up or generate 255 addresses?  

Well as of this posting the large bitcoin collider project has already sequentially searched the first 252.19 Bitcoin addresses and is on track to search the first 254.46 addresses within a year.  Given some serious resources to attack the problem, orders of magnitude more addresses could be generated and checked per year.

This thread is not about the merits or failing of the LBC so posts related to the LBC will be deleted.  I only use the maximum key generation rate of the LBC as an example of how fast keys can be created.

So far the LBC has had a peak key generation rate of about 2.5 x 109 key pairs per second.  That is about

   (3.154 x 107)(2.5 x 109) = 7.885 x 1016

   or approximately 256 key pairs per year.

If we were to move from the current 160 bit Bitcoin address to a new system that would fully utilize all 256 bit available for Bitcoin addresses the situation is much better:

Quote
   x   p(2x, 2256)
131   1
130   0.999664537
129   0.864664717
128   0.39346934
127   0.117503097
126   0.030766766
125   0.007782062
124   0.001951219
123   0.000488162
122   0.000122063
121   3.05171E-05
120   7.62937E-06
119   1.90735E-06
118   4.76837E-07
117   1.19209E-07
116   2.98023E-08
115   7.45058E-09
114   1.86265E-09
113   4.65661E-10
112   1.16415E-10
111   2.91038E-11
110   7.27596E-12
109   1.81899E-12
108   4.54747E-13
107   1.13687E-13
106   2.84217E-14
105   7.10543E-15
104   1.77636E-15
103   0
In this case we should be able to safely use up to 2103 Bitcoin addresses before collisions would be a problem.

I think we should discuss phasing out the old 160 bit Bitcoin address version and replacing it with a new 256 bit address version.

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
1714957134
Hero Member
*
Offline Offline

Posts: 1714957134

View Profile Personal Message (Offline)

Ignore
1714957134
Reply with quote  #2

1714957134
Report to moderator
1714957134
Hero Member
*
Offline Offline

Posts: 1714957134

View Profile Personal Message (Offline)

Ignore
1714957134
Reply with quote  #2

1714957134
Report to moderator
1714957134
Hero Member
*
Offline Offline

Posts: 1714957134

View Profile Personal Message (Offline)

Ignore
1714957134
Reply with quote  #2

1714957134
Report to moderator
"Governments are good at cutting off the heads of a centrally controlled networks like Napster, but pure P2P networks like Gnutella and Tor seem to be holding their own." -- Satoshi
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714957134
Hero Member
*
Offline Offline

Posts: 1714957134

View Profile Personal Message (Offline)

Ignore
1714957134
Reply with quote  #2

1714957134
Report to moderator
1714957134
Hero Member
*
Offline Offline

Posts: 1714957134

View Profile Personal Message (Offline)

Ignore
1714957134
Reply with quote  #2

1714957134
Report to moderator
1714957134
Hero Member
*
Offline Offline

Posts: 1714957134

View Profile Personal Message (Offline)

Ignore
1714957134
Reply with quote  #2

1714957134
Report to moderator
arulbero
Legendary
*
Offline Offline

Activity: 1915
Merit: 2074


View Profile
May 01, 2017, 07:21:51 AM
 #2

The birthday problem can be applied to estimate the probability of a collision given the size of a hash function.

.........

In this case we should be able to safely use up to 2103 Bitcoin addresses before collisions would be a problem.

Another interesting thing is: how many private keys I have to check to get all addresses? There are "only" 2^160 addresses and 2^256 private keys, so there are 2^96 private keys for each address.

Quote
Expected Number of Hashes until all Slots of a Hash Table Are Occupied.  The expected
number of items needed to fill all slots of a hash table of size k is between klnk+1/2k and klnk+k.

# private keys = # items = n
# addresses =  # slots = k


To get all k = 2^160 addresses, it is enough to use k*lnk+k  = 160*2^160 + 2^160 = 161*2^160 = 2^(167.32)  keys.

If you use only 2^160 private keys, you will have about 63% of all addresses:

Quote
Expected Number of Empty Slots in Hash Table. In hashing n items into a hash table with
klocations, the expected number of empty locations is k(1−1/k)^n.

Infact:  # empty locations (addresses never generated) = 2^160 (1-1/2^160)^(2^160) = 2^160 * 1/e = 0.37*2^160.


You could find other interesting formulas like this one:
Quote
The Expected Number of Collisions in Hashing. In hashing n items into a hash table with
k locations, the expected number of collisions is n−k+k(1−1/k)^n.

here --> https://math.dartmouth.edu/archive/m19w03/public_html/Section6-5.pdf


What is the maximum key generation rate you have every recorded, even for a short period of time?
I remember about 2500 Mkeys/s.
rico666
Legendary
*
Offline Offline

Activity: 1120
Merit: 1037


฿ → ∞


View Profile WWW
May 01, 2017, 07:37:51 AM
 #3

Another interesting thing is: how many private keys I have to check to get all addresses? There are "only" 2^160 addresses and 2^256 private keys, so there are 2^96 private keys for each address.

Nope. There are 2^97 private keys for each address. Proof is left to the mathematician.


Quote
To get all k = 2^160 addresses, it is enough to use k*lnk+k  = 160*2^160 + 2^160 = 161*2^160 = 2^(167.32)  keys.

If you use only 2^160 private keys, you will have about 63% of all addresses:

I found myself often forgetting the 1 key -> 2 addresses fact (if you are computing compressed and uncompressed pubkeys from a privkey). You sure you considered that?

What is the maximum key generation rate you have every recorded, even for a short period of time?
I remember about 2500 Mkeys/s.

Yes - somewhere between 2.3 - 2.5 Gkeys/s - interesting part is, that 1.9 Gkeys/s of that done by one man.


all non self-referential signatures except mine are lame ... oh wait ...   ·  LBC Thread (News)  ·  Past BURST Activities
arulbero
Legendary
*
Offline Offline

Activity: 1915
Merit: 2074


View Profile
May 01, 2017, 07:44:35 AM
 #4

Another interesting thing is: how many private keys I have to check to get all addresses? There are "only" 2^160 addresses and 2^256 private keys, so there are 2^96 private keys for each address.

Nope. There are 2^97 private keys for each address. Proof is left to the mathematician.


Ok, for the sake of semplicity I didn't want to talk here about compressed/uncompressed keys ...
BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 01, 2017, 11:21:54 AM
Last edit: May 01, 2017, 11:36:36 AM by BurtW
 #5

this thread has a lot of hot air,  segwit p2wsh uses 256-bit hashes, and there is a 256bit address format proposed.

Is the thread over now?
gmaxwell:  Thanks Greg for your answer.  Sorry I accidentally deleted your post in my LBC delete fest.

LBC:  This thread was never intended to be about LBC.  I wish I had never mentioned it at all.  I have gone back and deleted the posts specifically about the merits/failings of LBC.

The thread is over unless anyone has comments about the application of the birthday problem to the hash space of Bitcoin.

Specifically, is the math presented in the OP correct?  Is this a real concern?

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 01, 2017, 12:15:20 PM
Last edit: May 01, 2017, 12:58:49 PM by BurtW
 #6

There are only 2,100,000,000,000,000 satoshis.

Therefore the maximum number of Bitcoins addresses that can contain value at any one time is only about 251 and that assumes only one satoshi per address.

At the time the block chain contains 255 transactions it would be somewhere between a petabyte and an exabyte in size.  So a full node would need a multi petabyte drive to store the entire block chain.

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
jonald_fyookball
Legendary
*
Offline Offline

Activity: 1302
Merit: 1004


Core dev leaves me neg feedback #abuse #political


View Profile
May 01, 2017, 02:28:15 PM
 #7

this thread has a lot of hot air,  segwit p2wsh uses 256-bit hashes, and there is a 256bit address format proposed.

Is the thread over now?
gmaxwell:  Thanks Greg for your answer.  Sorry I accidentally deleted your post in my LBC delete fest.

LBC:  This thread was never intended to be about LBC.  I wish I had never mentioned it at all.  I have gone back and deleted the posts specifically about the merits/failings of LBC.

The thread is over unless anyone has comments about the application of the birthday problem to the hash space of Bitcoin.

Specifically, is the math presented in the OP correct?  Is this a real concern?

Let's say segwit shall not pass... We would need a new proposal, right?

Also, I think any new address format should be fully backwards compatible.   

dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 01, 2017, 03:01:05 PM
 #8

There are only 2,100,000,000,000,000 satoshis.

Therefore the maximum number of Bitcoins addresses that can contain value at any one time is only about 251 and that assumes only one satoshi per address.

At the time the block chain contains 255 transactions it would be somewhere between a petabyte and an exabyte in size.  So a full node would need a multi petabyte drive to store the entire block chain.

That's a smart point ! If we can count on the fact that there will never be a hard fork changing this total, of course.

I always wondered why Satoshi applied Ripemd-160 after SHA-256 for the address, that seemed inconsistent to me.  One can say that this was meant to save some room, but there are many other occasions where a lot of room has been wasted.  For instance, the fact of posting, in a transaction input, as well the signature as the public key is redundant: one can recover the public key from the signature and the message, which would save the room for the key (256 bits in compressed format).  Also, it is strange to have a 256 bit transaction hash: it is hard to think that one would need 256 bits for the transaction, and only 160 bits for the address.

To me, there were 2 possible logical values for the address size: 128 bits, or 256 bits.  128 bits, because this is the intrinsic security level of the ECDS with 256 bit keys ; and 256 bits because this is the total entropy in the key.  160 bits is a strange compromise.  The idea that there could have been a back door in SHA-256, and that RIPEMD-160, of different origin, might have another back door, but both of them together, are "safe" sounds weird, and not a good reason to change the number of bits.  One could have used 2 times RIPEMD-160 on 128 bit parts of the 256 bit hash if that were the problem, to keep 256 bits.

As the number of satoshis is 2^51, in fact, there can, as you rightly point out, never be more than 2^51 valid UTXO at a given moment, so 128 bits would in fact be enough ; but given the wasting of room with the 256 bit transaction hash, and the publishing of both signature and key, keeping the full entropy of the key with a 256 bit address would have been the maximally safe option. 

BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 01, 2017, 03:40:28 PM
 #9

Unless it is directly related to the subject of this thread, for example the fact there can currently be only 3 transactions per second so how long would it take to burn through 255 addresses at only 3 TPS, any posts that stray into "block size debate" territory will be immediately deleted.  There are enough threads about that subject already.

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 01, 2017, 03:55:28 PM
 #10

That's a smart point ! If we can count on the fact that there will never be a hard fork changing this total, of course.

Assuming we stick to the 64 bit size for a value, keep the 21 million Bitcoin cap, and we expand to the maximum number of decimal points offered by a 64 bit number we could comfortably expand the resolution to

264 = 18,446,744,073,709,551,616
      =   2,100,000,000,000,000,000 "millisatoshis"

So a factor of 1000 or about 210

So that would lead to about 261 addresses containing a millisatoshi each.

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
jonald_fyookball
Legendary
*
Offline Offline

Activity: 1302
Merit: 1004


Core dev leaves me neg feedback #abuse #political


View Profile
May 01, 2017, 03:58:27 PM
 #11


I always wondered why Satoshi applied Ripemd-160 after SHA-256 for the address, that seemed inconsistent to me. 


Seems like a good post here on that: https://bitcoin.stackexchange.com/questions/9202/why-does-bitcoin-use-two-hash-functions-sha-256-and-ripemd-160-to-create-an-ad

dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 01, 2017, 04:15:04 PM
 #12


I always wondered why Satoshi applied Ripemd-160 after SHA-256 for the address, that seemed inconsistent to me.  


Seems like a good post here on that: https://bitcoin.stackexchange.com/questions/9202/why-does-bitcoin-use-two-hash-functions-sha-256-and-ripemd-160-to-create-an-ad

Yes, I know these arguments, but they don't hold water (well, they are correct, but don't justify anything).  They at best answer the argument "why did Satoshi didn't use RIPEMD-160 alone".  The question is why didn't he stick to just SHA-256, and kept a 256 bit address.  And if he really wanted to decrease the number of bits from 256 to 160 (but why on earth ? - to save space ?  then why waste a lot of space in the signature + key ?), he could simply keep the first 160 bits of the hash.  

Let's go through a few arguments that are given there.
1) the "interaction between ECDS and hash function".  That's totally ridiculous, because given that the secret key is a random number, the public key is a random number too (there's a bijection between both, it is a cyclic group).  So there's no "interaction" between a random number, and the hash function.  But if ever there was a "funny interaction" between ECDS and the first hash function (which makes no sense as I just outlined: it's a random number), say, that it always gives one of the 10 same hashes, then applying a second hash function is not going to solve it, on the contrary.  Talking about interactions, by cascading things, you can only *increase* the chance of an undesirable interaction.

2) the "double hashing to avoid the extension attack" in this case is also weird, as the original key is only 512 bits (in long format) and hence, there's no need for the application of Merkle-Damgard, as one single block hash primitive has to be called, and there's no "extension" possible.  In other words, this is a protection against a kind of attack that is totally irrelevant in this case.

But the strangest of all these "protections" is the following: bitcoin allows address re-usage.  This could have been prohibited in the protocol (a second transaction with the same address in an OUTPUT would be invalid, in the same way as a double spent input is invalid).  But it isn't.   As such, this super duper protection of the *public* key with convolved hashes, is totally moot if ONE UTXO of a given address is spent, because then the public key is known.  All other UTXO of the same address (in different transactions most probably) have their public key exposed, and only the safety of ECC is protecting them, without hashes.   It is astoundingly weird to go through so much hassle to avoid the possibility to discover the public key from the address, just to allow you to publish that public key for that same address, which can still be used in other UTXO.

BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 01, 2017, 04:51:47 PM
 #13

Historically wasn't the original design to send to the public key as the destination - there was no hash in the original design?

Then, in order to save space, the whole hashing to a public address was added on top?

That was my understanding - which is often flawed when it comes to esoteric history.

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
jonald_fyookball
Legendary
*
Offline Offline

Activity: 1302
Merit: 1004


Core dev leaves me neg feedback #abuse #political


View Profile
May 01, 2017, 05:19:02 PM
 #14

Historically wasn't the original design to send to the public key as the destination - there was no hash in the original design?

Then, in order to save space, the whole hashing to a public address was added on top?

That was my understanding - which is often flawed when it comes to esoteric history.

My understanding was that there's additional benefits to hashing the pubkey.  For example, if a wallet implementation was flawed and used a bad 'k value'. 

jonald_fyookball
Legendary
*
Offline Offline

Activity: 1302
Merit: 1004


Core dev leaves me neg feedback #abuse #political


View Profile
May 01, 2017, 05:38:03 PM
 #15


I always wondered why Satoshi applied Ripemd-160 after SHA-256 for the address, that seemed inconsistent to me.  


Seems like a good post here on that: https://bitcoin.stackexchange.com/questions/9202/why-does-bitcoin-use-two-hash-functions-sha-256-and-ripemd-160-to-create-an-ad

Yes, I know these arguments, but they don't hold water (well, they are correct, but don't justify anything).  They at best answer the argument "why did Satoshi didn't use RIPEMD-160 alone".  The question is why didn't he stick to just SHA-256, and kept a 256 bit address.  And if he really wanted to decrease the number of bits from 256 to 160 (but why on earth ? - to save space ?  then why waste a lot of space in the signature + key ?), he could simply keep the first 160 bits of the hash.  

Let's go through a few arguments that are given there.
1) the "interaction between ECDS and hash function".  That's totally ridiculous, because given that the secret key is a random number, the public key is a random number too (there's a bijection between both, it is a cyclic group).  So there's no "interaction" between a random number, and the hash function.  But if ever there was a "funny interaction" between ECDS and the first hash function (which makes no sense as I just outlined: it's a random number), say, that it always gives one of the 10 same hashes, then applying a second hash function is not going to solve it, on the contrary.  Talking about interactions, by cascading things, you can only *increase* the chance of an undesirable interaction.


I guess they are saying that using the unhashed public key it would be more feasible to design some collision attack, based on its contents, rather than there being a small number of possible outputs.   Makes sense on the surface as a fail-safe precaution.



Quote
2) the "double hashing to avoid the extension attack" in this case is also weird, as the original key is only 512 bits (in long format) and hence, there's no need for the application of Merkle-Damgard, as one single block hash primitive has to be called, and there's no "extension" possible.  In other words, this is a protection against a kind of attack that is totally irrelevant in this case.

I'm not following you at all.  Why would 'only 512 bits' make the extension attack not work?  I don't deeply understand how this attack works but why would the attack not be possible?



dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 01, 2017, 05:39:04 PM
 #16

Historically wasn't the original design to send to the public key as the destination - there was no hash in the original design?

Then, in order to save space, the whole hashing to a public address was added on top?

That was my understanding - which is often flawed when it comes to esoteric history.

I have to say I don't know if that was historically so.  But the argument "to save space" doesn't hold water in the actual implementation, because of these reasons:

1) if the public key were published instead of its hash, it wouldn't be necessary to publish it *a second time* during the signature: now people publish first the hash, and next, the public key when signing.  If the address were directly the public key, one would win the hash room, because there wouldn't be any need to publish the key AGAIN.

2) the first argument actually falls down, because from the signature and the text, one can DERIVE the public key.  In that way, the hash COULD win some room, because the public key never needs to be published: from the signature, one can derive it, calculate the hash, and verify whether it corresponds to the address.  But this cannot be the reason, because this feature (the fact that one can calculate the public key from the signature) is not used.

3) there's a lot of other "wasting room" on the transaction, and the most important one to me is the hash of 256 bits indicating the transaction.  This is overkill.   There is also the 32-bit number indicating the "output number of the transaction" as if transactions would contain 4 billion outputs.   This is wasted space.

In fact, the re-use of an address is what makes "indicating the transaction" necessary.  If an address occurred only once, there wouldn't be any need to specify a "transaction hash": given that an address can only correspond to one single past output, that output (and its corresponding transaction) would be uniquely determined by the signature (and hence the corresponding public key, and hence the corresponding address).  So there wouldn't be any need, nor to specify the transaction hash (256 bits saved) nor the output number (32 bits saved).

So I have a hard time imagining that the 100 or so bits saved by the 160 bit hash, instead of the 256 bit key, was a serious reason.

However, however, there is an extremely good reason to use a hash of a public key that is not published.  That reason is: if ever the ECC crypto fails, one can still use a hash-based signature scheme (even if it is just to switch to another cryptographic signing system).  The "public key" is then the "secret key" behind the hash.

==> but this scheme fails miserably if one has multiple use of the same address, because if one has used one of these in the past, then the "secret public key" is now public.

So it looks to me that the whole cryptographic design of this "hashed public key" is quite messy, even though it works.  Whenever one finds a "design rationale" in one direction, there's something else that screws this:
1) winning room on the chain --> screwed by wasting room elsewhere
2) protecting public key in case ECC fails --> screwed by address multiple usage possible

That same multiple address usage also wastes a lot of space, because of the need of indicating which transaction and which output number which have way too large bit consumption for their actual needs.
dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 01, 2017, 07:48:16 PM
 #17

I guess they are saying that using the unhashed public key it would be more feasible to design some collision attack, based on its contents, rather than there being a small number of possible outputs.   Makes sense on the surface as a fail-safe precaution.

I don't know exactly what you mean here.  "interaction between ECDS and hash function" sounds totally weird, because it would mean that *inadvertedly* the hash function would a) suffer from a reduced output space if its inputs are public ECDS keys or b) would accidentally expose the private key, or make that easier.
Both arguments don't hold water, because:
a) the private key being a random number, and the relationship private->public key being a bijection (an exponentiation in a cyclic group of prime order) the public key is just as random as any random number, so there's no "structure" in the public key.
b) if a given hash function were to solve accidentally, even partially, the discrete log problem of ECDS, that would be a monumental exploit !!

I wasn't arguing for using the direct public key (although one could consider that).  I was arguing for using the SHA-256 hash of the public key directly, without channelling it through RIPEMD-160, and to use the full 256 hash bits.

Quote
Quote
2) the "double hashing to avoid the extension attack" in this case is also weird, as the original key is only 512 bits (in long format) and hence, there's no need for the application of Merkle-Damgard, as one single block hash primitive has to be called, and there's no "extension" possible.  In other words, this is a protection against a kind of attack that is totally irrelevant in this case.

I'm not following you at all.  Why would 'only 512 bits' make the extension attack not work?  I don't deeply understand how this attack works but why would the attack not be possible?

Well, the extension attack is of no meaning here, for different reasons, but the extension attack comes about when one knows the hash of a given document.  In the Merkel-Damgard construction, the document to be hashed it sliced into blocks of 512 bits, and the hash output of the first block is fed into the hash input of the second block and so on.  So if you know the final hash, you can add a bloc with your own content, irrespective of what was the hashed content up to that point, and use the published hash and your new block, to present the hash of a document, whatever it was, appended with your block.

For instance, if there was a secret text giving a whole explanation of a battle plan, were hashed (and you don't know the content), you could take that hash, add a block of your likings, for instance saying "PS: all of the above is wrong, the attack will take place at noon", and you would be able to calculate the correct hash of the document, with this bloc appended to it.  This would totally alter the meaning of the document, and you would still be able to produce a correct hash.

But when hashing the key, there's no such meaning of appending a bloc.  Nobody is going to look at any fake message with "more stuff in it than just the original key".  Moreover, there's only one single call to the compression function, and Merkle-Damgard isn't even used for this.

This attack is an attack in a totally different setting of the use of a hash than it is with a bitcoin address, and hence doesn't make sense.  It makes sense when you can do a man-in-the-middle attack with a transmitted document of which you can't change the content, but you can append content to it, and you can provide a different hash.
jonald_fyookball
Legendary
*
Offline Offline

Activity: 1302
Merit: 1004


Core dev leaves me neg feedback #abuse #political


View Profile
May 01, 2017, 08:03:06 PM
 #18

well we can only speculate why Satoshi designed it that way.  If a bad curve was chosen for ECDSA yes it would be broken but less broken on unused addresses...sounds like a weak argument but maybe still better than nothing?  so I see what you mean.

I see what you mean about the 512 bits.  Is it possible it has something to do with the fact that the 512 version of pubkey is uncompressed, so its safer to hash it again?

gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
May 01, 2017, 11:45:37 PM
 #19

how long would it take to burn through 255 addresses at only 3 TPS,

I think you haven't caught the real reason that 256 bit addresses are important.  Any N-bit address has only N/2 bits of security against collision, right?

So here is the interesting attack:  You give me your pubkey, and then I create my pubkey for a 2-of-2 (or some other more elaborate contract), and then we pay to the resulting address.

Oops.  In the background I did ~2^80 work and found a colliding address which didn't have the same policy, and I use it to steal the funds.

2^80 is a lot of work, but it isn't enough to be considered secure by current standards.

Unrelated, it's also the case that many hashes of the same vintage of ripemd160 have been significantly broken or substantially weakened.

there's a lot of other "wasting room" on the transaction, and the most important one to me is the hash of 256 bits indicating the transaction.  This is overkill.   There is also the 32-bit number indicating the "output number of the transaction" as if transactions would contain 4 billion outputs.   This is wasted space.
You are free to store transactions however you like and negoiate with your peers to send them however you like, there is no need for space to be wasted as a product of serialization.  Using a larger hash does take up more resources but the difference is pretty small compared to the rest of the transaction.
dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 02, 2017, 03:09:49 AM
Last edit: May 02, 2017, 04:21:35 AM by dinofelis
 #20

how long would it take to burn through 255 addresses at only 3 TPS,

I think you haven't caught the real reason that 256 bit addresses are important.  Any N-bit address has only N/2 bits of security against collision, right?

So here is the interesting attack:  You give me your pubkey, and then I create my pubkey for a 2-of-2 (or some other more elaborate contract), and then we pay to the resulting address.

Oops.  In the background I did ~2^80 work and found a colliding address which didn't have the same policy, and I use it to steal the funds.

2^80 is a lot of work, but it isn't enough to be considered secure by current standards.


I do not entirely get that.  You need 2^80 work to find a collision, that is to say, to find 2 (new) pub keys that hash to the same address.  But how is this helping in this case, because I have no free choice of *counterparty's* pubkey ?   I could find two different pub keys that hashed to the same address, and use one of them in the contract, but what would I do with the other one ?  

I guess I don't have to tell you that in a collision attack, I cannot "fix" one of the hashes - that's a "second pre-image" attack, not a collision attack, and that doesn't suffer from the birthday paradox.

EDIT: ah, I got it: the collision is between the pub key you generate for the actual 2 - 2 contract, and another pub key you generate, that will result in the same final address but where you master all the conditions, so that in the end, you can entirely satisfy the conditions for the first contract, by using the "proofs" of the second, given that they result in the same final hash.  Right, indeed.

dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 02, 2017, 03:31:54 AM
 #21

there's a lot of other "wasting room" on the transaction, and the most important one to me is the hash of 256 bits indicating the transaction.  This is overkill.   There is also the 32-bit number indicating the "output number of the transaction" as if transactions would contain 4 billion outputs.   This is wasted space.
You are free to store transactions however you like and negoiate with your peers to send them however you like, there is no need for space to be wasted as a product of serialization.  Using a larger hash does take up more resources but the difference is pretty small compared to the rest of the transaction.

Well, the room for a full "transaction cycle" consists the room taken for the output on the transaction that has the UTXO, and the corresponding room for the input in the transaction that spends it.  When you look at it, this mainly consists of:

- output amount 8 bytes
- output script, about 25 bytes, of which 20 bytes are the output address (output)

- transaction hash (32 bytes) of the previous output transaction  (input)
- output number of this transaction (4 bytes) (input)
- scriptsig, containing the signature (~64 bytes) AND the pubkey (also, old version ~64 bytes, compact version ~32 bytes)
- 4 bytes "sequence number"

Total usage: about 169 bytes (in compact format)
Of these, 32 bytes (the pub key) are redundant, because one can restore the pub key from the signature and the message.

But:
If an address was only used once, then the transaction hash, and the output number are also redundant (one can find them), and we would save still 36 bytes more.   As such, in the most compact form, if addresses were used only once, we would save 68 bytes out of 169. 

However, the initial argument was that reducing the address from potentially 256 bits to 160 bits "to save space" cannot be correct, because of the wasteful usage of 256 bits of the transaction hash, and 32 bits of the output indication.  If bits were considered expensive, this is where one wasted many of them.
There's no fundamental reason to have more indications of transactions than of addresses for instance, and 32 bits for the output are also overkill.  8 bits would be sufficient I'd think.  Transactions with more than 256 outputs are not very common.
tomtomtom7
Jr. Member
*
Offline Offline

Activity: 38
Merit: 18


View Profile
May 02, 2017, 01:28:31 PM
 #22


So here is the interesting attack:  You give me your pubkey, and then I create my pubkey for a 2-of-2 (or some other more elaborate contract), and then we pay to the resulting address.

Oops.  In the background I did ~2^80 work and found a colliding address which didn't have the same policy, and I use it to steal the funds.

2^80 is a lot of work, but it isn't enough to be considered secure by current standards.

Forgive me, but I still don't quite get that. Where does the transaction with a different policy come from? If you only find two colliding addresses yourself, how can you use it for a contract that steals someone else's fund?

Could someone elaborate on this attack?

Besides, doesn't that require you to create 2^80 "proper" addresses. Thus 2^80 times keypair creation plus double hashing?

BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 02, 2017, 02:31:59 PM
 #23


So here is the interesting attack:  You give me your pubkey, and then I create my pubkey for a 2-of-2 (or some other more elaborate contract), and then we pay to the resulting address.

Oops.  In the background I did ~2^80 work and found a colliding address which didn't have the same policy, and I use it to steal the funds.

2^80 is a lot of work, but it isn't enough to be considered secure by current standards.

Forgive me, but I still don't quite get that. Where does the transaction with a different policy come from? If you only find two colliding addresses yourself, how can you use it for a contract that steals someone else's fund?

Could someone elaborate on this attack?

Besides, doesn't that require you to create 2^80 "proper" addresses. Thus 2^80 times keypair creation plus double hashing?


Did you see the final paragraph in this post?

https://bitcointalk.org/index.php?topic=1895455.msg18832108#msg18832108

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
tomtomtom7
Jr. Member
*
Offline Offline

Activity: 38
Merit: 18


View Profile
May 02, 2017, 03:26:12 PM
 #24


I did, but didn't get it, but maybe I do now on rereading: We're talking about a collision of two P2SH addresses. That makes sense.
dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 02, 2017, 06:03:09 PM
 #25


I did, but didn't get it, but maybe I do now on rereading: We're talking about a collision of two P2SH addresses. That makes sense.

Yes, it took me some time to understand that too.  The "lever arms" are a couple of private keys drawn from a set, that gives rise to a couple of public keys, to be combined with conditions (one is the counterparty's public key, the other is an own public key arbitrarily chosen of which one has the private key), giving rise to two P2SH hashes.  One only needs to test on average a set of 2^80 private keys to find such a couple that has identical such P2SH hashes.
Note that it is somewhat more involved than just testing 2^80 private keys ; one needs to store somehow these results to find out what couple has a collision after the fact.  
jonald_fyookball
Legendary
*
Offline Offline

Activity: 1302
Merit: 1004


Core dev leaves me neg feedback #abuse #political


View Profile
May 02, 2017, 06:16:59 PM
 #26


I did, but didn't get it, but maybe I do now on rereading: We're talking about a collision of two P2SH addresses. That makes sense.

Yes, it took me some time to understand that too.  The "lever arms" are a couple of private keys drawn from a set, that gives rise to a couple of public keys, to be combined with conditions (one is the counterparty's public key, the other is an own public key arbitrarily chosen of which one has the private key), giving rise to two P2SH hashes.  One only needs to test on average a set of 2^80 private keys to find such a couple that has identical such P2SH hashes.
Note that it is somewhat more involved than just testing 2^80 private keys ; one needs to store somehow these results to find out what couple has a collision after the fact.  


So once you do that, how does the attack work? How do you get the other party to use the compromised keys in the multisig?

DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4616



View Profile
May 02, 2017, 06:24:23 PM
 #27

Yes, it took me some time to understand that too.  The "lever arms" are a couple of private keys drawn from a set, that gives rise to a couple of public keys, to be combined with conditions (one is the counterparty's public key, the other is an own public key arbitrarily chosen of which one has the private key), giving rise to two P2SH hashes.  One only needs to test on average a set of 2^80 private keys to find such a couple that has identical such P2SH hashes.
Note that it is somewhat more involved than just testing 2^80 private keys ; one needs to store somehow these results to find out what couple has a collision after the fact.  

I think I'm missing a piece here on how this attack only requires (on average) 280 attempts.


I can see that I can just generate a 1-of-1 P2SH script with a nonce, and then try MANY nonce values until I get an address that collides with the 2-of-2 contract address.  In that case, I think I can reduce my operations to hashing only, and not need to mess around with multiple private keys and generating public keys.

Once I find an address that collides, I no longer need the victim to sign anything.  I can sign with my private key and can broadcast my nonce based script in the input.  The 2-of-2 contract is no longer important.

What I don't fully understand is how I can expect to see a collision after only 280 attempts.  Wouldn't I need 2159 attempts just to have a 50% chance of success?

It seems likely that some other attack is being suggested by gmaxwell, but if it is then I've failed to understand how that attack would work.
BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 02, 2017, 06:34:09 PM
 #28

Danny,  did you read the OP describing the collision rate calculation?  I was about to pen the algorithm that would be needed.  Will do so soon.

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4616



View Profile
May 02, 2017, 06:48:04 PM
 #29

Danny,  did you read the OP describing the collision rate calculation?

I did, but that won't work for gmaxwell's theoretical attack will it?

The "birthday problem" calculates the odds of ANY TWO addresses in a set matching EACH OTHER.

Unless I'm misunderstanding gmaxwell's post, he's talking about ONE address in a set (the work) matching a specific given address (the contract).

If I roll a 20-sided die until ANY number shows up more than once, we are looking at the "birthday problem".
(Odds improve with every roll until eventually odds are effectively 100%)

If I roll a 20-sided die once, and then roll it again until that initial number re-appears, then we are looking at gmaxwell's post.
(Odds remain at 5% on every roll, with an average of one success out of 20 rolls)

Am I missing something here?

In the "birthday problem" presented in the OP, a collision may occur after 283 attempts, but the odds of that collision being with an address that actually has bitcoins in it (or ever will) is extremely unlikely (since at any given time there won't be more than 251 addresses associated with a UTXO, and in reality significantly less).

In gmaxwell's post, I don't see how finding a collision that doesn't collide with the contract address helps?
jonald_fyookball
Legendary
*
Offline Offline

Activity: 1302
Merit: 1004


Core dev leaves me neg feedback #abuse #political


View Profile
May 02, 2017, 06:51:31 PM
 #30

Danny I was essentially asking the same thing: how do you get the address that you've already found a collision for, into the multisig agreement.

BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 02, 2017, 09:42:05 PM
Last edit: May 02, 2017, 09:55:34 PM by BurtW
 #31

Basic pseudo code implementation of algorithm to find hash collisions:

Quote
32 byte variable pri1;
32 byte variable pri2;
32 byte variable pubx;
32 byte variable puby;
20 byte variable hash;
boolean variable collided;

do {

   GenerateRandomPrivKey( &pri1 );
   CalculatePubKeyFromPriv( pri1, &pubx, &puby );
   CalculateHashOfPubKey( pubx, puby, &hash );

   collided = CheckForHitAndInsertIntoHashTable( hash, pri1 );

} while ( !collided );

pri2 = GetFromHashTable( hash );

output pri1 " and " pri2 " both hash to " hash

Everything here is easy except the storage of the previous results into the hash table.  Ideally for speed the hash table would contain space for 2160 private keys.

That is 2165 bytes!  Not possible.  The largest unit of data we even have today is the yottabyte (280 bytes) and it would take 285 of those to store that much data.

So, even though the loop would need to be run "only" about 283 (280 if you are lucky) times you would still need to store and search through all the previous results in order to detect the first collision due to the birthday problem/paradox described in the OP.

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
May 03, 2017, 12:40:04 AM
 #32

The "birthday problem" calculates the odds of ANY TWO addresses in a set matching EACH OTHER.
Unless I'm misunderstanding gmaxwell's post, he's talking about ONE address in a set (the work) matching a specific given address (the contract).

I'm sorry if I was unclear.   You are not provided with an address. You are provided with a public key.   Now you construct many possible addresses looking for a pair where one is a plausible contract and the other lets you spend all the coins.

You show your counterparty one, then spend with the other.
DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4616



View Profile
May 03, 2017, 01:58:38 AM
Last edit: May 03, 2017, 02:36:02 AM by DannyHamilton
 #33

I'm sorry if I was unclear.   You are not provided with an address. You are provided with a public key.   Now you construct many possible addresses looking for a pair where one is a plausible contract and the other lets you spend all the coins.

You show your counterparty one, then spend with the other.

gmaxwell,

I have a lot of respect for your knowledge in this area, so I'm pretty sure I'm the one who still isn't understanding.

Edit: Perhaps I am beginning to understand?  By the time I finished writing this post I realized that the 2-set method might not be as difficult as I thought.  My initial assumption (in my earlier post) was 1 address in one set, and multiple addresses in the other set until there was a collision with the single-address set. As I wrote this post it dawned on me that it might be more efficient to generate equal numbers of addresses in each set until one set collided with the other.

Using the public key that I'm provided with, I can generate MANY plausible contracts until two of them collide with each other.  However, this isn't useful, since I won't be able to use either to spend the coins.

I could also generate MANY addresses that let me spend all the coins until two of them collide with each other.  However, this isn't useful, since I won't be able to show my counterparty either of them.

After some thought, I suppose I'd need to generate 2 sets.  I can generate MANY addresses in one set that let me spend all the coins , and then generate MANY plausible contracts in the other set.  I'd have to continue this until an address in set 1 collides with an address in set 2.

I'm not 100% certain on the maths, but (depending on how much more effort the contract generation requires) while that seems to require less effort than my initial example (where the "plausible contracts" set only has 1 address), it would seem to require more effort than the "birthday problem".  My intuition is that on average it requires twice as many P2SH addresses to be generated as the "birthday problem" would (since each new address in one set is effectively a new potential match in the other set)?  If that's right, then I suppose that means it's just one extra "bit" of security over the birthday problem?
dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 03, 2017, 07:50:42 AM
 #34


I did, but didn't get it, but maybe I do now on rereading: We're talking about a collision of two P2SH addresses. That makes sense.

Yes, it took me some time to understand that too.  The "lever arms" are a couple of private keys drawn from a set, that gives rise to a couple of public keys, to be combined with conditions (one is the counterparty's public key, the other is an own public key arbitrarily chosen of which one has the private key), giving rise to two P2SH hashes.  One only needs to test on average a set of 2^80 private keys to find such a couple that has identical such P2SH hashes.
Note that it is somewhat more involved than just testing 2^80 private keys ; one needs to store somehow these results to find out what couple has a collision after the fact.  


So once you do that, how does the attack work? How do you get the other party to use the compromised keys in the multisig?

The whole point is: you don't need a multisig to get paid out !  I didn't immediately realize this either, but the whole principle of bitcoin is that in order to have the "spending right" of an UTXO, you have to solve a puzzle of which the "question" hashes to the output address of that UTXO.  In a simple transaction, that puzzle is "make a signature that corresponds to the public key that is this hash".  In a 2-2 contract, however, that puzzle is whatever hashes to the given hash ; one such puzzle is the intended multisig: "make a signature that corresponds to the first public key, and make another signature that corresponds to the second public key".
But if you can find *another* puzzle description that hashes to the same hash, the solution to that other puzzle ALSO satisfies the spending requirement of that UTXO.  That other puzzle has nothing to do with the first guy's public key. 
You see, the explicit requirement is not present in the block chain: only its hash is.  So whatever requirement that hashes to the same hash, can be considered as the "true requirement".

Compare it to the following situation: you buy a house, and instead of registering the whole act of sales, you only register its HASH with the notary.   The notary knows that this house goes with that hash, that's all.  So anyone that can write another act of sale, that hashes to the same hash, can act as the owner of the house, the notary will agree, and will let him sell the house while you don't even know it.

Now, the nasty thing with a double signature, is that the guy providing HIS signature has a lever on the true document, and is hence able to find a collision with a document entirely of his making.  This is what reduces the 160 bit second-pre-image security to 80 bit collision security.

DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4616



View Profile
May 03, 2017, 08:01:10 AM
 #35

- snip -
Now, the nasty thing with a double signature, is that the guy providing HIS signature has a lever on the true document, and is hence able to find a collision with a document entirely of his making.  This is what reduces the 160 bit second-pre-image security to 80 bit collision security.

Am I right in assuming that this reduction in security is because the attacker can generate 279 reasonable looking 2-of-2 contracts (and their associated P2SH addresses), and then generate 279 single-signature P2SH addresses, and in doing so would have an extremely high probability of finding an address in the set of 2-of-2 contracts that collides with one of the single-signature P2SH addresses?

279 contracts + 279 single-signature P2SH addresses = 280 generations.

Or more specifically, that the attacker can:
  • 1. Generate one 2-of-2 contract and one single-signature P2SH addresses and see if they collide...
  • 2. Then generate an additional 2-of-2 contract and see if it collides with ANY of the single-signature P2SH addresses generated so far
  • 3. Then generate an additional single-signature P2SH addresse and see if it collides with ANY of the 2-of-2 contracts generated so far
  • 4. Repeat steps 2 and 3 until a collision is found

And that in doing so they will succeed, on average, after repeating steps two and three 280 times (although they could get lucky and collide sooner, or get unlucky and collide much later).

Is that the risk here?
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
May 03, 2017, 08:24:34 AM
 #36

My intuition is that on average it requires twice as many P2SH addresses to be generated as the "birthday problem" would (since each new address in one set is effectively a new potential match in the other set)?
You've got it.

You should generally think of birthday problem numbers as approximate in any case,  because to achieve the bound you need lots of storage.  A cost optimized solver will make some time/memory tradeoff and usually end up using a couple times more computation than the birthday number in exchange for little to no storage.  (google floyds cycle finding algorithim)

And yes, thats the risk-- that your counterparties degree of freedom in choosing part of the contract will let them find an alternative contract with only collision like work, rather than with second-preimage like work.

You can mitigate by having multiple rounds of communication with commitments, but few to no one will implement this in practice:  Each communication round is a huge software engineering and UI cost, and most people don't understand this collision vulnerability (or _any_ collision vulnerability) even after having it explained.  The longer hash should also be somewhat more secure against second preimages assuming our hash functions become weakened by attacks in the future.

[Basically _any_ time I've seen a collision come up in discussions about the protocols people-- even people who should know better-- get them wrong. Someone on Reddit just the other day argued that a 64-bit collision would take 2^64 work, so I generated one based on his message.. then he argued that I got really lucky to have managed to produce one using only a 32-bit nonce and couldn't provide another, so I did... Tongue  they're highly unintuitive to people)
dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 03, 2017, 08:25:44 AM
 #37

- snip -
Now, the nasty thing with a double signature, is that the guy providing HIS signature has a lever on the true document, and is hence able to find a collision with a document entirely of his making.  This is what reduces the 160 bit second-pre-image security to 80 bit collision security.

Am I right in assuming that this reduction in security is because the attacker can generate 279 reasonable looking 2-of-2 contracts (and their associated P2SH addresses), and then generate 279 single-signature P2SH addresses, and in doing so would have an extremely high probability of finding an address in the set of 2-of-2 contracts that collides with one of the single-signature P2SH addresses?

279 contracts + 279 single-signature P2SH addresses = 280 generations.

Or more specifically, that the attacker can:
  • 1. Generate one 2-of-2 contract and one single-signature P2SH addresses and see if they collide...
  • 2. Then generate an additional 2-of-2 contract and see if it collides with ANY of the single-signature P2SH addresses generated so far
  • 3. Then generate an additional single-signature P2SH addresse and see if it collides with ANY of the 2-of-2 contracts generated so far
  • 4. Repeat steps 2 and 3 until a collision is found

And that in doing so they will succeed, on average, after repeating steps two and three 280 times (although they could get lucky and collide sooner, or get unlucky and collide much later).

Is that the risk here?

Yes. Up to a few factors of 2, we're talking orders of magnitude here, not an exact amount of trials.
DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4616



View Profile
May 03, 2017, 09:20:26 AM
 #38

- snip -
And yes, thats the risk-- that your counterparties degree of freedom in choosing part of the contract will let them find an alternative contract with only collision like work, rather than with second-preimage like work.

You can mitigate by having multiple rounds of communication with commitments, but few to no one will implement this in practice:  Each communication round is a huge software engineering and UI cost, and most people don't understand this collision vulnerability (or _any_ collision vulnerability) even after having it explained.
- snip -

So, the solution to this (for me) is to insist that I'm the one that generates the contract address.  This removes the counterparty's ability to engage in this attack.

If the counterparty is aware of the risk and doesn't have reason to trust me, then their only recourse is to offer the "multiple rounds of communication with commitments".

Since in practice "no one will implement this" AND "most people don't understand this collision vulnerability", the odds are that I can almost always get away with being the one to generate the contract address every time (unless I'm engaged in a transaction with gmaxwell, since he understands enough to know better).  Someone may eventually fall victim to this, but I now understand enough to keep myself safe (from this particular attack).
dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
May 03, 2017, 09:31:32 AM
 #39

- snip -
And yes, thats the risk-- that your counterparties degree of freedom in choosing part of the contract will let them find an alternative contract with only collision like work, rather than with second-preimage like work.

You can mitigate by having multiple rounds of communication with commitments, but few to no one will implement this in practice:  Each communication round is a huge software engineering and UI cost, and most people don't understand this collision vulnerability (or _any_ collision vulnerability) even after having it explained.
- snip -

So, the solution to this (for me) is to insist that I'm the one that generates the contract address.  This removes the counterparty's ability to engage in this attack.

If the counterparty is aware of the risk and doesn't have reason to trust me, then their only recourse is to offer the "multiple rounds of communication with commitments".

Since in practice "no one will implement this" AND "most people don't understand this collision vulnerability", the odds are that I can almost always get away with being the one to generate the contract address every time (unless I'm engaged in a transaction with gmaxwell, since he understands enough to know better).  Someone may eventually fall victim to this, but I now understand enough to keep myself safe (from this particular attack).

... and invest in it to attack others Smiley
BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
May 03, 2017, 09:46:50 AM
 #40

Greg, you said:

this thread has a lot of hot air,  segwit p2wsh uses 256-bit hashes, and there is a 256bit address format proposed.

Is the thread over now?

So, assuming this proposal was adopted then you could choose to use a 256 bit hash on a contract and therefore increase the security to 128 bits.  It seems to me that this would be wise for a high dollar contract and would make it incredibly difficult to carry out the attach that has been brought up in this conversation.  Correct?

Also, It looks like the new 256 bit hash format is part of the current segwit proposal, correct?

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
May 03, 2017, 11:02:38 AM
 #41

Also, It looks like the new 256 bit hash format is part of the current segwit proposal, correct?
Yes, Segwit script v0 P2WSH (the p2sh for segwit) uses 256bit SHA2; both boosting the security to 128 bits, and removing ripemd160 from the potential locations of weakness.

There is a 160 bit supported, but only for single key checksig only. Basically so that the very common single key case that gains little from the 256bit hash isn't any longer than non-segwit.
addrstore.com
Newbie
*
Offline Offline

Activity: 45
Merit: 0


View Profile WWW
June 20, 2017, 08:15:28 AM
Last edit: June 20, 2017, 08:30:43 AM by addrstore.com
 #42

The author, there is no need to go to the 256-bit address. 160 bits is enough to ensure the security of the address.

The problem is completely contrived.

In order to get a collision with a probability of 1.77636E-15 (and this probability is negligible, there is a higher chance of winning the lottery) there must be a blockchain in which 2 ^ 56 addresses will be used, each address being 20 bytes (160 bit). In total, the blockchain with such an amount of addresses will occupy not less than 2 ^ 56 * 20 = 1441151 TB (this is the lowest estimate), by the way, at the moment, the blockchain is only 126 GB.
BurtW (OP)
Legendary
*
Offline Offline

Activity: 2646
Merit: 1131

All paid signature campaigns should be banned.


View Profile WWW
June 20, 2017, 02:39:13 PM
 #43

The author, there is no need to go to the 256-bit address. 160 bits is enough to ensure the security of the address.

The problem is completely contrived.

In order to get a collision with a probability of 1.77636E-15 (and this probability is negligible, there is a higher chance of winning the lottery) there must be a blockchain in which 2 ^ 56 addresses will be used, each address being 20 bytes (160 bit). In total, the blockchain with such an amount of addresses will occupy not less than 2 ^ 56 * 20 = 1441151 TB (this is the lowest estimate), by the way, at the moment, the blockchain is only 126 GB.
When you read the thread did you see this post:

There are only 2,100,000,000,000,000 satoshis.

Therefore the maximum number of Bitcoins addresses that can contain value at any one time is only about 251 and that assumes only one satoshi per address.

At the time the block chain contains 255 transactions it would be somewhere between a petabyte and an exabyte in size.  So a full node would need a multi petabyte drive to store the entire block chain.

Your estimate for block chain size is interesting.  I will ponder that.  Thanks.

Our family was terrorized by Homeland Security.  Read all about it here:  http://www.jmwagner.com/ and http://www.burtw.com/  Any donations to help us recover from the $300,000 in legal fees and forced donations to the Federal Asset Forfeiture slush fund are greatly appreciated!
DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4616



View Profile
June 20, 2017, 03:06:35 PM
 #44

The author, there is no need to go to the 256-bit address. 160 bits is enough to ensure the security of the address.

The problem is completely contrived.

In order to get a collision with a probability of 1.77636E-15 (and this probability is negligible, there is a higher chance of winning the lottery) there must be a blockchain in which 2 ^ 56 addresses will be used, each address being 20 bytes (160 bit). In total, the blockchain with such an amount of addresses will occupy not less than 2 ^ 56 * 20 = 1441151 TB (this is the lowest estimate), by the way, at the moment, the blockchain is only 126 GB.

Did you even think about reading the thread before wasting your time writing your post?

There is a bigger concern with 160 bit P2SH addresses than a collision with an address in the blockchain.  The concern is that an attacker can create two separate addresses that BOTH DO NOT exist in the blockchain, but which collide with each other.  With 160 bit addresses, there is a reasonably good chance that an attacker could do so within 2^80 attempts.

While 2^80 is a big number and probably more than you're going to do with your CPU today, it isn't big enough to be considered "secure".

As for the storage requirements:

- snip -
A cost optimized solver will make some time/memory tradeoff and usually end up using a couple times more computation . . . in exchange for little to no storage.  (google floyds cycle finding algorithim)

And yes, thats the risk-- that your counterparties degree of freedom in choosing part of the contract will let them find an alternative contract with only collision like work, rather than with second-preimage like work.
- snip -
addrstore.com
Newbie
*
Offline Offline

Activity: 45
Merit: 0


View Profile WWW
June 21, 2017, 08:39:19 AM
Last edit: June 21, 2017, 02:26:43 PM by addrstore.com
 #45


With 160 bit addresses, there is a reasonably good chance that an attacker could do so within 2^80 attempts.


What good chance are you talking about?

You can take specific numbers and count.

In order to get collision with a 40% chance on a set of 2 ^ 80 addresses, you need a memory size of 2.4 * 10 ^ 25 bytes. And even applying algorithms that use a trade-off between memory and calculation time - it will still be a huge size. All the Internet for the time of its existence is estimated only in 10 ^ 24 bytes (https://www.livescience.com/54094-how-big-is-the-internet.html), i.e. You will need a memory capacity like 200 modern Internet, I'm not talking about computing power, which will also be needed. Hence, we can conclude that in practice we will not be able to find a collision at the current time and for the next 15-20 years it will be impossible.

Quantum computers will appear faster because of which it will be necessary to change all modern cryptography and hashing methods..
dinofelis
Hero Member
*****
Offline Offline

Activity: 770
Merit: 629


View Profile
June 21, 2017, 09:43:11 AM
 #46


With 160 bit addresses, there is a reasonably good chance that an attacker could do so within 2^80 attempts.


What good chance are you talking about?

You can take specific numbers and count.


Well, you are making the case for 80 bits security level.  Bitcoin set out to have a 128 bit security level.  Once that principle is adopted (for good or bad reasons), it is good to stick to it.  If you don't, you have inconsistent security requirements, and inconsistent burdens.  Some calculational and storage burden would be due to people sticking with 128 bit security level, while other parts would only give you 80 bits security level and punch holes in the burden that was paid for elsewhere.  I can agree with you that maybe one should have put bitcoin's security at only 80 bits level.  That would have shortened quite a lot of things (room on chain, processing capacity, network capacity etc....). But Satoshi put it at 128 bits level (that's why the ECDS keys are 256 bit length).  Once you go for that, you have to keep it uniform, or you have inconsistent safety and burden.
aliashraf
Legendary
*
Offline Offline

Activity: 1456
Merit: 1174

Always remember the cause!


View Profile WWW
June 21, 2017, 04:35:11 PM
Last edit: June 22, 2017, 09:38:01 AM by aliashraf
 #47


2^80 is a lot of work, but it isn't enough to be considered secure by current standards.


Actually it is more than enough:
If all the parties of the bitcoin network with the current hash power of (almost) 2^62 H/s reach a consensus and decide to just find one single 160 bit collided pair of keys, they have to devote their total power for the next 1000+ years or so  to do the job. (Note: we have to run 2 hash functions in each test).

Which 'current standards' are you mentioning in this context?

I think there is a point that has been missed out: Any security consideration should be taken in response to the extent of the possible threats which are always proportional to the incentives.

In this specific case it is mathematically possible for a hypothetical very powerful attacker to spend trillions of dollars and a significant amount of time to collide with a 160 bit random (absolutely random) address and disrupt a single contract: my lesson? never put trillions dollars worth of assets on a single contract on top of the bitcoin blockchain, or make sure there is not a single sensitive dependency to the 160 bit hashed key addresses in the policy .

gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
June 23, 2017, 06:13:47 PM
 #48

2^80 is a lot of work, but it isn't enough to be considered secure by current standards.

Actually it is more than enough:
If all the parties of the bitcoin network with the current hash power of (almost) 2^62 H/s reach a consensus and decide to just find one single 160 bit collided pair of keys, they have to devote their total power for the next 1000+ years or so  to do the job. (Note: we have to run 2 hash functions in each test).

Check your math. 2^62 hashes per second does 2^80 work in 2^18 seconds.   That is three days.

you need a memory size of 2.4 * 10 ^ 25 bytes. And even applying algorithms that use a trade-off between memory and calculation time - it will still be a huge size.
No it won't.  Collisions can be found an an effectively storageless manner with a small constant factor slowdown, I gave google terms upthread.

You're suffering from the same ignorance that caused the collision design flaw in "xthin" where they were claiming that collisions of a 64-bit hash were infeasible to compute due to storage requirements. (which I eventually eventually grew tired of correcting and started responding to all the messages with 64-bit sha2 collisions.)
aliashraf
Legendary
*
Offline Offline

Activity: 1456
Merit: 1174

Always remember the cause!


View Profile WWW
June 23, 2017, 08:10:00 PM
 #49

2^80 is a lot of work, but it isn't enough to be considered secure by current standards.

Actually it is more than enough:
If all the parties of the bitcoin network with the current hash power of (almost) 2^62 H/s reach a consensus and decide to just find one single 160 bit collided pair of keys, they have to devote their total power for the next 1000+ years or so  to do the job. (Note: we have to run 2 hash functions in each test).

Check your math. 2^62 hashes per second does 2^80 work in 2^18 seconds.   That is three days.


OK. My bad. But the main argument remains valid: It takes a definitely huge amount of resources and time to collide with 168 key hashed addresses and for the near future the cost of such an attack can not be justified while bitcoin block chain is a monetary infrastructure and the attacker MUST justify his/her costs.

I insist you and other guys pay more attention here: We are not discussing security as a general problem. It is not about sensitive data related to nuclear missiles or family scandals, data with infinite or undecidable values. Our context is securing contracts on top of the Blockchain.

As of 160 bit hashed address key problem, the hypothetical attack can compromise a contract on the Blockchain, but nobody puts such a huge amount of money on a single contract. He/she should not do this and Bitcoin is not and have not to be adequate for such a contract.


gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
June 24, 2017, 12:18:42 AM
 #50

Yes, 2^80 is still a lot of work-- which is why I didn't just do it to make an example in a response. Smiley

But it is not an infeasible amount of work: 2^80 work by Bitcoin is paid for by less than three days of mining income-- $14,580,000-ish.

But who are you to decide that the maximum value of of a Bitcoin exchange should be limited to under (say) $15M?  Personally, I'm gunning for the day where a single Bitcoin is worth that much. Tongue (otherwise, I dunno how I'm going to finance both a space elevator and the invention of human immortality)-- and what about next year?   With 2^80 work perfectly achievable though expensive now  where will that cost be next year and the year after?  We saw with P2SH that it takes _years_ to adopt a new address type in Bitcoin.  "By 2025 a children's speak and spell could crack it"  (perhaps not, but the point remains)

Being weak in this way is just fodder for conspiracy theories and arguments that Bitcoin will lose its value in the future. This sort of weakness makes life harder for application developers since there are more corner cases which they cannot dismiss as categorically infeasible. Between these, security against near future decreases in computational costs, and security for large values-- 12 bytes of additional public key size is a pretty low price to pay.

Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1174


View Profile WWW
June 24, 2017, 12:28:00 AM
 #51

In order to get collision with a 40% chance on a set of 2 ^ 80 addresses, you need a memory size of 2.4 * 10 ^ 25 bytes. And even applying algorithms that use a trade-off between memory and calculation time - it will still be a huge size.

That's not correct. Floyd's cycle-finding algorithm can be used to find colliding script hashes with O(1) memory, for just a factor 3 slowdown.

I do Bitcoin stuff.
DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4616



View Profile
June 24, 2017, 05:18:46 AM
 #52

- snip -
- snip -
- snip -

Wow, what a flashback of familiar people.

Burt, Greg, and Pieter all in the same thread?  Feels like I'm back in late 2011, or early 2012.

Bitcointalk hasn't been this educational and interesting in ages.
aliashraf
Legendary
*
Offline Offline

Activity: 1456
Merit: 1174

Always remember the cause!


View Profile WWW
June 24, 2017, 05:19:46 PM
 #53

Yes, 2^80 is still a lot of work-- which is why I didn't just do it to make an example in a response. Smiley

But who are you to decide ...
I Ignore the language  Lips sealed
Quote
... that  the maximum value of of a Bitcoin exchange should be limited to under (say) $15M?  

It is  not about 'exchange' ... Roll Eyes
With all due respects, you are exaggerating too much.
I think you haven't caught the real reason that 256 bit addresses are important.  Any N-bit address has only N/2 bits of security against collision, right?
Wrong! They are n/2 secure against a birthday attack in applications sensitive to birthday attack! This is it, nothing more. No generalizations please.

I appreciate your _discovery_ about vulnerabilities in bitcoin's 2-of-2 contracts and alike.

It is definitely a good hint but it has nothing to do with 'exchange' as a general concept.

Thanks to your (and other guys') posts here, one should definitively conclude that, people can 'exchange' trillions of dollars in a single bitcoin transaction and remain safe and secure but they can _not_ put multimillion dollar assets  on 2-of-2 contracts in a 'naive' way.

By 'naive' I mean giving HIS right to an anonymous counter-party without using work around procedures like multiple communications proportional to the value of asset under contract.

Your points implying that people do not understand or do not follow hints, is not acceptable in this context, as we are not discussing simple low stake daily trades. Ignorant people never sign a multi million dollar contract without consultation and following the most secure procedures. In such a trade they will deliberately go through exhaustive communications that can escalate attack costs enough to de-incentivize it.

So I come to my final conclusion: Everything is OK with the current 160 bit hashed addresses and nothing will be compromised in the real world, never ever.

As I understand, you are bolding this vulnerability to justify SW once more, this time as a bonus for dealing with an attack that never gonna happen. Tongue Am I right?



addrstore.com
Newbie
*
Offline Offline

Activity: 45
Merit: 0


View Profile WWW
June 28, 2017, 06:03:22 AM
Last edit: June 28, 2017, 06:50:47 AM by addrstore.com
 #54


you need a memory size of 2.4 * 10 ^ 25 bytes. And even applying algorithms that use a trade-off between memory and calculation time - it will still be a huge size.
No it won't.  Collisions can be found an an effectively storageless manner with a small constant factor slowdown, I gave google terms upthread.

You're suffering from the same ignorance that caused the collision design flaw in "xthin" where they were claiming that collisions of a 64-bit hash were infeasible to compute due to storage requirements. (which I eventually eventually grew tired of correcting and started responding to all the messages with 64-bit sha2 collisions.)

You want to say that performing a hash according to the scheme:

Hash1 = RIPEMD160(SHA256(Data1))
Hash2 = RIPEMD160(SHA256(Data2))

You can find such Data1 and Data2, in which the first 64 bits in hash1 and hash2 will be the same without using the birthdays attack and bruteforce? And you will not spend much memory?

Can I see the algorithm (GitHub) and proof of work? I want to try it myself. How much time will it take to calculate this?
addrstore.com
Newbie
*
Offline Offline

Activity: 45
Merit: 0


View Profile WWW
June 28, 2017, 06:29:30 AM
 #55

In order to get collision with a 40% chance on a set of 2 ^ 80 addresses, you need a memory size of 2.4 * 10 ^ 25 bytes. And even applying algorithms that use a trade-off between memory and calculation time - it will still be a huge size.

That's not correct. Floyd's cycle-finding algorithm can be used to find colliding script hashes with O(1) memory, for just a factor 3 slowdown.

If I understand correctly on Wikipedia, this is a pointer algorithm, it uses O(1) memory, but the data structure itself, for which pointers will be created how much will it occupy?
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 28, 2017, 10:34:04 AM
 #56

If I understand correctly on Wikipedia, this is a pointer algorithm, it uses O(1) memory, but the data structure itself, for which pointers will be created how much will it occupy?

You don't store it at all.

You start with some seed for x[0] and then compute x[1] and x[2] and so on.

x[n + 1] = Hash(x[n])

After around 280 steps, x[n] collides with one of the previous values.  This creates a loop.

Code:
*   A -> B -> C -> D -> E .
*             ^            \
*             |             v
*             J             F
*              \           /
*               I <- H <- G

So, compute H = x[280] and then start from x[0] again.  If you hit H before you get to 280, then you know you have a loop.  Continue until you hit H a second time and that gives you the length of the loop.

Once you have the length of the loop and the location(s) of H, you can compute where merge point is.  In the diagram that is the positions of J and B.  Run a third time, and store J and B, you now how 2 values which hash to the same value.

To steal money, you actually need to have 2 types of hash.  The fair and fraudulent one.

Based on LSB of x[n]

EVEN: x[n + 1] = Hash(multisig with [Alice's public key & x[n] as Eve's public key])
ODD: x[n + 1] = Hash(checksig with [Eve's public key] + DROP(x[n]))

In the multisig, Eve doesn't even need to have the private key.  She isn't planning to use that version anyway.

Assume capitals are "fair" and lower case is "fraud", the diagram would change to

Code:
*   a -> B -> c -> D -> e .
*             ^            \
*             |             v
*             J             f
*              \           /
*               I <- H <- g

In this case, B and J are both fair.  So the collision is useless.

Eve would try again with another seed.  She needs to get one fair and one fraud.

There is a 50% chance she gets a mix, so on average she will need 2 seeds.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
addrstore.com
Newbie
*
Offline Offline

Activity: 45
Merit: 0


View Profile WWW
June 28, 2017, 12:48:24 PM
 #57

Thanks. A beautiful and detailed answer. Now I agree that for p2sh addresses there really is a need to use a longer hash.
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
June 28, 2017, 04:22:47 PM
Last edit: June 28, 2017, 06:23:09 PM by TierNolan
 #58

Thanks. A beautiful and detailed answer. Now I agree that for p2sh addresses there really is a need to use a longer hash.

20 bytes is enough for someone who is creating their own address from a private key and doesn't involve someone else.  You get 160 bit security in that case (which is higher than the 128 bit security from the signature algorithm).

One size fits all means that people don't accidentally use the wrong hash size in the cases where there is a weakness.  It also reduces code complexity.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
Pages: 1 2 3 [All]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!