Bitcoin Forum
April 25, 2024, 04:55:51 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 3 »  All
  Print  
Author Topic: Chances of a collision  (Read 4706 times)
Grumpster (OP)
Full Member
***
Offline Offline

Activity: 210
Merit: 100


View Profile
March 19, 2015, 10:55:15 PM
 #1

Has anybody been able to work out the fractional chances of a collision with an existing bitcoin address?

Has anybody got any verifiable proof that it's been done in the past?

What hardware would be capable of even getting close to being able to generate enough addresses to cause a collision, and how would somebody trying to cause a collision notice that they've succeeded?
1714064151
Hero Member
*
Offline Offline

Posts: 1714064151

View Profile Personal Message (Offline)

Ignore
1714064151
Reply with quote  #2

1714064151
Report to moderator
1714064151
Hero Member
*
Offline Offline

Posts: 1714064151

View Profile Personal Message (Offline)

Ignore
1714064151
Reply with quote  #2

1714064151
Report to moderator
Even in the event that an attacker gains more than 50% of the network's computational power, only transactions sent by the attacker could be reversed or double-spent. The network would not be destroyed.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714064151
Hero Member
*
Offline Offline

Posts: 1714064151

View Profile Personal Message (Offline)

Ignore
1714064151
Reply with quote  #2

1714064151
Report to moderator
1714064151
Hero Member
*
Offline Offline

Posts: 1714064151

View Profile Personal Message (Offline)

Ignore
1714064151
Reply with quote  #2

1714064151
Report to moderator
cr1776
Legendary
*
Offline Offline

Activity: 4018
Merit: 1299


View Profile
March 19, 2015, 11:12:25 PM
 #2

Has anybody been able to work out the fractional chances of a collision with an existing bitcoin address?

Has anybody got any verifiable proof that it's been done in the past?

What hardware would be capable of even getting close to being able to generate enough addresses to cause a collision, and how would somebody trying to cause a collision notice that they've succeeded?
1. Yes, the math has been done and discussed.
Edit: see eg
https://bitcointalk.org/index.php?topic=104461.0
https://bitcointalk.org/index.php?topic=62.0
2. No (assuming you have a working RNG and non-buggy software.
3. None. If the address had been used they would see it.
Grumpster (OP)
Full Member
***
Offline Offline

Activity: 210
Merit: 100


View Profile
March 19, 2015, 11:14:18 PM
 #3

1. Yes, the math has been done and discussed.

Just in case it wasn't quite clear, I was hoping for somebody to post what those chances actually were, or a link to a relevant and verified discussion with actual information on the topic. So far all I've come across is speculation about it.
cr1776
Legendary
*
Offline Offline

Activity: 4018
Merit: 1299


View Profile
March 19, 2015, 11:16:16 PM
 #4

1. Yes, the math has been done and discussed.

Just in case it wasn't quite clear, I was hoping for somebody to post what those chances actually were, or a link to a relevant and verified discussion with actual information on the topic. So far all I've come across is speculation about it.

I was looking them up, if you search for "bitcoin address collision" you'll see links - like the two above.
:-)
shorena
Copper Member
Legendary
*
Offline Offline

Activity: 1498
Merit: 1499


No I dont escrow anymore.


View Profile WWW
March 19, 2015, 11:17:03 PM
 #5

1. Yes, the math has been done and discussed.

Just in case it wasn't quite clear, I was hoping for somebody to post what those chances actually were, or a link to a relevant and verified discussion with actual information on the topic. So far all I've come across is speculation about it.

It boils down to the birthday paradoxon with 1 in 2160 instead of 1 in 365. The number of addresses that have a balance is probably among bc.i's stats.

1. Yes, the math has been done and discussed.

Just in case it wasn't quite clear, I was hoping for somebody to post what those chances actually were, or a link to a relevant and verified discussion with actual information on the topic. So far all I've come across is speculation about it.

I was looking them up, if you search for "bitcoin address collision" you'll see links - like the two above.
:-)


here is another one -> https://bitcointalk.org/index.php?topic=633308.0

Im not really here, its just your imagination.
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1002



View Profile
March 20, 2015, 07:30:21 AM
Last edit: March 20, 2015, 07:52:23 AM by teukon
 #6

Has anybody been able to work out the fractional chances of a collision with an existing bitcoin address?

With a few basic cryptographic assumptions, the chance of a single randomly (uniformly) generated Bitcoin address colliding with an existing address is simply: [number of existent addresses] / 2160.

At the moment of writing (block #348386), there are 66 214 306 distinct addresses recorded in the blockchain (93.5% of which are empty by the way).  This yields a probability of about 4.5 * 10-41 (about 1 in 22 000 trillion trillion trillion).

Has anybody got any verifiable proof that it's been done in the past?

Not to my knowledge.

What hardware would be capable of even getting close to being able to generate enough addresses to cause a collision, and how would somebody trying to cause a collision notice that they've succeeded?

Such hardware would need to be able to generate close to 2160 / [number of existent addresses].

A decent CPU might manage to generate 500 000 addresses per second.  Working with a companion GPU well-suited to the task and you might be looking at 20 million addresses per second.

"What about ASICs?" I hear you ask.  Well, let us suppose that you could design a machine a billion times faster than this for the same mass of hardware and power consumption.  We'll assume the machine can also check addresses against the relatively tiny pre-set list of existing ones as fast as the new ones are generated.

A cluster of these machines capable of "getting close" (lets say having a fair chance of generating a collision within a year) will draw about as much power as consumed in all other human activities combined and weigh about as much as Mount Everest.
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1002



View Profile
March 20, 2015, 07:48:37 AM
 #7

Just in case it wasn't quite clear, I was hoping for somebody to post what those chances actually were, or a link to a relevant and verified discussion with actual information on the topic. So far all I've come across is speculation about it.

It boils down to the birthday paradoxon with 1 in 2160 instead of 1 in 365. The number of addresses that have a balance is probably among bc.i's stats.

I understood Grumpster to be asking about colliding with one of the currently existing addresses rather generating two addresses which collide with one another.  If we're just looking for two distinct private keys which yield the same address then indeed the Birthday Problem applies.  The average number of addresses you'd need to generate to find such a beast is approximately 1.5 trillion trillion (a little bit more than square_root(2160)).

This problem is far closer to tractable.  All we'd need is about 2 billion gaming rigs running for a year, each with about 20 000 terrabytes of storage.
lophie
Hero Member
*****
Offline Offline

Activity: 924
Merit: 1001

Unlimited Free Crypto


View Profile
March 20, 2015, 09:41:22 AM
 #8

Just in case it wasn't quite clear, I was hoping for somebody to post what those chances actually were, or a link to a relevant and verified discussion with actual information on the topic. So far all I've come across is speculation about it.

It boils down to the birthday paradoxon with 1 in 2160 instead of 1 in 365. The number of addresses that have a balance is probably among bc.i's stats.

I understood Grumpster to be asking about colliding with one of the currently existing addresses rather generating two addresses which collide with one another.  If we're just looking for two distinct private keys which yield the same address then indeed the Birthday Problem applies.  The average number of addresses you'd need to generate to find such a beast is approximately 1.5 trillion trillion (a little bit more than square_root(2160)).

This problem is far closer to tractable.  All we'd need is about 2 billion gaming rigs running for a year, each with about 20 000 terrabytes of storage.

And that is not mentioning the power drain for this!

Will take me a while to climb up again, But where is a will, there is a way...
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1002



View Profile
March 20, 2015, 10:12:28 AM
Last edit: March 20, 2015, 11:08:49 AM by teukon
 #9

This problem is far closer to tractable.  All we'd need is about 2 billion gaming rigs running for a year, each with about 20 000 terrabytes of storage.

And that is not mentioning the power drain for this!

Hmm...  Maybe about 30 exajoules (roughly 8000 terrawatt hours).  This is assuming we can be clever about how matching pairs are checked for and most of the drives can be powered down for most of the year.

Switching off both China and the US and diverting the power to the cluster should be sufficient.
lophie
Hero Member
*****
Offline Offline

Activity: 924
Merit: 1001

Unlimited Free Crypto


View Profile
March 20, 2015, 10:38:28 AM
 #10

This problem is far closer to tractable.  All we'd need is about 2 billion gaming rigs running for a year, each with about 20 000 terrabytes of storage.

And that is not mentioning the power drain for this!

Hmm...  Maybe about 30 exajoules (roughly 8000 terrawatt hours).  This is assuming we can be clever about how matching pairs are checked for and most of the drives can be powered down for most of the year.

Switching off both China and the US and diverting the power to the cluster should be sufficient.

And do not forget this people all this is just t point out that Bitcoin is no impossible to break. But again there is nothing "impossible" in security system, It is just relative security to the gain. No one would do such thing to find a collision, This is alot of money for a hackathon to say haha I found a collision haha..... -_-

Will take me a while to climb up again, But where is a will, there is a way...
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1002



View Profile
March 20, 2015, 11:08:09 AM
 #11

This problem is far closer to tractable.  All we'd need is about 2 billion gaming rigs running for a year, each with about 20 000 terrabytes of storage.

And that is not mentioning the power drain for this!

Hmm...  Maybe about 30 exajoules (roughly 8000 terrawatt hours).  This is assuming we can be clever about how matching pairs are checked for and most of the drives can be powered down for most of the year.

Switching off both China and the US and diverting the power to the cluster should be sufficient.

And do not forget this people all this is just t point out that Bitcoin is no impossible to break. But again there is nothing "impossible" in security system, It is just relative security to the gain. No one would do such thing to find a collision, This is alot of money for a hackathon to say haha I found a collision haha..... -_-

Just to clarify: there's a wide margin between finding an address collision in this fashion and breaking Bitcoin*.  The latter would at bare minimum require something like the hypothetical futuristic Mount Everest machine I described above.

The cluster could probably be used to solve chess and maybe even run Crysis but it will not break Bitcoin.

*I'm assuming there is no subtle bug which would cause problems should two outputs at the same address be moved simultaneously using different keys.
lophie
Hero Member
*****
Offline Offline

Activity: 924
Merit: 1001

Unlimited Free Crypto


View Profile
March 20, 2015, 12:26:39 PM
 #12

This problem is far closer to tractable.  All we'd need is about 2 billion gaming rigs running for a year, each with about 20 000 terrabytes of storage.

And that is not mentioning the power drain for this!

Hmm...  Maybe about 30 exajoules (roughly 8000 terrawatt hours).  This is assuming we can be clever about how matching pairs are checked for and most of the drives can be powered down for most of the year.

Switching off both China and the US and diverting the power to the cluster should be sufficient.

And do not forget this people all this is just t point out that Bitcoin is no impossible to break. But again there is nothing "impossible" in security system, It is just relative security to the gain. No one would do such thing to find a collision, This is alot of money for a hackathon to say haha I found a collision haha..... -_-

Just to clarify: there's a wide margin between finding an address collision in this fashion and breaking Bitcoin*.  The latter would at bare minimum require something like the hypothetical futuristic Mount Everest machine I described above.

The cluster could probably be used to solve chess and maybe even run Crysis but it will not break Bitcoin.

*I'm assuming there is no subtle bug which would cause problems should two outputs at the same address be moved simultaneously using different keys.

well............ have you considered ripemd160? I mean it is easier (lol easier) to find collision in 160 than 256.

Will take me a while to climb up again, But where is a will, there is a way...
coinableS
Legendary
*
Offline Offline

Activity: 1442
Merit: 1179



View Profile WWW
March 20, 2015, 06:39:33 PM
 #13

Here's some super simple code I wrote for a mini bitcoin address collider. I call it mini because it only does 324 address per hour, however the RNG process being used is unclear. If the RNG is flawed there is a better chance of collision.  

Blockchain.info  provides this (https://blockchain.info/q/newkey) which is supposed to create a fresh public/private key pair whenever you call that address. Since the RNG process behind these addresses are unclear is the reason I'm using it. There's a limit of ~1 query per 10 seconds or you'll get banned/blocked which is why this script only does about 320 per hour.  It pulls the new key (assuming RNG is good) and then we use the address info and return the total amount received to that address and plugs everything into a DB. If total amount received is ever greater than 0 then you have a collision. Or you can also query your DB to see if there are any repeats.

Put it in a cron job to run every 5 minutes.

Code:
<?php
$counter 
0;
while(
$counter 27){
//gen new key pair
$url "https://blockchain.info/q/newkey";
$content file_get_contents($url);
$arrEx explode(" "$content);

$bitcoin_address $arrEx[0];
$privKey $arrEx[1];
//check the balance
$jsondata json_decode(file_get_contents("https://blockchain.info/address/$bitcoin_address?format=json"), true);
$totRec $jsondata["total_received"];

//insert the pub key, priv key, and balance into DB
$conn mysqli_connect("localhost""username""password""database_name");
if (mysqli_connect_errno()){
echo "Connection to DB failed" mysqli_connect_error();
}

mysqli_query($conn"INSERT INTO collision (pubK, privK, balance) VALUES ('$bitcoin_address', '$privKey', '$totRec')");

$counter++;
sleep(10);
}
mysqli_close($conn);
?>


teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1002



View Profile
March 20, 2015, 08:34:45 PM
 #14

well............ have you considered ripemd160? I mean it is easier (lol easier) to find collision in 160 than 256.

Yes, all this speculation takes RIPEMD-160 into account.  It is because of the RIPEMD-160 that a collision is going to look like having two different private/public keypairs which correspond to the same address.

Here's some super simple code I wrote for a mini bitcoin address collider. I call it mini because it only does 324 address per hour, however the RNG process being used is unclear. If the RNG is flawed there is a better chance of collision. 

Blockchain.info  provides this (https://blockchain.info/q/newkey) which is supposed to create a fresh public/private key pair whenever you call that address. Since the RNG process behind these addresses are unclear is the reason I'm using it. There's a limit of ~1 query per 10 seconds or you'll get banned/blocked which is why this script only does about 320 per hour.  It pulls the new key (assuming RNG is good) and then we use the address info and return the total amount received to that address and plugs everything into a DB. If total amount received is ever greater than 0 then you have a collision. Or you can also query your DB to see if there are any repeats.

Put it in a cron job to run every 5 minutes.

Cool.  Note that a collider doesn't need high-quality randomness for each address.  Just working through all the private keys in order from 1 to 115792089237316195423570985008687907852837564279074904382605163141518161494336 would be fine (yes, there are quite a lot of different private keys).

Anyway.  At 320 addresses per hour and assuming all else constant, you should be finding an address collision once every 121 billion trillion trillion years on average (similar to the way a new block is generated once every 10 minutes on average so expect some variance).  Notice that you'd actually be generating a collision on average once every 7.8 billion trillion trillion years but because your code checks collisions by checking for a balance and most addresses in the blockchain have a balance of 0 you'll be missing a lot.

I feel it's worth comparing this to Bitcoin block generation.  A script which generates 320 hashes per second will find a block on average once every 72 trillion years.  Given that many CPUs are capable of hashing 10 000 times faster and that an address with non-zero balance contains about 3.2 BTC (compared with the average block payout of about 25 BTC) we see that anyone looking to run this code for profit would be about 100 trillion trillion times better off just CPU mining Bitcoin.
coinableS
Legendary
*
Offline Offline

Activity: 1442
Merit: 1179



View Profile WWW
March 20, 2015, 08:51:18 PM
Last edit: March 20, 2015, 09:27:44 PM by coinableS
 #15

well............ have you considered ripemd160? I mean it is easier (lol easier) to find collision in 160 than 256.

Yes, all this speculation takes RIPEMD-160 into account.  It is because of the RIPEMD-160 that a collision is going to look like having two different private/public keypairs which correspond to the same address.

Here's some super simple code I wrote for a mini bitcoin address collider. I call it mini because it only does 324 address per hour, however the RNG process being used is unclear. If the RNG is flawed there is a better chance of collision.  

Blockchain.info  provides this (https://blockchain.info/q/newkey) which is supposed to create a fresh public/private key pair whenever you call that address. Since the RNG process behind these addresses are unclear is the reason I'm using it. There's a limit of ~1 query per 10 seconds or you'll get banned/blocked which is why this script only does about 320 per hour.  It pulls the new key (assuming RNG is good) and then we use the address info and return the total amount received to that address and plugs everything into a DB. If total amount received is ever greater than 0 then you have a collision. Or you can also query your DB to see if there are any repeats.

Put it in a cron job to run every 5 minutes.

Cool.  Note that a collider doesn't need high-quality randomness for each address.  Just working through all the private keys in order from 1 to 115792089237316195423570985008687907852837564279074904382605163141518161494336 would be fine (yes, there are quite a lot of different private keys).

Anyway.  At 320 addresses per hour and assuming all else constant, you should be finding an address collision once every 121 billion trillion trillion years on average (similar to the way a new block is generated once every 10 minutes on average so expect some variance).  Notice that you'd actually be generating a collision on average once every 7.8 billion trillion trillion years but because your code checks collisions by checking for a balance and most addresses in the blockchain have a balance of 0 you'll be missing a lot.

I feel it's worth comparing this to Bitcoin block generation.  A script which generates 320 hashes per second will find a block on average once every 72 trillion years.  Given that many CPUs are capable of hashing 10 000 times faster and that an address with non-zero balance contains about 3.2 BTC (compared with the average block payout of about 25 BTC) we see that anyone looking to run this code for profit would be about 100 trillion trillion times better off just CPU mining Bitcoin.

I was simply responding to the OP's question "how would somebody trying to cause a collision notice that they've succeeded?"  What I provided is perfectly valid to his question, it attempts to create a collision based on poor RNG and let's you know if you found one. What you posted everyone already knows.

from 1 to 115792089237316195423570985008687907852837564279074904382605163141518161494336 would be fine (yes, there are quite a lot of different private keys)

Yea I quite aware of the # of addresses...

your code checks collisions by checking for a balance and most addresses in the blockchain have a balance of 0 you'll be missing a lot.

No it doesn't, it checks total amount the address has ever received, in other words checks to see if the address has ever been used.

At 320 addresses per hour and assuming all else constant, you should be finding an address collision once every 121 billion trillion trillion years on average

The script is checking the RNG of blockchain's (https://blockchain.info/q/newkey) tool. It has no extra points of entropy like bitaddress.org and could very well be flawed. Do you know the RNG behind this newkey tool?

teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1002



View Profile
March 21, 2015, 05:12:09 AM
 #16

your code checks collisions by checking for a balance and most addresses in the blockchain have a balance of 0 you'll be missing a lot.

No it doesn't, it checks total amount the address has ever received, in other words checks to see if the address has ever been used.

My mistake.  A more careful look revealed that indeed you are checking for any addresses which have ever received funds.

At 320 addresses per hour and assuming all else constant, you should be finding an address collision once every 121 billion trillion trillion years on average

The script is checking the RNG of blockchain's (https://blockchain.info/q/newkey) tool. It has no extra points of entropy like bitaddress.org and could very well be flawed. Do you know the RNG behind this newkey tool?

I don't know the details of the sources of random used by blockchain.info's newkey.  While I agree that there is the possibility of a flaw in the RNG leading to unexpected long-term behaviour of the script, I maintain that the best way of resolving this potential flaw is to throw out the RNG altogether.  A collision generator has no need for anything random.
spin
Sr. Member
****
Offline Offline

Activity: 362
Merit: 261


View Profile
March 24, 2015, 01:29:38 PM
 #17

The script is checking the RNG of blockchain's (https://blockchain.info/q/newkey) tool. It has no extra points of entropy like bitaddress.org and could very well be flawed. Do you know the RNG behind this newkey tool?

I don't know the details of the sources of random used by blockchain.info's newkey.  While I agree that there is the possibility of a flaw in the RNG leading to unexpected long-term behaviour of the script, I maintain that the best way of resolving this potential flaw is to throw out the RNG altogether.  A collision generator has no need for anything random.

I guess the reasoning is that using a potentially flawed generator (that is widely in use) probably increases your probability of finding an address with a balance as opposed to working through the problem space in a pure fashion.  This is only true if the generator is non-random in some repetitive fashion (i.e. it is more likely than purely by chance to repeat some address).  If the generator is simply counting up from 1 for example the key/address would not be repeated (but would still not be random).

If you liked this post buy me a beer.  Beers are quite cheap where I live!
bc1q707guwp9pc73r08jw23lvecpywtazjjk399daa
Mikestang
Legendary
*
Offline Offline

Activity: 1274
Merit: 1000



View Profile
March 24, 2015, 04:11:07 PM
 #18

I think that it is safe to say, for all intents and purposes, that the chance of a collision happening is 0.
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1002



View Profile
March 24, 2015, 09:10:59 PM
 #19

The script is checking the RNG of blockchain's (https://blockchain.info/q/newkey) tool. It has no extra points of entropy like bitaddress.org and could very well be flawed. Do you know the RNG behind this newkey tool?

I don't know the details of the sources of random used by blockchain.info's newkey.  While I agree that there is the possibility of a flaw in the RNG leading to unexpected long-term behaviour of the script, I maintain that the best way of resolving this potential flaw is to throw out the RNG altogether.  A collision generator has no need for anything random.

I guess the reasoning is that using a potentially flawed generator (that is widely in use) probably increases your probability of finding an address with a balance as opposed to working through the problem space in a pure fashion.

Ah, yes, I missed that at first too.  The masses of poorly thought-out posts on this forum has encouraged me into the bad habit of skim reading everything not written by a small white-list of known good posters.

Sure.  If the RNG is flawed then the chances of finding a collision are improved.  I agree it's most likely that this newkey function is used under the hood when people create new addresses in their blockchain.info wallets so there should be plenty of addresses containing funds to collide with.

What the probability of collision is is open to speculation.  Theoretically, anything from [number of existing addresses] / 2160 up to 1 is possible and, absent the code, we only really have the fact that no-one's found a collision so far to guide us.

If the generator is simply counting up from 1 for example the key/address would not be repeated (but would still not be random).

I agree that the keys would not be repeated and the addresses would not be random.  However, the RIPEMD-160 step (assuming RIPEMD-160 itself is not badly flawed) ensures that the addresses will be seemingly random and address collisions will not be practically predictable (precisely as Bitcoin mining gives rise to unpredictable block generation times).
Cryddit
Legendary
*
Offline Offline

Activity: 924
Merit: 1122


View Profile
March 26, 2015, 06:14:55 PM
 #20

There are 2160 addresses.  That means that by the time you've got sqrt(2160) addresses, assuming uniform distribution (which is valid on cryptographic keys) there are even odds that some two of them have collided.  That means you'd need 280 addresses before a collision occurred.  That's about 1.2 septillion addresses required before a collision between any two becomes likely. 

For some sense of scale, Earth weighs about five times that many kilograms. 
Pages: [1] 2 3 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!