Bitcoin Forum
November 10, 2024, 07:35:00 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Why would make the extra merkle commitment asicboost uneconomical?  (Read 1128 times)
johoe (OP)
Full Member
***
Offline Offline

Activity: 217
Merit: 259


View Profile
April 12, 2017, 09:57:47 AM
Merited by vapourminer (2), ABCbits (1)
 #1

I know that it requires some extra hashes but if my math is correct we are talking about 12-13 GHash/s for a 400 PHash/s pool.  And I don't think 12-13 GHash/s cost millions of dollar per year.

This is how I derived the number:

A 400 PHash/s pool grinds through 100 million mid-states per second.  With 4-way asicboost, it needs about 25 million 4-way collisions per second.

If one collects collisions on a central server (the most efficient way to find a lot of collisions in a big pool), one needs about 3 billion hashes to create 25 million 4-way collisions.  If one can live with 5 seconds delay (i.e. blocks miss some high-fee transactions from the last 5 seconds), one can  reduce this to 5 billion hashes in 5 seconds, or a billion hashes per second.

Collecting the hashes and storing them to find the collision is not trivial, but should be possible.  You have to do it for any form of covert asicboost.   

Now, the extra commitment in the UTXO makes computing the hashes more expensive.  Instead of a single hash, you need to compute 12-13 Hashes for the Merkle root (if you still use Merkle grinding to generate the commitment hashes).  But it requires no extra storage.  So the total extra cost is 12-13 GHash/s for a 400 PHash/s pool.  Everything else is unchanged.

Donations to 1CF62UFWXiKqFUmgQMUby9DpEW5LXjypU3
tromp
Legendary
*
Offline Offline

Activity: 990
Merit: 1110


View Profile
April 12, 2017, 02:26:25 PM
Last edit: April 12, 2017, 02:48:56 PM by tromp
Merited by vapourminer (2), ABCbits (1)
 #2

If one collects collisions on a central server (the most efficient way to find a lot of collisions in a big pool), one needs about 3 billion hashes to create 25 million 4-way collisions.  If one can live with 5 seconds delay (i.e. blocks miss some high-fee transactions from the last 5 seconds), one can  reduce this to 5 billion hashes in 5 seconds,

Your analysis appears correct. To elaborate on the math above:

Given x * 2^32 hashes, the probability of a k-way collision on a particular 32 bit value equals
choose(x * 2^32, k) * (2^-32)^k * (1 - 2^-32)^{x * 2^32 - k} which is about x^k/k! * e^-x.

For x=3/4 we get a fraction of 25M/4B 4-way collisions.
And for x=5/4 we get a fraction of 125M/4B 4-way collisions
(we actually get more than an extra quarter of that in useful collisions from k=5 and higher).
arubi
Jr. Member
*
Offline Offline

Activity: 31
Merit: 1


View Profile
April 12, 2017, 05:57:11 PM
 #3

...

Now, the extra commitment in the UTXO makes computing the hashes more expensive.  Instead of a single hash, you need to compute 12-13 Hashes for the Merkle root (if you still use Merkle grinding to generate the commitment hashes).  But it requires no extra storage.  So the total extra cost is 12-13 GHash/s for a 400 PHash/s pool.  Everything else is unchanged.


With segwit active, a single transaction with 13 inputs and 13 outputs all signed ANYONECANPAY|SINGLE, it's possible to permute the pairs to get 13! permutations without the need to re-sign, as ANYONECANPAY|SINGLE in segwit outpoints does not commit to the specific place of the pair.
If the 13 inputs are derived from the same transaction, and the outputs are all the same, you should really only need to swap 13 bytes around the serialized transaction to generate a different, valid transaction and txid. not true, might be true if the inputs are not signed at all.
arubi
Jr. Member
*
Offline Offline

Activity: 31
Merit: 1


View Profile
April 12, 2017, 06:08:45 PM
 #4

I guess the topic is covert AB, but the real issue is if there is any point which AB becomes uneconomical at all, even if overt form can be blocked by policy.
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4270
Merit: 8805



View Profile WWW
April 12, 2017, 06:35:57 PM
Last edit: April 12, 2017, 07:37:23 PM by gmaxwell
Merited by vapourminer (1), ABCbits (1)
 #5

Because it's a 12x-ish additional work factor, thats all.  Is 12x a killer? Depends on how specifically that you've constructed it.

Trying to put it all on one big device and you run into IO/memory bottlenecks much easier-- your 100 million midstates is over 64gbit/sec of data, that device goes down and your farm is down. The device is also highly non-covert, so you cannot keep what you are doing secret from your facilities staff. It is much more straightforward to user local FPGAs to generate the collisions.  And from what I'm told use of local devices to generate is how the unreleased spoondooles design worked.

Consider: today miners could use centralized devices to generate midstates, but instead the S9/R3 miners have fairly expensive FPGAs with 16MB of attached dual channel DDR2133 to generate the ~2000 midstates per second they need.  (When will Trezor ship with such an incredible FPGA? Never, I assume Tongue ) They could use a centralized device to do so, but they don't. (well it looks like privately bitmain runs two S9s per controller, twice that).

If you've adopted a design with distributed generation, then you can't escape the cost-- perhaps you could deal with it, but that doesn't mean the infrastructure anyone has already built does deal with it.
johoe (OP)
Full Member
***
Offline Offline

Activity: 217
Merit: 259


View Profile
April 12, 2017, 09:38:20 PM
 #6

Consider: today miners could use centralized devices to generate midstates, but instead the S9/R3 miners have fairly expensive FPGAs with 16MB of attached dual channel DDR2133 to generate the ~2000 midstates per second they need.  

Did you mean 16 GB?   16 MB sounds a bit low and make 4-way collision finding quite expensive.

If it is 16 GB, does this indicate that they planned collision finding in their design?  Or is there another reason you need that much memory for mining?

Using the same assumption of collection collisions for 5 seconds, you need about 45 MHash/s without commitment header to compute 500 4-way collisions and 550 MHash/s when witness is required.
But you can save more by using a longer collision time, e.g. 14 Mhash/s / 180 MHash/s for 30 s.   What is the estimated hash rate of the FPGA?

Donations to 1CF62UFWXiKqFUmgQMUby9DpEW5LXjypU3
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!