Bitcoin Forum
November 04, 2024, 02:33:57 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: How will the merkel hash tree be stored?  (Read 5413 times)
0x6763
Guest

February 13, 2011, 03:54:52 AM
 #1

When spent transactions are removed from old blocks, how are the merkle tree branches/hashes going to be stored?  Does the Bitcoin software have any code related to this, yet?
Hal
VIP
Sr. Member
*
expert
Offline Offline

Activity: 314
Merit: 4176



View Profile
February 13, 2011, 04:33:31 AM
 #2

The idea of a Merkle tree is you just store the root, and then when someone wants to convince you that something's in the tree, they produce the Merkle branch. The Merkle roots are in the block headers. Merkle branches are stored with transactions in wallets. So when you do a spend, in principle you can prove that your "in" transactions are legit, by providing the Merkle branches for all of them. However there are no data structures defined at this time for sending such information in the network.

Now this is not a matter of removing spent transactions; rather, it's a mode of operation that would allow nodes to forget all transactions, spent and unspent, and just keep block headers.


Hal Finney
theymos
Administrator
Legendary
*
Offline Offline

Activity: 5376
Merit: 13368


View Profile
February 13, 2011, 05:16:32 AM
 #3

You can't currently run as a full network node without full blocks, since there's no way to represent on the network that you're sending just a partial block. Would the vectors currently used for Merkle trees even work if part of the tree was removed?

I haven't seen any relevant code. I don't think transaction trimming is intended for use in the near future.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
0x6763
Guest

February 13, 2011, 05:47:52 AM
 #4

Thanks, Hal and theymos.

To provide some context, this post by Satoshi is why I'm curious about this merkle tree stuff:

seems that the miner would have to basically do "extra work". and if there's no reward from the bitdns mining from the extra work (which of course, slows down the main bitcoin work), what would be a miner's incentive to include bitdns (and whatever other side chains) ?
The incentive is to get the rewards from the extra side chains also for the same work.

While you are generating bitcoins, why not also get free domain names for the same work?

If you currently generate 50 BTC per week, now you could get 50 BTC and some domain names too.

You have one piece of work.  If you solve it, it will solve a block from both Bitcoin and BitDNS.  In concept, they're tied together by a Merkle Tree.  To hand it in to Bitcoin, you break off the BitDNS branch, and to hand it in to BitDNS, you break off the Bitcoin branch.

In practice, to retrofit it for Bitcoin, the BitDNS side would have to have maybe ~200 extra bytes, but that's not a big deal.  You've been talking about 50 domains per block, which would dwarf that little 200 bytes per block for backward compatibility.  We could potentially schedule a far in future block when Bitcoin would upgrade to a modernised arrangement with the Merkle Tree on top, if we care enough about saving a few bytes.

Note that the chains are below this new Merkle Tree.  That is, each of Bitcoin and BitDNS have their own chain links inside their blocks.  This is inverted from the common timestamp server arrangement, where the chain is on top and then the Merkle Tree, because that creates one common master chain.  This is two timestamp servers not sharing a chain.


(emphasis mine)

I think his post is in response to a different way of doing DNS than what I'm trying to accomplish, but I think it is potentially useful for me.  I'm just trying to figure out exactly how I could implement it so I could keep DNS stuff out of the Bitcoin block chain while still using Bitcoin to pay transaction fees to block generators for creating blocks for the DNS block chain.

But after reading your posts, it seems this is not possible yet.
theymos
Administrator
Legendary
*
Offline Offline

Activity: 5376
Merit: 13368


View Profile
February 13, 2011, 05:54:19 AM
 #5

As far as I know, no one ever figured out exactly what Satoshi meant in that post.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
0x6763
Guest

February 13, 2011, 06:05:58 AM
 #6

As far as I know, no one ever figured out exactly what Satoshi meant in that post.

lol, yeah, that's probably true.

What I started thinking he meant was that the miner combines the transactions from each network into one block with the same header, calculates the merkle tree root hash based on all of the transactions, then does the normal block hashing algorithm on it until it finds a hash that meets the current difficulty.  Then when it sends the block out to the Bitcoin network, it prunes the DNS-related transactions from the block, and when sending it to the BitDNS network, it prunes the Bitcoin related transactions from the block.  This way each block chain has the block with only the data relevant to it's own network, but they are still connected/associated with the merkle tree root.  I think this would work, except that Bitcoin won't know the hash of the BitDNS merkle branch since there's no place to put it in a block.  This means it would not be possible to verify that any of the transactions in the block actually belong to that block.
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
February 13, 2011, 10:51:07 AM
 #7

I think he meant a merkle tree on top of the blocks, not the one inside the blocks. Read the last paragraph again. Each quorum has its own block chain, not simply a new set of transactions inside the existing chains tree. This solves the problem of people who don't care about DNS not needing to handle those transactions and vice-versa.

So you'd have two separate blocks. Unlike today, the nonce/extraNonce would not necessarily result in a hash lower than the target. It would result in a hash that is then combined with a hash from the other parts of the super-tree to produce a new merkle root that is lower than the target. You'd need to introduce some kind of super-block data structure as well, for instance instead of

Code:
Block 1234
- version
- hashPrevBlock
- hashBlockMerkleRoot
- nTime
- nBits (difficulty)
- nNonce

like today, for each chain you'd broadcast/store

Code:
SuperBlock 1
- version
- nBits  (difficulty)
- merkleBranch
- Block 1234
   - version
   - hashPrevBlock
   - hashBlockMerkleRoot
   - nTime
   - nNonce

merkleBranch in the superblock would be like

Code:
    abcdef1234567890  (ROOT)
       /                \
  hashBitDNS       hashBitCoin

or potentially in future


Code:
        abcdef1234567890   (ROOT)
       /                \
  hashBitDNS       123456abcdef
                   /          \
          hashBitCoin     hashBitVoting

or whatever, as new quorums are introduced. To solve a tree then, you gather up block headers from each network and start adjusting the nonces in each block (or perhaps just one nonce) until the root hash is lower than the global difficulty target. On disk and on the network you can now store/broadcast just the bitcoin block, the hash of the BitVoting block and the hash of the BitDNS block. These can then be combined to prove that the hash is lower than the target.

In this way the block broadcast for a particular network only needs to contain a small number of extra hashes, namely the merkle branch linking that block to the root, thus it scales nicely to very large numbers of applications of the BitCoin technology.

Now it seems Satoshi had some kind of backwards compatibility trick in mind when he wrote that, and the scheme I just described would be the "modernized" version he envisioned a long term migration to. I'm not sure how he imagined retrofitting this onto BitCoin today, exactly.
theymos
Administrator
Legendary
*
Offline Offline

Activity: 5376
Merit: 13368


View Profile
February 13, 2011, 11:23:24 AM
 #8

That does sound like what Satoshi was thinking of. He specified that each chain would have separate difficulty, which I suppose could be achieved by leaving invalid headers in the tree when the Merkle root is below the target for any of the chains.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
ribuck
Donator
Hero Member
*
Offline Offline

Activity: 826
Merit: 1060


View Profile
February 13, 2011, 12:23:34 PM
 #9

He specified that each chain would have separate difficulty
What's the advantage of that, compared to declaring that the "new" block chain will start with the same difficulty as bitcoin (and will automatically stay in step after that)?
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
February 13, 2011, 12:37:41 PM
 #10

Well, the point of all this is to ensure all chains benefit from the same pool of cpu power. So they'd need to have related difficulty in some sense.

That doesn't mean you have to broadcast a new block every 10 minutes though. Knowing that the core system is targeting a block every ten minutes means you can just set your own chain relative to that. For instance maybe for DNS registrations there aren't so many transactions so you'd prefer to do a new block every 30 minutes instead of every 10 minutes.

So you can state that the difficulty target for your DNS chain is global target / 3. If a superblock is solved and the hash is below the global target but above your chains specific target, just don't broadcast on that chain and continue incorporating transactions into the DNS block. Eventually the superblock will hash to a value below your chains difficulty target and you can treat that block as solved.
joe
Member
**
Offline Offline

Activity: 64
Merit: 10


View Profile
February 13, 2011, 02:52:14 PM
 #11

I think he meant a merkle tree on top of the blocks, ....

Satoshi may have made a small oversight in suggesting this could be done without changing the protocol. He suggested we add BitDNS (or BitWhatever) as the root/branch/leaf/something of the merkle tree. However this will not work because the current BitCoin protocol requires clients to verify that every leaf of the bitcoin chain is a valid bitcoin transaction. A hash value is not a valid bitcoin transaction, so this move would break legacy clients confused by the merkle makeup of the newer blocks.

In any event, I have a good understanding of a handful of ways to do the multiple chains.


The question you are asking about is not related to the BitDNS discussion. To erase spent transactions would not require any type of change in structure of the merkle tree. Each client can get away with forgetting most merkle leaf transaction data. Here is what it has to remember:

- The hash of all blocks
- All prior transactions that contributed to the availability of your own wallet funds
- The block header data for all blocks to which those transactions belong
- The merkle tree non-leaf hashes for those blocks
- All merkle leaf transactions that are siblings to the above transactions in the merkle tree of the block to which they belong.

Now, we must assume all clients are forgetting things by this same algorithm, so when a client spends money he has to spell out all the supporting transaction data over the network to convince other nodes and the recipient that it's legitimate.

Over time as coins become more circulated the supporting data that needs to be sent to the network for every new transaction will become large. So this should be weighed against the status quo where all clients store all data. Other solutions have been made up for tackling the data storage problem separately from bitcoin.
Hal
VIP
Sr. Member
*
expert
Offline Offline

Activity: 314
Merit: 4176



View Profile
February 13, 2011, 07:33:12 PM
 #12

I agree that Satoshi's "200 bytes for backward compatibility" is something of a mystery. The 200 is probably a clue. It is quite a bit bigger than a Bitcoin block header. Might be about the size of a block header plus a dummy transaction.

There are still some ways to embed arbitrary data in Bitcoin transactions. AFAIK scriptSig data is not checked, and you could put anything at the front without invalidating any signatures. So a BitDNS hash could be stuck into the current Bitcoin block.

Hal Finney
theymos
Administrator
Legendary
*
Offline Offline

Activity: 5376
Merit: 13368


View Profile
February 13, 2011, 08:08:34 PM
 #13

Well, the point of all this is to ensure all chains benefit from the same pool of cpu power. So they'd need to have related difficulty in some sense.

They do share CPU power. You only have to increment/hash once to try for a whole set of chains. If the Merkle root is below one of the targets, then you claim just that one (leaving the invalid branches so the root stays the same, but not announcing them as solutions). If the Merkle root happens to be below all of the targets, then you can claim all of them.

What's the advantage of that, compared to declaring that the "new" block chain will start with the same difficulty as bitcoin (and will automatically stay in step after that)?

Not everyone will work on every chain. To maintain 10-minute (or whatever) targets, separate difficulty is necessary.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
0x6763
Guest

February 15, 2011, 03:22:24 PM
 #14

Thanks everyone.  I might have a little bit better understanding of what Satoshi meant.

One problem I see is that if each block chain has a separate difficulty, but you're binding them together with a super-merkle tree, trying to solve them simultaneously would result in more work since every time you change a nonce in a sub-block, you'd have to recalculate that branch of the merkle-tree.  This is contrary to Satoshi's claim of being able to work on separate blocks for "free".

If each block chain shares the same difficulty, then you can get away with only changing a nonce in the super-block and then only hashing that super-block while all the sub-blocks are just part of the merkle-tree.  That part would be free, but it would come at a cost of each node needing to know about the blocks/transactions in each block chain in order to verify them before placing their block hash in the super-block merkle tree.  But if every node needs to keep track of every block chain, then what's the point of having separate block chains (and the increase in bandwidth and storage use) when it could all be combined in one chain?

Am I missing something?
theymos
Administrator
Legendary
*
Offline Offline

Activity: 5376
Merit: 13368


View Profile
February 15, 2011, 10:29:42 PM
 #15

Am I missing something?

You can still have the nonce in the "master header". Each chain doesn't need its own nonce to have its own difficulty.

You don't need the full headers/blocks for the chains you don't care about. You only need the full Merkle tree for the multi-chain. It doesn't matter if the person who created the set is using valid hashes or just random numbers, since it's still just as hard to get the Merkle root below the target.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!