Bitcoin Forum
April 30, 2024, 11:42:02 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Reduced storage mining nodes and compressing the UTXO set  (Read 979 times)
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
December 13, 2013, 11:21:18 AM
 #1

I was thinking about the "ultimate compression" system and I wonder if it is overkill for the time being.  The whole point is to record the UTXO set, but that set isn't really that large anyway.

With pruning, all transactions in the the UTXO set must be stored. There is no point in all nodes storing all this info.

Each entry in the UTXO set just needs to record block index, transactions index and output index.

The UTXO set entry needs to include a 3 byte block index (until block 16 million or so), a 2 byte transaction index and a 1 byte output index.  They could be formally variable length integer types.

This works out as 6 bytes that need to be stored for each UTXO set entry rather than needing at least 250 bytes per transaction.

When downloading the chain, the node would find the longest POW chain (headers first) and then scan that chain from genesis to the longest point.  It could snapshot the UTXO set every few blocks once it nears the end of the chain (... 100000, 10000, 1000, 100, 20 blocks back, and then all of last 20 blocks as diffs).

If a fork happens that is less than 20 blocks (almost all barring bug based forks), it could recalculate the set from the diffs.  The exact snapshot system isn't that critical.  Diffs would likely improve things, especially for the last 20 snapshots.

As it is, this node couldn't do signature verification, but it could make sure there is no double spending.

If this was combined with "expanded" transactions, then it could do signature checks.  An expanded transaction contains all of the transaction's input transactions (and merkle path from the block header).  A wallet could store the transactions that pay to any of its addresses.  This would allow it construct the expanded transaction when it was spending.

If a node received the expanded transaction, it would check

- the merkle path for all input transactions (requires block headers)
- the signatures for all inputs (requires input transactions)
- the inputs aren't double spent (requires the utxo set)

If those checks pass, then it knows the transaction is valid and it can be included in blocks.

Expanded transactions put the burden of tx storage on the owners of the outputs rather than requiring all nodes to store the info.

It also increases total network bandwidth usage in order to reduce storage.  An "expanded" block would be a block that includes expanded transactions (and so would be larger).

This could be an extra bit in the services map.  A EXPANDED_STORE_NODE would make no statement with regards to verification.  It would just store expanded blocks.

A hard fork would make this even more efficient.  Each transaction output could be simply a MAST (Merkle abstract syntax tree) hash and it could be included directly into the block's Merkle tree.

The expanded transaction would include the path to the MAST root for each input and then the sig script (which could be verified against the MAST root).  This means that if an input is from a very large transaction, it is still very small.

However, even without a hard fork, it would allow nodes to use much less disk space (but use up more bandwidth).

Nodes running on a virtual machine in a data centre probably are disk limited.  Nodes running in someone's home is likely bandwidth limited.

A full node could convert a standard transaction into an expanded transaction and act as a bridge.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
"This isn't the kind of software where we can leave so many unresolved bugs that we need a tracker for them." -- Satoshi
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714477322
Hero Member
*
Offline Offline

Posts: 1714477322

View Profile Personal Message (Offline)

Ignore
1714477322
Reply with quote  #2

1714477322
Report to moderator
1714477322
Hero Member
*
Offline Offline

Posts: 1714477322

View Profile Personal Message (Offline)

Ignore
1714477322
Reply with quote  #2

1714477322
Report to moderator
piotr_n
Legendary
*
Offline Offline

Activity: 2053
Merit: 1354


aka tonikt


View Profile WWW
December 13, 2013, 01:05:18 PM
 #2

With pruning, all transactions in the the UTXO set must be stored. There is no point in all nodes storing all this info.
How do you see a node being able to verify a validity of a new block, if it would not have the entire UTXO database?

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB  9059 1BE7 B545 CDF3 FD0E
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
December 13, 2013, 01:23:08 PM
 #3

With pruning, all transactions in the the UTXO set must be stored. There is no point in all nodes storing all this info.
How do you see a node being able to verify a validity of a new block, if it would not have the entire UTXO database?

It would need to have the extended block, rather than just a standard block.

A block at the moment is basically just a header + list of transactions.

<Header>
<transaction count>
tx[0]
tx[1]
...

Instead, the extended block would be something like

<Header>
<transaction count>
tx[0]
    <input count>
    tx for input 1; merkle path
    tx for input 2; merkle path
    ....

tx[1]
    <input count>
    tx for input 1; merkle path
    tx for input 2; merkle path
    ...

....

The average transaction is 250 bytes and most transactions have 2 inputs.  If there are 4096 transactions in a block, then there are 12 levels of the merkle tree down to the transaction.  This gives 32 bytes * 12 = 384 bytes for the merkle path.  This means that each input would be around 250 + 384 = 634 bytes.

That gives 634 * 2 + 250 = 1518 bytes for each extended transaction.  This is 6 times larger than just including the transaction.

However, once you have all the inputs into the transaction, the node can do the verification without needing a database of all the transactions.

A soft fork could be added setting the maximum extended block size to 6MB (in addition to the normal block representation being limited to 1MB).

When broadcasting transactions, spenders could include the expanded transaction.  Some miners might refuse to include transactions received as standard transactions in their memory pools.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
piotr_n
Legendary
*
Offline Offline

Activity: 2053
Merit: 1354


aka tonikt


View Profile WWW
December 13, 2013, 01:46:01 PM
 #4

However, once you have all the inputs into the transaction, the node can do the verification without needing a database of all the transactions.
But what about the volume checks?
Each UTXO has a specific BTC value associated with it and each transaction must not spend more then the sum of all its inputs.
Even if it is properly signed and even if it got mined into a block with a valid POW - it must still verify the transaction volumes which I do not quite see possible if you don't have the full database.

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB  9059 1BE7 B545 CDF3 FD0E
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
December 13, 2013, 01:55:47 PM
 #5

But what about the volume checks?
Each UTXO has a specific BTC value associated with it and each transaction must not spend more then the sum of all its inputs.
Even if it is properly signed and even if it got mined into a block with a valid POW - it must still verify the transaction volumes which I do not quite see possible if you don't have the full database.

The amount is in the input transaction.  You would check the signatures match, and that the amounts all add up correctly.

The idea is to allow blocks to be verified on a stand alone basis.  You receive an extended block and it has all the info needed to verify that the block is valid.

The only extra info you need is the header chain and the UTXO set.

This reduces the load on each node. 

There would still need to be some nodes that convert transactions into extended transactions.  These could be relay nodes that have the entire database.

However, if the system was popular, then wallets would store the info so they could broadcast extended transactions directly.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
piotr_n
Legendary
*
Offline Offline

Activity: 2053
Merit: 1354


aka tonikt


View Profile WWW
December 13, 2013, 02:01:45 PM
 #6

All right.
Now I think I get it - thanks for explaining!

I personally like the "extended transactions" idea; where each tx is broadcasted along with its input-txs.
These days it is a real hassle to verify a signature of a transaction, while the inputs are god knows where.

Check out gocoin - my original project of full bitcoin node & cold wallet written in Go.
PGP fingerprint: AB9E A551 E262 A87A 13BB  9059 1BE7 B545 CDF3 FD0E
TierNolan (OP)
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
December 13, 2013, 02:07:14 PM
 #7

These days it is a real hassle to verify a signature of a transaction, while the inputs are god knows where.

Yes, and it is easy for the owner of coins to keep the input transactions. 

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!