Bitcoin Forum
July 30, 2021, 02:30:53 AM *
News: Latest Bitcoin Core release: 0.21.1 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Replacing old UTXOS with new one in place: an idea/suggested complexity reductio  (Read 418 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 02, 2021, 11:59:47 PM
Last edit: May 03, 2021, 10:49:10 AM by Shymaa-Arafat
Merited by LoyceV (5), ETFbitcoin (3), Halab (2), aliashraf (2), NotATether (2)
 #1

This is a suggestion that, I think, will introduce a significant complexity reduction in the process of adding/removing UTXOS from the proofs Merkle Tree...
.
Intuition/ Rationale:
Since any transaction somehow follows the Energy Law from the point that SUM of inputs = SUM of outputs, a lot of unnecessary overhead could be avoided if deletions & insertions happened simultaneously in a replacement manner.
.
Summary:
When processing a transaction, replace the old TXOs that became input with the new ones; ie if they're the same number there will be no actual deletions/a and thus no tree swaps.. just modify the corresponding values& hashes along the proof path.
-If one old TXO led to more than one change the value of the old leaf node to be the root of a corresponding subtree of the new TXOs.

-Only if n1 inputs created  n2 outputs where n1>n2 u will have to choose (n1-n2) leaves to delete according to some criteria/heuristics
-Then deletions will this way definitely happen much less often, and a suitable heuristic could be thought of later according to some data analysis of existing blocks.
.
Examples:
I traced some inputs and outputs in 2 consecutive blocks Block 1973321,1973322 from the site:
https://blockstream.info/testnet/block/000000000000001eedf8cd0e358a0ca6144e82f5c3da263769f7ad208b8e07ba

You will find that
57.64177548 tBTC=⟩
0.01322037 tBTC
+ 57.62838604 tBTC
Then in the next block
57.62838604 tBTC=⟩
57.61015944 tBTC
+ 0.01322037 tBTC

-So, in my method a subtree evolves in place of the original leaf that represented the UTXO 57.64177548 tBTC, and no other branch of the tree get modified.
(-If the images didn't upload correctly since I'm new here, u can find them in this Facebook link
https://m.facebook.com/story.php?story_fbid=1414790592208718&id=100010333725264
Or here in this twitter link
https://mobile.twitter.com/ArafatShymaa/status/1389000211998117888)
.
-If u traced a number of blocks u'll find that the situation where the number of inputs n1 is larger than the number of outputs n2 (n1›n2) is a very rare event. Even if it is not, we still guaranteed to have less deletions  
.
That's all I have for now, thanks
.
Ps. I came from somewhat the theory not the developers community, so pls. feel free to correct me if I'm wrong & apologies for any waste of ur time
1627612253
Hero Member
*
Offline Offline

Posts: 1627612253

View Profile Personal Message (Offline)

Ignore
1627612253
Reply with quote  #2

1627612253
Report to moderator
PLAY NOW
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
NotATether
Hero Member
*****
Online Online

Activity: 602
Merit: 1877


Cryptographic Crawler


View Profile WWW
May 03, 2021, 08:53:56 AM
 #2

Reposting the images:



I think I understand what's going on here now. You want the spent inputs located on an address to be replaced by UTXOs on the same height, right?

Example: An address 1abcdef has 2BTC at input #20, also sometimes written as 1abcdef:20.

And there's a receiving address somewhere called 3abcdef that has 10 total spent and unspent inputs (so numbering would continue at 3abcdef:11).

Let's say 1abcdef:20 is used as an input to the output 3abcdef:11 in a transaction.

So technically what I think you're saying is, instead of having several different 3abcdef:1,2,3,4,... on the blockchain records, each successive input overwrites the previous one?

One problem I can think of this change is that if the historic inputs are overwritten on blockchain data (which is how you'd implement something like this since verification involves reading all the blocks from the disk), then it becomes impossible to verify previous transactions with the overwritten input numbers like 3abcdef:1,2,3 and so on.

Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 03, 2021, 10:24:50 AM
 #3

No, I meant the Merkle Tree/forest or whatever data structure used for  Stateless Clients
(like in the Utreexo project and other efforts like
https://youtu.be/HEKtDILPeaI
and even  Miller's Red-Black trees idea that started here very long ago
https://bitcointalk.org/index.php?topic=101734.0)
.
Thanks for posting the images
& Let me emphasize that the orange circle is accidentally over "0ca81e", it just means the original parent node of the 1st supposedly deleted leaf in traditional methods.
NotATether
Hero Member
*****
Online Online

Activity: 602
Merit: 1877


Cryptographic Crawler


View Profile WWW
May 04, 2021, 07:05:22 AM
Merited by ETFbitcoin (1)
 #4

OK now I think I get it, this is an optimization in the structures used for verification.

From that linked thread it says this:

Quote
This proposal calls for a single additional hash to be placed in each block: the root hash of the UTXO Merkle tree. As such, it belongs in the HardFork Wishlist. However, it could also be included in an (optionally merge-mined) alt-chain. This will require additional computation for validating a transaction - the addition will be a constant factor, since each processed txinput/txoutput already requires (even in UltraPrune), an O(log N) update to a BTree table of the UTXOs. For now, the proposal simply describes a stand-alone data structure that can be derived from existing blockchain data.

As far as I understand about that, there is already a root transaction hash inside each block header, but we also want a root hash for the UTXO merkle tree so that we can dispense with the current in-memory database of coins. This will save a lot of and maybe even dispense with the need of the dbcache all together (if all it does is store UTXOs).

I think this is similar your proposal of replacing the entries (of the coins memory database) of spent inputs with their unspent inputs. I'm not sure how the current implementation validates an input but I assume it would be something like "if this input is already in the database then it's being spent again".

I guess trying to change it to something like "if an input is NOT in the database (except when inserting UTXOs to said database) that means it has been replaced so it's being spent again" could work except the problem is the dbcache is only a limited size so if we run out of memory and have to off-load unspent inputs to disk, it might accidentally classify those as being spent again.

And in that video it's mentioned that the UTXO size is 5GB which is already greater than people's dbcache limits of 500MB, 2 and even 4 GB and I think that's a reason why Miller's UTXO merkle tree might be a better idea since it's size-agnostic.

aliashraf
Legendary
*
Offline Offline

Activity: 1344
Merit: 1050

Always remember the cause!


View Profile WWW
May 04, 2021, 12:21:49 PM
Last edit: May 04, 2021, 12:50:09 PM by aliashraf
 #5

Good idea as I get it:
Using a recursive Merkle Tree data structure to represent the state of the blockchain helps with transition from the current state once a block is received:
For transitions (caused by each txn in the block) that involve fewer inputs than outputs the spent nodes are replaced/updated with a Merkle Root while for transactions with equal number of inputs and outputs the nodes are updated to the new output and for the worst case when a transaction has more inputs than outputs the extra inputs should be arbitrarily chosen for deletion which is the most expensive operation for Merkle Trees.

That said, the problem is the applicability of this idea because with current bitcoin core implementation, we don't keep track of the blockchain state, using Merkle Trees, explicitly.
On the other hand, proposals for UTXO commitment (using a Merkle Root  somewhere in the bloc or on a side-chain, whatever) are being abandoned for a while; it is a total unfortunate, BTW, because without such a commitment there will be no light client option available for bitcoin.

Anyway, I count this proposal as one of the many efforts for helping with UTXO commitment and I appreciate it. @NotATeher's memory requirement analysis of this proposal looks somehow irrelevant to me because for a client that is supposed to represent/compare the state of the blockchain it is committing to, by a Merkle Root (how else, it could be done?), it is absolutely necessary to maintain the data structure either in memory or on the disk, what OP is proposing an algorithm for its optimization here.

             ▄██▄
   ▄██▄      ▀█▀▀     ▄██▄
   ▀██▀▄  ▄▄█████▄▄  ▐███▀
       ███████████████
      ████████▀▄▄▄▀████
 ▄▄  ▐███▀▄▀██▄▀▀▀▄█████  ▄▄
████▀█████▄███▀▀█████ ██▀████
 ▀▀  ▐███▄███ ██ ████ █▌  ▀▀
      ▀████▄██▄▄███▀▄█▀
    ▄▄ █▀██████▀▄▄▄█▀█ ▄▄
   ████▀   ▀▀▀█▀▀▀   ▐████
    ▀▀       ▄██▄      ▀▀
             ▀██▀
⟩ ⟩ ⟩             ▄▄▄
  ▄▄▄▄▄▄▄▄▄▄█   █▄
 █           ▀▀▀  █
 ▀▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▀
▄▀▀ ▄▄▄▄▄▄▄▄▄▄▄▄ ▀▀▄
█ ▄▀ ▄▄▄▄▄▄▄    ▀█ █
█ █ █       █    █ ▄
█ █ ▄▀▀▀▀▀▀▄▄    █ █
█ █ ▀▄▄▄▄▀▀▄▄▀▀▄ █ █
█ █ █   █  ██  █ █ █
█ █ ▄▀▀▀▀▄▄▀▀▄▄▀ █ █
█ ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ █
 ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
⟩ ⟩ ⟩       ▄████▄  ▄████▄
      ████████████████
      ████████████████
       ██████████████
        ▀██████████▀
██        ▀██████▀        ██
██▌   ▄            ▄   ▐██
███  ███▄          ▄███  ███
▀███▄ ▀███▄      ▄███▀ ▄███▀
  ▀████████      ████████▀
     ▀████▀      ▀████▀
     ▄   ▄▄      ▄▄   ▄
     ▀█████      █████▀
NotATether
Hero Member
*****
Online Online

Activity: 602
Merit: 1877


Cryptographic Crawler


View Profile WWW
May 04, 2021, 05:04:49 PM
 #6

Anyway, I count this proposal as one of the many efforts for helping with UTXO commitment and I appreciate it. @NotATeher's memory requirement analysis of this proposal looks somehow irrelevant to me because for a client that is supposed to represent/compare the state of the blockchain it is committing to, by a Merkle Root (how else, it could be done?), it is absolutely necessary to maintain the data structure either in memory or on the disk, what OP is proposing an algorithm for its optimization here.

The memory requirement was not intended to be for the merkle tree, in fact I explicitly said at the end that merkle trees are independent of the UTXO size, it only applies to the traditional database which is the way the inputs are currently stored (the dbcache).

aliashraf
Legendary
*
Offline Offline

Activity: 1344
Merit: 1050

Always remember the cause!


View Profile WWW
May 04, 2021, 07:30:10 PM
 #7

Anyway, I count this proposal as one of the many efforts for helping with UTXO commitment and I appreciate it. @NotATeher's memory requirement analysis of this proposal looks somehow irrelevant to me because for a client that is supposed to represent/compare the state of the blockchain it is committing to, by a Merkle Root (how else, it could be done?), it is absolutely necessary to maintain the data structure either in memory or on the disk, what OP is proposing an algorithm for its optimization here.

The memory requirement was not intended to be for the merkle tree, in fact I explicitly said at the end that merkle trees are independent of the UTXO size, it only applies to the traditional database which is the way the inputs are currently stored (the dbcache).

So, now it looks even more irrelevant. OP is concerned about Merkle Tree specific operations like addition and deletion of outputs. Without having the tree as the data structure it'd be impossible to update a node with the root of a tree.

             ▄██▄
   ▄██▄      ▀█▀▀     ▄██▄
   ▀██▀▄  ▄▄█████▄▄  ▐███▀
       ███████████████
      ████████▀▄▄▄▀████
 ▄▄  ▐███▀▄▀██▄▀▀▀▄█████  ▄▄
████▀█████▄███▀▀█████ ██▀████
 ▀▀  ▐███▄███ ██ ████ █▌  ▀▀
      ▀████▄██▄▄███▀▄█▀
    ▄▄ █▀██████▀▄▄▄█▀█ ▄▄
   ████▀   ▀▀▀█▀▀▀   ▐████
    ▀▀       ▄██▄      ▀▀
             ▀██▀
⟩ ⟩ ⟩             ▄▄▄
  ▄▄▄▄▄▄▄▄▄▄█   █▄
 █           ▀▀▀  █
 ▀▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▀
▄▀▀ ▄▄▄▄▄▄▄▄▄▄▄▄ ▀▀▄
█ ▄▀ ▄▄▄▄▄▄▄    ▀█ █
█ █ █       █    █ ▄
█ █ ▄▀▀▀▀▀▀▄▄    █ █
█ █ ▀▄▄▄▄▀▀▄▄▀▀▄ █ █
█ █ █   █  ██  █ █ █
█ █ ▄▀▀▀▀▄▄▀▀▄▄▀ █ █
█ ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ █
 ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
⟩ ⟩ ⟩       ▄████▄  ▄████▄
      ████████████████
      ████████████████
       ██████████████
        ▀██████████▀
██        ▀██████▀        ██
██▌   ▄            ▄   ▐██
███  ███▄          ▄███  ███
▀███▄ ▀███▄      ▄███▀ ▄███▀
  ▀████████      ████████▀
     ▀████▀      ▀████▀
     ▄   ▄▄      ▄▄   ▄
     ▀█████      █████▀
Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 04, 2021, 07:55:39 PM
Merited by ETFbitcoin (2)
 #8

Quote
That said, the problem is the applicability of this idea because with current bitcoin core implementation, we don't keep track of the blockchain state, using Merkle Trees, explicitly.
On the other hand, proposals for UTXO commitment (using a Merkle Root  somewhere in the bloc or on a side-chain, whatever) are being abandoned for a while; it is a total unfortunate, BTW, because without such a commitment there will be no light client option available for bitcoin.

Quote
And in that video it's mentioned that the UTXO size is 5GB which is already greater than people's dbcache limits of 500MB, 2 and even 4 GB and I think that's a reason why Miller's UTXO merkle tree might be a better idea since it's size-agnostic
.
Answering both in short points:
1-The paper in the video was published in Mar21, the Utreexo project is still going, so the topic is currently under research focus not abandoned.

2-Miller proposal was implemented in 2014 https://github.com/amiller/redblackmerkle,
published as an ACM notice
Miller, A., Hicks, M., Katz, J., Shi, E.: Authenticated data structures, generically.
ACM SIGPLAN Notices 49(1), 411–423 (2014),

but if u watched the video u'll find out that it did not give the performance improvement that deserves the book keeping of a Red-Black tree data structure as compared to regular BST binary search tree.

3-Easier from the video to understand that the problem that led to the 3 proposals or researches of a RBT from Miller, Forest in Utreexo, Trie in the video paper is the overhead of continuous excessive deletion& insertion of UTXOs (the tree branches get modified which may lead to a degenerate tree with longer paths, each time all internal nodes that contain the leaf hash inside their concatenated hashes must be modified)
.
-What I'm saying if u noticed that every output comes from an input don't handle them separately and "distract" ur data structure, just modify the values in place with the hierarchy of the tree almost the same.
.
I hoped for a community familiar with the problem, to find out if I'm missing something. Is my seemingly obvious easy solution was thought of before and rejected for a reason?or it just didn't cross their minds?Huh
.
.
.
Last note, the memory requirements was the original motive of having Stateless nodes/clients ... nodes that do not keep the full Merkle Tree/Forest/Trie, they just the proof path (called witness) of the UTXOs they need to verify. Each insertion/deletion modify the parents and siblings hashes along the way, and thus may affect other proofs especially if the deletion involved a swap of tree branches (watch the video) and we wish to minimize that.
oryhp
Newbie
*
Offline Offline

Activity: 26
Merit: 44


View Profile
May 04, 2021, 09:09:18 PM
 #9

For transitions (caused by each txn in the block) that involve fewer inputs than outputs the spent nodes are replaced/updated with a Merkle Root while for transactions with equal number of inputs and outputs the nodes are updated to the new output and for the worst case when a transaction has more inputs than outputs the extra inputs should be arbitrarily chosen for deletion which is the most expensive operation for Merkle Trees.

I'm probably missing something. My current understanding is that a 1-2 transaction (1 input, 2 outputs) would replace the leaf of the input with a parent and two children, meaning you increase the depth of the tree by 1. Since most of the transactions on Bitcoin are 1-2, we can expect these children outputs to also split into two forming a chain which makes it no longer log(n) but rather linear.
Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 04, 2021, 10:17:57 PM
 #10

Quote
Since most of the transactions on Bitcoin are 1-2, we can expect these children outputs to also split into two forming a chain which makes it no longer log(n) but rather linear.
.
Well, we r sure to have less number of deletions and no swaps in the tree branches (in addition to avoiding their overhead, keeping the locality of reference mentioned in the video)
However, I can't be quite sure about the resulting tree without trying on the old available data (benchmarks).
In the video paper they tried to benefit from locality of reference by reserving the original order of UTXOs in the tree/Trie, here I like keep the history of one coin or amount of crypto how it got spent in a subtree ( correlated hashes) will this lead to a better (or maybe worse) tree structure? Will this add a new piece of info (tracing money) that could be useful somehow (I don't know maybe tracing stolen money thru hacking, I don't want to rush things)?
.
I'm first like exploring did I miss something that makes this idea wrong or silly somehow?
NotATether
Hero Member
*****
Online Online

Activity: 602
Merit: 1877


Cryptographic Crawler


View Profile WWW
May 05, 2021, 09:08:24 AM
 #11

Quote
Since most of the transactions on Bitcoin are 1-2, we can expect these children outputs to also split into two forming a chain which makes it no longer log(n) but rather linear.
.
Well, we r sure to have less number of deletions and no swaps in the tree branches (in addition to avoiding their overhead, keeping the locality of reference mentioned in the video)
However, I can't be quite sure about the resulting tree without trying on the old available data (benchmarks).
In the video paper they tried to benefit from locality of reference by reserving the original order of UTXOs in the tree/Trie, here I like keep the history of one coin or amount of crypto how it got spent in a subtree ( correlated hashes) will this lead to a better (or maybe worse) tree structure? Will this add a new piece of info (tracing money) that could be useful somehow (I don't know maybe tracing stolen money thru hacking, I don't want to rush things)?
.
I'm first like exploring did I miss something that makes this idea wrong or silly somehow?

I think first we need to figure out how to keep a utreexo tree balanced, which I assume you want to go forward with, since if Merkle tree is used then every time a UTXO is replaced, more hashes need to be updated because it tries to keep everything under one root unlike utreexo.

The video does mention that briefly (I don't recall seeing utreexo discussed here before) and proposes a few balancing tree structures - trie, red-black, and AVL.

IMO a trie doesn't make sense to represent a UTXO set with since those are intended for prefixes, and UTXO's have no ancestry.

AVL tree and red-black trees have similar performance except when you want to look up a specific UTXO (AVL trees are slightly faster). Also the extra bit doesn't take up much space overall.

The unfamiliarity is probably because not many people here have heard of or tried the utreexo method.

Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 05, 2021, 12:39:40 PM
Last edit: May 06, 2021, 02:01:34 AM by Shymaa-Arafat
 #12

Quote
think first we need to figure out how to keep a utreexo tree balanced, which I assume you want to go forward with, since if Merkle tree is used then every time a UTXO is replaced, more hashes need to be updated because it tries to keep everything under one root unlike utreexo.

Up till now I didn't modify the tree structure, just a simple BST.
»»»
To be accurate I think I should call it just Binary tree for now because the meaning of relative order could be tricky somehow.
««««
Utreexo on the other hand, uses a forest ( keeps a max of "log n" unconnected subtrees. So, yes maybe it will modify less hashes than my approach sometimes but not necessarily, since in the most frequent 1:2 case I do his 3steps in just 1 larger step.
.
Quote
AVL tree and red-black trees have similar performance except when you want to look up a specific UTXO (AVL trees are slightly faster). Also the extra bit doesn't take up much space overall.

As stated in the "suggesting Trie" paper, AVL Trees will probably involve more branch swaps with deletions which will(plus the extra complexity) distract more the original order we r trying to keep hoping for a better  Locality of reference.
....
FogOfChess
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
May 06, 2021, 05:23:58 PM
 #13

Hey, I'm one of the authors of the paper in the video that was linked. Thanks Shymaa for emailing me, it's cool that people are discussing these ideas.

In this particular formulation, I would say that some of the major strengths are low bookkeeping overhead (since this proposal doesn't seem to use a more complicated data structure than just a Merkle tree). In fact, it may be possible to do something like this in the UTREEXO codebase itself. UTREEXO is somewhat agnostic, as I think they've stated on their freenode channel, about how exactly the recombination takes place. It's possible for efficiency reasons they do actually try to do this "replacement" tactic - from my experience with their codebase, they keep their tree in a flat file where the location within the file indicates the location of the node in the tree. With this replacement, they would not have to do much movement of values within that database, so that seems like a benefit.

In terms of this criterion of "keeping the transactions in order" from our paper, this technique doesn't necessarily do this. For this reason, I might expect the proofs to be larger, but ultimately I don't see why they should be larger than what UTREEXO currently does.
Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 06, 2021, 06:21:15 PM
Last edit: May 09, 2021, 05:31:08 AM by Shymaa-Arafat
 #14

Quote
Hey, I'm one of the authors of the paper in the video that was linked. Thanks Shymaa for emailing me, it's cool that people are discussing these ideas
Thanks due to u for the paper first, and for the assuring reply this my first 6months in the crypto world.
And yes I like discussed ur paper a million times, this one in mixed Arabic/English
https://m.facebook.com/story.php?story_fbid=1409002919454152&id=100010333725264


Quote
In terms of this criterion of "keeping the transactions in order" from our paper, this technique doesn't necessarily do this.
-I don't know, maybe in most cases I like "trace" how a certain amount of currency got spent.
Quote

For this reason, I might expect the proofs to be larger, but ultimately I don't see why they should be larger than what UTREEXO currently does.
-The truth is I didn't exactly sense as welcoming reaction in the Utreexo gethub discussions?. I don't know should try again or find another "coding partner" who is familiar with the environment to help in the programming, testing,...etc.
.
Mile Thanks again, for the assuring reply 🙏
...
»»»One more thing, if this idea could trace how an amount of money got spent, do u think it could be used with some enhancements in introducing an analogy to NFT in Bitcoin?Huh
I mean maybe another object than UTXO called DUTXO for Distinguished that keeps track of the history of ownership in a tree (if it is to be divided or shared)Huh
Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 09, 2021, 05:37:18 AM
 #15

I should also mention that I found that it seems  according to this article that Merkle Trees r officially implemented in the most traditional way!
https://codetrips.com/2020/08/02/modern-c-implementing-a-merkle-tree/

In there u can read ...

Quote
Rather obviously, this results in a highly unbalanced tree:
however, as the purpose of a MerkleTree is to hold data and ensure its validity against tampering or computation errors (and not, for example, search efficiency) this is a very convenient trade-off: re-balancing the tree after each (or even several) addition would require recomputing almost all of the nodes’ hashes (apart from the leaf nodes, which remain unchanged); given that this would likely require many cryptographic operations (which are notoriously CPU intensive) it would be a rather poor trade-off.
NotATether
Hero Member
*****
Online Online

Activity: 602
Merit: 1877


Cryptographic Crawler


View Profile WWW
May 14, 2021, 08:31:39 AM
Last edit: May 14, 2021, 09:15:43 AM by NotATether
 #16

@Shymaa-Arafat

I just ran the utreexo server overnight. The driver program itself isn't the best as I had to stop Bitcoin Core before running the server and then delete an incomplete pollardFile, but the client is also up and running.

Now I am seeing output like this:

Code:
got to height 170000
got to height 171000
got to height 172000
got to height 173000
got to height 174000
got to height 175000
got to height 176000
got to height 177000
got to height 178000
got to height 179000
Block 180000 add 1655251 del 1231778 pol nl 423473 roots 10 he 1654444 re 1259415 ow 0
 plus 0.00 total 204.56

I believe "total" is just the running time, but I'm unable to make sense of the rest of the data. Perhaps "add" and "del" represent the number of addition and deletion operations, or UTXOs that have been added or deleted? And maybe "roots" is the number of utreexo roots.

Can you explain what I'm seeing? I think I'll also take a closer look at the way Core itself is constructing its transaction database.

EDIT: Apparently it is using so much resources that it managed to crash my machine and force me to do a power cycle.

Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 14, 2021, 10:14:12 AM
 #17

Quote
Perhaps "add" and "del" represent the number of addition and deletion operations, or UTXOs that have been added or deleted? And maybe "roots" is the number of utreexo roots.

I'm not part of the Utreexo project, I can just guess that those r the no of deletions & insertions performed, the next is the difference ie the increase in the no of UTXOs, root 10 probably no of roots (Utreexo is a forest of trees, so means after this block they have 10 trees), maybe the next 2 numbers r the max, and min height of those 10 trees (If so seems degenerate?)
.
If those r really no of deletions & insertions, this means my replacement suggestion will make a significant performance improve,  about 16/(16+12) meaning 3/7 of the work will be saved.

However, we r not sure about the resulting tree structure?gives better results in one tree or in forest?
ie the no of operations is reduced to almost half, but the single operation may become more complex if the number of concatenated hashes that need to be updated increases. Also, the storage.
I think my tree will resemble a transaction graph that's why I'm studying it.
.
Quote
EDIT: Apparently it is using so much resources that it managed to crash my machine and force me to do a power cycle.

Bolton Bailey said the drawback in his Trie implementation is that it runs in several days for it involves many disk accesses, while Utreexo runs in 1 day; but the resulting forest is about 133G I think, the Trie is 63G.
.
Thanks for the info & the trying effort.
kcalvinalvinn
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
May 15, 2021, 08:18:19 AM
 #18


Code:
got to height 170000
got to height 171000
got to height 172000
got to height 173000
got to height 174000
got to height 175000
got to height 176000
got to height 177000
got to height 178000
got to height 179000
Block 180000 add 1655251 del 1231778 pol nl 423473 roots 10 he 1654444 re 1259415 ow 0
 plus 0.00 total 204.56

I believe "total" is just the running time, but I'm unable to make sense of the rest of the data. Perhaps "add" and "del" represent the number of addition and deletion operations, or UTXOs that have been added or deleted? And maybe "roots" is the number of utreexo roots.

Can you explain what I'm seeing? I think I'll also take a closer look at the way Core itself is constructing its transaction database.


Block 180000 -> The current height it's at
add -> all the added leaves (TXOs)
del -> all the deletions that happened (STXOs)
pol nl -> pollard numLeaves. All the current UTXOs
roots -> Utreexo roots
he -> Height. Height of the Utreexo tree
re -> RememberEver. The count of all the leaves remembered through caching. This is stale and we have a newer caching algorithm.

Quote
EDIT: Apparently it is using so much resources that it managed to crash my machine and force me to do a power cycle.

Yeah that's a known problem https://github.com/mit-dci/utreexo/issues/264. cmd/utreexoclient (CSN) likely has a memory leak. All the binaries in cmd/ aren't really maintained that much as there's more focus on the btcd fork with utreexo https://github.com/mit-dci/utcd. None of the binaries are really user friendly yet. We're still finalizing the accumulator design so not really focused on user-friendliness of the binaries.
Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 16, 2021, 05:26:23 AM
Merited by ETFbitcoin (1)
 #19

To give the complete picture here, the previous reply came as a response of me starting an issue #270  inquiring about the run results at the Utreexo site
https://github.com/mit-dci/utreexo/issues/270
Summary of the rest not said here:
 -yes total is the total running time.
-Still not convinced why the forest height is too large, 1654444 is not O(log N) by any case?!
and the print code line doesn't clear it
Quote

c.ScanBlock(blocknproof.Block)
 
    if c.CurrentHeight%10000 == 0 {
       fmt.Printf("Block %d add %d del %d %s plus %.2f total %.2f \n",
          c.CurrentHeight, totalTXOAdded, totalDels, c.pollard.Stats(),          plustime.Seconds(), time.Since(starttime).Seconds())

They r all part of the c.pollard.Stats() that I couldn't find its complete trace in the given code
.
To be honest and showing the complete picture, he also gave a "not preferring" reply to me proposing the same idea at their site
https://github.com/mit-dci/utreexo/discussions/268

Quote
While the design where additions and deletions happen at the same time would help a bit, hash savings would be limited and wouldn't help with the real bottleneck, the proof size.
.
The fact that the proof size depends on the tree height (as stated in their paper and known in the literature), the resulting height in their runs is encouraging me to pursue on with my idea (not have to be cooperated with their project). Also the transaction graph analysis; replacement is analogy to address reuse. So if they prevented address reuse because it violates anonymity connecting user data together, the same Locality of reference will probably be inherited in my Merkle Tree which decreases the proof size by having many repeated internal nodes in the batch(will be fetched only once)
Shymaa-Arafat
Member
**
Offline Offline

Activity: 95
Merit: 59


View Profile
May 16, 2021, 09:06:25 AM
 #20

They finally replied
Quote
he -> Hashes ever. All the hashing ever done.
.
Well, I closed the issue but now I wonder again from where came the difference than the no of insertions?
165251-1654444=807?
-Also, if it's obviously should be the same why calculate it to appear in the o/p?
Is it just double-checking, or the difference is due to address reuse?
-If it is address reuse, how do they check it?
Something still not clear
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!