Erlay looks flawed to me - a superior Invitation-less bitcoin protocol proposal

watashi-kokoto (OP)

Sr. Member

Offline

Activity: 687
Merit: 269

Erlay looks flawed to me - a superior Invitation-less bitcoin protocol proposal

July 30, 2024, 02:37:06 PM

With interest, I read the paper "Bandwidth-Efficient Transaction Relay in Bitcoin" (Erlay Project). It seems quite flawed, and I believe I can offer a better approach.

Let's reconsider full mempool reconciliation. We can train a reconciliation AI model on a CPU. The training times for an ordinary CPU are as follows:

0.147s for 100 transactions, 100 inner Merkle tree nodes
3.984s for 1,000 transactions, 1,000 inner Merkle tree nodes
Very long for 5,000 transactions, 5,000 inner Merkle tree nodes

Now, applying bucketing (sharding by the initial hash byte, which divides the problem into 256 "equal" subproblems) on the same CPU yields:

0.004s * 256 = 1.024s for 1,000 transactions, 1,000 inner Merkle tree nodes, training 256 models on 8 data points each [48 * 256 bytes model]
0.029s * 256 = 7.424s for 5,000 transactions, 5,000 inner Merkle tree nodes, training 256 models on 40 data points each [157 * 256 bytes model]
0.057s * 256 = 14.592s for 10,000 transactions, 10,000 inner Merkle tree nodes, training 256 models on 80 data points each [258 * 256 bytes model]

Surprisingly, it's faster? And it will benefit from future CPU advancements. Maybe mining farms won’t mind?

If Party A sends this 66KB (44KB more realistically) model to Party B, Party B will immediately be able to respond with only the transactions that Party A doesn't have. The best part: notice how the AI model is smaller than a simple vector of hashes? It's about 3x-4x smaller.

Our model (actually, 256 models) will be able to predict the following boolean values based on TX/tree internal node 256-bit hash:

reconcile(model, hash1) = true // I have this object, and it is a TX hash (Merkle tree leaf). It is NOT a root/inner node.
reconcile(model, hash2) = true or false // I don't have this object (false positive/false negative)
reconcile(model, hash3) = false // This object is NOT a TX HASH but a Merkle tree root/inner node hash

Here is the proposed protocol:

ProtocolPacketUnsolicitedData:
field: "block_header": raw_header ([80]byte) // Note: contains the Merkle root
field: "block_height": 814201

ProtocolPacketReconcileRequest:
field: "merkle_root": hash
field: "my_best_model_hash_or_hashes": {model_hash} // Multiple or none, used only by a proxy; otherwise, one
field: "block_height": 814201

ProtocolPacketReconcileModelRequest:
field: "requested_model": model_hash ([32]byte)

ProtocolPacketReconcileModelResponse:
field: "model_hash": model_hash ([32]byte)
field: "model_response": raw_model ([]byte)

ProtocolPacketReconcileResponse:
field: "block_height": 814201
field: "merkle_root": ([32]byte)
field: "model_hash_vector": {model1_hash, model2_hash} ([][32]byte) // Used only by a proxy; otherwise, empty
field: "transactions": raw_txs ([][]byte)
field: "merkle_nodes": raw_inodes_left_right_hashes ([][2][32]byte)

Full node protocol:

A block is mined by us/new TX signed, and added to the chain/mempool respectively.
If it’s a block, we send it to everyone using the unsolicited ProtocolPacketUnsolicitedData packet.
Initialize reconciliation for the just-mined block (height), and perform periodic reconciliation for the next candidate block (height+1).
When a new reconciliation model (based on Merkle root and nodes) is trained, we send it to all peers via ProtocolPacketReconcileRequest.
The other party may download it using ProtocolPacketReconcileModelRequest/ProtocolPacketReconcileModelResponse.
The other party will perform reconciliation and directly send us the objects that we don't have for that specific height/candidate block.

Full node protocol - Handling ProtocolPacketReconcileRequest from another node:

If the block_height is far in the past or future, ignore it.
If the known Merkle root is on the chain or known chain branch, ignore it; we already have this.
If the sender's my_best_model_hash is different, update the my_best_model_hash for this connection and attempt to download the full model using ProtocolPacketReconcileModelRequest.
Once we have the ProtocolPacketReconcileModelResponse (the full model), determine if this is a mempool reconciliation or a block reconciliation based on the block_height.
Perform inference on the model and our view of the mempool/on-chain block. For each mempool/on-chain block object:
For each Merkle internal node, check if the model says the other node thinks it is truly an internal node. If the model is wrong, send the inode object.
For each transaction hash leaf, check if the model says the other node thinks it is truly a transaction hash leaf. If the model is wrong, send the whole transaction.
Send the ProtocolPacketReconcileResponse.

Proxy/Relay/Non-reconciliating node protocol - Handling ProtocolPacketReconcileRequest:

If the block_height is far in the past or future, ignore it.
If the known Merkle root is on the chain or known chain branch, ignore it; we already have this.
When a ProtocolPacketReconcileRequest comes in, set the inbound hash for this connection and add the hash to the outbound bag for all other connections.
Proxy the ProtocolPacketReconcileRequest to all other connections, containing the same block_height, same merkle_root, but their own connection's outbound hash bag set into the my_best_model_hash_or_hashes.

Proxy/Relay/Non-reconciliating node protocol - Handling ProtocolPacketReconcileResponse:

Determine which objects some peers don't have based on their connection's latest available model for that block height/merkle root.
Construct a ProtocolPacketReconcileResponse if they made a request before.
Send the TX data/tree inode data to them.

Proxy/Relay/Non-reconciliating node protocol - Handling ProtocolPacketReconcileModelRequest:

Identify which connection has the inbound model hash.
Set up mapping to send the response based on the model hash to whoever requested it.
Proxy the request to the node that has the inbound model hash, then proxy its response back (and save the response to the connection; this is how the proxy discovers models).

NotFuzzyWarm

Legendary

Offline

Activity: 3738
Merit: 2666

Evil beware: We have waffles!

Re: Erlay looks flawed to me - a superior Invitation-less bitcoin protocol proposal

July 30, 2024, 09:06:13 PM
Last edit: July 30, 2024, 10:23:05 PM by NotFuzzyWarm

And my guess is that it would not be Legacy compatible so would be just another fork of Bitcoin - aka, an altcoin and not Bitcoin...

- For bitcoin to succeed the community must police itself - My info useful? Donations welcome! 3NtFuzyWREGoDHWeMczeJzxFZpiLAFJXYr
-Sole remaining active Primary developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to: 1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr

gmaxwell

Moderator
Legendary

Offline

Activity: 4242
Merit: 8654

Re: Erlay looks flawed to me - a superior Invitation-less bitcoin protocol proposal

July 30, 2024, 09:28:43 PM
Last edit: July 30, 2024, 09:42:15 PM by gmaxwell

Merited by vjudeu (1)

Quote from: watashi-kokoto on July 30, 2024, 02:37:06 PM

With interest, I read the paper "Bandwidth-Efficient Transaction Relay in Bitcoin" (Erlay Project). It seems quite flawed, and I believe I can offer a better approach.

In what respect? The core reconciliation mechanism in erlay is arguably close to information theoretically optimal: If peer A wants to tell peer B about the IDs a knows and B doesn't know, it need only send as much data as the number of missing IDs, no matter how many they have in common even though A has no idea what txn B are missing.

Reconciling relay sets is almost functionally equivalent to reconciling mempools, however mempools can have persistent differences due to policy differences between nodes or due to ties in preferences between conflicting transactions. This means that there is a risk of continually leaking bandwidth due to a persistent difference in the mempool, which I think makes reconciling relay sets more attractive.

Quote

Let's reconsider full mempool reconciliation. We can train a reconciliation AI model on a CPU. The training times for an ordinary CPU are as follows:

What precisely is this machine learning model learning to predict? Absent such a description your post is hard to distinguish from the significant numbers of non-technical conmen in the cryptocurrency space that just spew jargon.

Quote

Now, applying bucketing (sharding by the initial hash byte, which divides the problem into 256 "equal" subproblems) on the same CPU yields:

The author of transactions can control the bytes of their hash, so any scheme that hopes to distribute work/load with respect to transactions are vulnerable to attack (e.g. attacker mines flood of transaction that all begin with a common byte).

Quote

If Party A sends this 66KB (44KB more realistically) model to Party B, Party B will immediately be able to respond with only the transactions that Party A doesn't have. The best part: notice how the AI model is smaller than a simple vector of hashes? It's about 3x-4x smaller.

That is a substantial amount of data, several times what the Bitcoin network uses today to relay a block in the presence of common transaction, which itself is substantially larger than we know is possible (existing mechanism is used for implementation simplicity and to reduce the latency impact of decode time).

Checking a friend's node, just to given an example, I see that the last 7 blocks took the following number of bytes to relay to this node: 38227, 18617, 30581, 31484, 41335, 42155, 24038.

Quote

reconcile(model, hash3) = false // This object is NOT a TX HASH but a Merkle tree root/inner node hash

Transaction order is a property of both the dependency graph and the miners selection algorithm, as a result attempting to reconcile on interior nodes will mostly just waste bandwidth because even with the transactions are the same the alignment in the tree is often different. Efficient reconciliation protocols effectively don't use any bandwidth for common data, so there shouldn't be an advantage on reconciling on interior nodes.

Quote

Full node protocol:

   A block is mined by us/new TX signed, and added to the chain/mempool respectively.
   If it’s a block, we send it to everyone using the unsolicited ProtocolPacketUnsolicitedData packet.
   Initialize reconciliation for the just-mined block (height), and perform periodic reconciliation for the next candidate block (height+1).
   When a new reconciliation model (based on Merkle root and nodes) is trained, we send it to all peers via ProtocolPacketReconcileRequest.
   The other party may download it using ProtocolPacketReconcileModelRequest/ProtocolPacketReconcileModelResponse.
   The other party will perform reconciliation and directly send us the objects that we don't have for that specific height/candidate block.

This and the following text has no coverage for if it's not a block but a new transaction. Erlay is *not* a block relay protocol. It is a transaction relay protocol, and given my above efficiency comment I don't think what you're imagining is particularly interesting as a block relay protocol compared to what is already deployed.

vjudeu

Hero Member

Offline

Activity: 825
Merit: 1977

Re: Erlay looks flawed to me - a superior Invitation-less bitcoin protocol proposal

July 31, 2024, 07:39:20 AM

Quote

Exactly. Mining a single byte in transaction ID is very easy. There are transactions with four mined bytes, for example: https://mempool.space/tx/000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51

█▀▀▀
█
█
█
█
█
█
█
█
█
█
█
█▄▄▄

▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄

▀▀▀█
█
█
█
█
█
█
█
█
█
█
█
▄▄▄█

▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
░▀▀██▀▀

Pages: [1]

Bitcoin Forum > Bitcoin > Development & Technical Discussion > Erlay looks flawed to me - a superior Invitation-less bitcoin protocol proposal

« previous topic next topic »