Bitcoin Forum
May 12, 2024, 08:22:28 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Using references in order to compress the byte size of transactions  (Read 235 times)
kabab (OP)
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
July 11, 2018, 06:31:20 AM
 #1

I was thinking of ways a blockchain's format might be designed so that its transaction records can be made more compact. One low hanging fruit appears the following..

Transaction inputs involve previous outputs on the chain. If in place of recording these subsequent inputs by value, they were instead only referenced in transaction records, then perhaps their byte size could be reduced significantly.

The idea would work something like this: whenever a new output is recorded on the chain, an unused (monotonically increasing) numeric ID is recorded next to it so that subsequent blocks may reference that value by ID. Since the space of IDs so defined can probably fit in something like 8-10 bytes, it seems like a considerable savings.

Am I overlooking something? (Is/was this something already considered?)
1715502148
Hero Member
*
Offline Offline

Posts: 1715502148

View Profile Personal Message (Offline)

Ignore
1715502148
Reply with quote  #2

1715502148
Report to moderator
1715502148
Hero Member
*
Offline Offline

Posts: 1715502148

View Profile Personal Message (Offline)

Ignore
1715502148
Reply with quote  #2

1715502148
Report to moderator
1715502148
Hero Member
*
Offline Offline

Posts: 1715502148

View Profile Personal Message (Offline)

Ignore
1715502148
Reply with quote  #2

1715502148
Report to moderator
It is a common myth that Bitcoin is ruled by a majority of miners. This is not true. Bitcoin miners "vote" on the ordering of transactions, but that's all they do. They can't vote to change the network rules.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715502148
Hero Member
*
Offline Offline

Posts: 1715502148

View Profile Personal Message (Offline)

Ignore
1715502148
Reply with quote  #2

1715502148
Report to moderator
1715502148
Hero Member
*
Offline Offline

Posts: 1715502148

View Profile Personal Message (Offline)

Ignore
1715502148
Reply with quote  #2

1715502148
Report to moderator
odolvlobo
Legendary
*
Offline Offline

Activity: 4312
Merit: 3214



View Profile
July 11, 2018, 07:05:52 PM
 #2

How would two different wallets agree on the IDs of the outputs when they construct their transactions? If the IDs are set when the transaction is included in a block, then how would an output that is not yet in a block be referenced?

Join an anti-signature campaign: Click ignore on the members of signature campaigns.
PGP Fingerprint: 6B6BC26599EC24EF7E29A405EAF050539D0B2925 Signing address: 13GAVJo8YaAuenj6keiEykwxWUZ7jMoSLt
aliashraf
Legendary
*
Offline Offline

Activity: 1456
Merit: 1174

Always remember the cause!


View Profile WWW
July 11, 2018, 08:03:40 PM
Last edit: July 11, 2018, 09:01:15 PM by aliashraf
 #3

How would two different wallets agree on the IDs of the outputs when they construct their transactions? If the IDs are set when the transaction is included in a block, then how would an output that is not yet in a block be referenced?
For some reasons, other than reducing tr size, I like the idea of using references to blockchain data instead of the real addresses. For example, a wallet by referencing a transaction in a block, implies its adherence to the state of the blockchain at least in the referenced block.

Your objection could be easily resolved by supporting both legacy tr format and the compact form proposed by op.
achow101
Moderator
Legendary
*
expert
Offline Offline

Activity: 3388
Merit: 6635


Just writing some code


View Profile WWW
July 11, 2018, 10:04:37 PM
Merited by Foxpup (3), ABCbits (1)
 #4

Transaction inputs involve previous outputs on the chain. If in place of recording these subsequent inputs by value, they were instead only referenced in transaction records, then perhaps their byte size could be reduced significantly.
They already are referred to by reference, not by value. The reference is known as the outpoint, which is the hash of the transaction containing the output and the 0-based index of the output. This is a reference to where a node can find the output, it is not the output itself.

For some reasons, other than reducing tr size, I like the idea of using references to blockchain data instead of the real addresses.
There is no such thing as "real addresses" and Bitcoin already works by using references in many places.

aliashraf
Legendary
*
Offline Offline

Activity: 1456
Merit: 1174

Always remember the cause!


View Profile WWW
July 12, 2018, 11:03:27 AM
Last edit: July 12, 2018, 07:45:13 PM by aliashraf
 #5

Transaction inputs involve previous outputs on the chain. If in place of recording these subsequent inputs by value, they were instead only referenced in transaction records, then perhaps their byte size could be reduced significantly.
They already are referred to by reference, not by value. The reference is known as the outpoint, which is the hash of the transaction containing the output and the 0-based index of the output. This is a reference to where a node can find the output, it is not the output itself.
For some reasons, other than reducing tr size, I like the idea of using references to blockchain data instead of the real addresses.
There is no such thing as "real addresses" and Bitcoin already works by using references in many places.
You should be more cautious about calling transaction input (the outpoint field) a reference, technically it is not.

A reference should encompass a valuable information about an external event/data which is not the case with current bitcoin implementation of input addresses. let's examine it more closely:

First, we have a transaction with output(s) being tweaked (prefixed and padded properly) RIPEMD-160 hash(es) of respected public key(s).
Please note: Once this transaction is propagated in the network resided in the mempool, is nothing more valuable than when it was created.  

Now, suppose for some reason we like to 'refer' to one of the outputs by saying that it is the nth output of the transaction (using its hash/id), instead of the original wallet address used for the output (the way bitcoin actually implements inputs and you have correctly mentioned it), is it really a reference?

No! It is not. The hash of a value is not a reference to it, hashes don't contain any information, they don't lead you to a new data/informatio.
Ideally speaking, the hash of a value is nothing other than the value itself, a version of it. Having the id of a transaction won't give you any new information about it, you should have access to the original version and for this you have to find it by applying a search on a data structure, probably.

This is why you can conventionally call bitcoin transaction inputs as 'real address', they are real, an alternative version of the reality but with the same information value, obtainable from the transaction.

I'm not questioning this technique, actually it is good for many reasons but not the only option, neither the best one.

What op suggests is a true reference to the blockchain data. To  make it more formal:

A blockchain can be understood and treated as an immutable ordered list of transactions. One can trivially derive an ordered list of input addresses from the transaction list.
Using the absolute position of an input in the derived ordered list of outputs could be considered as an alternative approach to current approach of using the original transaction's address.

My first evaluations, suggest wider consequences than just compression, e.g. it helps security by rendering bootstrap/long range attacks orders of magnitude more difficult.

pebwindkraft
Sr. Member
****
Offline Offline

Activity: 257
Merit: 343


View Profile
July 12, 2018, 11:20:52 AM
Merited by achow101 (2), ABCbits (1)
 #6

...
You should be more cautious about calling transaction input a reference, technically it is not.

A reference should encompass a valuable information about an external event/data which is not the case with current bitcoin implementation of input addresses. let's examine it more closely:

First, we have a transaction with output(s) being tweaked (prefixed and padded properly) RIPEMD-160 hash(es) of respected public key(s).
Please note: Once this transaction is propagated in the network resided in the mempool, is nothing more valuable than when it was created. 

Now, suppose for some reason we like to 'refer' to one of the outputs by saying that it is the nth output of the transaction (using its hash/id), instead of the original wallet address used for the output (the way bitcoin actually implements inputs and you have correctly mentioned it), is it really a reference?
I don‘t get the mixture between hashes and index. Maybe I am missing something?

The previous transaction is found by giving it‘s hash, and the outpoint to spend from is given as a number starting from zero. So both are pointers into a previous tx, as opposed to providing the whole data structure of a previous tx, which saves a lot of space... and one could call this a reference?

On addresses it is clear, that we talk about data representation. The bitcoin address is derived from the public key via some conversions and mathematical functions, which are not (as per today‘s knowledge) reversibel.

So when it comes to data for the input, it can be considered a reference, whereas the bitcoin addresses don‘t appear in a tx, only it‘s pubkey or the hash of it. They are not references in this sense.
aliashraf
Legendary
*
Offline Offline

Activity: 1456
Merit: 1174

Always remember the cause!


View Profile WWW
July 12, 2018, 12:35:15 PM
Last edit: July 12, 2018, 01:09:17 PM by aliashraf
Merited by ABCbits (1), pebwindkraft (1)
 #7

...
You should be more cautious about calling transaction input a reference, technically it is not.

A reference should encompass a valuable information about an external event/data which is not the case with current bitcoin implementation of input addresses. let's examine it more closely:

First, we have a transaction with output(s) being tweaked (prefixed and padded properly) RIPEMD-160 hash(es) of respected public key(s).
Please note: Once this transaction is propagated in the network resided in the mempool, is nothing more valuable than when it was created.  

Now, suppose for some reason we like to 'refer' to one of the outputs by saying that it is the nth output of the transaction (using its hash/id), instead of the original wallet address used for the output (the way bitcoin actually implements inputs and you have correctly mentioned it), is it really a reference?
I don‘t get the mixture between hashes and index. Maybe I am missing something?

The previous transaction is found by giving it‘s hash, and the outpoint to spend from is given as a number starting from zero. So both are pointers into a previous tx, as opposed to providing the whole data structure of a previous tx, which saves a lot of space... and one could call this a reference?
No. It is not a reference. It is just a hash. A hash is not an index or a pointer (pointers are indexes of an ordered list of bytes, memory), although it is common practice when there is no ambiguity to 'treat' them like a reference.

When you give someone a hash of a data, you have just offered an alternative version of the same data, temporarily useless tho, but when the data is revealed it is applicable for comparison and security purposes, yet at the end, no value is added to the data.

Comparatively, a reference to the data (or to its hash as is the case here) yields, both the data (or its hash)  and its location in a list. References are more rich than the raw data. They are processed data, i.e. information. Even a simple pointer to a data structure is more valuable than its copy.

In the context of this discussion, an index to a confirmed transaction (a reference to one of its outputs, precisely) on the blockchain is ways more information rich than a hash of the same transaction. e.g. the user by embedding it as an input, is acknowledging the state of the blockchain at the height of the containing block.
odolvlobo
Legendary
*
Offline Offline

Activity: 4312
Merit: 3214



View Profile
July 12, 2018, 05:58:55 PM
 #8

Another issue is that any fork in the block chain might result in transactions that are valid on one branch being considered invalid on the other branch because the references to the outputs in the competing blocks would be different (assuming that the reference is determined by the miner).

Join an anti-signature campaign: Click ignore on the members of signature campaigns.
PGP Fingerprint: 6B6BC26599EC24EF7E29A405EAF050539D0B2925 Signing address: 13GAVJo8YaAuenj6keiEykwxWUZ7jMoSLt
aliashraf
Legendary
*
Offline Offline

Activity: 1456
Merit: 1174

Always remember the cause!


View Profile WWW
July 12, 2018, 07:40:22 PM
Last edit: July 12, 2018, 09:13:45 PM by aliashraf
 #9

Another issue is that any fork in the block chain might result in transactions that are valid on one branch being considered invalid on the other branch because the references to the outputs in the competing blocks would be different (assuming that the reference is determined by the miner).
It is rather an advantage (the one I'm in love with  Wink ), it helps security and gives wallets more strength on securing blockchain and should be considered  a disruptive improvement with a wide range of socio-economic and political consequences that deserve in-depth investigations and analysis.

It is worth mentioning that referencing data on blockchain by using the index of the outputs as the input is a brilliant idea but in its naive form it doesn't  help security  significantly because an adversary can re-write the blockchain so that he would be able to "steal" the transactions.

To avoid this, we have to remember the 28 bytes of valuable space that would be available! 8 bytes suffices to address more than 18 billion trillion transactions, almost an infinity,with current 36 bytes outpoint field we are left with 28 bytes to make it really hard for the adversary. Say, by adding a strong checksum/trimmed reference to the containing block hash and even more, another checksum for the latest block the wallet is willing to participate in finalizing it, being typically a much "younger" block.

OP is not interested in this subject and is just thinking about compacting stuff. I don't care about compression the main point is user contribution in security a totally new horizon which I'm already embracing and adopting it in my personal  work, PoCW,  Proof of Contributive Work.

Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!