Tx Chain Clarification

etotheipi (OP)

Legendary

Offline

Activity: 1428
Merit: 1093

Core Armory Developer

Tx Chain Clarification

July 08, 2011, 04:17:45 AM

#1

I have been trying to figure out exactly what information exists in each transaction and each block. I understand the merkle trees and the chain of block headers, and how everything is linked through hashes. What I don't understand is what is actually contained in a TxIn and OutPoint object. Does each transaction reference a previous TxOut? Does that TxOut have to have enough BTC to cover the amount of the TxIn? If so, doesn't it get complicated to track partial balances in all previous TxOuts that eventually have to be accumulated if the user wants to empty those accounts?

Perhaps I'm missing something. There are three things I really want to know:

(1) Are the transactions chained to one another such that you can simply follow the chain backwards to get all the information you need to know about a particular address, given the last transaction involving that address?
(2) Is the specification telling me that I will need to hold the blockchain if I want to construct a transaction? What is the minimum amount of information I would need to hold on my phone, for my lite phone-client to be able to construct a valid transaction? I was planning to keep only the block headers, balance of the account, and the private key. I don't want to store any block information. Will I have to keep previous transaction data?
(3) What information is contained in a transaction that prevents it from being repeated/re-broadcast by an attacker? Is it linked to a specific block? If that transaction is broadcast just as the next block is solved, does it have to be re-broadcast? I thought a transaction could be included in any future block, but then I don't know what would prevent someone you just paid from re-broadcasting and paying themselves again.

And on a side note, I haven't quite figured out how block "timestamping" works. In some places it looks like timestamps are 32-bit unsigned numbers, in some cases they are block numbers between 0 and 2015. I am uncertain how the unix-time timestamps could work when you have unsynchronized clocks across all nodes. If I want to reference a specific block, do I only provide the header hash and the node searches the headers for it? Or is it okay to say "Block #122,245" ?

Thanks for you patience with my questions. I am anxious to start work on a new client, but the specification clear enough for me.
-Eto

Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here! (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)

error

Hero Member

Offline

Activity: 588
Merit: 500

Re: Tx Chain Clarification

July 08, 2011, 04:22:50 AM

#2

I think you may have inputs and outputs reversed.

For any given transaction, there are inputs (existing coins being spent) and outputs (new coins being received). The exception is generation, which has no inputs.

Each input carries a reference to its previous output, including the transaction hash and the index (offset). Thus, you can trace all the way back to the original generated coin if you want.

For a smartphone app, you probably want to implement simplified payment verification, which is described in the paper.

3KzNGwzRZ6SimWuFAgh4TnXzHpruHMZmV8

etotheipi (OP)

Legendary

Offline

Activity: 1428
Merit: 1093

Core Armory Developer

Re: Tx Chain Clarification

July 08, 2011, 04:36:34 AM

#3

Are you referencing Satoshi's original paper? Which paper?

So it's not sufficient to say "Address X is transferring 10 BTC to Address Y." It requires including hashes of previous transactions with TxOuts that sum up to the specified TxIn amount in this new transaction? Why does having references to previous TxOuts matter? A complete node is going to have to search the entire transaction history after the last TxOut to verify that the coins weren't spent in another transaction, anyway?

In my head, it makes sense that you would only need to specify "X sends N BTC to Y" and the nodes will check to see if address X has the desired level of funds remaining to execute that transaction, based on the transaction history. I think I'm still missing something...

Perhaps I just need to see the paper you mentioned. I browsed Satoshi's paper a couple weeks ago, but have since forgotten everything except the higher-level concepts. I assumed the actual BTC implementation varied from his original vision, as most ideas like this need tweaking to get from theory to implementation.

-Eto

Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here! (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)

theymos

Administrator
Legendary

Offline

Activity: 5180
Merit: 12884

Re: Tx Chain Clarification

July 08, 2011, 05:01:15 AM

#4

Transferring from addresses instead of from previous outputs is more complicated, and it wouldn't allow for the scripting system.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD

patvarilly

Guest

Re: Tx Chain Clarification

July 08, 2011, 07:23:34 AM

#5

Quote from: etotheipi on July 08, 2011, 04:17:45 AM

I have been trying to figure out exactly what information exists in each transaction and each block. I understand the merkle trees and the chain of block headers, and how everything is linked through hashes. What I don't understand is what is actually contained in a TxIn and OutPoint object. Does each transaction reference a previous TxOut? Does that TxOut have to have enough BTC to cover the amount of the TxIn? If so, doesn't it get complicated to track partial balances in all previous TxOuts that eventually have to be accumulated if the user wants to empty those accounts?

Perhaps I'm missing something. There are three things I really want to know:

(1) Are the transactions chained to one another such that you can simply follow the chain backwards to get all the information you need to know about a particular address, given the last transaction involving that address?

My current understanding of how this works, derived from https://en.bitcoin.it/wiki/Transactions, https://en.bitcoin.it/wiki/Script and looking at transaction in blockexplorer, is as follows. It's probably easier to think of Bitcoins being stored at transaction outputs, each of which has a value and an associated Bitcoin address (a hash of a public key). One can get at those Bitcoins by specifying the full public key for the address and a signature for the transaction. Since this signature can only be created using the private key for that address, only the "owner" of that address can get at those Bitcoins. Each transaction then says "take *all* the Bitcoins that were at TxOut n_1 of Tx t_1, n_2 of t_2, ..., [here are all the right public keys and signatures] and create TxOut's holding b_1 BTC at address a_1, a_2 BTC at address a_2, ...".

There are never "partial balances" stored at a given transaction output. This is why when you send a small number of BTC to someone, the transaction actually has a TxIn with a lot of BTC, and two TxOut's, one with the BTC for the recipient of the send, and one with the "change" sent back to you (at a different address, which never shows up in the client GUI). See, for example, this transaction: http://blockexplorer.com/tx/56faebec0694f42c201b0afbd2327dc823b8298b3aa4bb3313bab3e2fe026f44. On the other hand, there may be many unclaimed TxOut's in the block chain belonging to the same address, so it's not true that given the last transaction for a particular address, you can reconstruct the total number of BTC that can be sent from that address.

Quote from: etotheipi on July 08, 2011, 04:17:45 AM

(2) Is the specification telling me that I will need to hold the blockchain if I want to construct a transaction? What is the minimum amount of information I would need to hold on my phone, for my lite phone-client to be able to construct a valid transaction? I was planning to keep only the block headers, balance of the account, and the private key. I don't want to store any block information. Will I have to keep previous transaction data?

You will need to know a transaction with unclaimed TxOut's to the address that you're sending from. You don't need to store the whole blockchain, though at present, you will have to have downloaded and analyzed the blockchain starting from the time when the sender's address first had any Bitcoins. For example, BitcoinJ (the Java library) keeps a list of addresses that belong to you. Whenever it receives a new block at the tip of the main chain, it scans it for transactions with any of your addresses at TxIn's or TxOut's, and keeps a local copy of those transactions. It ignores everything else in the block. In the future, presumably there will be a way of either (a) asking a trusted service or a number of your peers for all recent transactions involving a given address [possibly through pattern matching to avoid revealing to these third parties that you're the address' owner], or (b) asking your peers to only forward the parts of blocks that are relevant to you. You can then verify that a transaction has been incorporated into the block chain if you know its merkle branch.

Quote from: etotheipi on July 08, 2011, 04:17:45 AM

(3) What information is contained in a transaction that prevents it from being repeated/re-broadcast by an attacker? Is it linked to a specific block? If that transaction is broadcast just as the next block is solved, does it have to be re-broadcast? I thought a transaction could be included in any future block, but then I don't know what would prevent someone you just paid from re-broadcasting and paying themselves again.

Any TxOut can only be redeemed by a TxIn of a single transaction. This rule gets enforced when the transaction is incorporated into the block by a miner. Nodes will refuse to accept blocks with transactions violating this rule, so, for instance, the blocks from a malicious miner that incorporate double spends from a single TxOut won't propagate through the network.

You can broadcast a transaction as many times as you want, and it will only be incorporated once into the block chain. My understanding is that the client broadcasts its transactions to all of its connected peers, and rebroadcasts them every 30 minutes until it sees the transaction incorporated in a block.

Someone else can't take your transaction and pay themselves because the signatures at each TxIn cover the *entire* transaction, not just the TxOut that they're redeeming. Your so-called "friend" who you just paid would have to create a new transaction redeeming the TxOut from your address and sending it elsewhere, but he is unable to generate a valid signature for this new transaction since he doesn't have your public key.

Quote from: etotheipi on July 08, 2011, 04:17:45 AM

And on a side note, I haven't quite figured out how block "timestamping" works. In some places it looks like timestamps are 32-bit unsigned numbers, in some cases they are block numbers between 0 and 2015. I am uncertain how the unix-time timestamps could work when you have unsynchronized clocks across all nodes.

The block timestamps are all 32-bit UTC unix times that are "network-adjusted". That is, the client gets from its peers (in the version message) their relative what time they think it is, and computes the offset from local time to each of the "peer" times. The median offset is thereafter used to convert local time to network time. This only synchronizes the clients roughly, but enough to get the difficulty calculation mostly right, which is all that timestamps are even used for. The way this works is that a block will be accepted by a node only if its timestamp is greater than the median timestamp of the preceeding 11 blocks (so you can't add blocks with a timestamp that's too low, and try to lower the difficulty on the next retarget) and if the timestamp is no further than 2 hours into the future (so that the timestamp is not too high, and you increase the difficulty on the next retarget). Retargets happen every 2016 blocks (about two weeks), which is where I guess your 0 and 2015 limits are coming from.

Quote from: etotheipi on July 08, 2011, 04:17:45 AM

If I want to reference a specific block, do I only provide the header hash and the node searches the headers for it? Or is it okay to say "Block #122,245" ?

You can only get specific blocks by hash number, not height on the block chain. Blockexplorer allows you to search by height, and every client with an accurate blockchain can determine what block is at a particular height, but there's no way in the protocol to ask a peer for a block by height.

Quote from: etotheipi on July 08, 2011, 04:17:45 AM

Thanks for you patience with my questions. I am anxious to start work on a new client, but the specification clear enough for me.
-Eto

Hope that helped!

JoelKatz

Legendary

Offline

Activity: 1596
Merit: 1012

Democracy is vulnerable to a 51% attack.

Re: Tx Chain Clarification

July 08, 2011, 10:15:28 AM
Last edit: July 08, 2011, 02:28:39 PM by JoelKatz

#6

Quote from: etotheipi on July 08, 2011, 04:17:45 AM

I have been trying to figure out exactly what information exists in each transaction and each block. I understand the merkle trees and the chain of block headers, and how everything is linked through hashes. What I don't understand is what is actually contained in a TxIn and OutPoint object. Does each transaction reference a previous TxOut?

Every transaction pulls in coins from one or more places and squirts coins out to one or more places. The input can be by claiming an output from a previous transaction or from generation. The output can, at least as far as the chain is concerned, be almost anything.

Quote

Does that TxOut have to have enough BTC to cover the amount of the TxIn? If so, doesn't it get complicated to track partial balances in all previous TxOuts that eventually have to be accumulated if the user wants to empty those accounts?

You always claim the entire output. You can send the 'change' whatever you want in the outputs of this transaction.

Quote

(1) Are the transactions chained to one another such that you can simply follow the chain backwards to get all the information you need to know about a particular address, given the last transaction involving that address?

Don't bring addresses into this part of it. Addresses exist at another level in the protocol. The input of a later transaction claims the output of a previous transaction as far as the hash chain is concerned. One way to claim an output, if it's the right kind of output, is by having the key corresponding to the address that output was sent to, if it was that type of output.

Quote

(2) Is the specification telling me that I will need to hold the blockchain if I want to construct a transaction? What is the minimum amount of information I would need to hold on my phone, for my lite phone-client to be able to construct a valid transaction? I was planning to keep only the block headers, balance of the account, and the private key. I don't want to store any block information. Will I have to keep previous transaction data?

If you don't keep the transactions, there is no way you can claim their outputs. Will do not want to keep account balances, you want to keep claimable transaction outputs. Account balances are just a convenient shorthand and they actually don't correspond to anything specific in the blockchain (because an account can have many addresses).

Quote

(3) What information is contained in a transaction that prevents it from being repeated/re-broadcast by an attacker? Is it linked to a specific block? If that transaction is broadcast just as the next block is solved, does it have to be re-broadcast? I thought a transaction could be included in any future block, but then I don't know what would prevent someone you just paid from re-broadcasting and paying themselves again.

The transaction claims specific outputs from previous transactions or claims the generation and fees. Once a transaction's output is claimed, claiming it again is no longer a valid transaction. If an attacker re-broadcast the transaction, nodes would simply ignore it because its claims failed.

Quote

And on a side note, I haven't quite figured out how block "timestamping" works. In some places it looks like timestamps are 32-bit unsigned numbers, in some cases they are block numbers between 0 and 2015. I am uncertain how the unix-time timestamps could work when you have unsynchronized clocks across all nodes. If I want to reference a specific block, do I only provide the header hash and the node searches the headers for it? Or is it okay to say "Block #122,245" ?

The risk with supplying a block number is that what was block 122,245 a minute ago may not be block 122,245 now. The block timestamps are placed at the originating node and are not trustworthy. If you need to refer to a specific block, use a hash.

I am an employee of Ripple. Follow me on Twitter @JoelKatz
1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN

Jan

Legendary

Offline

Activity: 1043
Merit: 1002

Re: Tx Chain Clarification

July 08, 2011, 05:10:15 PM

#7

Satoshi's paper is vague, and doesn't provide much detail. I consider it a concept paper. The bitcoin client is the real source of truth.

Mycelium let's you hold your private keys private.

etotheipi (OP)

Legendary

Offline

Activity: 1428
Merit: 1093

Core Armory Developer

Re: Tx Chain Clarification

July 10, 2011, 01:15:12 AM

#8

Thank you so much for these replies! This is exactly what I wanted to know. And now it makes more sense why clients can eventually "forget" ancient parts of the global transaction register, because once all the outputs are "used", that transaction no longer provides any value to the network.

Why would the block number change? Since every new block has the hash of the previous block, we have a pretty unambiguous, linear chain of blocks. Where do the timestamps come into this?

Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here! (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)

JoelKatz

Legendary

Offline

Activity: 1596
Merit: 1012

Democracy is vulnerable to a 51% attack.

Re: Tx Chain Clarification

July 10, 2011, 01:20:07 AM

#9

Quote from: etotheipi on July 10, 2011, 01:15:12 AM

Why would the block number change? Since every new block has the hash of the previous block, we have a pretty unambiguous, linear chain of blocks. Where do the timestamps come into this?

The block corresponding to a particular block number would change if the client receives a different block with that same block number and that later block wins out.

I am an employee of Ripple. Follow me on Twitter @JoelKatz
1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN

etotheipi (OP)

Legendary

Offline

Activity: 1428
Merit: 1093

Core Armory Developer

Re: Tx Chain Clarification

July 10, 2011, 01:27:38 AM

#10

So you're saying there is some ambiguity when the block is first included in the network? Perhaps that block becomes invalid and replaced with a different block?

But, just like tx confirmations, after a few "confimations", that block height is going to become absolutely static. Did I miss anything?

Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here! (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)

JoelKatz

Legendary

Offline

Activity: 1596
Merit: 1012

Democracy is vulnerable to a 51% attack.

Re: Tx Chain Clarification

July 10, 2011, 01:32:03 AM

#11

Quote from: etotheipi on July 10, 2011, 01:27:38 AM

So you're saying there is some ambiguity when the block is first included in the network? Perhaps that block becomes invalid and replaced with a different block?

Correct.

Quote

But, just like tx confirmations, after a few "confimations", that block height is going to become absolutely static. Did I miss anything?

Correct in practice. In theory, someone could present a longer chain that invalidates the last 400 blocks.

I am an employee of Ripple. Follow me on Twitter @JoelKatz
1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN

etotheipi (OP)

Legendary

Offline

Activity: 1428
Merit: 1093

Core Armory Developer

Re: Tx Chain Clarification

July 10, 2011, 01:42:54 AM

#12

I guess my point is, either that block is valid or not. If it's valid, it has both a static block number and a valid hash. If it's not, none of it matters.

The reason it concerns me, is it seems rather inefficient to have to search through hundreds of thousands of block headers just to find the right one. If that hash is valid, then so it's linear height from the genesis node, and you might as well include both. The client can check that block 125,329 in his own blockchain has <hash>, and if not, he will do a full search for it.

Now that I think about it, the block headers are probably stored in a tree-structure indexed by hash, in which case everything I just said is irrelevant... I think I'll go dig into the source code before attacking this thread again.

Thanks,
-Eto

Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here! (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)