Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: arch_stanton on September 21, 2017, 07:57:40 AM



Title: Hashing coinbase transaction to get txid
Post by: arch_stanton on September 21, 2017, 07:57:40 AM
I'm not able to get a txid from coinbase transactions, by hashing. I can get it from any other transaction by doing the double hash. Here's what I'm trying:
Code:
import hashlib

txid = 'd3104f6e12f47b9fef672820ed8721670b087aaa801591b43fa2210bf3887649'
txRaw = "0100000001c4efdc5025c9816bd6cc098b205ec7a5e5d91b398969e5350d4ce528310ea7a4000000006b483045022100a33944bd7354dd464b605d30f7f94f728f41c3dc58d434d918b4685e645b183e02200d37ec139c4d0de363ddd66bdb29ff19aac538644440dee950d03b935f1d3900012103eb38b8ea461b42ec464f738e890ab0c9ca909ef2e7df9599e5d939cb441e5390feffffff02b0ac1600000000001976a91492b00d72d4ef77d5710e71c415be831900e8739488acd0480a00000000001976a9149cadc280f1873709b80d005764e7a8741ee8d94788ac796b0700"
print('**')
print('Raw transaction:')
print(txRaw)
print('**')
data=txRaw.decode("hex")
hash = hashlib.sha256(hashlib.sha256(data).digest()).digest()
print('txid:')
print(txid)
print "Hashed Raw tx:\n", hash[::-1].encode('hex_codec')

Output when using a normal transaction:
Code:
**
Raw transaction:
0100000001c4efdc5025c9816bd6cc098b205ec7a5e5d91b398969e5350d4ce528310ea7a4000000006b483045022100a33944bd7354dd464b605d30f7f94f728f41c3dc58d434d918b4685e645b183e02200d37ec139c4d0de363ddd66bdb29ff19aac538644440dee950d03b935f1d3900012103eb38b8ea461b42ec464f738e890ab0c9ca909ef2e7df9599e5d939cb441e5390feffffff02b0ac1600000000001976a91492b00d72d4ef77d5710e71c415be831900e8739488acd0480a00000000001976a9149cadc280f1873709b80d005764e7a8741ee8d94788ac796b0700
**
txid:
d3104f6e12f47b9fef672820ed8721670b087aaa801591b43fa2210bf3887649
Hashed Raw tx:
d3104f6e12f47b9fef672820ed8721670b087aaa801591b43fa2210bf3887649

Output when using a coinbase transaction:
Code:
**
Raw transaction:
010000000001010000000000000000000000000000000000000000000000000000000000000000ffffffff3103816b07244d696e656420627920416e74506f6f6c6b2f4542312f4144362f4e59412f332059c36d7be1550000df320000ffffffff0238252e4d000000001976a914660371326d3a2e064c278b20107a65dad847e8a988ac0000000000000000266a24aa21a9edc11e8cdbd8d442b27bf8f273395baa83b5da4c9c3d87fbc539dad742480437100120000000000000000000000000000000000000000000000000000000000000000000000000
**
txid:
d0783f480343fb37b009b5e3db90ccad85e2c8314639de98b3bb66c396dd7915
Hashed Raw tx:
4495cc1d511e61de206cd5b18998eeab11932d42c6b1d264c400a20eefc97e99

I haven't been able to find any documentation about this. Does anybody know why this is?


Title: Re: Hashing coinbase transaction to get txid
Post by: DannyHamilton on September 21, 2017, 07:51:46 PM
You appear to have a bad copy of the raw transaction.

When I look at the raw transaction for txid in the blockchain (coinbase transaction for block height 486,273, block hash 00000000000000000083cbfd33b63c2ac10e703266c5749bf3ce2fbff88f5791), I get the following:

Code:
01000000 01000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00ffffff ff310381 6b07244d
696e6564 20627920 416e7450 6f6f6c6b
2f454231 2f414436 2f4e5941 2f332059
c36d7be1 550000df 320000ff ffffff02
38252e4d 00000000 1976a914 66037132
6d3a2e06 4c278b20 107a65da d847e8a9
88ac0000 00000000 0000266a 24aa21a9
edc11e8c dbd8d442 b27bf8f2 73395baa
83b5da4c 9c3d87fb c539dad7 42480437
10000000 00

When I calculate hashlib.sha256(hashlib.sha256(data).digest()).digest() on that data, I get the correct result.

You seem to be working with the following:
Code:
01000000 00010100 00000000 00000000
00000000 00000000 00000000 00000000
00000000 000000ff ffffff31 03816b07
244d696e 65642062 7920416e 74506f6f
6c6b2f45 42312f41 44362f4e 59412f33
2059c36d 7be15500 00df3200 00ffffff
ff023825 2e4d0000 00001976 a9146603
71326d3a 2e064c27 8b20107a 65dad847
e8a988ac 00000000 00000000 266a24aa
21a9edc1 1e8cdbd8 d442b27b f8f27339
5baa83b5 da4c9c3d 87fbc539 dad74248
04371001 20000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00

Notice the extra "0120" at the end followed by an extra 36 bytes of "00"?
Also notice that you have an extra byte of "00" between the 4-byte version (the 01000000) at the beginning, and the "01" representing the number of inputs?
Then notice that you have an extra "01" byte immediately following the in-counter?
These extra bytes are resulting in an invalid hash calculation.


Title: Re: Hashing coinbase transaction to get txid
Post by: arch_stanton on September 22, 2017, 05:15:47 AM
Thanks. That's really strange. I'm getting the raw transaction from https://blockchain.info/tx/d0783f480343fb37b009b5e3db90ccad85e2c8314639de98b3bb66c396dd7915?format=hex , I haven't tried getting the data from my node, since I'm currently running tests in regtest mode. By the way, I'm having this problem in regtest aswell. I'll look more into this when I get home from work.


Title: Re: Hashing coinbase transaction to get txid
Post by: DannyHamilton on September 22, 2017, 03:06:43 PM
- snip -
I'm getting the raw transaction from https://blockchain.info/...
- snip -

Blockchain.info has a reputation for having issues. I wouldn't recommend ever using them as a source for anything significant.


Title: Re: Hashing coinbase transaction to get txid
Post by: arch_stanton on September 22, 2017, 04:47:53 PM
My node is a couple of days of synching behind, so I can't get the transaction from it yet. But I tried getting the raw transaction from electrum wallet and chainquery.com, and they all have the long tail starting with 12... Where did you get your raw transaction?


Title: Re: Hashing coinbase transaction to get txid
Post by: DannyHamilton on September 22, 2017, 04:54:43 PM
I tried getting the raw transaction from electrum wallet . . . and they all have the long tail starting with 12

That's surprising.  Might want to report that.  Sounds like an Electrum bug, and they are usually pretty good about fixing their bugs.

Where did you get your raw transaction?

https://blockexplorer.com/api/rawtx/d0783f480343fb37b009b5e3db90ccad85e2c8314639de98b3bb66c396dd7915


Title: Re: Hashing coinbase transaction to get txid
Post by: achow101 on September 22, 2017, 04:56:19 PM
You seem to be working with the following:
Code:
01000000 00010100 00000000 00000000
00000000 00000000 00000000 00000000
00000000 000000ff ffffff31 03816b07
244d696e 65642062 7920416e 74506f6f
6c6b2f45 42312f41 44362f4e 59412f33
2059c36d 7be15500 00df3200 00ffffff
ff023825 2e4d0000 00001976 a9146603
71326d3a 2e064c27 8b20107a 65dad847
e8a988ac 00000000 00000000 266a24aa
21a9edc1 1e8cdbd8 d442b27b f8f27339
5baa83b5 da4c9c3d 87fbc539 dad74248
04371001 20000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00

Notice the extra "0120" at the end followed by an extra 36 bytes of "00"?
Also notice that you have an extra byte of "00" between the 4-byte version (the 01000000) at the beginning, and the "01" representing the number of inputs?
Then notice that you have an extra "01" byte immediately following the in-counter?
These extra bytes are resulting in an invalid hash calculation.
But I tried getting the raw transaction from electrum wallet and chainquery.com, and they all have the long tail starting with 12...
They seem to be giving you the transaction in the extended serialiation format (witness serialization when there are witnesses). This is actually incorrect and in violation of the segwit specification which states that the witness serialization for a non-witness transaction (as the coinbase transaction is) is the legacy non-extended (no witnesses) serialization format.

The extra 0001 between the version number and number of inputs is the marker and flag bytes for extended serialization. The 012000... at the end of the transaction is an incorrect and invalid witness (it means that there is a witness of 1 stack item which is 32 bytes in length and all 0's. During verification, this would be pushed to the stack, but it is incorrect to have that there at all).

Edit: That was wrong, explained in a post below.


Title: Re: Hashing coinbase transaction to get txid
Post by: DannyHamilton on September 22, 2017, 05:03:55 PM
They seem to be giving you the transaction in the extended serialiation format (witness serialization when there are witnesses). This is actually incorrect and in violation of the segwit specification which states that the witness serialization for a non-witness transaction (as the coinbase transaction is) is the legacy non-extended (no witnesses) serialization format.

Achow101, thanks for explaining what's happening there.  I've been too busy, and haven't has a chance to learn and understand the SegWit changes yet.  It's clear that I'm going to need to start absorbing some of that material or my knowledge will become stale and useless around here.


Title: Re: Hashing coinbase transaction to get txid
Post by: arch_stanton on September 22, 2017, 05:43:54 PM
Ok, great. But my regtest setup (version 14.1) gives me a similar raw coinbase transaction:
Code:
020000000001010000000000000000000000000000000000000000000000000000000000000000ffffffff050282060101ffffffff020e642500000000002321031a70a95a57f63b7d63e46803d385fd483b77f7a71a7a753ffcf7aafb67ec3edcac0000000000000000266a24aa21a9edad9d9c439414e2d257fda34cdc00667c0ecd7d78256721a4178046b49ee7382f0120000000000000000000000000000000000000000000000000000000000000000000000000
Is the implementation error in the node? Maybe working in 15.0?


Title: Re: Hashing coinbase transaction to get txid
Post by: link2yasar on September 22, 2017, 06:25:01 PM
My node is a couple of days of synching behind, so I can't get the transaction from it yet. But I tried getting the raw transaction from electrum wallet and chainquery.com, and they all have the long tail starting with 12... Where did you get your raw transaction?

Thank you very much, I am python developer and I will take a look into it. Thanks


Title: Re: Hashing coinbase transaction to get txid
Post by: achow101 on September 22, 2017, 07:05:49 PM
Ok, great. But my regtest setup (version 14.1) gives me a similar raw coinbase transaction:
Code:
020000000001010000000000000000000000000000000000000000000000000000000000000000ffffffff050282060101ffffffff020e642500000000002321031a70a95a57f63b7d63e46803d385fd483b77f7a71a7a753ffcf7aafb67ec3edcac0000000000000000266a24aa21a9edad9d9c439414e2d257fda34cdc00667c0ecd7d78256721a4178046b49ee7382f0120000000000000000000000000000000000000000000000000000000000000000000000000
Is the implementation error in the node? Maybe working in 15.0?
Hmm. That's interesting. I can reproduce that. I will investigate and see what's up with that.

Regardless, the txid is calculated from the non-witness serialization of a transaction, so you should strip out the witness parts of this and hash the remaining. That will get you the txid.



Ok, so apparently that is actually expected behavior. From BIP 141:
Quote
and the coinbase's input's witness must consist of a single 32-byte array for the witness reserved value.
This is done to allow for future extensibility.


Title: Re: Hashing coinbase transaction to get txid
Post by: DannyHamilton on September 22, 2017, 07:20:24 PM
- snip -
the txid is calculated from the non-witness serialization of a transaction, so you should strip out the witness parts of this and hash the remaining. That will get you the txid.
- snip -

This is the key information to take away from this thread.  Achow101 correct me if I'm mistaken, but I believe that is true regardless of whether it is a legacy transaction or a SegWit transaction.  This thread has taught me how to identify a SegWit transaction and how to identify the witness portion of the SegWit transaction. But regarding the topic of the thread (calculating the txid), the key is to remove the witness information before hashing.


Title: Re: Hashing coinbase transaction to get txid
Post by: arch_stanton on September 23, 2017, 06:25:16 AM
Ok, this is great info. But why is the segwit data only attached to coinbase transactions? Of the the random picks of transactions I've checked, this is true.
Edit: Sorry, read the above post again and got it this time. Thanks for your answers.