jratcliff63367 (OP)
Member
Offline
Activity: 82
Merit: 10
|
|
December 29, 2014, 10:03:41 PM Last edit: December 29, 2014, 10:19:20 PM by jratcliff63367 |
|
I'm working on my bitcoin parser and I'm running across some cases where I cannot decode the public key in an output script. Here is an example: See the coinbase transaction in block #199,975 https://blockchain.info/tx/870f2daaf1e6bf44fd23c98b152f5f5b45beeb7066eb135840809cb528579e87You will note that the output address that blockchain.info shows is: 1DgaASdtGgUavpNUE8ESBq3gmPbHh2ALnC The hex address for this is: 0x00, 0x8b, 0x1d, 0x6a, 0x31, 0xb0, 0x19, 0xe2, 0xda, 0x16, 0xde, 0x77, 0xf6, 0x0c, 0x62, 0x3b, 0x14, 0x42, 0xd5, 0xec, 0x2e, 0x24, 0x3a, 0x00, 0x61 Normally I find the public key in the output script, but in this case I cannot figure out how/where blockchain.info came up with "1DgaASdtGgUavpNUE8ESBq3gmPbHh2ALnC" for the output script containing: ChallengeScriptLength: 35 bytes long 21 - Push 0x21 bytes on the stack 02 14 71 c3 e2 c3 3f 1a 52 55 a5 14 cb f9 db d3 22 f8 2b 95 78 34 e5 90 ab 99 ff 00 26 30 eb b7 ee ac OP_CHECKSIG Any help/advice on what I'm missing to extract the public key from this output script would be most appreciated. How does this stream of 33 (02 14 71 c3 e2 c3 3f 1a 52 55 a5 14 cb f9 db d3 22 f8 2b 95 78 34 e5 90 ab 99 ff 00 26 30 eb b7 ee)bytes turn into this public key '1DgaASdtGgUavpNUE8ESBq3gmPbHh2ALnC'? Thanks, John
|
|
|
|
DannyHamilton
Legendary
Offline
Activity: 3514
Merit: 4894
|
|
December 29, 2014, 10:22:13 PM |
|
I'm working on my bitcoin parser and I'm running across some cases where I cannot decode the public key in an output script. Here is an example: See the coinbase transaction in block #199,975 https://blockchain.info/tx/870f2daaf1e6bf44fd23c98b152f5f5b45beeb7066eb135840809cb528579e87You will note that the output address that blockchain.info shows is: 1DgaASdtGgUavpNUE8ESBq3gmPbHh2ALnC The hex address for this is: 0x00, 0x8b, 0x1d, 0x6a, 0x31, 0xb0, 0x19, 0xe2, 0xda, 0x16, 0xde, 0x77, 0xf6, 0x0c, 0x62, 0x3b, 0x14, 0x42, 0xd5, 0xec, 0x2e, 0x24, 0x3a, 0x00, 0x61 Normally I find the public key in the output script, but in this case I cannot figure out how/where blockchain.info came up with "1DgaASdtGgUavpNUE8ESBq3gmPbHh2ALnC" for the output script containing: "021471c3e2c33f1a5255a514cbf9dbd322f82b957834e590ab99ff002630ebb7ee OP_CHECKSIG" Any help/advice on what I'm missing to extract the public key from this output script would be most appreciated. Thanks, John This transaction appears to use the obsolete "pay-to-pubkey" script instead of the more commonly used "pay-to-pubkey-hash". Therefore, the value given in the script IS the public key. To see this public key represented as a bitcoin address, you need to follow the steps as described here: https://en.bitcoin.it/wiki/Technical_background_of_Bitcoin_addressesSince you already have the public key, you are at the end of "step 1". Step 2: Perform SHA-256 hashing on the public key A579A1CEDA2894FDDB360E9D2941907835A290C72BFEC0443E1EFF64AAA61EAA
Step 3: Perform RIPEMD-160 hashing on the result of SHA-256 8B1D6A31B019E2DA16DE77F60C623B1442D5EC2E
Step 4: Add version byte in front of RIPEMD-160 hash (0x00 for Main Network) 008B1D6A31B019E2DA16DE77F60C623B1442D5EC2E
Step 5: Perform SHA-256 hash on the extended RIPEMD-160 result F83469929812FE7835C4650BBC6E32DB092B545F0D546B597C8BFBA0C7C26BD8
Step 6: Perform SHA-256 hash on the result of the previous SHA-256 hash 243A00619F100F1A1D84D418198875C023F45290574D5BAC64FEA8B7897B1852
Step 7: Take the first 4 bytes of the second SHA-256 hash. This is the address checksum Step 8: Add the 4 checksum bytes from stage 7 at the end of extended RIPEMD-160 hash from stage 4. This is the 25-byte binary Bitcoin Address 008B1D6A31B019E2DA16DE77F60C623B1442D5EC2E243A0061
Step 9: Convert the result from a byte string into a base58 string using Base58Check encoding. This is the most commonly used Bitcoin Address format 1DgaASdtGgUavpNUE8ESBq3gmPbHh2ALnC
|
|
|
|
jratcliff63367 (OP)
Member
Offline
Activity: 82
Merit: 10
|
|
December 29, 2014, 10:28:37 PM |
|
Thanks so much for the detailed response. I looked a little bit further and the output is in the format of a 'compressed public key'. My parser already takes into account the 65 byte uncompressed public key which was only used for a little while in the early lifetime of the blockchain. These compressed public keys are rarely used as well because most of the time the 20 byte RIPEMD160 hash of the public key is what is usually stored in the output script. http://bitcoin.stackexchange.com/questions/3059/what-is-a-compressed-bitcoin-keyHere is the snippet if code I currently use to convert a 65 byte uncompressed public key to a bitcoin address, which matches your explanation. bool bitcoinPublicKeyToAddress(const uint8_t input[65], // The 65 bytes long ECDSA public key; first byte will always be 0x4 followed by two 32 byte components uint8_t output[25]) // A bitcoin address (in binary( is always 25 bytes long. { bool ret = false;
if ( input[0] == 0x04) { uint8_t hash1[32]; // holds the intermediate SHA256 hash computations SHA256::computeSHA256(input,65,hash1); // Compute the SHA256 hash of the input public ECSDA signature output[0] = 0; // Store a network byte of 0 (i.e. 'main' network) RIPEMD160::computeRIPEMD160(hash1,32,&output[1]); // Compute the RIPEMD160 (20 byte) hash of the SHA256 hash SHA256::computeSHA256(output,21,hash1); // Compute the SHA256 hash of the RIPEMD16 hash + the one byte header (for a checksum) SHA256::computeSHA256(hash1,32,hash1); // now compute the SHA256 hash of the previously computed SHA256 hash (for a checksum) output[21] = hash1[0]; // Store the checksum in the last 4 bytes of the public key hash output[22] = hash1[1]; output[23] = hash1[2]; output[24] = hash1[3]; ret = true; } return ret; }
What I'm wondering is how do I take the 33 byte 'compressed' form into account?
|
|
|
|
DannyHamilton
Legendary
Offline
Activity: 3514
Merit: 4894
|
|
December 29, 2014, 10:43:30 PM |
|
What I'm wondering is how do I take the 33 byte 'compressed' form into account?
Note that in the tranaction, there isn't a bitcoin address. The transaction was not "sent to a bitcoin address", it was sent to a compressed public key. If you still feel like you want to create a bitcoin address representation for that public key for some reason, then you simply hash the 33 byte public key, instead of hashing the 65 byte public key. Something like this: bool bitcoinCompressedPublicKeyToAddress(const uint8_t input[33], // The 33 bytes long ECDSA public key; first byte will always be either 0x02 or 0x03 followed by a 32 byte components uint8_t output[25]) // A bitcoin address (in binary is always 25 bytes long). { bool ret = false;
if ( ( input[0] == 0x02 ) || ( ( input[0] == 0x03 ) ) { uint8_t hash1[32]; // holds the intermediate SHA256 hash computations SHA256::computeSHA256(input,33,hash1); // Compute the SHA256 hash of the input public ECSDA signature output[0] = 0; // Store a network byte of 0 (i.e. 'main' network) RIPEMD160::computeRIPEMD160(hash1,32,&output[1]); // Compute the RIPEMD160 (20 byte) hash of the SHA256 hash SHA256::computeSHA256(output,21,hash1); // Compute the SHA256 hash of the RIPEMD16 hash + the one byte header (for a checksum) SHA256::computeSHA256(hash1,32,hash1); // now compute the SHA256 hash of the previously computed SHA256 hash (for a checksum) output[21] = hash1[0]; // Store the checksum in the last 4 bytes of the public key hash output[22] = hash1[1]; output[23] = hash1[2]; output[24] = hash1[3]; ret = true; } return ret; }
|
|
|
|
jratcliff63367 (OP)
Member
Offline
Activity: 82
Merit: 10
|
|
December 29, 2014, 10:47:07 PM |
|
Ok, I got it to work. I just had to make a 'compressed' version of my routine. Here is the code: bool bitcoinCompressedPublicKeyToAddress(const uint8_t input[32], // The 33 byte long compressed ECDSA public key; first byte will always be 0x2 or 0x3 followed by the 32 byte component uint8_t output[25]) // A bitcoin address (in binary( is always 25 bytes long. { bool ret = false;
if ( input[0] == 0x02 || input[0] == 0x03 ) { uint8_t hash1[32]; // holds the intermediate SHA256 hash computations SHA256::computeSHA256(input,33,hash1); // Compute the SHA256 hash of the input public ECSDA signature output[0] = 0; // Store a network byte of 0 (i.e. 'main' network) RIPEMD160::computeRIPEMD160(hash1,32,&output[1]); // Compute the RIPEMD160 (20 byte) hash of the SHA256 hash SHA256::computeSHA256(output,21,hash1); // Compute the SHA256 hash of the RIPEMD16 hash + the one byte header (for a checksum) SHA256::computeSHA256(hash1,32,hash1); // now compute the SHA256 hash of the previously computed SHA256 hash (for a checksum) output[21] = hash1[0]; // Store the checksum in the last 4 bytes of the public key hash output[22] = hash1[1]; output[23] = hash1[2]; output[24] = hash1[3]; ret = true; } return ret; }
|
|
|
|
DannyHamilton
Legendary
Offline
Activity: 3514
Merit: 4894
|
|
December 29, 2014, 11:01:34 PM |
|
Ok, I got it to work. I just had to make a 'compressed' version of my routine. Here is the code: Just for educational purposes, it could have all been done in the one routine with something like the following: bool bitcoinPublicKeyToAddress(const uint8_t input[65], // The ECDSA public key; first byte will indicate if the public key is compressed or not uint8_t output[25]) // A bitcoin address (in binary is always 25 bytes long). { bool ret = false; uint8_t hash1[32]; // holds the intermediate SHA256 hash computations
if ( input[0] == 0x04 ) // Uncompressed public key. First byte is 0x04 followed by two 32 byte components { SHA256::computeSHA256(input,65,hash1); // Compute the SHA256 hash of the input public ECSDA signature } else if ( input[0] == 0x02 || input[0] == 0x03 ) //Compressed public key. First byte is 0x02 or 0x03 followed by the 32 byte component { SHA256::computeSHA256(input,33,hash1); // Compute the SHA256 hash of the input public ECSDA signature }
if ( input[0] == 0x02 || input[0] == 0x03 || input[0] == 0x04 ) { output[0] = 0; // Store a network byte of 0 (i.e. 'main' network) RIPEMD160::computeRIPEMD160(hash1,32,&output[1]); // Compute the RIPEMD160 (20 byte) hash of the SHA256 hash SHA256::computeSHA256(output,21,hash1); // Compute the SHA256 hash of the RIPEMD16 hash + the one byte header (for a checksum) SHA256::computeSHA256(hash1,32,hash1); // now compute the SHA256 hash of the previously computed SHA256 hash (for a checksum) output[21] = hash1[0]; // Store the checksum in the last 4 bytes of the public key hash output[22] = hash1[1]; output[23] = hash1[2]; output[24] = hash1[3]; ret = true; } return ret; }
|
|
|
|
jratcliff63367 (OP)
Member
Offline
Activity: 82
Merit: 10
|
|
December 29, 2014, 11:46:44 PM Last edit: December 30, 2014, 12:00:50 AM by jratcliff63367 |
|
Danny, you were so helpful figuring out that compressed public key, maybe you can help me out with this one too. Take a look at this transaction: https://blockchain.info/tx/195c96a25c4e63f641a2ddd920a9d00d2db947752f41c6be6d4cdd21e54f7ae8It has two output scripts, the second one contains a RIPEMD160 hash as follows: OP_HASH160 49206c6f76652077697a6b6964303537203c3300 OP_EQUAL When I run: "49206c6f76652077697a6b6964303537203c3300" through my code to produce the ASCII key, I get: "17ffAqKkfMLF5QAYwFEDkRtF2mJMzrkmpi" as a result. However, blockchain.info says the ASCII address is: "38Mg6NpCDFedAZrz4LtpB4FBBHb5WqwHaN" This looks suspect to me, to begin with, because it doesn't start with '1'. I tend to think that my representation is correct and blockchain.info has a bug. Can you confirm? *EDIT* and here is one more: https://blockchain.info/tx/d37e9d75ea61dd3f019626f077d74081bca0e80336ae9263cb362c094444c075The third output script is only 60 bytes long, and once you remove the push data operator and the OP_CHECKSIG, there are only 58 bytes left. I'm used to getting uncompressed public keys of 65 bytes in length. What does a 58 byte long public key mean? How is it supposed to be interpreted? Thanks, John
|
|
|
|
DannyHamilton
Legendary
Offline
Activity: 3514
Merit: 4894
|
|
December 29, 2014, 11:58:24 PM |
|
Danny, you were so helpful figuring out that compressed public key, maybe you can help me out with this one too. Take a look at this transaction: https://blockchain.info/tx/195c96a25c4e63f641a2ddd920a9d00d2db947752f41c6be6d4cdd21e54f7ae8It has two output scripts, the second one contains a RIPEMD160 hash as follows: OP_HASH160 49206c6f76652077697a6b6964303537203c3300 OP_EQUAL When I run: "49206c6f76652077697a6b6964303537203c3300" through my code to produce the ASCII key, I get: "17ffAqKkfMLF5QAYwFEDkRtF2mJMzrkmpi" as a result. However, blockchain.info says the ASCII address is: "38Mg6NpCDFedAZrz4LtpB4FBBHb5WqwHaN" This looks suspect to me, to begin with, because it doesn't start with '1'. I tend to think that my representation is correct and blockchain.info has a bug. Can you confirm? Thanks, John Bitcoin addresses that start with a 1 are "pay to pubkey hash" scripts. The 1 is a version number that tells the wallet software that it should create/recognize a script in the format of: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG Bitcoin addresses that start with a 3 are "pay to script hash" scripts. The 3 is a version number that tells the wallet software that it should create/recognize a script in the format of: OP_HASH160 (20-byte-hash-value> OP_EQUAL The representation of the pay-to-script-hash address is described in BIP-0013base58-encode: [one-byte version][20-byte hash][4-byte checksum]
For more details on how to use Pay-to-script-hash you can take a look at BIP-0016
|
|
|
|
jratcliff63367 (OP)
Member
Offline
Activity: 82
Merit: 10
|
|
December 30, 2014, 12:11:40 AM |
|
I guess I have a similar question. I'm encountering a number of output scripts which have non-standard lengths for the public key. Here is one where the public key is 37 bytes long: https://blockchain.info/tx/49c22f63beb61811ab87b6bfd0b04335e5684bea89d2760213e1f650853cb6f1The data for the key is: 4d 65 73 73 61 67 65 3a 20 68 74 74 70 3a 2f 2f 69 2e 69 6d 67 75 72 2e 63 6f 6d 2f 73 5a 38 64 30 2e 6a 70 67 The first byte is '0x4d' Here is one where the public key is 58 bytes long: https://blockchain.info/tx/d37e9d75ea61dd3f019626f077d74081bca0e80336ae9263cb362c094444c075The data for the key is: 4d 65 73 73 61 67 65 3a 20 42 69 74 63 6f 69 6e 50 61 72 61 2e 64 65 20 44 69 76 69 64 65 6e 64 65 6e 7a 61 68 6c 75 6e 67 20 31 20 76 6f 6d 20 30 37 2e 30 39 2e 32 30 31 32 The first byte is 0x4d as well. I'm not used to seeing public keys in the output scripts of such odd sizes. Why do they appear, and what is the proper rule to convert them into a bitcoin address? Is there a place that documents all of the way these signatures can be encoded (and converted to RIPEMD160 format) from a purely hex-dump perspective? Thanks, John
|
|
|
|
DannyHamilton
Legendary
Offline
Activity: 3514
Merit: 4894
|
|
December 30, 2014, 12:35:34 AM |
|
I guess I have a similar question. I'm encountering a number of output scripts which have non-standard lengths for the public key. Here is one where the public key is 37 bytes long: https://blockchain.info/tx/49c22f63beb61811ab87b6bfd0b04335e5684bea89d2760213e1f650853cb6f1The data for the key is: 4d 65 73 73 61 67 65 3a 20 68 74 74 70 3a 2f 2f 69 2e 69 6d 67 75 72 2e 63 6f 6d 2f 73 5a 38 64 30 2e 6a 70 67 The first byte is '0x4d' Here is one where the public key is 58 bytes long: https://blockchain.info/tx/d37e9d75ea61dd3f019626f077d74081bca0e80336ae9263cb362c094444c075The data for the key is: 4d 65 73 73 61 67 65 3a 20 42 69 74 63 6f 69 6e 50 61 72 61 2e 64 65 20 44 69 76 69 64 65 6e 64 65 6e 7a 61 68 6c 75 6e 67 20 31 20 76 6f 6d 20 30 37 2e 30 39 2e 32 30 31 32 The first byte is 0x4d as well. I'm not used to seeing public keys in the output scripts of such odd sizes. Why do they appear, and what is the proper rule to convert them into a bitcoin address? Is there a place that documents all of the way these signatures can be encoded (and converted to RIPEMD160 format) from a purely hex-dump perspective? Thanks, John According to this post by DeathAndTaxes: https://bitcointalk.org/index.php?topic=675321.0There are 17112 unspent outputs that are permanently unspendable due to having an invalid public key. I suspect that these are two of those. If you want to create a bitcoin address representation of an invalid public key that matches the representation that blockchain.info has decided to use, you can probably just calculate the address by hashing the invalid key. As such, your bitcoinPublicKeyToAddress script could be made more generic by counting up the number of bytes in the public key and then hashing it, regardless of whether there are 33 bytes, 65 bytes, or some other value entirely.
|
|
|
|
jratcliff63367 (OP)
Member
Offline
Activity: 82
Merit: 10
|
|
December 30, 2014, 12:55:23 AM |
|
So, as I'm researching all of the output scripts that I cannot successfully decode the bitcoin address for, I realize I may going about this the wrong way. I looked for common patterns in the scripts, which normally 99.999% of all output scripts conform to. In the early days of bitcoin people used the full 65 byte public key, then later switched to the 20 byte RIPEMD160 format nearly everywhere. However, there are a number of other formats, including things like multi-sig, script hashes, and others, which create a lot of permutations. I guess what I'm looking for is the following: I want a routine which can accept a transaction output script as raw hex binary data and provide as output the public bitcoin addresses (*exactly* what would show up in blockchain.info as the output address). something like: struct BitcoinAddress { uint32_t key[25]; // the 25 bitcoin public key address };
uint32_t findOutputAddresses(const uint8_t *scriptData, // The raw output script data uint32_t scriptLength, // The length of the output script. BitcoinAddress *keys, // The array of keys to return uint32_t maxOutputKeys); // The maximum number of output keys allowed If someone could implement this in C++ with absolutely *no* external dependencies on any other code of any code (except stdint.h) that would be ideal. I have a routine already that does this, however that routine clearly doesn't take every possible permutation into account and I gather having the magic knowledge to know how to parse the output scripts in every flavor is time consuming. If anyone thinks they can write such a routine and has an interest, I would gladly pay a reasonable bitcoin bounty for the code snippet. It could also be very educational. Again, the requirements are that it accepts the raw binary blob of an output script and returns one, or more, bitcoin addresses which when printed in BASE58 ASCII exactly matches what would show up in blockchain.info. It must be a single source file, C++, and have zero external dependencies on anything other than <stdint.h> Anyone up for that challenge? Up until now the number of keys I failed to parse were fairly statistically insignificant, but that is no longer the case due to recent changes in how people are forming their transactions. I want to finalize my 'end of the year' statistics on the blockchain but, to do so, I cannot let any more public keys slip by unaccounted for. Thanks, John
|
|
|
|
DannyHamilton
Legendary
Offline
Activity: 3514
Merit: 4894
|
|
December 30, 2014, 01:44:52 AM |
|
- snip - Anyone up for that challenge? - snip -
It's an interesting sounding project, but I don't have time for it this week. Hopefully you'll find someone else to help you out. You may want to create a thread in "Marketplace" or "Services" requesting a programmer. I currently have a post there offering to pay a perl programmer to write a quick little script for me.
|
|
|
|
hhanh00
|
|
December 30, 2014, 04:42:39 AM |
|
I don't see why you want to decode these bogus pub keys. They are obviously invalid. It's just someone who wanted to put a text message into the blockchain. OP_RETURN is the right way to do it. The client can drop them from the UTXO set.
4d6573736167653a20687474703a2f2f692e696d6775722e636f6d2f735a3864302e6a7067 in ascii is "Message: BitcoinPara.de Dividendenzahlung 1 vom 07.09.2012".
PS: the clue is that they begin with the same bytes and 65 is 'e', 61 is 'a' in ascii.
|
|
|
|
jratcliff63367 (OP)
Member
Offline
Activity: 82
Merit: 10
|
|
December 30, 2014, 06:35:14 AM |
|
Whether they are 'bogus' or not, my goal is to replicate what blockchain.info reports for these. For most of these blockchain.info does report some valid 'address' for each of these scripts. Essentially I use 'blockchain.info' to debug me own code, so matching their 'interpretation' of the output script signatures is really what I'm trying to do.
Maybe someone who works at blcockchain.info can explain how they interpret all of these rather bizarrely formed script signatures?
This also reminds why I'm not that thrilled with the whole scripting mechanism of bitcoin. Some clear standards on format and layout of input and output data would make things a lot cleaner. Now, with 20/20 hindsight, I wish that bitcoin was completely and utterly hard coded, no scripts all, and any 'experimental' programmable crap was reserved for side chains.
|
|
|
|
hhanh00
|
|
December 30, 2014, 07:32:53 AM |
|
Then do what Danny told you. Essentially I use 'blockchain.info' to debug me own code, so matching their 'interpretation' of the output script signatures is really what I'm trying to do.
I wouldn't use a buggy implementation as a reference for my own code but it's your call.
|
|
|
|
amaclin
Legendary
Offline
Activity: 1260
Merit: 1019
|
|
December 30, 2014, 08:04:10 AM |
|
I don't see why you want to decode these bogus pub keys. They are obviously invalid. It's just someone who wanted to put a text message into the blockchain. OP_RETURN is the right way to do it. The client can drop them from the UTXO set. Inserting arbitrary data with OP_RETURN - is the right way of course. But this was standardized only in 2014. There were many other ways in past. BTW, clients can also treat p2pk outputs with invalid public keys as provable unspendable.
|
|
|
|
jratcliff63367 (OP)
Member
Offline
Activity: 82
Merit: 10
|
|
December 30, 2014, 07:57:46 PM |
|
Just a follow up, I found out that the bulk of my issues were simply properly formed multi-sig addresses; which my parser had not been updated to take into account. I'm now making sure I can fully account for multi-sig addresses. I'm sure other people have noticed this, but as of a little over a month ago there has been an explosion in the use of multi-sig addresses and a shit-ton of bitcoins are moving to them.
|
|
|
|
|