Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: riplin on July 26, 2013, 06:45:54 PM



Title: Best way to extract addresses from sigScript and PkScript?
Post by: riplin on July 26, 2013, 06:45:54 PM
Hi everyone,

Is there a best practice method of extracting addresses from the sigScript and PkScript?

For example, is it safe to assume that data pushes of 20 bytes are addresses and 33 / 65 (if starting with 0x02 / 0x03 for 33 and 0x04 for 65) are public keys?

I know that there are transactions that don't have addresses, like this one: https://blockchain.info/tx/a4bfa8ab6435ae5f25dae9d89e4eb67dfa94283ca751f393c1ddc5a837bbc31b

But are there transactions that have more than one? Do i have to run the scripts in order to find out which one is actually used?

I also read somewhere that it's possible to get the public key from the signature or is that not the case?


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: jl2012 on July 26, 2013, 06:48:18 PM
Hi everyone,

Is there a best practice method of extracting addresses from the sigScript and PkScript?

For example, is it safe to assume that data pushes of 20 bytes are addresses and 33 / 65 (if starting with 0x02 / 0x03 for 33 and 0x04 for 65) are public keys?

I know that there are transactions that don't have addresses, like this one: https://blockchain.info/tx/a4bfa8ab6435ae5f25dae9d89e4eb67dfa94283ca751f393c1ddc5a837bbc31b

But are there transactions that have more than one? Do i have to run the scripts in order to find out which one is actually used?

I also read somewhere that it's possible to get the public key from the signature or is that not the case?

Certainly it's not safe to assume that. Why don't you just use regular expression to extract the address?


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: riplin on July 26, 2013, 06:54:08 PM
Certainly it's not safe to assume that. Why don't you just use regular expression to extract the address?

What would be the criteria for a regular expression then? There's 20 bytes in an address, all of them random. :)



Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: jl2012 on July 26, 2013, 06:59:19 PM
Certainly it's not safe to assume that. Why don't you just use regular expression to extract the address?

What would be the criteria for a regular expression then? There's 20 bytes in an address, all of them random. :)



In perl regex

^76a914(.{40})88ac$


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: riplin on July 26, 2013, 07:04:41 PM

In perl regex

^76a914(.{40})88ac$

Great, now there's another thing I have to figure out. Thanks though. :)


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: kjj on July 26, 2013, 07:05:54 PM
If you aren't doing a full scripting engine, you should make templates and search them until you find a hit.

jl2012 gives an example of the "standard" transaction template.

Also, see https://bitcointalk.org/index.php?topic=126865.msg1348297#msg1348297


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: Peter Todd on July 26, 2013, 07:06:20 PM
Hi everyone,

Is there a best practice method of extracting addresses from the sigScript and PkScript?

For example, is it safe to assume that data pushes of 20 bytes are addresses and 33 / 65 (if starting with 0x02 / 0x03 for 33 and 0x04 for 65) are public keys?

I know that there are transactions that don't have addresses, like this one: https://blockchain.info/tx/a4bfa8ab6435ae5f25dae9d89e4eb67dfa94283ca751f393c1ddc5a837bbc31b

But are there transactions that have more than one? Do i have to run the scripts in order to find out which one is actually used?

I also read somewhere that it's possible to get the public key from the signature or is that not the case?

I have a pull-req that might help you: https://github.com/bitcoin/bitcoin/pull/2830

Basically it adds a "decodescript" RPC call that takes a script and decodes it into human readable form. That includes the opcodes as well as the address itself, either pay-to-script-hash (standard addresses) or P2SH. (or non-standard if the script doesn't match a standard address form) It also lets you calculate the P2SH address that would correspond to that script - if you don't understand what I mean by that statement read up on P2SH.

The bigger question though is why exactly do you need to do this? Are you trying to write a library?


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: piotr_n on July 26, 2013, 07:08:37 PM
Hi everyone,

Is there a best practice method of extracting addresses from the sigScript and PkScript?

For example, is it safe to assume that data pushes of 20 bytes are addresses and 33 / 65 (if starting with 0x02 / 0x03 for 33 and 0x04 for 65) are public keys?

I know that there are transactions that don't have addresses, like this one: https://blockchain.info/tx/a4bfa8ab6435ae5f25dae9d89e4eb67dfa94283ca751f393c1ddc5a837bbc31b

But are there transactions that have more than one? Do i have to run the scripts in order to find out which one is actually used?

I also read somewhere that it's possible to get the public key from the signature or is that not the case?
Ultimately the is no such thing as the address.
Yes you can do these things and they will work in 90+% of cases, but never for all of them.
If you sell it to a miner, you can introduce a tx that can create a new address format - whatever you can think of.
I guess you should be able to name it then ;)

As for extracting the address from the signature - it's somehow possible, though from what I recall the output might be an address that is a false one. So it only makes sense if you have the right one, so you can check.


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: riplin on July 26, 2013, 07:14:56 PM
If you aren't doing a full scripting engine, you should make templates and search them until you find a hit.

jl2012 gives an example of the "standard" transaction template.

Also, see https://bitcointalk.org/index.php?topic=126865.msg1348297#msg1348297

I have a full script engine, but I'm writing an SPV node now that isn't doing full script evaluation. It just wants to extract the addresses to test against the local wallet. I guess it doesn't really matter too much if I get false positives, since they'll be filtered out by the bloom filter / local db anyway.



Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: piotr_n on July 26, 2013, 07:20:30 PM
I guess you only need to analyze the data from pk_scripts, from unspent outputs - the input addresses will just be tx ids (who cares).
They (txout scripts) are in general pretty much straight forward,
Recently a hash, with something.
Previously there was the entire public key.


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: kjj on July 26, 2013, 07:34:03 PM
If you aren't doing a full scripting engine, you should make templates and search them until you find a hit.

jl2012 gives an example of the "standard" transaction template.

Also, see https://bitcointalk.org/index.php?topic=126865.msg1348297#msg1348297

I have a full script engine, but I'm writing an SPV node now that isn't doing full script evaluation. It just wants to extract the addresses to test against the local wallet. I guess it doesn't really matter too much if I get false positives, since they'll be filtered out by the bloom filter / local db anyway.

In that case, why do you bother extracting the pubkey?  The satoshi client just compares the script to a stored copy of the script that matches a key.


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: riplin on July 26, 2013, 07:39:33 PM
In that case, why do you bother extracting the pubkey?  The satoshi client just compares the script to a stored copy of the script that matches a key.

So it's building up: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG

and just matching on that?


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: piotr_n on July 26, 2013, 07:41:06 PM
In that case, why do you bother extracting the pubkey?  The satoshi client just compares the script to a stored copy of the script that matches a key.

So it's building up: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG

and just matching on that?

yes. but in the first block, they were using different scripts


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: kjj on July 26, 2013, 08:02:55 PM
In that case, why do you bother extracting the pubkey?  The satoshi client just compares the script to a stored copy of the script that matches a key.

So it's building up: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG

and just matching on that?

Yes, exactly.

Understanding this is the key to making sense of compressed keys and WIFs too.


Title: Re: Best way to extract addresses from sigScript and PkScript?
Post by: riplin on July 26, 2013, 08:34:58 PM
Yes, exactly.

Understanding this is the key to making sense of compressed keys and WIFs too.

I understand compressed keys and WIF's ;)

Anyway, I've decided not to do a simple memcmp, but rather something more generic:

Code:
        internal static Hash160 GetAddress(byte[] aScript)
        {
            Hash160 hash = null;
            ParseState parseState = ParseState.OP_ANY;
            ScriptParser parser = new ScriptParser(aScript, delegate(Opcode aOpcode, byte[] aData)
            {
                switch (parseState)
                {
                    case ParseState.OP_ANY:
                        if (aOpcode == Opcode.OP_HASH160)
                            parseState = ParseState.OP_20;

                        if ((aOpcode == (Opcode)33) || (aOpcode == (Opcode)65))
                        {
                            if (hash != null)
                                return false;

                            if (Utils.VerifyPublicKey(aData, aData.Length) != 1)
                                return false;

                            hash = new Hash160(Utils.RIPEMD160SHA256(aData));
                        }
                        break;

                    case ParseState.OP_20:
                        if (aOpcode != (Opcode)20)
                            return false;

                        if (hash != null)
                            return false;

                        hash = new Hash160(aData);
                        parseState = ParseState.OP_EQUALVERIFY;
                        break;

                    case ParseState.OP_EQUALVERIFY:
                        if (aOpcode != Opcode.OP_EQUAL && aOpcode != Opcode.OP_EQUALVERIFY)
                            return false;

                        parseState = ParseState.OP_ANY;
                        break;
                    default:
                        return false;
                }
                return true;
            });

            if (parser.Parse())
            {
                return hash;
            }

            return null;
        }