Bitcoin Forum
May 14, 2024, 04:24:56 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Question on the scriptSig and scriptPubKey  (Read 6317 times)
BitcoinScholar (OP)
Newbie
*
Offline Offline

Activity: 32
Merit: 0


View Profile WWW
February 08, 2013, 01:42:07 AM
 #1

Not very long and I'm back with more questions. I've learned quite a bit about Bitcoin through my pursuits on here and now I have more technical questions. Right now I'm studying the scriptSig of the input portion of a transaction and the scriptPubKey of the output portion of a transaction. I've found some answers and understand aspects but others I don't.

Pertaining to the scriptSig I see the two aspects, public-key and the signature. I a m comfortable with the public-key aspect. When you try to spend BTC you refer back to the output of the previous tx and the program hashes your public-key thus finding the your address and since the preceding output sent the BTC to your address it is verified you have the key. That seems simple to me, now at least.

When it comes to signature portion of it I'm lost. I know that somehow your private-key you provide verifies your public-key through hashes. This makes sense, I understand how that happens. My question is how do you provide your private-key to the input of your tx to process it and find a line-up with your public-key without putting your private-key in the record? Does the process just use the private-key but do nothing else with it? Also I know you sign a hash of the tx and I get how the hash is made from the tx. But what exactly is that signature? Is it a hash of a combination of a your private and public-key? I found this information:

"The script contains two components, a signature and a public key. The public key belongs to the redeemer of the output transaction and proves the creator is allowed to redeem the outputs value. The other component is an ECDSA signature over a hash of a simplified version of the transaction. It, combined with the public key, proves the transaction was created by the real owner of the address in question. Various flags define how the transaction is simplified and can be used to create different types of payment."

here: https://en.bitcoin.it/wiki/Transactions

and I'm trying to understand the italicized part.
kjj
Legendary
*
Offline Offline

Activity: 1302
Merit: 1025



View Profile
February 08, 2013, 02:12:58 AM
 #2

Not very long and I'm back with more questions. I've learned quite a bit about Bitcoin through my pursuits on here and now I have more technical questions. Right now I'm studying the scriptSig of the input portion of a transaction and the scriptPubKey of the output portion of a transaction. I've found some answers and understand aspects but others I don't.

Pertaining to the scriptSig I see the two aspects, public-key and the signature. I a m comfortable with the public-key aspect. When you try to spend BTC you refer back to the output of the previous tx and the program hashes your public-key thus finding the your address and since the preceding output sent the BTC to your address it is verified you have the key. That seems simple to me, now at least.

When it comes to signature portion of it I'm lost. I know that somehow your private-key you provide verifies your public-key through hashes. This makes sense, I understand how that happens. My question is how do you provide your private-key to the input of your tx to process it and find a line-up with your public-key without putting your private-key in the record? Does the process just use the private-key but do nothing else with it? Also I know you sign a hash of the tx and I get how the hash is made from the tx. But what exactly is that signature? Is it a hash of a combination of a your private and public-key? I found this information:

"The script contains two components, a signature and a public key. The public key belongs to the redeemer of the output transaction and proves the creator is allowed to redeem the outputs value. The other component is an ECDSA signature over a hash of a simplified version of the transaction. It, combined with the public key, proves the transaction was created by the real owner of the address in question. Various flags define how the transaction is simplified and can be used to create different types of payment."

here: https://en.bitcoin.it/wiki/Transactions

and I'm trying to understand the italicized part.

Look at the picture on that page.  It shows that the transaction signature includes both the signature and the pubkey that verifies the signature.  We then hash the pubkey and compare it to the hash embedded in the prevout.

The wikipedia page explains, in a way, how the signature is actually calculated.  The OP_CHECKSIG page on the bitcoin wiki goes into some detail on the simplification stage.

17Np17BSrpnHCZ2pgtiMNnhjnsWJ2TMqq8
I routinely ignore posters with paid advertising in their sigs.  You should too.
Gavin Andresen
Legendary
*
qt
Offline Offline

Activity: 1652
Merit: 2216


Chief Scientist


View Profile WWW
February 08, 2013, 03:38:03 AM
 #3

The other component is an ECDSA signature over a hash of a simplified version of the transaction.

The magic of public key crypto is that you can give somebody your public key, some data, and a signature, and they can be certain that:

a) that particular signature could only have been created by somebody that has the private key that corresponds to the public key
b) the data hasn't been changed in any way

They don't need to know the private key-- you keep it secret.

The "hash over..." bit is the way digital signatures work-- you sign a hash of the data, and not the data itself, because the hash is much smaller.

The "...simplified version of the transaction" bit is complicated. The data signed is the transaction minus all it's scriptSig signatures, plus (almost always) the previous transaction's scriptPubKey. See the OP_CHECKSIG page on the wiki for all the gory details.

How often do you get the chance to work on a potentially world-changing project?
DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4653



View Profile
February 08, 2013, 04:17:32 PM
 #4

It sounds to me like you are struggling to understand what ECDSA (Elliptic Curve Digital Signature Algorithm) is and how it works.  This is a well established and widely used cryptographic algorithm for "signing" information using a private key in such a way that anyone with the associated public key can prove that the signature could only have been provided by the holder of the private key.  The private key is never revealed (which is why it is called "private"), but the public key must be revealed (which is why it is called "public").  I don't fully understand the process, but it has been vetted by enough people that I trust it.

You can find some of the process involved here:
http://en.wikipedia.org/wiki/ECDSA

When sending bitcoin, the bitcoin address is provided in the output but the public key is not.  It is not possible to determine what the public key is if you only have the bitcoin address.

Since the public key is required to verify the signature, and since the owner of the private key can easily/quickly calculate both the public key and the bitcoin address, the owner of the private keys scans the blockchain for outputs that are associated with addresses that can be generated from those private keys.  Then they choose one and calculate the public key from the private key.  The public key is then presented along with the signature in the scriptSig.

Now there is enough information available to prove that the spender has the right to do so.  Being given the public key in the scriptSig means that anyone can:
Code:
Base58Check(CONCATENATE(VERSION, RIPEMD-160(SHA-256(public key)), SUBSTR(0,4,SHA-256(RIPEMD-160(SHA-256(public key))))))
To verify that the resulting bitcoin address is the same as the address provided in the previous output.

Being given the public key and the signature in the scriptSig means that using ECDSA, anyone can confirm that the signature provided could only have been provided by a person who holds the private key associated with the given public key.

Since the public key and signature provide proof of control of the private key, and the public key can be confirmed as the correct public key for the given address, authorization to spend the previous output is verified.

Understanding the fact that a signature created by a private key can be verified with the public key, and that the same signature cannot be created with the public key is essential to understanding how the bitcoin transaction verification process works.  Understanding how and why the signature can be verified and how and why the same signature can't be created if you only have the public key is not essential to understanding how the transaction verification process works.

You'll probably want to seek out a cryptography forum if you really want to understand the internals of SHA-256, RIPEMD-160, and ECDSA.
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
February 08, 2013, 04:30:37 PM
Last edit: February 08, 2013, 04:44:56 PM by DeathAndTaxes
 #5

It sounds to me like you are struggling to understand what ECDSA (Elliptic Curve Digital Signature Algorithm) is and how it works.

Or more generally the concept of Asymmetric Cryptography (also known as public key cryptography).

Really to start understanding Bitcoin one needs to have a very good understanding of the following concepts:
Cryptographic Hash Function
Asymmetric Cryptography
Digital Signatures
Cryptographic Nonce  <- used in mining not transactions

(Some wikipedia links to get the OP started).

Note I didn't reference a specific hash function or the asymmetric cryptography algorithm used.  It is important to understand in general terms what these are, how they work, and why they are used.   For example why do we use a cryptographic hash instead of the original value?  What does a digital signature prove?  How can we verify the authenticity of a digital signature with the public key if it was signed by the private key?

Then one should be familiar with the algorithms used:
SHA-256 (cryptographic hash function)
RIPEMD-160 (cryptographic hash function)
ECC (Elliptical Curve Cryptography)
ECDSA (ECC Digital Signature Algorithm)
SECP256K1 (specific ECC curve used by Bitcoin

One doesn't need to know for example the internal dataflow of SHA-256 but one does need to know what it is and at a high level how it works. Purpose?  Block size? Hash size?


None of these are Bitcoin specific and if someone doesn't have a fairly good understanding of them, then any explanation on Bitcoin really becomes a confusing mess of Bitcoin concepts AND the underlying cryptographic concepts.

It would be like trying to learn double entry bookeeping without understanding arithmetic.  It simply can't be done.  

DannyHamilton
Legendary
*
Offline Offline

Activity: 3388
Merit: 4653



View Profile
February 08, 2013, 05:54:54 PM
 #6

. . .
Really to start understanding Bitcoin one needs to have a very good understanding of the following concepts:
Cryptographic Hash Function
Asymmetric Cryptography
Digital Signatures
Cryptographic Nonce  <- used in mining not transactions

(Some wikipedia links to get the OP started).

Note I didn't reference a specific hash function or the asymmetric cryptography algorithm used.  It is important to understand in general terms what these are, how they work, and why they are used.   For example why do we use a cryptographic hash instead of the original value?  What does a digital signature prove?  How can we verify the authenticity of a digital signature with the public key if it was signed by the private key?

Then one should be familiar with the algorithms used:
SHA-256 (cryptographic hash function)
RIPEMD-160 (cryptographic hash function)
ECC (Elliptical Curve Cryptography)
ECDSA (ECC Digital Signature Algorithm)
SECP256K1 (specific ECC curve used by Bitcoin
. . .

I see what you did there.  Wink

I cheated.  I answered most of it . . . then posted, then went back and cleaned up the post and added additional details.
BitcoinScholar (OP)
Newbie
*
Offline Offline

Activity: 32
Merit: 0


View Profile WWW
February 09, 2013, 11:26:51 PM
 #7

I understand the scriptSig and scriptPubKey mostly now. What happens in a transaction is A private key goes through the process of ultimately proving that the address the BTC was "sent" to matches. This initially "signs" it but then it must be verified with the public-key. The public-key then also verifies the signature and provides further evidence that the signature is valid. This is now open to go to the scriptPubKey portion of the script that ultimatly just specifies which address is the recipient of the BTC or signature. Within the output is also the quantity sent. Then when this is received by the new owner they have to go through the same input process, prove possession of the address storing the value, etc., etc., thus the system acts as a series of electronic signatures.

Looking at the necessary things for the input portion of a transaction, I see that they are 1) the previous tx 2) the index and 3) the scriptSig. I understand the scriptSig now(In a basic way I believe) and the index, which just refers to the specific output of the tx in question. I don't understand however how the "previous tx" represents the previous transaction(sounds strange but let me explain). Does the previous tx represent a hash of the finished signature of the previous tx? And how is this used as an essential part of the input.

I have a few little theories. Maybe the "previous tx" section of an input is just a hash of the referenced output address? Again, maybe it's just a hash of the previous tx signature? But then, I think, scriptPubKey only gives the output address. If scriptPubKey gives more than just the output address I think it would explain this last piece of the puzzle and I'd understand the basics of the script process. My impression is that scriptPubKey just contains the value and the output address.
fengshu
Newbie
*
Offline Offline

Activity: 14
Merit: 0


View Profile WWW
May 04, 2014, 12:42:48 PM
 #8

The other component is an ECDSA signature over a hash of a simplified version of the transaction.

The magic of public key crypto is that you can give somebody your public key, some data, and a signature, and they can be certain that:

a) that particular signature could only have been created by somebody that has the private key that corresponds to the public key
b) the data hasn't been changed in any way

They don't need to know the private key-- you keep it secret.

The "hash over..." bit is the way digital signatures work-- you sign a hash of the data, and not the data itself, because the hash is much smaller.

The "...simplified version of the transaction" bit is complicated. The data signed is the transaction minus all it's scriptSig signatures, plus (almost always) the previous transaction's scriptPubKey. See the OP_CHECKSIG page on the wiki for all the gory details.


does the signature in the transaction like this:
ECDSASignature(Hash(Transaction-scriptSig)+PreTransaction_scriptPubKey)

?

telepatheic
Jr. Member
*
Offline Offline

Activity: 56
Merit: 1


View Profile
May 04, 2014, 01:05:26 PM
 #9

See this wiki page for more details, the data which is signed is effectively:
Code:
SHA256(SHA256(modified_transaction))

The modified transaction is very complicated to construct and removes the signature and public key and inputs that are not being signed.
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
May 04, 2014, 03:42:19 PM
 #10

See this wiki page for more details, the data which is signed is effectively:
Code:
SHA256(SHA256(modified_transaction))

The modified transaction is very complicated to construct and removes the signature and public key and inputs that are not being signed.

Sadly it is probably one of the worst decisions Satoshi made.  The complexity adds nothing and is one of the root causes for tansaction malleability.   Not really sure what Satoshi was trying to acheive.  Normally the signature is OUTSIDE of the payload and trying to stick it back inside the input adds no value.

It would have been far simpler to do something like.

Step 1) Construct entire transaction (minus signature) in a canonical form
Step 2) Hash the entire transaction.  This becomes the tx_id as well as the digest for the signature
Step 3) Sign the hash in step 2 with the private key(s) and append to signature body.

You would end up with something like:
tx header
in[n]  (list of inputs)
out[n] (list of outputs)
sign[n] (list of signatures)

To verify:
Step 1) Remove the signatures from the tx body and save.
Step 2) Hash the remaining tx.  This is the tx id and the digest for the signature verification.
Step 3) Verify each of the signatures using the pubkey(s) and the transaction hash.

Honestly I have no idea what Satoshi was trying to accomplish with the overly complicated mess that is Bitcoin tx signatures but given a few other questionable decisions (using uncompressed pubkeys, non-canonical signatures, including pubkey in inputs when they could be reconstructed from the signature, etc) I believe as smart as Satoshi was ECDSA wasn't his strong suit.  He used it but he wasn't an expert at it.





telepatheic
Jr. Member
*
Offline Offline

Activity: 56
Merit: 1


View Profile
May 04, 2014, 04:52:39 PM
 #11

Reading Satoshi's original code is extremely insightful. The script.cpp file has basically not changed since Satoshi wrote it. Everyone has been too scared to suggest changing it to something logical. I've yet to see any real use of the scripting functionality beyond multi-signatures. There must have been some big idea behind it but nobody knows what, even Gavin doesn't have a clue why Satoshi wrote it like he did.
Nicolas Dorier
Hero Member
*****
Offline Offline

Activity: 714
Merit: 621


View Profile
May 04, 2014, 07:16:07 PM
 #12

I posted an article yesterday where I explain that
Please, take a look, if you like it, vote http://www.codeproject.com/Articles/768412/NBitcoin-The-most-complete-Bitcoin-port-Part-Crypt Wink
Satoshi's code is somewhat hard to read for someone not used to C++ dev,
My port is more easy to understand : https://github.com/NicolasDorier/NBitcoin]https://github.com/NicolasDorier/NBitcoin]https://github.com/NicolasDorier/NBitcoin (Script class)
A signature is represented by the type TransactionSignature.

The process to determine the hash that you need to sign with your key is specified in the Script.SignatureHash method.
https://github.com/NicolasDorier/NBitcoin/blob/master/NBitcoin/Script.cs#L358

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
telepatheic
Jr. Member
*
Offline Offline

Activity: 56
Merit: 1


View Profile
May 04, 2014, 08:02:26 PM
 #13

Your code looks really good. What unit tests do you use (how do you make sure you don't break compatibility with bitcoin core) ?
Nicolas Dorier
Hero Member
*****
Offline Offline

Activity: 714
Merit: 621


View Profile
May 04, 2014, 08:06:38 PM
 #14

Your code looks really good. What unit tests do you use (how do you make sure you don't break compatibility with bitcoin core) ?

I ported the unit tests of bitcoin core, along with their own data driven tests. (Actually, I had to implement some bugs of the core implementation to not break compatibility, like the openssl bug I documented in the ScriptEvaluationContext.CheckSig method)
Clone my project, I gave the category "Core" to the tests coming from the core implementation.



In fact I ported their tests before implementing them.

My Node Server implementation is not 100% done yet, but you can start to talk with the network.
The crypto part is entirely ported though.

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
May 04, 2014, 10:50:53 PM
 #15

Sadly it is probably one of the worst decisions Satoshi made.  The complexity adds nothing and is one of the root causes for tansaction malleability.   Not really sure what Satoshi was trying to acheive.  Normally the signature is OUTSIDE of the payload and trying to stick it back inside the input adds no value.

The signature is outside the payload.  You wouldn't be able to sign it otherwise.

To calculate hash for signing, you set all the input scripts to length zero (and copy some other stuff around) and then get the hash of the result.

This "locks" the transaction in place and prevents any changes without breaking the signature.

The inputs aren't signed (since they are set to zero length arrays), so you can add the signatures to the transaction without breaking the signature.

The tx-id hash depends on the signed part of the transaction and the added signatures.  

Malleability is caused by the fact that you can encode the signature in many ways.

In psuedo code:

Code:
signing hash = Hash(transaction with inputs deleted)

signature = sign(signing hash, private key)

final transaction = transaction with signature added to the inputs

tx-id = hash(final tranasction)

The signature is basically two numbers.  It would be like encoding 123, 456 as 0123, 0456.  They both represent the same pair of number, so are both valid signatures.

The ideal solution would be to use the signing hash[ * ] to refer to previous inputs and the (current) tx-id just for computing the merkle tree in the blocks.

[ * ] The signing hash is the hash of the transaction with all inputs set to zero

You would end up with something like:
tx header
in[n]  (list of inputs)
out[n] (list of outputs)
sign[n] (list of signatures)

To verify:
Step 1) Remove the signatures from the tx body and save.
Step 2) Hash the remaining tx.  This is the tx id and the digest for the signature verification.
Step 3) Verify each of the signatures using the pubkey(s) and the transaction hash.

It already works that way, except the sign[n] values are added to the inputs after signing.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
telepatheic
Jr. Member
*
Offline Offline

Activity: 56
Merit: 1


View Profile
May 04, 2014, 10:59:46 PM
 #16

The inputs aren't signed (since they are set to zero length arrays), so you can add the signatures to the transaction without breaking the signature.

It isn't that simple, the current input, with the signature and anything before the last OP_CODESEPARATOR removed, are signed (In fact even that is a gross simplification). See https://en.bitcoin.it/wiki/OP_CHECKSIG
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
May 04, 2014, 11:26:41 PM
 #17

It isn't that simple, the current input, with the signature and anything before the last OP_CODESEPARATOR removed, are signed (In fact even that is a gross simplification). See https://en.bitcoin.it/wiki/OP_CHECKSIG

For transaction malleability, that doesn't matter.  If the script for the output being sent doesn't have an OP_CODESEPARATOR in it, then you can ignore that effect. 

Assuming the spender is spending standard transaction coins, there won't be any OP_CODESEPARATORS to deal with.

With normal transactions, to get the signature hash

- you blank out all the inputs
- copy the scriptPubKey of the output you are spending into the input you are signing
- add the hash_type to the end of the transaction (expanded to 4 bytes)

Everything that is signed is locked-down.

The problem is the "blank all the inputs" step, that means that the inputs don't affect the signing hash but they do affect the txid hash.

If the tx-id wasn't affected by the transaction inputs, then malleability would not be an issue.

You send funds to a particular tx-id, and then the transaction is changed, and so the funds are credited to a different transaction output.  This breaks refund transactions which assumed a particular tx-id.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
May 04, 2014, 11:32:08 PM
 #18

It already works that way, except the sign[n] values are added to the inputs after signing.

Which is pointless but the reality is that isn't correct.  Part of the inputs ARE included and part of the inputs are placeholders and then this convoluted mess is arranged into a modified transaction and then signed.  Then after the fact the signature is dumped back into where the placeholders are.

Quote
If the tx-id wasn't affected by the transaction inputs, then malleability would not be an issue.

Or if like any other digital signature system the entire message (in this case the complete tx minus the siganture(s) was hashed and signed then there would be no difference between the tx id (hash) and the digest of the signature (the exact same hash).

TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1083


View Profile
May 05, 2014, 12:02:15 AM
 #19

Or if like any other digital signature system the entire message (in this case the complete tx minus the siganture(s) was hashed and signed then there would be no difference between the tx id (hash) and the digest of the signature (the exact same hash).

There are some benefits in being able to blank parts of the transaction out. 

Malleability itself could have been fixed, if the tx-id of the input wasn't included in the signing process.

Having said that, I broadly agree. 

The basic signing process should just use hash(transaction without inputs | hash_type) as the signing hash.  The signing hash should be used to refer to the previous transaction.

The extra complexities could be added with different hash_types, if absolutely necessary.

Everything, including the signatures, should be included in the hash for the block merkle root though.  That makes it much easier for archiving purposes and initial download.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4172
Merit: 8420



View Profile WWW
May 05, 2014, 01:10:48 AM
Last edit: May 05, 2014, 01:52:45 AM by gmaxwell
 #20

The basic signing process should just use hash(transaction without inputs | hash_type) as the signing hash.  The signing hash should be used to refer to the previous transaction.
Oh no, that wouldn't be good in general, at least unless you could opt out of it.

Consider, You pay Alice.  The transaction isn't confirming because your fees were not competitive. So you double spend its inputs in a new transaction with better fees in order to achieve atomic exclusion. Oops: Moments after your replacement transaction a prior payer, Peggy, poses a parallel payment and in your present position this is no perk since her payments were paired: price and pubkey parroted. Preclusion prevented by a profusion of parallel property, both payments are processed and Alice, pleased with her profit, parts leaving you peevish.
 
There certantly are cases where it would be good to be able to mask the inputs— generally where you're doing something interesting where you'd be absolutely sure to never reuse a public key as part of your protocol— but in the common case, the addition control precision is very important, not just against preventing stupidity but to avoid suffering losses due to inconsistency which is inherent in a distributed system.

And fengshu, it's generally preferred that people not bump old threads.
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!