Bitcoin Forum
November 02, 2024, 10:59:16 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: BIP70 : Question about canonical representation of signed data  (Read 1912 times)
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 28, 2014, 06:18:04 PM
 #1

BIP70 signs the "protocol buffer" serialized data.
One big implementation problem I had, is that the same message can be encoded differently, because default fields are most of the time not serialized by protobuf implementations.

So what I'm doing, is that I'm verifying the signature against the message as I receive it, and just put the signature to empty.
This works fine, but I suspect a canonical representation of the protocol message would be easier to implement.

I see nothing specified in the BIP, am I missing something ?

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 28, 2014, 06:25:06 PM
 #2

I think there should be no "optional" fields in the proto schema. The would fix the problem.

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 28, 2014, 06:55:33 PM
 #3

I am confused since, gavin's implementation at https://bitcoincore.org/~gavin/createpaymentrequest.php
Generate a PaymentRequest that sometimes include default values (paymentdetailsversion), and sometimes not (details.expires).

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
May 28, 2014, 06:58:17 PM
 #4

Why are you reserializing things? The PaymentDetails is stored as binary in the protobuf exactly so you don't have to care about the details of how exactly it was serialized, the signature covers the exact binary representation as stored in the PaymentRequest. I don't think anyone else had any implementation problems with this.
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 28, 2014, 09:25:22 PM
Last edit: May 28, 2014, 10:19:31 PM by Nicolas Dorier
 #5

Mike, this is how I currently do, but I suspect implementers don't interop between them yet.
Also, BitcoinJ does not support signing payment message. As far as I know, I have one of the only working implementation with what gavin did. (And I suspect his code can't confirm correctly a payment I generate)
You need to deserialize to zero the signature, and then you need to reserialize so you can get the bytes that were hashed + signed with the certificate, so you can verify the signature.

My implementation support that great, but I'm creating test vectors for BIP70, and I am almost sure that even if they comply with the BIP, some of the tests vector will not pass for some implementations.

If you think that normal, then be it. I just don't find that correct to not have a canonical way of signing a PaymentRequest, this will lead to buggy implementations. With bad interoperability between different implementations. Such bug are kind of hard to find.

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 28, 2014, 10:27:42 PM
 #6

Anyway talk is cheap

I put some test vector here https://github.com/NicolasDorier/bips/blob/master/bip-0070.mediawiki
Just try the third one against your implementation, if it works, you can probably ignore this post. If not and I'm wrong, we probably need to review the BIP70, to be more explicit about how to serialize a PaymentRequest before hashing and signing.

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
May 29, 2014, 10:57:42 AM
 #7

You may well be right that there are interop problems, but I still don't understand the issue you're having.

You do not need to reserialize anything to check the signature. PaymentDetails is stored in the protobuf as "required bytes", not an embedded submessage. To verify the signature, you just take that raw byte array, take the raw signature, and apply them with the public key. No protobuf reserialization takes place at any point.

To sign a payment request, same thing: you serialize PaymentDetails to binary, sign the binary you serialized, and store the binary you serialized in the PaymentRequest message.
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 29, 2014, 11:51:23 AM
Last edit: May 29, 2014, 04:24:00 PM by Nicolas Dorier
 #8

I think I understand why you don't understand why I need to reserialize.

You think the signature is applied on the "serialized_payment_details", which is, as you said, just a byte[] blob.
But the signature is not done on the "serialized_payment_details" but on the whole "paymentrequest".

To sign a payment request, same thing: you serialize PaymentDetails to binary, sign the binary you serialized, and store the binary you serialized in the PaymentRequest message.

So if I understand you correctly, your sentence is wrong. You don't sign the PaymentDetails' binaries. You need to sign the whole PaymentRequest message without the signature.
This mean that after you store the payment details in the PaymentRequest, you need to serialize the PaymentRequest, get the signature, deserialize, insert the signature, reserialize.
On the other hand, the receiver must : deserialize, remove the signature, serialize, check the signature against the whole binary.

This deserialize/serialize process is where lots of bug lies. Because some implementation of protobuf will remove (or add) default value.
This means that a particular implementation will seem to work well, but fail to do so with other implementation, thus giving implementation problems, hard to discover.
It is by no mean impossible to make a good implementation, but not as easy as it looks.

I added some test vectors that should test against that type of problems, but having a canonical way of expressing a serialized paymentrequest would be better.

The simplest example is the payment_details_version, Gavin's make it appear in the serialized PaymentRequest. My Protobuf version, by default, strip it.
I support both, but are both correct ? From the specification it seems both are ok.

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
May 29, 2014, 07:16:43 PM
 #9

Oh, right. I'm misremembering the spec. I'm not sure why it was done that way: we were aware of the canonicalisation problems, which is why the details field is binary in the first place.

I think most protobuf implementations will put fields into the serialized data if they were set at all, even if they match the default value. That's why Gavin's messages end up with the version field set. If your protobuf implementation is dropping fields that are set to the default value at load time, that's probably a bug (though from the perspective of most users, not an important one).
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 29, 2014, 07:33:50 PM
 #10

Oh, right. I'm misremembering the spec. I'm not sure why it was done that way: we were aware of the canonicalisation problems, which is why the details field is binary in the first place.

I think most protobuf implementations will put fields into the serialized data if they were set at all, even if they match the default value. That's why Gavin's messages end up with the version field set. If your protobuf implementation is dropping fields that are set to the default value at load time, that's probably a bug (though from the perspective of most users, not an important one).

I see, so I think you responded to my question.
I was wondering why Gavin was not serializing default values of the details (like details.expires), but serializing default value of the paymentrequest. (like pki_type or payment_details_version)

Imagine that I am receiving from the wire a PaymentRequest without the payment_details_version nor the pki_type for example. (default are 1 and "none")
If I understand, to correctly verify the signature, I must :
-deserialize PaymentRequest,
-zero the signature,
-reserialize PaymentRequest, but this time asking my framework to add the default fields
-check signature of the resulting bytes

What I did was :
-deserialize PaymentRequest,
-zero the signature,
-reserialize PaymentRequest, as it was before but this time asking my framework to add the default fields
-check signature of the resulting bytes

I was checking the signature against exactly what I received from the wire. But was clearly wrong isn't it ?

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Cryddit
Legendary
*
Offline Offline

Activity: 924
Merit: 1132


View Profile
May 29, 2014, 07:43:23 PM
 #11

I think it's generally a mistake to use anything of which there are different possible implementations (like protobufs) in software where you're taking signatures over the output.

You want to avoid calls to any library where someone might change an "implementation detail" like whether or not default values are encoded, because people making those decisions are not making them with the idea that absolutely reliably identical binary representation is what anybody wants.  In fact, that's an irrelevant question to them; they're implementing protobufs specifically so that people "don't have to worry about" such piffling details as binary representation, and we're actively subverting their intent by introducing a dependency on the exact binary  details.

Honestly, if you want to depend on a binary format, and a signature of that binary format, my estimation is that you should be including stdio, not protobufs.



Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 29, 2014, 08:04:56 PM
 #12

I think it's generally a mistake to use anything of which there are different possible implementations (like protobufs) in software where you're taking signatures over the output.

You want to avoid calls to any library where someone might change an "implementation detail" like whether or not default values are encoded, because people making those decisions are not making them with the idea that absolutely reliably identical binary representation is what anybody wants.  In fact, that's an irrelevant question to them; they're implementing protobufs specifically so that people "don't have to worry about" such piffling details as binary representation, and we're actively subverting their intent by introducing a dependency on the exact binary  details.

Honestly, if you want to depend on a binary format, and a signature of that binary format, my estimation is that you should be including stdio, not protobufs.





It's not so much the question of whether there is different implementation.
The question is more about : If I receive a requestpayment with missing (optional) fields. Should the signed data include these missing fields or not. (see my previous response)
Depending on the implementation of protobuf we are using, by default, this is the case... or not. But the question remains the same.

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Cryddit
Legendary
*
Offline Offline

Activity: 924
Merit: 1132


View Profile
May 29, 2014, 08:41:53 PM
 #13

We have made a mistake in choosing an implementation where that question (whether defaults are encoded or not) even makes any sense.

We are using something that isn't a direct command to write out a specific binary format, and then we are introducing a dependency on its specific binary form.  Bzzzt, wrong answer.

Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
May 30, 2014, 10:37:10 AM
 #14

Imagine that I am receiving from the wire a PaymentRequest without the payment_details_version nor the pki_type for example. (default are 1 and "none")
If I understand, to correctly verify the signature, I must :
-deserialize PaymentRequest,
-zero the signature,
-reserialize PaymentRequest, but this time asking my framework to add the default fields
-check signature of the resulting bytes

My understanding is that in this case they would NOT be in the re-serialized data, because they were not in the original. And the protobuf library does follow this (iirc).

But it sounds like you're saying you tested this, and in fact the default Google implementations put default values onto the wire, even if they weren't in the original protobuf? Why would they work this way? It's not a behaviour I know about.
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
May 30, 2014, 10:47:37 AM
 #15

We are using something that isn't a direct command to write out a specific binary format, and then we are introducing a dependency on its specific binary form.  Bzzzt, wrong answer.

No, using protobufs is the correct approach. We need an extensible format that has good cross language support. If we rolled something from scratch, we'd end up creating something that looks basically the same as protobufs, but perhaps is better specified in terms of details like when the implementations should write default fields. However, given that, we may as well just impose some additional rules that protobuf implementations used in BIP70 should follow and save everyone a lot of time, given that it's easier to patch a library than write one from scratch.

The BIP70 spec does insist that fields are serialized in numerical order. Once we figure out what's going on here exactly (I suspect a .NET protobuf implementation that doesn't match Google's), we can just update BIP70 to insist on a particular handling of optional fields with a default. This is still much, much easier than defining some new format just for Bitcoin.

Nicolas, which protobuf library are you using? Let's just go check the code.
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 30, 2014, 11:06:21 AM
 #16

We are using something that isn't a direct command to write out a specific binary format, and then we are introducing a dependency on its specific binary form.  Bzzzt, wrong answer.

No, using protobufs is the correct approach. We need an extensible format that has good cross language support. If we rolled something from scratch, we'd end up creating something that looks basically the same as protobufs, but perhaps is better specified in terms of details like when the implementations should write default fields. However, given that, we may as well just impose some additional rules that protobuf implementations used in BIP70 should follow and save everyone a lot of time, given that it's easier to patch a library than write one from scratch.

The BIP70 spec does insist that fields are serialized in numerical order. Once we figure out what's going on here exactly (I suspect a .NET protobuf implementation that doesn't match Google's), we can just update BIP70 to insist on a particular handling of optional fields with a default. This is still much, much easier than defining some new format just for Bitcoin.

Nicolas, which protobuf library are you using? Let's just go check the code.

It is not so much a problem of library, I am using protobuf-net, and can ask it to do whatever I want, this is a solid library.
My incomprehension is only that : The BIP70 spec does insist that fields are serialized in numerical order, your sentence is true ONLY for the data to sign.
The source of the headache is that the proto schema, define "optional" fields that are not "optional" for the signature.

The proposition I have is to remove the "optional" from payment_details_version of PaymentRequest from the schema.
In this way the data that is signed is the same than the data you receive from the wire and no headache happens.

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
May 31, 2014, 04:39:55 PM
 #17

Anyway, I made a pull request with the tests that check for such pitfall : https://github.com/bitcoin/bips/pull/69

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
June 02, 2014, 05:46:30 AM
 #18

The version is optional! If missing it's assumed to be 1, but that doesn't mean it has to appear when reserializing. Changing optional to required would not be compatible.

We need to figure out which library is doing the wrong thing and fix it. Perhaps the issue is the PHP implementation instead? That would not really surprise me.
Nicolas Dorier (OP)
Hero Member
*****
Offline Offline

Activity: 714
Merit: 662


View Profile
June 02, 2014, 09:19:29 AM
Last edit: June 02, 2014, 09:30:48 AM by Nicolas Dorier
 #19

Mike Hearn I think I did not explained the problem well enough.

I quote :
Quote
Signature : digital signature over a hash of the protocol buffer serialized variation of the PaymentRequest message, with all fields serialized in numerical order (all current protocol buffer implementations serialize fields in numerical order) and signed using the public key in pki_data. Before serialization, the signature field must be set to an empty value so that the field is included in the signed PaymentRequest hash but contains no data.

Note the "with all fields serialized in numerical order". This means that even if the payment from the wire does not include the version (because optional), it should still be serialized to compute the signature.
This is the tricky part, easy to get wrong.

There is no bug of library, just default behavior that are not the same. These default behavior (if you are not careful) will make you appear your implementation works when it will not interoperate with other.
The default behavior in question is whether the library will serialize in the wire a detailversion with value of 1 or not.

In the pull request, you will see that : payreq1_without_detailversion.paymentrequest and payreq1.paymentrequest, both have (and should have) the same signature.
My guess is that every single implementation get one of these wrong. (Except if I misunderstood the signature part in the bip)

Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
June 02, 2014, 12:50:51 PM
 #20

My reading of what the BIP says is that the fields that are serialized, are serialized in numerical order. Not that all fields including optional fields must be redundantly written to the wire.

Like I said, someone (i.e. one of us) is going to have to sit down and see which library is getting this wrong, because it shouldn't be something that developers have to care about.
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!