Bitcoin Forum
April 25, 2024, 05:12:23 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 [2] 3 4 5 »  All
  Print  
Author Topic: What would you change about the Bitcoin protocol?  (Read 12854 times)
oskar (OP)
Newbie
*
Offline Offline

Activity: 9
Merit: 0


View Profile
March 10, 2011, 09:54:15 PM
 #21

- use something like sha512(data) + whirpool(data) + RIPEMD320(data) + GOST(data) + what_else_good_hash_out_there( data ) , where + is concatenation instead of sha256(sha256(data)).

Why make life of KGB, NSA, FED and lol ArtFortz any easier than it could be?
If you just mean to make it take longer to generate, why can't the protocol just gradually increment the number of iterations of SHA256 you must perform?

If you mean to make it less likely for hash vulnerabilities to be found, I think it may be better to just specify the hash algorithm that is used, so the network can easily move to a new algorithm if SHA256 proves to be insecure (as it eventually will be).

EDIT: The reason I think it may not be a good idea to use such a large variety of hash algorithms is that it would make it harder to create an implementation of bitcoin. If I want to write a bitcoin client in javascript, I can easily find a JS library with SHA256 -- but finding one with whirlpool, RIPEMD320, etc, may prove difficult.
1714021943
Hero Member
*
Offline Offline

Posts: 1714021943

View Profile Personal Message (Offline)

Ignore
1714021943
Reply with quote  #2

1714021943
Report to moderator
1714021943
Hero Member
*
Offline Offline

Posts: 1714021943

View Profile Personal Message (Offline)

Ignore
1714021943
Reply with quote  #2

1714021943
Report to moderator
1714021943
Hero Member
*
Offline Offline

Posts: 1714021943

View Profile Personal Message (Offline)

Ignore
1714021943
Reply with quote  #2

1714021943
Report to moderator
Bitcoin addresses contain a checksum, so it is very unlikely that mistyping an address will cause you to lose money.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714021943
Hero Member
*
Offline Offline

Posts: 1714021943

View Profile Personal Message (Offline)

Ignore
1714021943
Reply with quote  #2

1714021943
Report to moderator
1714021943
Hero Member
*
Offline Offline

Posts: 1714021943

View Profile Personal Message (Offline)

Ignore
1714021943
Reply with quote  #2

1714021943
Report to moderator
comboy
Sr. Member
****
Offline Offline

Activity: 247
Merit: 252



View Profile
March 10, 2011, 10:43:12 PM
 #22

I would:
- use BERT for binary serialisation
Nice one.
- use YAML for text serialisation
As much as I like it for config files, I think JSON is more straidforward when it comes to parsing.

- use something like sha512(data) + whirpool(data) + RIPEMD320(data) + GOST(data) + what_else_good_hash_out_there( data ) , where + is concatenation instead of sha256(sha256(data)).

And this one is beautifully interesting. Why not put any strong hashes in existence in there. Even if one fails totally, we're still good. It would take also longer time before mining centralization, on the long run however it would end up with ASICs too probably. It takes maybe some simplicity out of design and makes implementation more complex, but it works. I like this idea very much.

What I would change (mentioned earlier already), I would lower expected time between blocks found. It seems to me that BTC design takes into account growing computational power, but ignores the fact that network is speeding up too. I really don't think that with current connection speeds 10 minutes is needed for propagation.

Variance is a bitch!
grondilu
Legendary
*
Offline Offline

Activity: 1288
Merit: 1076


View Profile
March 11, 2011, 06:01:19 PM
 #23


The protocol could be more permissive about what kind of key could be used.  Basically it could accept not just elliptic curves, but any openssl supported key.

realnowhereman
Hero Member
*****
Offline Offline

Activity: 504
Merit: 502



View Profile
April 26, 2011, 03:35:12 PM
 #24

    My grumbles (don't read this as aggressive, I love bitcoin, but you did ask):

  • Big endian would have been nicer, just to keep with the tradition set by most other TCP/IP protocols ntohs(), et al sort out the swapping for us anyway.  The Internet Protocol spec requires it for all packet headers, and many other protocols.
  • Even if not big endian; it would have been nice to be consistent.  There are big and little endian values in the bitcoin protocol.  Making it necessary to check the protocol documentation to be sure, instead of just remembering one.
  • Putting a locally-determined address and port number in an application-level message was a bad idea.  Local applications don't know what their globally visible address is -- they can be behind NAT.  It's not like the remote peer even needs it -- that information is handed over by the operating system when connection is established.  FTP has demonstrated for years what a nightmare this is.  The official client has to jump through "whatsmyip" hoops to get this information, and it was completely unnecessary.  If we're talking single points of failure, as an attacker I'd go for whatsmyip (et al) and bitcoin is in trouble.
  • The server half of a connection shouldn't speak first.  It makes life easy for sniffers.
  • Is verack even necessary?  Connect.  Client says "I speak version 10".  If the server is willing to speak "version 10" it can answer "I will speak version 10" (regardless of its true version); if it is not then it can say "I will speak version 5", the client can decide if it is willing to speak version 5 and continue or hang up.  verack isn't necessary and makes start up more complicated than needed.
  • Did we really need RIPE-MD and SHA256?  If you want 160 bytes of hash, then just truncate the SHA-256 hash.  They are meant to be evenly distributed, so there shouldn't be any grouping issues.
  • Why use double SHA-256 for the message checksum?  Checksums are there to ensure that data isn't unintentionally corrupted from A to B.  That checksum doesn't need to be cryptographically secure.  Even if it were -- what advantage is there to double hashing for a checksum?
  • Block download should have been most recent first.  Each peer must know what its current block chain tips are.  Those should be requestable by a command.  Then getblocks should have sent the one requested and then its parent, then its parent, etc, etc, until we hit the genesis block.  For comparison, see how git stores its "branches" -- the branch is simply a pointer to the commit at the head of that branch.
  • No thought seems to have been given to a single multi-user system -- two users of one computer with two wallets that must each remain private from the other.  The main bitcoin server should be started up during boot, and it should be responsible for making itself identical to any other arbitrary node and the information it holds is just as public.  It would be advantageous to run a local copy simple for speed (and generation). Then, a wallet/transaction client would connect to that (or any other node) to perform transactions.  There are no commands to support such a light client -- e.g. requesting transaction validation/confirmation.
  • There is no way to query a node for the transactions it has queued.  If I've been disconnected for a long time, I can use getblocks to find out what I've missed.  There is no way to find out what transaction broadcasts I've missed.  This is relevant for early confirmation indications.  Instead of waiting ten minutes for confirmation, I can at least get a hint that the transaction is queued by my peers.
  • 64-bit maximum resolution for VarInt type storage?  The message header will only allow you to send 32-bits of payload; making it impossible that any particular vector in a message will ever need 64-bits to specify its length.  On the same theme: the messages are limited to sizes that are way smaller than 32-bit lengths.
  • Considering how freely 64-bit fields have been chucked around, the time fields are 32 bits in some places.  64-bit times are useful.
  • ... but not for the version message... which uses 64 bits to store the number of seconds since the unix epoch.  Why?  That's 584 billion years of resolution.  Why not have specified it as number of microseconds past the epoch, in case the extra accuracy is ever useful?  That'll still get you 584 thousand years.
  • Why break the version field on base-10 boundaries?  Base-16 boundaries make everything easier to process (not needing division for display) and read in a hex dump.
  • Why specify transaction values in "tens of nano coins"?  Again: a base-2 split would be better.  A fixed-point base-2 unit doesn't suffer strange rounding errors from 2-to-10 conversion.  I haven't looked but I'll bet the bitcoin client is filled with loads of "convertNanoCoinsToCoins64BitWithRounding()" type functions.  Those special rounding rules have to be implemented by all future clients now.
  • Why is there no field in the version message for specifying the application version?  Application and protocol versions are different things.  As more implementations of the client appear, this is important.  What if version 5.6 of SomeNewClient is widely distributed but has a bug in it?  That bug could be worked around if only we knew that we were talking to version 5.6 of SomeNewClient.
  • The services field is underused.  What about flags to say whether a node is a generator, whether it will accept transactions, whether it will broadcast transactions, whether it keeps a peer directory (for the addr message), whether it keeps a full block chain or just headers, whether it should be noted down as a seednode?
  • It's possible to request only the header of a block, but not to request only the body.  A headers-only client has to download a load of bytes it already has when it wants to look at the chain in detail.
  • Perhaps I've misunderstood, but it seems that getblocks and getheaders both include the protocol version.  Don't we already know the protocol version?
  • Again: I haven't looked, but doesn't the alert message allow a DoS?  It's non-trivial to verify a signature, and presumably every alert message has to be checked for validity?  What's to stop an attacker sending broken alert messages all through the network, continuously?  That'll use up plenty of CPU.

Phew. :-)

I'm sure I'm wrong about a good proportion of those, but they're what I thought of while reading the protocol spec.

1AAZ4xBHbiCr96nsZJ8jtPkSzsg1CqhwDa
BitterTea
Sr. Member
****
Offline Offline

Activity: 294
Merit: 250



View Profile
April 26, 2011, 04:08:34 PM
 #25

Wow. Most constructive first post ever. Thanks, realnowhereman.

Or is satoshi airing his regrets? Smiley
grondilu
Legendary
*
Offline Offline

Activity: 1288
Merit: 1076


View Profile
April 26, 2011, 04:45:37 PM
 #26

Wow. Most constructive first post ever. Thanks, realnowhereman.

+1

The guy has never posted here and yet he seems to know the code in full details.  Amazing.

I very much which Satoshi could answer those questions.

jpsoto
Newbie
*
Offline Offline

Activity: 10
Merit: 1


View Profile
April 26, 2011, 05:47:51 PM
 #27

Satoshi ... please, it would be nice to know your position
BitterTea
Sr. Member
****
Offline Offline

Activity: 294
Merit: 250



View Profile
April 26, 2011, 06:02:02 PM
 #28

Satoshi hasn't visited the forums since December, I wouldn't count on an answer.

I just hope that someday, when Bitcoin has either succeeded or failed, Satoshi can reveal him/her/themselves and take credit for his/her/their work (if he/she/they so choose).
Luke-Jr
Legendary
*
expert
Offline Offline

Activity: 2576
Merit: 1186



View Profile
April 26, 2011, 06:14:55 PM
 #29

Would use INT128 so it can be claimed that Bitcoin is 'infinitely divisible' (well that's near enough for most people) as a selling point.
Why not just store numbers in packets as ASCII characters? That would eliminate both length restrictions as well as the above-mentioned endianness concerns. Bittorrent's bencode stores them this way for this reason.
It doesn't eliminate divisibility problems. Obviously the solution is a varint fraction. First, you have the numerator, made up of a variable number of 7-bit sequences (using the high-bit for "more bits to follow"). Then, the same encoding for the denominator. That way you can have exactly 1⁄7 of a bitcoin... Wink

ribuck
Donator
Hero Member
*
Offline Offline

Activity: 826
Merit: 1039


View Profile
April 26, 2011, 06:22:44 PM
 #30

Give it a rest please, Luke-Jr. Decimal numbers work just fine for 99.9999999% of people.
jpsoto
Newbie
*
Offline Offline

Activity: 10
Merit: 1


View Profile
April 26, 2011, 06:23:49 PM
 #31

Satoshi hasn't visited the forums since December, I wouldn't count on an answer.

I just hope that someday, when Bitcoin has either succeeded or failed, Satoshi can reveal him/her/themselves and take credit for his/her/their work (if he/she/they so choose).

Really, this info is worrying ... Seemingly this forum is the sole way to coordinate and plan efforts.

Is Satoshi backing Bitcoin yet?

Maybe, senior members should drive Bitcoin's future with energy and confidence.
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1128


View Profile
April 26, 2011, 06:42:30 PM
 #32

  • The server half of a connection shouldn't speak first.  It makes life easy for sniffers.

This is already changed, though I don't think I care one way or another. The ports are fixed today so if for some reason you want to find BitCoin nodes you can just check if port 8333 is open.

  • Block download should have been most recent first.

That would mean you can't start verifying the chain until you have fully downloaded it, complicating the client significantly (you would have to store the chain then reverse it).

  • There is no way to query a node for the transactions it has queued.  If I've been disconnected for a long time, I can use getblocks to find out what I've missed.  There is no way to find out what transaction broadcasts I've missed.  This is relevant for early confirmation indications.  Instead of waiting ten minutes for confirmation, I can at least get a hint that the transaction is queued by my peers.

That would be nice but not hard to add.

  • Why is there no field in the version message for specifying the application version?  Application and protocol versions are different things.  As more implementations of the client appear, this is important.  What if version 5.6 of SomeNewClient is widely distributed but has a bug in it?  That bug could be worked around if only we knew that we were talking to version 5.6 of SomeNewClient.

Satoshi didn't give much thought to re-implementations. I am using the subVer field for this in my client. It's unused today except by very old clients, and it allows alert broadcasts to target it if necessary (though I don't have the keys and my code doesn't pay attention to alerts anyway).

  • The services field is underused.  What about flags to say whether a node is a generator, whether it will accept transactions, whether it will broadcast transactions, whether it keeps a peer directory (for the addr message), whether it keeps a full block chain or just headers, whether it should be noted down as a seednode?

It works OK. In practice today all nodes are the same. There's no way you can be just a peer directory, or whatever. Seed nodes are chosen by Gavin/other devs based on which nodes have a good track record of availability, it doesn't make sense to let people elect themselves for that role.

  • It's possible to request only the header of a block, but not to request only the body.  A headers-only client has to download a load of bytes it already has when it wants to look at the chain in detail.

It's only 80 bytes, not a big deal. In practice SPV clients need to download the full blocks after their wallet gains a key anyway. In future a pattern matching protocol will be needed to avoid that overhead.

  • Again: I haven't looked, but doesn't the alert message allow a DoS?  It's non-trivial to verify a signature, and presumably every alert message has to be checked for validity?  What's to stop an attacker sending broken alert messages all through the network, continuously?  That'll use up plenty of CPU.

Yes but you can easily DoS a node with bad transactions already.
realnowhereman
Hero Member
*****
Offline Offline

Activity: 504
Merit: 502



View Profile
April 26, 2011, 07:32:32 PM
 #33

  • The server half of a connection shouldn't speak first.  It makes life easy for sniffers.

This is already changed, though I don't think I care one way or another. The ports are fixed today so if for some reason you want to find BitCoin nodes you can just check if port 8333 is open.

That's good.

I'm actually trying to think more generally.  The default port number should be just that: a default.  It really shouldn't make any difference if someone paranoid wants to run it on a different port, in which case, it's nicer if the service doesn't identify itself to anyone who comes knocking.

  • Block download should have been most recent first.

That would mean you can't start verifying the chain until you have fully downloaded it, complicating the client significantly (you would have to store the chain then reverse it).

I'm not sure I agree with that.  What do you mean by "verifying it"?  (I'm only guessing here, I'm pretty new to Bitcoin, but I'm very familiar with git, and the two are remarkably similar).  The chain is verified when you hit the genesis block; but why does the genesis block have to be the hard coded one?  I could hard code any combination of block hash and count in and tell the client that the chain was valid if that block is found.  Further: I can't see any fundamental reason why the entire block chain is needed, surely the very old blocks where all transactions have been spent could be pruned at present -- in which case why can't they be pruned during download, and not downloaded at all?

This idea of "reversing" a chain is pretty crazy as well; the blocks point to their parents, not to their children, so the natural traversal direction is backwards.  If I were designing a structure to hold blocks, I'd have one of the fields in it be "Block *Parent", doesn't that tell you what the traversal direction should be?  Transactions also naturally want to verify backwards since they refer to past transactions not future transactions.

  • There is no way to query a node for the transactions it has queued.  If I've been disconnected for a long time, I can use getblocks to find out what I've missed.  There is no way to find out what transaction broadcasts I've missed.  This is relevant for early confirmation indications.  Instead of waiting ten minutes for confirmation, I can at least get a hint that the transaction is queued by my peers.

That would be nice but not hard to add.

Good.  Here, my thoughts are that this could go someway toward mitigating the 10 minute wait for confirmation.  There is a pre-confirmation level if you know how many peers are holding your transaction for inclusion in a block.  Obviously its not as valid as real confirmations, but it's a long uncomfortable gap between 0 and 1, when it's money on the line.

  • Why is there no field in the version message for specifying the application version?  Application and protocol versions are different things.  As more implementations of the client appear, this is important.  What if version 5.6 of SomeNewClient is widely distributed but has a bug in it?  That bug could be worked around if only we knew that we were talking to version 5.6 of SomeNewClient.

Satoshi didn't give much thought to re-implementations. I am using the subVer field for this in my client. It's unused today except by very old clients, and it allows alert broadcasts to target it if necessary (though I don't have the keys and my code doesn't pay attention to alerts anyway).

Well yes; the question was what would you change in the protocol.  I'd change that.  The fact that Satoshi didn't give much thought to reimplementations is the reason its needed.  The existing monoculture of clients is simply not acceptable for a financial service.  It's not acceptable for email servers, DNS servers or bittorrent clients, why would it be okay for storing real money?

  • The services field is underused.  What about flags to say whether a node is a generator, whether it will accept transactions, whether it will broadcast transactions, whether it keeps a peer directory (for the addr message), whether it keeps a full block chain or just headers, whether it should be noted down as a seednode?

It works OK. In practice today all nodes are the same. There's no way you can be just a peer directory, or whatever. Seed nodes are chosen by Gavin/other devs based on which nodes have a good track record of availability, it doesn't make sense to let people elect themselves for that role.

"There's no way" at present.  My biggest problem with the bitcoin client is it is so monolithic.  It should be perfectly possible for a client to announce what services it supplies and only supply them.  The fact that there is only one bit in the services field kind of makes my point.

As to the idea that "Gavin/other devs" choose the seed nodes: what kind of peer to peer network is that?  I don't mind hints in the client, but it is really for the peers to decide if they are willing to act as seed nodes, not for developers to pick them.  I'd also envisage it only as a hint to other peers.  If I connect to some random peer and it has announced itself as a seed, I can make special note of that fact and use that node again in the future.  Similarly for answers to getaddr - I would pass seed nodes in preference to non seed nodes.  There is no advantage to pretending to be a seed node, so it's unlikely that people would lie.

  • It's possible to request only the header of a block, but not to request only the body.  A headers-only client has to download a load of bytes it already has when it wants to look at the chain in detail.

It's only 80 bytes, not a big deal. In practice SPV clients need to download the full blocks after their wallet gains a key anyway. In future a pattern matching protocol will be needed to avoid that overhead.

I'm sorry, but that's a pretty bad attitude -- if 80 bytes don't need sending, then don't send them.  80 bytes times the 120,000 (approx at time of writing) blocks is 9.6meg that doesn't need downloading.  I'm thinking more about thin clients anyway, and I am particularly interested in maintaining a sparsely populated chain, with the thin client downloading as little as possible at all times.  My idea is that a thin client only needs to verify transactions for its own addresses, in which case a headers-only chain can be traversed backwards only grabbing full blocks along that transaction path.  The full block chain becomes more a local cache than a necessity (and a flag to indicate that that's how the client is behaving in the services field... oops, but there isn't one).

  • Again: I haven't looked, but doesn't the alert message allow a DoS?  It's non-trivial to verify a signature, and presumably every alert message has to be checked for validity?  What's to stop an attacker sending broken alert messages all through the network, continuously?  That'll use up plenty of CPU.

Yes but you can easily DoS a node with bad transactions already.

Okay.

I'm not trying to completely slag off bitcoin; there is a lot to admire and it's exciting and new enough a project that it's got me really interested.  That interest has manifested in my working on an independent bitcoin client (as I'm sure are many others).  At the moment though, I'm simply scratching an itch, and this thread was a place to write down those thoughts I've had during my researches.

1AAZ4xBHbiCr96nsZJ8jtPkSzsg1CqhwDa
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1128


View Profile
April 26, 2011, 10:32:16 PM
 #34

Yes sorry, to be clear the points I didn't reply to I mostly agree with :-)

Quote
What do you mean by "verifying it"?

Checking that the block contents are valid, ie, that the blocks connect together into a chain that follows the rules and that the transactions within the blocks connect together correctly (are properly signed etc).

You can't do that backwards because then transactions would constantly be referring to dependencies you haven't seen yet.

Quote
I could hard code any combination of block hash and count in and tell the client that the chain was valid if that block is found.

You're thinking of thin clients that don't do full block chain verifications. Then it could be worked backwards, sure. The protocol was created for full nodes to talk to each other though, and they have to store the entire chain.

It's true that theoretically the chain can be pruned as long as you're willing to take the risk the chain will never fork beyond that point. Transactions that are buried under enough blocks can be deleted as long as the merkle branches of the remaining transactions are preserved.

Quote
As to the idea that "Gavin/other devs" choose the seed nodes: what kind of peer to peer network is that?

Seed nodes are only used if IRC bootstrapping fails for some reason. Nodes keep a record of advertised addresses and try to reuse them in future regardless of if they are in the seed list or not. The idea is to allow you to join the network even if the pre-arranged rendezvous point is unreachable and the nodes you've seen before are offline. That's why they are selected based on long term availability and hard coded into the client.

Quote
I'm thinking more about thin clients anyway, and I am particularly interested in maintaining a sparsely populated chain, with the thin client downloading as little as possible at all times.  My idea is that a thin client only needs to verify transactions for its own addresses, in which case a headers-only chain can be traversed backwards only grabbing full blocks along that transaction path.

Thin clients cannot verify transactions because they cannot prove there is no double spend. So thin clients need at least some headers, but they only need to download the contents of blocks to find transactions that interest them. A more efficient (but less private) way is to ask peers with full copies to do that selection for them, like polling a mailbox, but the protocol does not support that right now.

Re-downloading 80 bytes is nothing compared to the size of a full block, even today. Avoiding downloading the 80 bytes in the first place, sure, that's useful and can be achieved by bootstrapping from a recent checkpoint if the wallet is empty.

ByteCoin
Sr. Member
****
expert
Offline Offline

Activity: 416
Merit: 277


View Profile
April 26, 2011, 10:46:11 PM
 #35

I think realnowhereman's suggestions for improvement are excellent and comprehensive.

The only point I disagree with is that the fundamental unit of currency being "tens of nanocoins" is fine. I find it very surprising that so many people have trouble with converting between a decimal representation of a number of bitcoin and the number of fundamental units.
The mathematics is so elementary.

I'd like to add a suggestion that we revamp the Hash160ToAddress via the Base58Encoding so that leading zeros are not handled in the current peculiar fashion. It's hard to approve of a function that takes a fixed length input and outputs a more human friendly "address" which can vary in length between 27 characters and 34 characters inclusive.
Code:
1111111111111111111114oLvT2 
1ByteCosnsUNJun4KL3HSt1NfFdXpzoRTy
are both valid addresses. It makes the first step of address verification on websites vastly more complex than it needs to be.


  • Block download should have been most recent first.
That would mean you can't start verifying the chain until you have fully downloaded it, complicating the client significantly (you would have to store the chain then reverse it).
There's nothing stopping you downloading the chain in reverse one by one when you get notified of a new block. Your software could also store the chain in reverse and process it with no increased difficulty. There should be support in the network protocol for clients efficiently downloading the chain in reverse. As well as "start height", "version" could return the hash of the last received block to facilitate downloading most recent blocks first.

Not downloading and in fact discarding and forgetting old blocks is of course a central feature of my "balance sheets" proposal.

ByteCoin

PS
You can't [verify the block chain] backwards because then transactions would constantly be referring to dependencies you haven't seen yet.
That's just a programming problem, you keep track of the unresolved dependencies and check them when you've worked back enough blocks.

It's true that theoretically the chain can be pruned as long as you're willing to take the risk the chain will never fork beyond that point. 
That's a reality we already live with. There's little point verifying the block chain before block 105000 due to the "checkpointing" in the current (0.3.20.2) version of bitcoin (see main.cpp).
xf2_org
Member
**
Offline Offline

Activity: 98
Merit: 13


View Profile
April 27, 2011, 03:32:25 AM
 #36

    My grumbles (don't read this as aggressive, I love bitcoin, but you did ask):

  • Big endian would have been nicer, just to keep with the tradition set by most other TCP/IP protocols ntohs(), et al sort out the swapping for us anyway.  The Internet Protocol spec requires it for all packet headers, and many other protocols.

The overwhelming majority of all hosts that will use this protocol are little endian.

Naively suggesting big endian because it's "network byte order" is simply buying what the Oracle (ex. Sun) and other RISC folks are selling.  It makes zero engineering sense for bitcoin.

However, bitcoin should have been written like proper software, with bytesex conversions, but it was not.

Quote
  • Even if not big endian; it would have been nice to be consistent.  There are big and little endian values in the bitcoin protocol.  Making it necessary to check the protocol documentation to be sure, instead of just remembering one.

Where are the big endian values?  The network addresses are unavoidable, as is SHA256 words.

Quote
  • Putting a locally-determined address and port number in an application-level message was a bad idea.  Local applications don't know what their globally visible address is -- they can be behind NAT.  It's not like the remote peer even needs it -- that information is handed over by the operating system when connection is established.  FTP has demonstrated for years what a nightmare this is.  The official client has to jump through "whatsmyip" hoops to get this information, and it was completely unnecessary.  If we're talking single points of failure, as an attacker I'd go for whatsmyip (et al) and bitcoin is in trouble.

Mostly agreed.

Quote
  • The server half of a connection shouldn't speak first.  It makes life easy for sniffers.

This is fixed in 0.3.21.

This also enables accept filtering (TCP_DEFER_ACCEPT in Linux, there is also a BSD equivalent).

Quote
  • Is verack even necessary?  Connect.  Client says "I speak version 10".  If the server is willing to speak "version 10" it can answer "I will speak version 10" (regardless of its true version); if it is not then it can say "I will speak version 5", the client can decide if it is willing to speak version 5 and continue or hang up.  verack isn't necessary and makes start up more complicated than needed.

It's just a sharing of capabilities.

Quote
  • Did we really need RIPE-MD and SHA256?  If you want 160 bytes of hash, then just truncate the SHA-256 hash.  They are meant to be evenly distributed, so there shouldn't be any grouping issues.

A little late to change now Smiley

Quote
  • Why use double SHA-256 for the message checksum?  Checksums are there to ensure that data isn't unintentionally corrupted from A to B.  That checksum doesn't need to be cryptographically secure.  Even if it were -- what advantage is there to double hashing for a checksum?

Uses more CPU power, making it harder to generate a proof-of-work, I imagine.  Also a little late to change now.

Quote
  • Block download should have been most recent first.  Each peer must know what its current block chain tips are.  Those should be requestable by a command.  Then getblocks should have sent the one requested and then its parent, then its parent, etc, etc, until we hit the genesis block.  For comparison, see how git stores its "branches" -- the branch is simply a pointer to the commit at the head of that branch.

Interesting idea.

Quote
  • No thought seems to have been given to a single multi-user system -- two users of one computer with two wallets that must each remain private from the other.  The main bitcoin server should be started up during boot, and it should be responsible for making itself identical to any other arbitrary node and the information it holds is just as public.  It would be advantageous to run a local copy simple for speed (and generation). Then, a wallet/transaction client would connect to that (or any other node) to perform transactions.  There are no commands to support such a light client -- e.g. requesting transaction validation/confirmation.

I wouldn't say "no thought."  satoshi can only do so much all by himself.  There are tons of satoshi ideas still unimplemented...  Just getting the basic, ugly client out there obviously took a ton of work.

Patches welcome!  Smiley

Quote
  • There is no way to query a node for the transactions it has queued.  If I've been disconnected for a long time, I can use getblocks to find out what I've missed.  There is no way to find out what transaction broadcasts I've missed.  This is relevant for early confirmation indications.  Instead of waiting ten minutes for confirmation, I can at least get a hint that the transaction is queued by my peers.

This has privacy implications.

Quote
  • 64-bit maximum resolution for VarInt type storage?  The message header will only allow you to send 32-bits of payload; making it impossible that any particular vector in a message will ever need 64-bits to specify its length.  On the same theme: the messages are limited to sizes that are way smaller than 32-bit lengths.

Agreed.  Too late to change, though.

Quote
  • Considering how freely 64-bit fields have been chucked around, the time fields are 32 bits in some places.  64-bit times are useful.

100% agreed... sigh

Quote
  • ... but not for the version message... which uses 64 bits to store the number of seconds since the unix epoch.  Why?  That's 584 billion years of resolution.  Why not have specified it as number of microseconds past the epoch, in case the extra accuracy is ever useful?  That'll still get you 584 thousand years.

Who knows Smiley  As you note, the time storage is all over the place.

Quote
  • Why break the version field on base-10 boundaries?  Base-16 boundaries make everything easier to process (not needing division for display) and read in a hex dump.

Some software does it like that, so perhaps satoshi simply copied an existing encoding scheme.

Quote
  • Why specify transaction values in "tens of nano coins"?  Again: a base-2 split would be better.  A fixed-point base-2 unit doesn't suffer strange rounding errors from 2-to-10 conversion.  I haven't looked but I'll bet the bitcoin client is filled with loads of "convertNanoCoinsToCoins64BitWithRounding()" type functions.  Those special rounding rules have to be implemented by all future clients now.

Voluminous discussions about rounding on the forums Smiley

Quote
  • Why is there no field in the version message for specifying the application version?  Application and protocol versions are different things.  As more implementations of the client appear, this is important.  What if version 5.6 of SomeNewClient is widely distributed but has a bug in it?  That bug could be worked around if only we knew that we were talking to version 5.6 of SomeNewClient.
You're not the first to point this out, either Smiley

Quote
  • The services field is underused.  What about flags to say whether a node is a generator, whether it will accept transactions, whether it will broadcast transactions, whether it keeps a peer directory (for the addr message), whether it keeps a full block chain or just headers, whether it should be noted down as a seednode?

Yep, this will be used for future enumeration of capabilities.

Quote
  • It's possible to request only the header of a block, but not to request only the body.  A headers-only client has to download a load of bytes it already has when it wants to look at the chain in detail.

Yep (but don't forget merkle details).  Client mode is a key interest for us all, to ensure the bitcoin network remains usable as transaction rates increase.

Quote
  • Perhaps I've misunderstood, but it seems that getblocks and getheaders both include the protocol version.  Don't we already know the protocol version?

Welcome to the Department of Redundant Redundancies Smiley

Quote
  • Again: I haven't looked, but doesn't the alert message allow a DoS?  It's non-trivial to verify a signature, and presumably every alert message has to be checked for validity?  What's to stop an attacker sending broken alert messages all through the network, continuously?  That'll use up plenty of CPU.

True.

Quote
I'm sure I'm wrong about a good proportion of those, but they're what I thought of while reading the protocol spec.

Nope, not wrong...  bitcoin protocol is... unique.

[/list]
realnowhereman
Hero Member
*****
Offline Offline

Activity: 504
Merit: 502



View Profile
April 27, 2011, 08:27:10 AM
 #37

    My grumbles (don't read this as aggressive, I love bitcoin, but you did ask):

  • Big endian would have been nicer, just to keep with the tradition set by most other TCP/IP protocols ntohs(), et al sort out the swapping for us anyway.  The Internet Protocol spec requires it for all packet headers, and many other protocols.

The overwhelming majority of all hosts that will use this protocol are little endian.

Naively suggesting big endian because it's "network byte order" is simply buying what the Oracle (ex. Sun) and other RISC folks are selling.  It makes zero engineering sense for bitcoin.

However, bitcoin should have been written like proper software, with bytesex conversions, but it was not.

I hardly think I'm naive.  And I hardly see how endianness affects sales of Sun or Oracle equipment.  One or the other endianness could have picked all those years ago (personally I think little endian would have been better -- it's much easier for little processors to do maths with), but what is, is.

C code can and should be written with no assumptions about the endianness of the system.  And I doubt there is very much difference between

x = c[0] << 8 | c[1];

and

x = c[1] << 8 | c[0];

from the processor's point of view.  Therefore "engineering sense" is not really a consideration.

As to your assertion that "overwhelming majority" of clients are little endian; I would suggest it is you who is being naive.  At present it is almost certainly true.  However, have a think about what endianness smartphones are, and how many billion more of them there will be than desktops in five years time -- surely bitcoin should be aiming to be on every single one of them?  The endianness of the platform is irrelevant though, it is simply a standard that network protocols use big endian.  There was no good engineering reason for bitcoin to break with that tradition.

Quote
  • Even if not big endian; it would have been nice to be consistent.  There are big and little endian values in the bitcoin protocol.  Making it necessary to check the protocol documentation to be sure, instead of just remembering one.

Where are the big endian values?  The network addresses are unavoidable, as is SHA256 words.

Network address, port number, and SHA256 are unavoidably big endian? There's your answer then.  To be consistent, everything else should have been big endian.

Quote
  • The server half of a connection shouldn't speak first.  It makes life easy for sniffers.

This is fixed in 0.3.21.

This also enables accept filtering (TCP_DEFER_ACCEPT in Linux, there is also a BSD equivalent).

Both good news.

Quote
  • Is verack even necessary?  Connect.  Client says "I speak version 10".  If the server is willing to speak "version 10" it can answer "I will speak version 10" (regardless of its true version); if it is not then it can say "I will speak version 5", the client can decide if it is willing to speak version 5 and continue or hang up.  verack isn't necessary and makes start up more complicated than needed.

It's just a sharing of capabilities.

I wasn't saying version was unnecessary, I'm saying that verack was unnecessary.  Therefore the additional complication of coding a handler for verack was unnecessary.  No information is passed in verack that couldn't have been exchanged another way.

Quote
  • Did we really need RIPE-MD and SHA256?  If you want 160 bytes of hash, then just truncate the SHA-256 hash.  They are meant to be evenly distributed, so there shouldn't be any grouping issues.

A little late to change now Smiley

Well that's true of a lot of my points.  But the question was "what would you change if you could?", not "what should we change?"

Nothing's ever too late to change of course, if you really want to.  Ready?

return (Block < 500000) ? ripemd(sha256(buffer)) : truncate(sha256(sha256(buffer)))

Quote
  • Why use double SHA-256 for the message checksum?  Checksums are there to ensure that data isn't unintentionally corrupted from A to B.  That checksum doesn't need to be cryptographically secure.  Even if it were -- what advantage is there to double hashing for a checksum?

Uses more CPU power, making it harder to generate a proof-of-work, I imagine.  Also a little late to change now.

Which would be a valid point if we were talking about anything other than a checksum.  Most protocols simply sum the bytes to generate a checksum.  CRC would have been a step up.  A single SHA256 would have been overkill.  Double SHA256?  That's the site nuked from orbit.  All those smartphones with their limited batteries are not going to thank anyone for the double SHA256 they have to do on every message they receive.

Quote
  • No thought seems to have been given to a single multi-user system -- two users of one computer with two wallets that must each remain private from the other.  The main bitcoin server should be started up during boot, and it should be responsible for making itself identical to any other arbitrary node and the information it holds is just as public.  It would be advantageous to run a local copy simple for speed (and generation). Then, a wallet/transaction client would connect to that (or any other node) to perform transactions.  There are no commands to support such a light client -- e.g. requesting transaction validation/confirmation.

I wouldn't say "no thought."  satoshi can only do so much all by himself.  There are tons of satoshi ideas still unimplemented...  Just getting the basic, ugly client out there obviously took a ton of work.

Patches welcome!  Smiley

I'm working on it.

My problem is that it isn't "extra work" that needed doing.  Just different work.  So "Satoshi can only do so much" isn't the argument.

Quote
  • There is no way to query a node for the transactions it has queued.  If I've been disconnected for a long time, I can use getblocks to find out what I've missed.  There is no way to find out what transaction broadcasts I've missed.  This is relevant for early confirmation indications.  Instead of waiting ten minutes for confirmation, I can at least get a hint that the transaction is queued by my peers.

This has privacy implications.

What?  I'm sorry but that's nonsense.

You're saying that transactions that would have been broadcast to me had I been connected to the network are private?  That I can't ask another node "what transactions should I be holding to add to a generated block" because of privacy concerns?  The whole of the blockchain is public knowledge, the only privacy that Bitcoin supplies is the assumed anonymity of addresses.


Quote
  • Why specify transaction values in "tens of nano coins"?  Again: a base-2 split would be better.  A fixed-point base-2 unit doesn't suffer strange rounding errors from 2-to-10 conversion.  I haven't looked but I'll bet the bitcoin client is filled with loads of "convertNanoCoinsToCoins64BitWithRounding()" type functions.  Those special rounding rules have to be implemented by all future clients now.

Voluminous discussions about rounding on the forums Smiley

I'm not surprised.  The problem I expected was that conversion from base 2 "decimals" to base 10 decimals, while simple, is not unambiguous (there are some base-10 fractions that can't be represented exactly in base 2).  I haven't looked though, so I don't know if problems like that are actually problems in real bitcoin.

Quote
  • It's possible to request only the header of a block, but not to request only the body.  A headers-only client has to download a load of bytes it already has when it wants to look at the chain in detail.

Yep (but don't forget merkle details).  Client mode is a key interest for us all, to ensure the bitcoin network remains usable as transaction rates increase.

As I say, I'm very new to bitcoin; but I had thought that the merkle tree is calculated rather than stored (at least publicly).  The merkle tree would be another convenient thing to be able to request; but that's nicely isolated enough that another couple of commands "getmerkle" and "merkle" could easily be added to provide it in lieu of the whole block.  The point is the same though: a thin client should be able to request only those bits of information it doesn't have.

Quote
I'm sure I'm wrong about a good proportion of those, but they're what I thought of while reading the protocol spec.

Nope, not wrong...  bitcoin protocol is... unique.

Anyway, thanks for responding.  I understand that these things happen when one is designing something new; hindsight is 20/20.  For all my complaints, I do recognise the core idea of bitcoin is a good one, and it was Satoshi that came up with that, not me.  So what's all my smart-ass criticism really worth, eh?

I feel I was late to the game both for git and bitcoin.  I was working (years ago) on a project that used hashes of hashes to make chains, and never even realised the potential of the technique (I'm certainly not saying I invented it) for distributed systems.  It's funny, I remember my father trying to explain "content addressable memory" to me when I was young, and I couldn't see the point of it.[/list]

1AAZ4xBHbiCr96nsZJ8jtPkSzsg1CqhwDa
bytekoinz
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
April 27, 2011, 08:59:03 AM
 #38

I would change it so vital parts of the protocol can be preformed with out a connection to the internet, and only encrypted blocks of cipher text / already signed data would need to touch the internet. People could decrypt and verify signatures on a different machine with no connection to the internet, and the data to try to hash also transferred this way and then coins mined for and stored completely offline prior to a signed/ciphertext transfer of value. This would prevent hackers from being able to attack the network with buffer overflows and similar, root all of the clients and destroy the value of Bitcoin. This could be done either with flaws in the programming of the bitcoin client used, or flaws in the programming of other applications used in a shared environment. I doubt many Bitcoin users are taking security measures capable of defending from intelligence agency / military / super l33t hackers in general and such an attacker could likely take over the network. By removing critical processes from the internet entirely and having only secured/signed/encrypted data online, you can completely remove the risk of hackers 100%. This is the only way to remove such risk 100% as well, but most users are not even securing themselves near as well as they could be while connected to the internet, and the technical expertise required to do this is significantly beyond that of the average computer user.
bytekoinz2
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
April 27, 2011, 09:04:23 AM
 #39

I should also add that data should be transferred between the internet connected machine and the disconnected machine via CD which is discarded, so an attacker can not use the CD as a compromise vector to communicate data from the disconnected machine to the connected machine and then back to the attacker. Also, at least one back up of the drive of the disconnected drive should be made periodically, incase a compromise attempts to wipe the drive rather than steal the wallet.
realnowhereman
Hero Member
*****
Offline Offline

Activity: 504
Merit: 502



View Profile
May 02, 2011, 08:00:35 PM
 #40

A few more:

  • If the "version" code prefixed on bitcoin addresses had been anything other than zero, they wouldn't have been a variable length.  At present they can vary between 27 and 34 characters.
  • The "sequence" number is stored in the TxIn record.  If I've understood, the sequence number allows us to replace one transaction with a later version, why then is the sequence number in the TxIn field?  Surely if we change the inputs we'll change the outputs?  The sequence number should be part of the transaction record, not the txin record.
  • OP_CHECKSIG:  Woah!  It really doesn't seem right to me that a script operator is so complex.  It just feels wrong that the scripts need to be filtered, truncated, and reencoded into a copy of the transaction; and this is done for every TXIN.. and comes up with a different result.  I understand the problem, every component of the transaction needs signing as a whole by all the owners of the source TXOUTS, but OP_CHECKSIG reads like some sort of frankenstein monster.  My first instinct is that the signatures shouldn't have been part of the script, they should have been indexed fields in the transaction header, referenced with an OP_FETCHSIG(1) or similar in the script.  That way the script would be static and the signature could remain outside the block that it is signing.  I suppose to be more generic it would have been better to have a simply parameter block as part of the transaction header, and the operator would be OP_FETCHPARAM; but the idea is the same.  It's not like this is even out of character.  OP_CHECKSIG already requires out-of-script data.

1AAZ4xBHbiCr96nsZJ8jtPkSzsg1CqhwDa
Pages: « 1 [2] 3 4 5 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!