work2heat
Newbie
Offline
Activity: 21
Merit: 0
|
|
January 14, 2015, 09:21:55 PM |
|
BlueMatt wrote a pretty extensive agreement testing framework and discovered several corner cases that were not previously known. He's generally of the opinion that achieving a correct reimplementation is more or less intractable. His approach made progress, especially in unearthing weird behaviours, but doesn't result in great confidence of soundness; and doesn't show an implementation free of its own buts not related to non-obvious behavior in Bitcoin Core. Our strategy in Bitcoin Core as of late has been toward compartmentalizing and simplifying in order to make code reuse more reasonable; and also get things structured to be more agreeable to approaches that would possibly make formal analysis more realistic, but thats a long term effort. Is there somewhere I can read about the kinds of corner cases that were discovered and how they were dealt with? Also, what would it take to produce a formally verifiable implementation of bitcoin? Is such a thing even possible? Can we expect a PhD thesis to deliver the matter in the next decade?
|
|
|
|
gmaxwell
Moderator
Legendary
Offline
Activity: 4284
Merit: 8816
|
|
January 14, 2015, 11:10:26 PM |
|
Is there somewhere I can read about the kinds of corner cases that were discovered and how they were dealt with?
They're sort of scattered in various places, for example: http://sourceforge.net/p/bitcoin/mailman/message/29699385/Also, what would it take to produce a formally verifiable implementation of bitcoin? Is such a thing even possible? Can we expect a PhD thesis to deliver the matter in the next decade?
One question with formal verification is what are you formally verifying? If the question is the most important question in consensus "is this function equivalent to that function for all inputs" there is one very easy to have a strong guarantee that the answer is true: make them be exactly the same function. Formally proving two functions written in different languages compute the same thing is much harder, and likely not practically possible if the languages are very different or the algorithms used internally substantially different. To formally verify other properties we first need to extract the part that needs to be verified into it's smallest self contained realization. You can look at whats been done with SEL4 to see whats currently possible (and how much work it takes for a rather small program) at the state of the art. In Bitcoin core we're working on the isolation which is an important first step on both with the "use the exact same code for the consensus critical parts" approach as well as the "formally show some code achieves certain properties". I suspect that on the latter we won't make much progress in the near term outside of some parts in the underling cryptographic code unless some researchers pick it up as an area of study, but we should hopefully be well positioned to take advantage of formal analysis tools as that space matures.
|
|
|
|
davec
Newbie
Offline
Activity: 39
Merit: 0
|
|
January 15, 2015, 12:45:11 AM Last edit: January 15, 2015, 01:20:58 AM by davec |
|
I personally think the question is wrong here. Not even Bitcoin Core is technically 100% compatible with the same version of itself, so asking that question is starting with a false premise. As previously mentioned, there is a lot of work to be done in the formal verification space, but even then, due to hardware differences, compiler optimization differences, OS differences, and a myriad of other factors, I personally think it's unlikely there will ever be absolute 100% certainty. The question then becomes, does there really have to be?
To me, what it really comes down to is probabilities. The popular assumption is that Bitcoin Core has an extremely low probability that it will fork against itself. So, let's go with that premise and say that the probability of Bitcoin Core forking against itself is something like 1 in a 100 trillion (1e14).
When an alternative implementation passes all consensus tests and covers all known corner cases, then it too has a relatively low probability of forking. If that is not true, then the consensus tests need to be improved. For the sake of argument, let's assume that such an alternative is 3 orders of magnitude, or 1000x, more likely to fork. Well, then the forking probability relative to the Bitcoin Core forking probability previously stated would then be 1 in 100 billion (1e11).
Moreover, if/when a fork does happen, it identifies a case that the consensus tests missed and can be used to improve the consensus tests which in turn further reduces the probability for all nodes involved (including Bitcoin Core) that validate against the consensus tests.
It is estimated that in any given year the odds that you will die from an impact of an asteroid or comet are between 1 in 3,000 and 1 in 250,000. However, that probability is still low enough that I bet you don't live in a steel box and refuse to go outside just in case.
|
|
|
|
tl121
|
|
January 15, 2015, 01:06:43 AM |
|
I was going to make a post similar in content to davec's, but he already made my points.
One of the reasons why I didn't get involved with bitcoin at the beginning was because it was a complex distributed system that had a code base but no real specification. It was genius and luck that bitcoin took off and still works. There should be a formal specification, but the problem will be keeping it in sync with the code base. This will be politically impossible so long as there is only a single code base. So for the long term good of bitcoin it is imperative that there be different code bases.
There will be forks. There are airline crashes. People still fly. In the aerospace safety world, the phrase is, "All progress comes as the result of crashes."
|
|
|
|
gmaxwell
Moderator
Legendary
Offline
Activity: 4284
Merit: 8816
|
|
January 15, 2015, 01:18:49 AM Last edit: January 15, 2015, 02:07:43 AM by gmaxwell |
|
To me, what it really comes down to is probabilities. The popular assumption is that Bitcoin Core has an extremely low probability that it will fork against itself. So, let's go with that premise and say that the probability of Bitcoin Core forking against itself is something like 1 in a 100 trillion (1e14).
Probabilities are not an effective model to reason about bugs in an adversarial environment. (Even in conventional reliability engineering-- without invoking an adversary-- you must first separate systematic failure from random failure in order to reason about failure distributions.) "If you have a website with a billion pixels and due to a bug clicking one gives the reader one of your bitcoins, and you have one visitor an hour and a normal visitor clicks on 100 pixels, how long until you've lost 100 bitcoins?" .... This doesn't make sense, and thinking about consensus bugs in this manner doesn't make sense. Existing consensus bugs we've seen triggered against implementation were 10^-100 scale events if your model is random monkies pounding on keyboards to produce transactions. Virtually all the errors we care about are not random failures, and to the extent that there is an element of randomness an attacker can behave adaptively to bring out the otherwise 'rare' behavior that is harmful to your interests. the consensus tests need to be improved Thats always true, the tests are woefully inadequate and would continue to be even if they were terrabytes in size. That said, improving them has high value. I'm looking forward to contributions on that front. Probably one of the really important things an alternative implementation can do for the ecosystem is produce new tests as a side effect of their own independent efforts to obtain agreement. (Though, sadly, there seems to have been at least some level of people keeping at least implementation errors private in order to use them for marketing purposes... with all the different ways to monetize a security flaw other than reporting it, we cannot really count on reports.) But thinking that passing a prefab test suite that someone else wrote alone is sufficient-- which is certainly the impression given there-- sounds like magical thinking. It's not sufficient, merely necessary, and one shouldn't rely on it alone. (If this wasn't the message you were intending to convey, it wasn't clear to me.) Anyone can grind at some tests until they pass but doing, sadly, reduces the predictive power of the tests for errors not present in the tests. (This is partially why I prefer the block tests in Bitcoin runs in CI rather than being used locally by developers; we get a concrete measurement of detectable consensus bugs which are introduced by a change.) Yes, by some definition the tests are inadequate if there is a failure in spite of passing; but they're always going to be inadequate because they cannot be sound, at least not by themselves. (A test coupled with a formal analysis that showed it met some definition of completeness would be interesting.) There should be a formal specification,
Philosophically there should be, but in terms of benefit to the world the value is quite possibly negative. A non-executable or not widely used specification is a dead letter. The behavior of the participants defines the system. You can point blame all day this way that when some reading of the specification disagrees with some other reading; but its little comfort when all the funds are gone. As a pedagogical artifact a good specification may be highly useful; but it can also cause harm by feeding deep misunderstandings like having software which you believe-does (instead of actually-does) what you believe the specification says (rather than what everyone else actually-does, regardless of what they believe the specification says) is an acceptable situation to be in. "But I followed the spec!" "Too bad, all your coins have been double spent away because you ended up on a different chain from the other participants." Compared to other efforts to improve system integrity I think specification work beyond what exists now has fairly low returns. (Doubly so because existing alternative implementers have frequently made mistakes in parts of the system that are clearly specified and have tests in the normal tests suites.). In Bitcoin the behavior of the system is all there is. You have no other recourse, no higher authority. Being surprised or unhappy with at the Bitcoin network for disagreeing with a ream of paper is, in some ways, like being mad at the universe for not being completely defined by Einsteins' description of general relativity. Like the universe, the Bitcoin network doesn't give a darn about political gamesmanship. It does what it does, not what you think it should do. To the extent that people want to work on specification work, especially if it was specification work that clearly enough defined to permit formal reasoning, it's work I would support-- in spite of my reservations about potential net-negative value from it.
|
|
|
|
davec
Newbie
Offline
Activity: 39
Merit: 0
|
|
January 15, 2015, 01:59:36 AM |
|
Probabilities are entirely the wrong model to reason about bugs in an adversarial environment.
We can't just cherry pick what to apply our arguments to. The entire argument here is entirely based around the probability that Bitcoin Core is less likely to fork than alternatives, because the exact same logic you describe applies to multiple versions of Bitcoin Core on different compilers, OSes, hardware, etc, as much as it does to alternatives. Of course, in an adversarial environment, you have bad faith actors looking to exploit those probabilities. You're asserting that there is a much lower probability that they will be able to do so between two versions of Bitcoin Core. That's fair, but it doesn't make any sense to use probabilities to assert that and then turn around and say they don't matter when applied to a different implementation. Either probabilities don't matter and the entire argument about whether Bitcoin Core is more or less likely (probability) to fork or not is moot, or they really do matter. I claim they absolutely do matter because Bitcoin Core isn't immune to forks either and there is clearly some chance involved that forks will occur between any two pieces of Bitcoin software. But thinking that passing a prefab test suite that someone else wrote alone is sufficient-- which is certainly the impression given there-- sounds like magical thinking. It's not sufficient, merely necessary, and one shouldn't rely on it alone. (If this wasn't the message you were intending to convey, it wasn't clear to me.)
The claim isn't that a prefab test suite is entirely sufficient, it's that it lowers the probability of a fork occurring. When is the probability low enough that it's generally safe to transact? If you claim there isn't one, then Bitcoin, as a whole, is unsafe to transact on because the entirety of consensus is probability based to begin with.
|
|
|
|
gmaxwell
Moderator
Legendary
Offline
Activity: 4284
Merit: 8816
|
|
January 15, 2015, 02:03:51 AM |
|
We can't just cherry pick what to apply our arguments to. The entire argument here is entirely based around the probability
I'd like to invite you to point out any place where I argued anything about probability, except to rebut your reliance on that argument. Bitcoin is an experimental system, and in some regards it is pushing the envelope about the kinds of systems we know how to engineer and is doing so under considerable resource restrictions.
|
|
|
|
Realpra
|
|
January 17, 2015, 12:30:21 AM |
|
https://github.com/conformal/btcdI don't know if it works and is done yet, but here you go... hahaha. In truth though Bitcoin is an agreement between humans and as such it can never die. Hash algorithms and signing algorithms can be switched and bugs can be fixed. Consensus splits can be resolved - ALL of those have already happened before. (Original payment method was supposed to be IP based, 2010 had a script bug letting everyone spend anything and as noted 2013 had a major fork event) Gavin's scalability roadmap breaks the Bitcoin protocol, but because people support it and because they believe it is aligned with the concept of "Bitcoin" it will happen eventually. However the system remains stable because you need 90+% majority for such things to work and only simple changes aligned with the human vision can attain that. It's extremely interesting and beautiful in a way. You guys should read Asimovs Foundation book series.
|
|
|
|
grau
|
|
January 17, 2015, 10:25:06 AM |
|
However the system remains stable because you need 90+% majority for such things to work and only simple changes aligned with the human vision can attain that. It's extremely interesting and beautiful in a way.
You are right. The real foundation of Bitcoin is the shared desire to have it, that will overcome any technical obstacles.
|
|
|
|
work2heat
Newbie
Offline
Activity: 21
Merit: 0
|
|
January 18, 2015, 03:27:01 AM |
|
One question with formal verification is what are you formally verifying? Right. So maybe we can try to think about how to break bitcoin up into modular parts (eg. secp256k1, vm/script, encoding/decoding, merkle tree, create/verify block header, add block to tree, etc.). Then we can start with verifying the parts, and with verified parts you are much more likely to have a valid whole. Maybe at that point we can all band together and build an asic for it, then all run the asic instead of building software on the myriad OS's (which practically precludes the possibility of formal verification, as you've noted). Then maybe we'll have a safe bitcoin?! It's extreme but I'll entertain it.
|
|
|
|
adam3us
|
|
January 18, 2015, 06:18:04 PM |
|
I think its fair to say its fundamentally hard (in a computer science formal sense) to make two implementations that are consensus identical in an adversarial setting. Consensus is a new and tougher version of the host security challenges. You dont even have to make an implementation buffer overflow in a steerable way, nor perform something it shouldnt have; just most divergences become sufficient to adversarially steer to the advantage of whoever finds the divergence first and is inclined to exploit it. About the former simpler problem, Meredith Patterson gave an interesting research summary presentation on the limits of (computer) language security: https://archive.org/details/The_Science_of_Insecurity_I encourage people, even those who think they know a decent amount about host security, to have a listen. Its quite interesting. You can see that any network protocol with messages interpreted by software forms a computer language, as well as bitcoin incorporating explicit languages like script as part of its operation. Adam
|
hashcash, committed transactions, homomorphic values, blind kdf; researching decentralization, scalability and fungibility/anonymity
|
|
|
grau
|
|
January 18, 2015, 09:00:55 PM |
|
Great talk. Too bad, that Bitcoin messages use length encoding, even in scripts.
|
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
January 18, 2015, 09:10:08 PM |
|
If you need to make use of some sort of custom node (server backend for example) I would recommend running a copy of bitcoind in wallet only mode. Then have your custom node connect only to the bitcoind under your direct control. With bitcoind acting like a bitcoin firewall it will prevent entire categories of consensus split isolation attacks. Just keep in mind it isn't a magic bullet. If your consensus rules are bad enough your custom node may drop blocks passed by the bitcoind and thus "stall out" but that is preferable to being fed a false history by a malicious node.
|
|
|
|
grau
|
|
January 19, 2015, 07:02:38 AM |
|
If you need to make use of some sort of custom node (server backend for example) I would recommend running a copy of bitcoind in wallet only mode. Then have your custom node connect only to the bitcoind under your direct control.
This is a very good advice. I'd add that your code should do SPV validation, then a split with respect to the border router is limited to common bugs on your side.
|
|
|
|
hhanh00 (OP)
|
|
January 19, 2015, 08:08:43 AM |
|
Good advice but that defeats the reason I'm running my node. I'd like something small, light on memory usage and disk space that still does full validation. I ended up writing it myself.
|
|
|
|
grau
|
|
January 19, 2015, 08:46:02 AM |
|
Good advice but that defeats the reason I'm running my node. I'd like something small, light on memory usage and disk space that still does full validation. I ended up writing it myself.
A wallet disabled, UI stripped satoshi client is the closest you can get at the moment to a small consensus compatible full validation. I wish one could exclude also RPC code at compile time. Your own code that goes beyond SPV could help understanding, aid debugging or define a side chain.
|
|
|
|
hhanh00 (OP)
|
|
January 19, 2015, 10:57:26 AM |
|
Mmmm - well, it bootstraps, syncs and validates the blockchain. It passes the acceptance tests from Matt too. As a relay node, it keeps a tx pool, validates, relays new tx and you can trim old blocks just by deleting files. I've been running for a while. At about 2k lines of code, it fits my requirement of small code that I can fit in my brain but I understand it's not 100% compatible and will never be.
|
|
|
|
grau
|
|
January 19, 2015, 12:13:35 PM |
|
Mmmm - well, it bootstraps, syncs and validates the blockchain. It passes the acceptance tests from Matt too. As a relay node, it keeps a tx pool, validates, relays new tx and you can trim old blocks just by deleting files. I've been running for a while. At about 2k lines of code, it fits my requirement of small code that I can fit in my brain but I understand it's not 100% compatible and will never be.
I know it is possible to get that far in a few months, as I did myself. The emphasis is on 100% in this discussion, that is unlikely to be reached with an unrelated code base. This is not just a theoretical problem, since the smallest disagreement is suitable to fork your node off the majority. How much that costs depends on your use. The combo satoshi border router + SPV extension is simply a much safer bet if you really deal with money.
|
|
|
|
Nicolas Dorier
|
|
January 23, 2015, 01:27:48 PM |
|
My original intention was to build a full node with NBitcoin. I quickly abandoned the idea, you can't imagine the number of corner case you have to take care of... and it is a moving target. Frankly speaking, I learned to appreciate the work that core developers were doing when I tried to do it by myself. It is boring, difficult, and high in responsability.
As DeathAndTaxes said, I end up having my own bitcoind, and my software connect directly to the trusted node. 100% is not possible, and you can expect that if the reward is worth it, an attacker will end up exploiting the 0,00000001%
|
Bitcoin address 15sYbVpRh6dyWycZMwPdxJWD4xbfxReeHe
|
|
|
hhanh00 (OP)
|
|
January 23, 2015, 03:13:11 PM |
|
Well - in my opinion, it's not a sustainable long term situation. I understand there are a lot of factors at play but putting the burden of all the core client work on a single implementation is an unfair responsibility to carry. Besides, it seems that some companies are willing to take the risk of running alternate clients. In my case, I simply aim to be at least as good as them while having a code base that I can be proud of.
|
|
|
|
|