Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: patvarilly on June 30, 2011, 04:31:56 AM



Title: Error in block chain update logic? -- EDIT: False alarm...
Post by: patvarilly on June 30, 2011, 04:31:56 AM
Dear all,

I think I've found an unusual bug in the block chain update logic (main.cpp).  The problem is likely to be triggered sporadically by users who run bitcoin only for a few minutes or hours a day.

Here's what happens.  Suppose that you've run bitcoin, and at the time you close it, the best block chain looked like this:

You: A -> B -> C

On closing bitcoin, you record that the tip of the best block chain (pindexBest) is C.  Just after you close Bitcoin, a fork in the block chain occurs that invalidates C (say, three miners produced blocks in quick succession, and you drew the short end of the stick), so the actual block chain from there on is:

Network: A -> B -> C
              |
              + -> D -> E -> ...


Then, the next day, you wake up and start running Bitcoin again.  At this point you think pindexBest is C, so as soon your first peer sends you a "version" command (main.cpp:ProcessMessage()), you ask for all the blocks from C forward.  Your peer sees that there's no blocks following C, and doesn't send you any blocks as a result.  Hence, your view of the block chain remains this one:

You: A -> B -> C

Later on, a new block X is added to the tip of the block chain, so that

Network: A -> B -> C
              |
              + -> D -> E -> ... -> X


You get an "inv" message about it, and to you, X looks like an orphan chain:

You: A -> B -> C

          ? -> X


So naturally (main.cpp:ProcessBlock()), you issue a "getblocks" message to get all the blocks "from C's successor up to X".  When your peers get this message, they find that there is no successor to C, so they send nothing back!  Now, no matter how many new block you learn about, it seems that they're all being added to an orphan chain, and the main chain has stopped growing!

Is there something I'm missing?  All calls to "PushGetBlocks" seem to pass pindexBest as a starting point.

The situation where this bug is triggered is rare, but not impossible.  In particular, it's very hard to trigger inside a continuously running instance of bitcoin, but if you run bitcoin only sporadically and you have the bad fortune of closing it with a soon-to-be stale notion of the tip of the main chain, you're toast.  Presently, to fix it, it seems to me that you'd have to trash your local copy of the blockchain and redownload it from another peer.

One way to fix this would be for a client that gets an unbounded "getblocks" that starts from a block off of the main chain (or from a block that we've never seen) acts specially.  Instead of returning nothing, it returns the hashes of the last 500 blocks on the main chain, call them P -> ... -> X (there's nothing that requires the reply to a "getblocks" to correspond to the blocks that were actually asked for).  If that's enough that the peer sees the fork in the main chain, then that will trigger a reorganize, and the universe will be at peace with itself once again.  Otherwise, the peer will send another "getblocks" that starts from off of the main chain (or from an unknown block) up to P.  That nonesense request also should be recognized, and the 500 blocks preceeding P should be sent back.  This goes on a sufficient number of times that, eventually, the peer sees the block chain split and reorganizes its chain.  Perhaps this can stop at the latest checkpoint known to the peer to stop a malicious peer from forcing you to send back all of the headers of the main chain down to the genesis block.

What do you think?


Title: Re: Error in block chain update logic?
Post by: MoonShadow on June 30, 2011, 04:40:17 AM
I would say that this would explain some strange behaviors I've seen from my own client, that I don't run continuously.


Title: Re: Error in block chain update logic?
Post by: LightRider on June 30, 2011, 04:56:30 AM
I have a similar usage pattern but haven't had this problem. Does seem like something that should be fixed though.


Title: Re: Error in block chain update logic?
Post by: theymos on June 30, 2011, 07:10:40 AM
getblocks does not send just the latest hash. It sends a CBlockLocator object, which contains many hashes going far back into the chain.


Title: Re: Error in block chain update logic?
Post by: patvarilly on June 30, 2011, 07:28:11 AM
getblocks does not send just the latest hash. It sends a CBlockLocator object, which contains many hashes going far back into the chain.

Ah!  Thanks, that's very smart, and I hadn't noticed that feature.