Title: bitcoind stops responding to RPC requests Post by: mndrix on March 24, 2011, 09:01:25 PM While operating CoinPal, I've had the bitcoin daemon hang (http://bitcointalk.org/index.php?topic=2555.msg71611#msg71611) several times. The behavior has been the same each time. RPC calls timeout without response. I restart the daemon, it catches up with the blockchain and works correctly again for several hours or days before it happens again.
Here's the information I have. When I notice the daemon has hung, the tail of debug.log always looks about like this. I can watch the log indefinitely and see only similar messages streaming by as normal: Code: IRC got join If I look backwards in the debug log to the last activity not related to addresses and IRC, I usually get something similar to this: Code: IRC got join That last "sendfrom" request never sends a response. When the daemon hangs again, what information should I collect so that developers can diagnose the problem? Title: Re: bitcoind stops responding to RPC requests Post by: jgarzik on March 24, 2011, 09:09:17 PM Have you played around with -rpctimeout ?
Title: Re: bitcoind stops responding to RPC requests Post by: ArtForz on March 25, 2011, 08:42:49 AM I think we got a deadlock in there...
rpc: sendfrom CRITICAL_BLOCK(cs_mapWallet) SendMoneyToBitcoinAddress(strAddress, nAmount, wtx) SendMoney(scriptPubKey, nValue, wtxNew, fAskFee) CRITICAL_BLOCK(cs_main) ... processmessages: CRITICAL_BLOCK(cs_main) ProcessMessage(pfrom, strCommand, vMsg) AddToWalletIfMine() AddToWallet(wtx) CRITICAL_BLOCK(cs_mapWallet) Title: Re: bitcoind stops responding to RPC requests Post by: Mike Hearn on March 25, 2011, 11:35:48 AM Oops. Should RPCs be run with the BFL held?
Title: Re: bitcoind stops responding to RPC requests Post by: Gavin Andresen on March 25, 2011, 12:35:19 PM Oops. Should RPCs be run with the BFL held? D'oh!sendfrom should definitely CRITICAL_BLOCK(cs_main). Nice catch ArtForz. Title: Re: bitcoind stops responding to RPC requests Post by: mikegogulski on March 25, 2011, 12:46:11 PM ArtForz, you've got BTC 5.00 incoming from me for spotting this. Very well done.
Title: Re: bitcoind stops responding to RPC requests Post by: Gavin Andresen on March 25, 2011, 01:01:00 PM Does anybody have experience with valgrind -helgrind or other automated tools for finding potential deadlocks?
Running it on bitcoind I'm getting a huge number of false positives... Should we just document every method that holds one or more locks? I'm worried there are other possible deadlocks lurking. Title: Re: bitcoind stops responding to RPC requests Post by: ShadowOfHarbringer on March 25, 2011, 01:35:26 PM Oops. Should RPCs be run with the BFL held? D'oh!sendfrom should definitely CRITICAL_BLOCK(cs_main). Nice catch ArtForz. For which version will the patch be scheduled for ? Title: Re: bitcoind stops responding to RPC requests Post by: ArtForz on March 25, 2011, 01:52:10 PM well, quick manual check suggests for cs_main + cs_mapWallet only rpc.cpp sendfrom and sendmany are doing the wrong thing.
Title: Re: bitcoind stops responding to RPC requests Post by: mikegogulski on March 25, 2011, 01:55:38 PM @Gavin: Document? Always a good thing. This is tricky stuff, as ArtForz has shown. My own experience goes like: 1: If you don't really have to lock, push into a serial action queue; 2: when you really do have to lock, prepare everything beforehand, then lock, alter and unlock as swiftly as possible; and 3: er, yeh, document, at least so that you can recall what the heck you were up to when you decided you needed that lock.
Obviously, this becomes real hard when we're dealing with what are essentially library primitives for manipulating the dataset. If I were sober at the moment I'd produced a precompiler macro that would flag potential nested locks in the control flow. Fortunately, I'm not sober. Title: Re: bitcoind stops responding to RPC requests Post by: ArtForz on March 25, 2011, 02:26:12 PM Another one
setaccount CRITICAL_BLOCK(cs_mapAddressBook) GetAccountAddress(strOldAccount) CRITICAL_BLOCK(cs_mapWallet) processmessages: CRITICAL_BLOCK(cs_main) ProcessMessage(pfrom, strCommand, vMsg) AddToWalletIfMine() AddToWallet(wtx) CRITICAL_BLOCK(cs_mapWallet) walletdb.WriteName(PubKeyToAddress(vchDefaultKey), "") CRITICAL_BLOCK(cs_mapAddressBook) Title: Re: bitcoind stops responding to RPC requests Post by: mndrix on March 25, 2011, 02:28:24 PM Well done. Let me know when a patch makes it into a beta/nightly build and I'll run it in production to test.
Title: Re: bitcoind stops responding to RPC requests Post by: jgarzik on March 26, 2011, 05:28:12 PM Pull request: https://github.com/bitcoin/bitcoin/pull/136
Direct link to commit (patch): https://github.com/jgarzik/bitcoin/commit/4feff786546448e2c436956ad77b9081167e3124 Unfortunately the commit is larger than it should be for easy reading, because large blocks of code were un-indented. Title: Re: bitcoind stops responding to RPC requests Post by: slush on April 02, 2011, 06:47:29 PM Today I had similar problems as mndrix had; bitcoind freezed during payouts. It was second time in pool history, but firstly with sendmany command.
mndrix, did you succesfully tested jgarzik's patch? Title: Re: bitcoind stops responding to RPC requests Post by: mndrix on April 11, 2011, 03:15:15 PM mndrix, did you succesfully tested jgarzik's patch? I haven't tested the patch. My feeble attempts to compile Bitcoin from source have failed (speaks to my ignorance not a problem with Bitcoin). Does anyone know if the patch is available in a release candidate build for Linux yet? Title: Re: bitcoind stops responding to RPC requests Post by: slush on April 11, 2011, 09:17:42 PM this patch is already in bitcoin upstream, it looks like more people watched it :). I'll try to use that in pool tomorrow...
Title: Re: bitcoind stops responding to RPC requests Post by: mndrix on April 15, 2011, 05:16:49 PM Coin{Pal,Card} are now running a nightly build including the deadlock changes. I'll report here if bitcoind hangs again.
Title: Re: bitcoind stops responding to RPC requests Post by: Stephen Gornick on April 16, 2011, 06:18:04 AM Just an FYI -- gjs278 shared a monit script to restart:
Restart bitcoind automatically if it crashes or dies using Monit: http://bitcointalk.org/index.php?topic=5911.0 Title: Re: bitcoind stops responding to RPC requests Post by: Stephen Gornick on April 19, 2011, 06:21:32 AM Coin{Pal,Card} are now running a nightly build including the deadlock changes. I'll report here if bitcoind hangs again. Was CoinPal's April 18th service issue related to this? Your post (http://bitcointalk.org/index.php?topic=2555.msg88826#msg88826) mentioned "I've restarted some server components and the site appears to be working fine now". Title: Re: bitcoind stops responding to RPC requests Post by: mndrix on April 22, 2011, 02:28:52 PM Coin{Pal,Card} are now running a nightly build including the deadlock changes. I'll report here if bitcoind hangs again. Was CoinPal's April 18th service issue related to this? Your post (http://bitcointalk.org/index.php?topic=2555.msg88826#msg88826) mentioned "I've restarted some server components and the site appears to be working fine now". I'm glad you brought this up sgornick. I should have mentioned it here. That particular outage was caused by an error in my code causing it to leak open file handles. It wasn't related to bitcoind. Since upgrading to a nightly build on April 15th, I haven't had any problems with bitcoind hanging. I almost certainly would have seen one by now if the problem were still present. Thanks all for your help diagnosing and fixing the bug. |