Bitcoin Forum
April 19, 2024, 04:49:21 PM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: bitcoind stops responding to RPC requests  (Read 3949 times)
mndrix (OP)
Michael Hendricks
VIP
Sr. Member
*
Offline Offline

Activity: 447
Merit: 258


View Profile
March 24, 2011, 09:01:25 PM
 #1

While operating CoinPal, I've had the bitcoin daemon hang several times.  The behavior has been the same each time.  RPC calls timeout without response.  I restart the daemon, it catches up with the blockchain and works correctly again for several hours or days before it happens again.

Here's the information I have.  When I notice the daemon has hung, the tail of debug.log always looks about like this.  I can watch the log indefinitely and see only similar messages streaming by as normal:

Code:
IRC got join
IRC got join
AddAddress()
IRC got new address
IRC got join
IRC got join

If I look backwards in the debug log to the last activity not related to addresses and IRC, I usually get something similar to this:

Code:
IRC got join
received: inv (37 bytes)
  got inventory: tx 1d95d66a217e5fbe49bd  new
askfor tx 1d95d66a217e5fbe49bd   0
sending getdata: tx 1d95d66a217e5fbe49bd
sending: getdata (37 bytes)
received: inv (37 bytes)
  got inventory: tx 1d95d66a217e5fbe49bd  new
askfor tx 1d95d66a217e5fbe49bd   1300914187000000
received: inv (37 bytes)
  got inventory: tx 1d95d66a217e5fbe49bd  new
askfor tx 1d95d66a217e5fbe49bd   1300914307000000
received: tx (617 bytes)
ThreadRPCServer method=sendfrom
IRC got join

That last "sendfrom" request never sends a response.

When the daemon hangs again, what information should I collect so that developers can diagnose the problem?
1713545361
Hero Member
*
Offline Offline

Posts: 1713545361

View Profile Personal Message (Offline)

Ignore
1713545361
Reply with quote  #2

1713545361
Report to moderator
1713545361
Hero Member
*
Offline Offline

Posts: 1713545361

View Profile Personal Message (Offline)

Ignore
1713545361
Reply with quote  #2

1713545361
Report to moderator
1713545361
Hero Member
*
Offline Offline

Posts: 1713545361

View Profile Personal Message (Offline)

Ignore
1713545361
Reply with quote  #2

1713545361
Report to moderator
Even if you use Bitcoin through Tor, the way transactions are handled by the network makes anonymity difficult to achieve. Do not expect your transactions to be anonymous unless you really know what you're doing.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713545361
Hero Member
*
Offline Offline

Posts: 1713545361

View Profile Personal Message (Offline)

Ignore
1713545361
Reply with quote  #2

1713545361
Report to moderator
jgarzik
Legendary
*
qt
Offline Offline

Activity: 1596
Merit: 1091


View Profile
March 24, 2011, 09:09:17 PM
 #2

Have you played around with -rpctimeout ?

Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
ArtForz
Sr. Member
****
Offline Offline

Activity: 406
Merit: 257


View Profile
March 25, 2011, 08:42:49 AM
 #3

I think we got a deadlock in there...

rpc:
sendfrom
    CRITICAL_BLOCK(cs_mapWallet)
        SendMoneyToBitcoinAddress(strAddress, nAmount, wtx)
            SendMoney(scriptPubKey, nValue, wtxNew, fAskFee)
                CRITICAL_BLOCK(cs_main)
                    ...

processmessages:
CRITICAL_BLOCK(cs_main)
    ProcessMessage(pfrom, strCommand, vMsg)
        AddToWalletIfMine()
              AddToWallet(wtx)
                  CRITICAL_BLOCK(cs_mapWallet)

bitcoin: 1Fb77Xq5ePFER8GtKRn2KDbDTVpJKfKmpz
i0coin: jNdvyvd6v6gV3kVJLD7HsB5ZwHyHwAkfdw
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1128


View Profile
March 25, 2011, 11:35:48 AM
 #4

Oops. Should RPCs be run with the BFL held?
Gavin Andresen
Legendary
*
qt
Offline Offline

Activity: 1652
Merit: 2216


Chief Scientist


View Profile WWW
March 25, 2011, 12:35:19 PM
 #5

Oops. Should RPCs be run with the BFL held?
D'oh!

sendfrom should definitely CRITICAL_BLOCK(cs_main).  Nice catch ArtForz.

How often do you get the chance to work on a potentially world-changing project?
mikegogulski
Sr. Member
****
Offline Offline

Activity: 360
Merit: 250



View Profile WWW
March 25, 2011, 12:46:11 PM
 #6

ArtForz, you've got BTC 5.00 incoming from me for spotting this. Very well done.

FREE ROSS ULBRICHT, allegedly one of the Dread Pirates Roberts of the Silk Road
Gavin Andresen
Legendary
*
qt
Offline Offline

Activity: 1652
Merit: 2216


Chief Scientist


View Profile WWW
March 25, 2011, 01:01:00 PM
 #7

Does anybody have experience with valgrind -helgrind or other automated tools for finding potential deadlocks?

Running it on bitcoind I'm getting a huge number of false positives...

Should we just document every method that holds one or more locks?  I'm worried there are other possible deadlocks lurking.

How often do you get the chance to work on a potentially world-changing project?
ShadowOfHarbringer
Legendary
*
Offline Offline

Activity: 1470
Merit: 1005


Bringing Legendary Har® to you since 1952


View Profile
March 25, 2011, 01:35:26 PM
 #8

Oops. Should RPCs be run with the BFL held?
D'oh!

sendfrom should definitely CRITICAL_BLOCK(cs_main).  Nice catch ArtForz.


For which version will the patch be scheduled for ?

ArtForz
Sr. Member
****
Offline Offline

Activity: 406
Merit: 257


View Profile
March 25, 2011, 01:52:10 PM
 #9

well, quick manual check suggests for cs_main + cs_mapWallet only rpc.cpp sendfrom and sendmany are doing the wrong thing.

bitcoin: 1Fb77Xq5ePFER8GtKRn2KDbDTVpJKfKmpz
i0coin: jNdvyvd6v6gV3kVJLD7HsB5ZwHyHwAkfdw
mikegogulski
Sr. Member
****
Offline Offline

Activity: 360
Merit: 250



View Profile WWW
March 25, 2011, 01:55:38 PM
 #10

@Gavin: Document? Always a good thing. This is tricky stuff, as ArtForz has shown. My own experience goes like: 1: If you don't really have to lock, push into a serial action queue; 2: when you really do have to lock, prepare everything beforehand, then lock, alter and unlock as swiftly as possible; and 3: er, yeh, document, at least so that you can recall what the heck you were up to when you decided you needed that lock.

Obviously, this becomes real hard when we're dealing with what are essentially library primitives for manipulating the dataset.

If I were sober at the moment I'd produced a precompiler macro that would flag potential nested locks in the control flow. Fortunately, I'm not sober.

FREE ROSS ULBRICHT, allegedly one of the Dread Pirates Roberts of the Silk Road
ArtForz
Sr. Member
****
Offline Offline

Activity: 406
Merit: 257


View Profile
March 25, 2011, 02:26:12 PM
 #11

Another one
setaccount
    CRITICAL_BLOCK(cs_mapAddressBook)
        GetAccountAddress(strOldAccount)
            CRITICAL_BLOCK(cs_mapWallet)

processmessages:
CRITICAL_BLOCK(cs_main)
    ProcessMessage(pfrom, strCommand, vMsg)
        AddToWalletIfMine()
              AddToWallet(wtx)
                  CRITICAL_BLOCK(cs_mapWallet)
                      walletdb.WriteName(PubKeyToAddress(vchDefaultKey), "")
                          CRITICAL_BLOCK(cs_mapAddressBook)

bitcoin: 1Fb77Xq5ePFER8GtKRn2KDbDTVpJKfKmpz
i0coin: jNdvyvd6v6gV3kVJLD7HsB5ZwHyHwAkfdw
mndrix (OP)
Michael Hendricks
VIP
Sr. Member
*
Offline Offline

Activity: 447
Merit: 258


View Profile
March 25, 2011, 02:28:24 PM
 #12

Well done.  Let me know when a patch makes it into a beta/nightly build and I'll run it in production to test.
jgarzik
Legendary
*
qt
Offline Offline

Activity: 1596
Merit: 1091


View Profile
March 26, 2011, 05:28:12 PM
 #13

Pull request: https://github.com/bitcoin/bitcoin/pull/136

Direct link to commit (patch): https://github.com/jgarzik/bitcoin/commit/4feff786546448e2c436956ad77b9081167e3124

Unfortunately the commit is larger than it should be for easy reading, because large blocks of code were un-indented.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
slush
Legendary
*
Offline Offline

Activity: 1386
Merit: 1097



View Profile WWW
April 02, 2011, 06:47:29 PM
 #14

Today I had similar problems as mndrix had; bitcoind freezed during payouts. It was second time in pool history, but firstly with sendmany command.

mndrix, did you succesfully tested jgarzik's patch?

mndrix (OP)
Michael Hendricks
VIP
Sr. Member
*
Offline Offline

Activity: 447
Merit: 258


View Profile
April 11, 2011, 03:15:15 PM
 #15

mndrix, did you succesfully tested jgarzik's patch?

I haven't tested the patch.  My feeble attempts to compile Bitcoin from source have failed (speaks to my ignorance not a problem with Bitcoin).  Does anyone know if the patch is available in a release candidate build for Linux yet?
slush
Legendary
*
Offline Offline

Activity: 1386
Merit: 1097



View Profile WWW
April 11, 2011, 09:17:42 PM
 #16

this patch is already in bitcoin upstream, it looks like more people watched it Smiley. I'll try to use that in pool tomorrow...

mndrix (OP)
Michael Hendricks
VIP
Sr. Member
*
Offline Offline

Activity: 447
Merit: 258


View Profile
April 15, 2011, 05:16:49 PM
 #17

Coin{Pal,Card} are now running a nightly build including the deadlock changes.  I'll report here if bitcoind hangs again.
Stephen Gornick
Legendary
*
Offline Offline

Activity: 2506
Merit: 1010


View Profile
April 16, 2011, 06:18:04 AM
 #18

Just an FYI -- gjs278 shared a monit script to restart:

  Restart bitcoind automatically if it crashes or dies using Monit:
    http://bitcointalk.org/index.php?topic=5911.0

Unichange.me

            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █


Stephen Gornick
Legendary
*
Offline Offline

Activity: 2506
Merit: 1010


View Profile
April 19, 2011, 06:21:32 AM
 #19

Coin{Pal,Card} are now running a nightly build including the deadlock changes.  I'll report here if bitcoind hangs again.

Was CoinPal's April 18th service issue related to this? Your post mentioned "I've restarted some server components and the site appears to be working fine now".

Unichange.me

            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █
            █


mndrix (OP)
Michael Hendricks
VIP
Sr. Member
*
Offline Offline

Activity: 447
Merit: 258


View Profile
April 22, 2011, 02:28:52 PM
 #20

Coin{Pal,Card} are now running a nightly build including the deadlock changes.  I'll report here if bitcoind hangs again.

Was CoinPal's April 18th service issue related to this? Your post mentioned "I've restarted some server components and the site appears to be working fine now".

I'm glad you brought this up sgornick.  I should have mentioned it here.  That particular outage was caused by an error in my code causing it to leak open file handles.  It wasn't related to bitcoind.

Since upgrading to a nightly build on April 15th, I haven't had any problems with bitcoind hanging.  I almost certainly would have seen one by now if the problem were still present.  Thanks all for your help diagnosing and fixing the bug.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!