I'm describing the problem here from my personal perspective, bitcoind running on my desktop, which has a slow atom cpu and is frequently under various other load conditions), poclbm running on dedicated miner with a 5970.
Problem is this: after running bitoind for a while, I see "Problems communicating with bitcoin RPC, data: None" messages on the miner. The freuqency of these increases over time (presumably with size of bitcoind's transaction cache). Getwork interval is at 5s.
It has been suggested to used git-version with -limitfreerelay=1. I tried that and it might have taken some pressure, but the problem still occurs.
After a couple of hours my miner doesn't do any work any more, because getwork continuously fails.
I'm quoting from #bitcoin-dev to elaborate the problem and possible workarounds and/or solutions:
<molecular> ArtForz, am running into slight getwork troubles at 66 tx cache size already? does that make sense?
<ArtForz> nope
<tcatm> molecular: what's the trouble? getwork taking long time to return?
<ArtForz> do you have a really slow disk or high I/O load?
<molecular> yes
<molecular> getwork takes several seconds
<molecular> at some point it takes longer the my getwork interval (currently 5s)
<molecular> I have slowish cpu (atom)
<ArtForz> yeah, that might do it
<molecular> which is also under load from other shit since it's my desktop
this made sense to me, ArtForz kept analysing:
<ArtForz> can you check if it's actually pegging the CPU?
<ArtForz> because here it seemed to be more I/O than CPU bound
<molecular> I can't see it using cpu in htop. trying to verify that
<molecular> there's some iotop app? what's it called?
<ArtForz> iotop ?
I'm emerging iotop on my desktop, while the chatter continues:
<ArtForz> what stalls getwork is CreateNewBlock
<ArtForz> and I suspect *that* is more I/O than cpu bound
<ArtForz> I don't really know why though, it doesnt *look* like it does lots of I/O
<molecular> CreateNewBlock is in O(n) with n == <number of tx in cache>?
<ArtForz> yes
<molecular> ArtForz, why do you think IO might be the problem?
<molecular> it's all in RAM
<ArtForz> yes, it does a fopen/fseek/fread/fclose for blk0001.dat
<tcatm> so that's probably what slows it down
<ArtForz> (if the input is from a tx thats already in a block)
<ArtForz> and it does a lookup in blkindex DB for every call, too
<molecular> "if (!txPrev.ReadFromDisk(txdb, txin.prevout, txindex))" <- you mean this, tcatm?
<tcatm> molecular: yep
<ArtForz> yeah, that sounds like it might cause slowdowns, especially if you dont have enough free memory to keep blk0001 cached
<tcatm> how does that code find the transaction in blk0001.dat? is there an index for txhash -> blkhash?
<ArtForz> yes
<ArtForz> blkindex.dat for txhash->offset
<ArtForz> urrr... why the F are we not caching that stuff?
<molecular> slush, workaround will relieve the pain, but in the end we should fix stuff in bitcoin
<ArtForz> it's not like tx can magically appear in blocks while no new block comes along
<molecular> I think ArtForz might've just identified the root of the problem?
<ArtForz> could be...
<ArtForz> can't think of a elegant way to work around it though
<ArtForz> thats... really weird
<ArtForz> well, we do 2 things really with that prev tx
<ArtForz> 1. check if it's in a block (can be cached between block updates)
<ArtForz> 2. if it is, get the value of the output referred to (same thing)
I'm stopping here and backposting link to #bitcoin-dev...