Title: Performance of Account structures in bitcoind Post by: arosca on April 26, 2014, 04:40:48 AM I was curious to see what kind of negative performance effect a large number of accounts has on bitcoind. The results are not pretty. My tests were not particularly scientific, but here's what I've learned.
Methodology I created 50K accounts in an empty wallet with a small balance. The resulting wallet file is approximately 13MB. Creating accounts takes approximately 0.03 seconds per account. Code: for N = 1 to 5e4 I then executed a sequence of 10K random transfers between these accounts. These transfers take approximately 0.1 seconds each, on average. Again, this is in an empty wallet, and all of these transfers are internal to the wallet (no transactions are actually sent to the bitcoin network). Code: client.move(account1, account2, smallAmount) Next, I executed the following sequence of external transfers (i.e. actual network transactions), on each transfer sending funds from a random account to another random account.
Code: account1 = getRandomAccount() Results After each step above, I recorded the size of the wallet file, the time it took for bitcoind to start up (i.e. initialize by reading the wallet and other database files), and the time it took to actually execute the transfers. Here is the summary of my results: http://i.snag.gy/1Zh8Z.jpg The results are surprisingly bad. File wallet.dat ballooned to 85MB (!) after only 660 transfers. I have no idea what could possibly take up so much space, but I'll try to inspect the file using BerkeleyDB tools and will add to this post if I gain some insight. The really bad news is that transfers end up taking several seconds each, on average. As expected the duration increases as the number of transactions in the wallet goes up. I inspected the bitcoind logs and it appears that most of the delay is because wallet.dat is flushed to disk after each transfer. Other Observations I was able to severely corrupt the wallet file by terminating bitcoind process. I did not lose any keys, but the account balance information was corrupted. In essence I was able to lose track of what the correct balance is in each account without any effort at all. Conclusions Others have said, both here in this forum and elsewhere: don't use Accounts in a server environment. More importantly, bitcoind itself does not seem to be suitable for any type of system where a large number of transactions is expected to occur. A different solution is needed. There is only one commercial, enterprise-level solution I am aware of (https://bitsofproof.com/?page_id=323 (https://bitsofproof.com/?page_id=323)). Additionally, BerkeleyDB (which bitcoind uses to store account, address, and all other data) does not appear to be a sufficiently robust solution if you really care about account balances. I do not know enough about it to comment but it is possible that it would perform better if it were implemented differently. For example, I would like to see an option for transactional replication of all wallet data to a separate disk or server. This would at least ensure an internally-consistent copy of the wallet database exists. As things stand now, if the wallet file gets corrupted, everything is lost, and I was able to corrupt the file very easily (and unintentionally). I am contemplating starting an open source alternative to the built-in bitcoind account management infrastructure. It would still use bitcoind for interfacing with the network, but would use a more robust database setup to store and handle account data. More about this in a separate post. Title: Re: Performance of Account structures in bitcoind Post by: gmaxwell on April 26, 2014, 09:33:29 AM I was able to severely corrupt the wallet file by terminating bitcoind process. I did not lose any keys, but the account balance information was corrupted. In essence I was able to lose track of what the correct balance is in each account without any effort at all. Can you provide some more information here? Were you running the release binaries? What version? What operating system? How did you kill the process? What state was it in when you brought it back up? What errors did you receive? Would it be possible for you to provide the courrupted wallet and database/ directory to me?I ask because last year I ran a loop killing the process under load for more than a month, killing it thousands and thousands of time trying to tease out some rare issues and was not able to generate a single instant of corruption that way. Before I start trying to reproduce your experience I want to have a comparable setup. Generally use of the 'account' functionality is not recommended it wasn't designed for what most people who try to use it expect to use it for, and other methods (which support durability across hardware failure) should be used instead. Wrt large amounts of transactions, there I must disagree— for better or worse some of the largest bitcoin using sites collect their transactions in a bitcoind using wallet. Unfortunately, none of the people interested in those high transaction load applications are contributing to the code base but they tell me that they don't need to because it currently works for them with reasonable considerations. If you've automated your tests enough that they could be run against a testnet/regtest wallet out of a script it might be useful to get them imported into the integration testing used for bitcoin core— it's quite shy on wallet related tests. Quote The really bad news is that transfers end up taking several seconds each, on average I assume you were spending unconfirmed coins in these transactions? Taking several seconds per-spend is a known artifact of the current software behavior— the code that traverses unspent coins has factorial-ish complexity. While it could be improved— there are patches available, and simply disabling spending unconfirmed outputs avoids it—, since the overall network capacity is not very great I've mostly considered this bug helpful at discouraging inept denial of service attacks so I haven't personally considered it a priority. (And most of the people who've noticed it who have mentioned it to me appear to have just been conducting tests or attempting denial of service attacks…)Title: Re: Performance of Account structures in bitcoind Post by: arosca on April 26, 2014, 01:58:08 PM Can you provide some more information here? Were you running the release binaries? What version? What operating system? How did you kill the process? What state was it in when you brought it back up? What errors did you receive? Would it be possible for you to provide the courrupted wallet and database/ directory to me? I want to start by saying I think bitcoind overall is solid. This whole experiment started informally. A friend of mine is working on a project that requires accounts and I'm mostly exploring the topic out of curiosity. I saw a lot of posts recommending not to use the account features, and I wanted to see for myself how far I can take things before they break. I was running an older version (which happened to be installed with my Armory instance), 8.2.2-beta (80202). You're absolutely right, I should probably try this again on the latest version. I'll be happy to provide the database files to you (it's all on testnet), but they are currently very large. Wallet.dat is 85MB. Contact me directly please and I'll send you a download link. I am running on Windows 7 and making calls from Python 2.7. I didn't intentionally kill the process, but when I initially set up my code I used this construct, which seems to have caused the problem: Code: process = subprocess.Popen([r'C:\Program Files (x86)\Bitcoin\daemon\bitcoind.exe', '-testnet', '-rpcuser=test', '-rpcpassword=test1']) Code: process = subprocess.Popen([r'C:\Program Files (x86)\Bitcoin\daemon\bitcoind.exe', '-testnet', '-rpcuser=test', '-rpcpassword=test1']) Code: Warning: Warning: error reading wallet.dat! All keys read correctly, but transaction data or address book entries might be missing or incorrect. Code: >>> client.getbalance() I assume you were spending unconfirmed coins in these transactions? Taking several seconds per-spend is a known artifact of the current software behavior— the code that traverses unspent coins has factorial-ish complexity. While it could be improved— there are patches available, and simply disabling spending unconfirmed outputs avoids it—, since the overall network capacity is not very great I've mostly considered this bug helpful at discouraging inept denial of service attacks so I haven't personally considered it a priority. (And most of the people who've noticed it who have mentioned it to me appear to have just been conducting tests or attempting denial of service attacks…) I'm not sure but I believe the inputs were all confirmed. I started out with 5 confirmed BTC and sent 0.0001 to a random address in the wallet on each iteration. Code is below.It seems to me all or most of the delay was not in code but rather with disk operations, and more specifically flushing wallet.dat (which is now 85MB). In any case I don't consider this to be a major issue. This is the code I used in my test: Populate wallet with 50K accounts and test duration of moving funds internally between accounts: Code: import subprocess Perform external transfers between accounts: Code: import subprocess Title: Re: Performance of Account structures in bitcoind Post by: DocJeff on April 26, 2014, 03:54:40 PM I was curious to see what kind of negative performance effect a large number of accounts has on bitcoind. The results are not pretty. My tests were not particularly scientific, but here's what I've learned. Methodology I created 50K accounts in an empty wallet with a small balance. The resulting wallet file is approximately 13MB. Creating accounts takes approximately 0.03 seconds per account. Code: for N = 1 to 5e4 I then executed a sequence of 10K random transfers between these accounts. These transfers take approximately 0.1 seconds each, on average. Again, this is in an empty wallet, and all of these transfers are internal to the wallet (no transactions are actually sent to the bitcoin network). Code: client.move(account1, account2, smallAmount) Next, I executed the following sequence of external transfers (i.e. actual network transactions), on each transfer sending funds from a random account to another random account.
Code: account1 = getRandomAccount() Results After each step above, I recorded the size of the wallet file, the time it took for bitcoind to start up (i.e. initialize by reading the wallet and other database files), and the time it took to actually execute the transfers. Here is the summary of my results: http://i.snag.gy/1Zh8Z.jpg The results are surprisingly bad. File wallet.dat ballooned to 85MB (!) after only 660 transfers. I have no idea what could possibly take up so much space, but I'll try to inspect the file using BerkeleyDB tools and will add to this post if I gain some insight. The really bad news is that transfers end up taking several seconds each, on average. As expected the duration increases as the number of transactions in the wallet goes up. I inspected the bitcoind logs and it appears that most of the delay is because wallet.dat is flushed to disk after each transfer. Other Observations I was able to severely corrupt the wallet file by terminating bitcoind process. I did not lose any keys, but the account balance information was corrupted. In essence I was able to lose track of what the correct balance is in each account without any effort at all. Conclusions Others have said, both here in this forum and elsewhere: don't use Accounts in a server environment. More importantly, bitcoind itself does not seem to be suitable for any type of system where a large number of transactions is expected to occur. A different solution is needed. There is only one commercial, enterprise-level solution I am aware of (https://bitsofproof.com/?page_id=323 (https://bitsofproof.com/?page_id=323)). Additionally, BerkeleyDB (which bitcoind uses to store account, address, and all other data) does not appear to be a sufficiently robust solution if you really care about account balances. I do not know enough about it to comment but it is possible that it would perform better if it were implemented differently. For example, I would like to see an option for transactional replication of all wallet data to a separate disk or server. This would at least ensure an internally-consistent copy of the wallet database exists. As things stand now, if the wallet file gets corrupted, everything is lost, and I was able to corrupt the file very easily (and unintentionally). I am contemplating starting an open source alternative to the built-in bitcoind account management infrastructure. It would still use bitcoind for interfacing with the network, but would use a more robust database setup to store and handle account data. More about this in a separate post. https://en.bitcoin.it/wiki/Accounts_explained From the wiki: Code: Account Weaknesses Title: Re: Performance of Account structures in bitcoind Post by: arosca on April 26, 2014, 08:06:10 PM https://en.bitcoin.it/wiki/Accounts_explained From the wiki: Code: Account Weaknesses Yep, no doubt, but I wanted to quantify this and see what the limits of bitcoind are in terms of managing accounts. How many users can I realistically handle before I run into trouble? Title: Re: Performance of Account structures in bitcoind Post by: 2112 on April 26, 2014, 08:38:11 PM How many users can I realistically handle before I run into trouble? One, before running into trouble. With two users the trouble starts: the bitcoind "accounts" are unlike any other "accounts" anywhere in the known universe. Any accountant will object to using them because it violates the principles of accounting.As with many things in Bitcoin there is however an unexpected benefit: the enterprises interested in using the built-in accounts have history of losing customer's Bitcoins due to fraud or gross negligence. Two most well-know cases are Instawallet and BitFloor. Again as with many things Bitcoin: it is hard to come by a definite proof of cause-effect relationship in the enterprises that are by design made un-auditable and un-accountable. But it seems to be an useful quick litmus test. Title: Re: Performance of Account structures in bitcoind Post by: arosca on April 26, 2014, 09:12:17 PM After everything that I've read and experienced, I completely agree. I think the fundamental principles of how accounts are implemented are OK (you can get a list of transactions that "explain" the balance in each account), but the technical implementation is not great. In my opinion this is in part due to the BerkeleyDB implementation. A more robust database solution is needed to handle accounts, including an option for offsite transactional replication. I don't know enough about BerkeleyDB but it does appear that it supports replication. This feature is not implemented in bitcoind however.
Title: Re: Performance of Account structures in bitcoind Post by: wumpus on April 27, 2014, 06:35:04 AM Others have said, both here in this forum and elsewhere: don't use Accounts in a server environment. We already know that. You could just have asked :) It is one of the worst parts of the bitcoind code.Everyone wants something else from the account system, but the conclusion is that it belongs at a higher level (with the database) not with the wallet. Maintaining third-party balances is not part of the responsibility of Bitcoin Core. There are plans to completely remove the account system in a future revision of JSON RPC API (see https://github.com/bitcoin/bitcoin/issues/3816 ). Labelling of addresses will be kept, but not accounts-with-balances. Quote I am contemplating starting an open source alternative to the built-in bitcoind account management infrastructure. It would still use bitcoind for interfacing with the network, but would use a more robust database setup to store and handle account data. More about this in a separate post. Great idea. That's what it should be, a solution on top.Title: Re: Performance of Account structures in bitcoind Post by: arosca on April 27, 2014, 01:14:19 PM Great idea. That's what it should be, a solution on top. I posted some ideas here: https://bitcointalk.org/index.php?topic=586013.0 (https://bitcointalk.org/index.php?topic=586013.0) I appreciate any feedback! |