Bitcoin Forum
May 08, 2024, 09:12:05 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: 1 2 [All]
  Print  
Author Topic: fuck you wallet format!!!! (RANT)  (Read 3190 times)
kokjo (OP)
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000

You are WRONG!


View Profile
May 04, 2012, 07:25:58 AM
 #1

i have one question in mind right now: "WHY?Huh?"
WHY have you designed the wallet format this way it sucks!
you are even using dbname/tables but you only have one table in wallet.dat(the "main" table), BUT you are using it incorrectly.
im right not reading this code:
https://github.com/joric/pywallet/blob/master/pywallet.py#L1283

WHY have you not put the each type, in each separate table? i don't get it! its a mess.
a table for settings, one for keys, one for accounts, ...
and i see you are using public keys, as a database key why not use the hash/address, as a key to both the publickey and privatekey?

what is the benefit, of this madness? the dude that came up this db scheme, is a bad database coder.


/rant over

"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
1715159525
Hero Member
*
Offline Offline

Posts: 1715159525

View Profile Personal Message (Offline)

Ignore
1715159525
Reply with quote  #2

1715159525
Report to moderator
The block chain is the main innovation of Bitcoin. It is the first distributed timestamping system.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
check_status
Full Member
***
Offline Offline

Activity: 196
Merit: 100


Web Dev, Db Admin, Computer Technician


View Profile
May 04, 2012, 12:58:44 PM
 #2

Is it insecure, slow, poorly arranged code?
If it's python, can't you modify it and use your own improvements? Submit your improvements to joric?

For Bitcoin to be a true global currency the value of BTC needs always to rise.
If BTC became the global currency & money supply = 100 Trillion then ⊅1.00 BTC = $4,761,904.76.
P2Pool Server List | How To's and Guides Mega List |  1EndfedSryGUZK9sPrdvxHntYzv2EBexGA
kokjo (OP)
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000

You are WRONG!


View Profile
May 04, 2012, 01:07:20 PM
 #3

Is it insecure, slow, poorly arranged code?
If it's python, can't you modify it and use your own improvements? Submit your improvements to joric?
it's not the code, its the structure of the wallet database. The code looks fine, I would have wrote it in another way, but that's irrelevant.

"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
May 04, 2012, 01:10:47 PM
 #4

I think the fact that it is a database in the first place is a silly design decision.

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1174


View Profile WWW
May 04, 2012, 01:35:07 PM
 #5

I believe the reason for only using one table is the result of the glue layer on top of BDB (which I suppose was designed by Satoshi). This way, only one open database object needs to be kept around per file. It's an ugly system, but it seems reasonable at least. As far as I know, Satoshi didn't really like the idea of alternate clients, so I don't think he designed with ease of interoperability in mind. Once that layer was in place, it was probably all to easy to add fields to (mostly) the wallet for various pieces of data (and I did so as well, later on...).

Regarding the choice for BDB in the first place: it makes sense for the transaction/block index database. These are large, need frequent updates and queries, and need a high degree of transaction atomicity. For wallets and IP addresses, I believe better solutions are possible. Both are only read at startup, and subsequently updated but mostly appended to. I hope we can move to a better format (especially for the wallet) soon. One that isn't prone to corruption under flaky hardware (perhaps by only appending to it, and occassionally rewriting it), isn't linked to a fixed database environment directory, and is backward compatible...

I do Bitcoin stuff.
kwukduck
Legendary
*
Offline Offline

Activity: 1937
Merit: 1001


View Profile
May 04, 2012, 04:46:24 PM
 #6

I think the fact that it is a database in the first place is a silly design decision.

^^ this

14b8PdeWLqK3yi3PrNHMmCvSmvDEKEBh3E
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
May 04, 2012, 05:16:58 PM
 #7

I don't think it being a database is a silly idea (maybe that is because I work w/ db everyday).  More applications that you think use databases internally.  Any mission critical I/O is going to need support for:

atomic transactions
read/write verification
high level of availability
dataloss recovery
backups

Once you implement all that you just built a good chunk of the "plumping" that makes a database and likely it isn't as good as databases which have tens of thousands of hours of development already.

Still the "db" as used by the wallet is horrible.  It really isn't a database.  More like a loosely typed flat file stuffed inside a database.  So you get all the disadvantages of flat files combined with all the complexity and disadvantages of a database.   Then it is compounded with the "weird" choice of Berkeley DB over SQL Lite.  Anyone have insight into that decision or was it simply a case of "using what you know"?

Having everything stored as varchar and recorded as a set of type, key, value is ..... yuck.

@Peter:
I never realized Satoshi didn't appreciate the need for alt clients.  Strange he saw the danger of centralization everywhere else but didn't notice the danger that centralization of development would bring.    Honestly IMHO even the term "alt client" is dangerous.  Hopefully in time Bitcoin will be evolve to a point where there are simply compatible clients and the project (under whichever name is evolves to) which began as the Satoshi client is just one of many peers.
randomproof
Member
**
Offline Offline

Activity: 61
Merit: 10


View Profile
May 04, 2012, 05:21:21 PM
 #8

You would think that the wallet only stored the public/private key pairs.  In that case it could have been a flat file.  But it seems that all transaction history is stored in there, too, which is what is really complicating things.  The issue I see is that there are so many files in the data directory.  I've read somewhere that this version of bitcoin was not intended to take off, so maybe that's part of the reason.  I'd suggest using an SQL database of the blockchain which would make the job of pruning unneeded data (redeemed transactions) easier, then you could execute some simple SQL statements and get address balances and transaction details.  You could also build in most of blockexplorer.com into the client.

Donations to me:   19599Y3PTRF1mNdzVjQzePr67ttMiBG5LS
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
May 04, 2012, 07:28:15 PM
 #9

Any mission critical I/O is going to need support for:

atomic transactions
read/write verification
high level of availability
dataloss recovery
backups

These benefits have not been brought to Bitcoin by the choice of using a database to hold the wallet.  While I agree that a real database system should be used for a mission critical database, I don't regard a bitcoin wallet as a database.  I see it as a document, like an excel spreadsheet.  Using a database engine for a bitcoin wallet to me is as ridiculous as using a power drill screwdriver to take apart a pocket watch.

In my view, the wallet (containing keypairs) should totally be separate from the file that contains the transaction history.  You don't use your real wallet to keep all of your receipts for everything you bought since you were twelve, neither should Bitcoin do the same.

Atomic transactions: The wallet should be nothing more than a repository of keypairs, so the only transaction that should ever need to take place is the occasional topping up of the keypool, and perhaps saving labels attached to addresses.  (For example, a keypool created with 500 keypairs, and topped back up to 500 when the reserve reaches 100, would make transactions and/or modification of the wallet happen so rarely that it would hardly be inefficient to achieve atomicity simply by writing a brand new file, swapping it with the old one via renaming, and then deleting the old one, as though it were a document.

Read/write verification: Nothing unique to a database - if verification is needed, this is easy to code in the client.  Write file, close file, read file, verify file (or hash).

High level of availability:  How does bdb increase availability as compared to say a flat xml file?  If anything, it has hindered it, judging by the number of people who have gotten "critical DB errors" and lost access to their wallet.

Data loss recovery:  If anything, people have lost more wallets as a result of bdb than have ever gained from it.  At least with XML you can manually hack together in a text editor.  bdb, no way.

Backups: Another place where bdb has offered no benefit, and the unusual placement (especially on Windows clients, as compared to the typical Windows user) has made backups difficult for the average user.  Rather, the client should treat a wallet like a document, the same way Microsoft Word treats a .doc file as a document, where a user can back it up the same way they'd back up a letter, such as by clicking File - Save As - and then choosing removable media as the destination for the file, or by dragging a copy out of their Documents folder and onto their flash drive etc.



Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1174


View Profile WWW
May 05, 2012, 02:53:40 AM
 #10

I think you're preaching to the choir. I don't think many people consider bdb for the wallet a good idea. I've already experimented with an own format to function as drop-in replacement for the database file. It doesn't look too hard, but obviously compatibility issues will complicate things.

I do Bitcoin stuff.
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
May 05, 2012, 03:43:55 AM
 #11

You know what would be really nice down the road is if the database layer was abstracted in a way that allowed someone to use their own SQL server as the data store for the block chain, but defaulted to something statically linked for the sake of slimness for those who won't use this feature.  Likewise, the "in-memory transaction pool" also ought to be maintained as a database table.  This would help with the development of other applications (like payment processing, shopping carts) without requiring them to hack or tangle with bitcoind.  Likewise, I'm sure someone else has thought of this first.

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1129


View Profile
May 05, 2012, 07:28:31 PM
 #12

I don't understand the request to have wallets not contain transactions. You need transactions to create spends, which is what a wallet is for.

Satoshis wallet design was clearly built with lightweight/SPV mode in mind, though the rest of it was never fully implemented. In that design you MUST store the transactions that are relevant to the keys in your wallet.
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
May 05, 2012, 08:07:22 PM
 #13

I don't understand the request to have wallets not contain transactions. You need transactions to create spends, which is what a wallet is for.

Satoshis wallet design was clearly built with lightweight/SPV mode in mind, though the rest of it was never fully implemented. In that design you MUST store the transactions that are relevant to the keys in your wallet.

In my mind, that need should be accommodated by the client maintaining an index that allows a rapid lookup of all of the transactions that are associated with any given hash160, directly from the block chain... the same way it already maintains an index of all unspent transactions.

If it worked this way, then it would be trivial for a user to close one wallet (File - Close) and open another (File - Open), the same way I might close one spreadsheet and open another.  And commands like importprivkey would run rather instantly (or O(log n) to be specific).

Right now, the idea that one must perform a lengthy "rescan" to switch to another wallet defies all common sense from the perspective of a typical user, and adds no useful benefit (except perhaps the non-consumption of the disk space that such an index would require).

That index, by the way, doesn't have to be exhaustive to be effective... a hashtable index that had just 32 bits of key and 32 bits of block reference (at the expense of collisions causing occasional unnecessary reads of blocks) would still be highly useful for finding all relevant transactions given a set of addresses in a reasonable timeframe without being unduly large.

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1174


View Profile WWW
May 05, 2012, 08:11:21 PM
 #14

That requires the full blockchain (which is something that will get removed further and further from wallets in the future, imho), plus an extra additional index on top of it. Furthermore it misses any ability to store local data (address labels, accounts, comments, ...).

I believe what you are looking for is Electrum or another lightweight client, which keeps all that data and I assume fast rescan ability on a server.

I agree with you that we shouldn't need a lengthy rescan to switch wallets (afaik, we don't, but switching is far from how easy it should be), but the solution is adding multiple wallet support to the client. Using the blockchain as your transaction store may sound fun, but I don't think it's a viable end-user way of working in the future.

I do Bitcoin stuff.
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
May 05, 2012, 08:21:20 PM
Last edit: May 05, 2012, 08:54:47 PM by casascius
 #15

That requires the full blockchain (which is something that will get removed further and further from wallets in the future, imho), plus an extra additional index on top of it. Furthermore it misses any ability to store local data (address labels, accounts, comments, ...).

I am not sure I see it the same way.

I have proposed building an index on a set of data (the block chain) which is data maintained by the client, separate from the wallet.  The fact that down the road, that set of data maintained by bitcoind may be reduced (a partial block chain) has nothing to do with whether an index can be built upon it.  Especially when the index is only going to be referencing the portion of the block chain that will be kept not discarded.

Importantly, a properly implemented index can always be thrown away and rebuilt, so if you ever change your mind in the future as to what the index should look like, or will be making a major change to how much or how the blockchain data is kept, the new client version can simply dump the index and rebuild it upon installation.

Local data like address labels, accounts, and comments are fair game for the wallet file (the same way that if I yellow-highlighted cells in Excel, that that highlighting would be persisted in my .xls file when I went to save it).  They are the user's data, which is what a typical user would expect to be saved in a user file.

I believe what you are looking for is Electrum or another lightweight client, which keeps all that data and I assume fast rescan ability on a server.

Actually, given that I have no real unmet needs for the way I transact, I am not really looking for anything, other than to bridge the gap between the mindset of the developers and the mindset of the average user that will be downloading the client.  This way, new users have a greater likelihood of saying "Aha this is what I was looking for", rather than "WTF I don't understand".

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
MatthewLM
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 05, 2012, 09:31:52 PM
 #16

What do people think about other wallet formats?
Pieter Wuille
Legendary
*
qt
Offline Offline

Activity: 1072
Merit: 1174


View Profile WWW
May 05, 2012, 10:14:27 PM
 #17

I believe what you are looking for is Electrum or another lightweight client, which keeps all that data and I assume fast rescan ability on a server.

Actually, given that I have no real unmet needs for the way I transact, I am not really looking for anything, other than to bridge the gap between the mindset of the developers and the mindset of the average user that will be downloading the client.  This way, new users have a greater likelihood of saying "Aha this is what I was looking for", rather than "WTF I don't understand".

Let me try to formulate things differently. I'm only talking about the Bitcoin reference client (its wallet) being used by end users. Using address-to-block indexes in fat servers that serve many thin clients, is obviously a good idea.

But for an end user who is running the bitcoin software (for now with a full node, later perhaps in SPV mode when it is implemented), I see no reason to replace storing the transactions in the wallet by an on-the-fly scan through the block database (even with an extra index to speed it up). First of all, the endpoints (in particular the receiver) is ultimately responsible for having the transaction around: in case the transaction is not yet in the blockchain, sender and receiver of the transaction are those who will keep broadcasting them. In case of a reorganisation, a transaction may be lost and again the owners are responsible for keeping the transaction alive. Second, further developments with multisig transactions will require transactions being negotiated (which is non-trivial and requires many communication steps) before they can be published and mined into the chain.

It seems to me your largest issue is usability. And I agree, there are many improvements possible for the end-user using a local bitcoin wallet. But the solution is supporting multiple wallets, improve efficiency, improving the user interface, simplify saving and restoring backups (tagged with for example the first block that would need to be scanned for incoming transactions), .... Adding an extra index may be viable right now, but I can't believe that such a requirement will be necessary for how bitcoin wallets will be used in the future.

That said, I didn't say such an index is a bad idea - there are certainly uses (in particular when the nodes functions as a back-end for thin clients, or runs a large service). It's not a priority right now, but that can change. I just don't think it is the thing end users need now.


I do Bitcoin stuff.
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
May 05, 2012, 10:42:28 PM
 #18

It seems to me your largest issue is usability. And I agree, there are many improvements possible for the end-user using a local bitcoin wallet. But the solution is supporting multiple wallets, improve efficiency, improving the user interface, simplify saving and restoring backups (tagged with for example the first block that would need to be scanned for incoming transactions), .... Adding an extra index may be viable right now, but I can't believe that such a requirement will be necessary for how bitcoin wallets will be used in the future.

Can I suggest that the ability to maintain, open, and close multiple wallets at will is a compelling benefit for a typical end user that would justify the index?  Also would be the ability to import or sweep funds from private keys in non-exponential time from handheld bitcoin cash.  I mean, that's a pretty huge benefit: I can hand someone bitcoins on a QR code, and they can scan and sweep the funds, either some or all of them.  End users can "be their own bank" by printing their own cash at home, and I can tell a restaurant (like Meze Grill in NYC who recently refused my bitcoins due to difficulty accepting them) that all they have to buy is a $250 USB QR code scanner, they can be accepting home-printed bitcoin cash with the official bitcoin client in no time.

If you give us this index with the official blessing of being a core feature, then others can add the rest of these features that depend on it so you don't have to be burdened with it.  Give us the framework, the infrastructure, and let others put in the effort of carrying it to the level of practical application.

Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
westkybitcoins
Legendary
*
Offline Offline

Activity: 980
Merit: 1004

Firstbits: Compromised. Thanks, Android!


View Profile
May 05, 2012, 11:00:55 PM
 #19

I think the fact that it is a database in the first place is a silly design decision.

Yep.

Bitcoin is the ultimate freedom test. It tells you who is giving lip service and who genuinely believes in it.
...
...
In the future, books that summarize the history of money will have a line that says, “and then came bitcoin.” It is the economic singularity. And we are living in it now. - Ryan Dickherber
...
...
ATTENTION BFL MINING NEWBS: Just got your Jalapenos in? Wondering how to get the most value for the least hassle? Give BitMinter a try! It's a smaller pool with a fair & low-fee payment method, lots of statistical feedback, and it's easier than EasyMiner! (Yes, we want your hashing power, but seriously, it IS the easiest pool to use! Sign up in seconds to try it!)
...
...
The idea that deflation causes hoarding (to any problematic degree) is a lie used to justify theft of value from your savings.
etotheipi
Legendary
*
expert
Offline Offline

Activity: 1428
Merit: 1093


Core Armory Developer


View Profile WWW
May 06, 2012, 01:30:46 AM
 #20

What do people think about other wallet formats?

For reference, I went as far in the opposite direction as I could, when creating the Armory wallet format.  I hate the Satoshi wallet format as much as kokjo.  Armory uses a simple binary format, easy to read, and only two operations on it are ever used:  append, or overwrite-in-place-with-same-data-size.   I documented it here: 

http://bitcoinarmory.com/index.php/armory-wallet-files

I had two goals in mind when I made the wallet format:

  • I want 100% control of what happens in the wallet file.  Inspired by the wallet-not-actually-encrypted bug in 0.4.0
  • I want it to be dead simple for other developers to be able to read (and maybe modify) the wallet files

There's quite a bit of extra wallet-management code to protect against corruption & errors, and enforce atomic operations, but that's in code -- it doesn't affect the simplicity for other developers to read the files.    The most important feature is that when I encrypt my wallet, the encrypted key is guaranteed to overwrite the original unencrypted key, which prevents any leaks happening when I back it up to Dropbox, etc.  Same with deleting data:  it's overwritten with zeros in-place.  I know the overwrite may not happen in-place on-disk, but there's nothing I can do about that -- at least when someone copies the wallet file from my HDD, the binary file will not have any surprises in it.
[/list]

Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here!    (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)
splatster
Full Member
***
Offline Offline

Activity: 176
Merit: 100



View Profile
May 06, 2012, 02:35:16 AM
 #21

I think the fact that it is a database in the first place is a silly design decision.
I couldn't agree more.

Why can't we use a simple format?
etotheipi
Legendary
*
expert
Offline Offline

Activity: 1428
Merit: 1093


Core Armory Developer


View Profile WWW
May 06, 2012, 02:48:25 AM
 #22

I think the fact that it is a database in the first place is a silly design decision.
I couldn't agree more.

Why can't we use a simple format?

It's because the wallet is sensitive stuff, and one of the benefits of a database engine is the ACID/atomic operations:  it guarantees that data is written as intended or not written at all  to the database.  No matter what nanosecond the power goes out, it's supposed to be impervious to corruption.   When you're talking about private keys protecting millions of dollars, it's a very good idea to have atomic operations... but it comes with the downside that the database is a kind of blackbox and you don't always know what it's doing (hence the 0.4.0 wallet-not-actually-encrypted bug).

In my wallet format, I created atomic operations using a backup file and some flag files that detect when corruption has happened and to be able to detect and restore an uncorrupted version automatically.  It's probably only 90-95% as effective as a real ACID/atomic database, but I've tested the heck out of it and it does work.   In the end, it's pretty rare that this logic would even be triggered because Armory doesn't keep the wallet file open.  It only does open-modify-close operations occasionally.

I shared this experience with the devs, and while some of them thought it was interesting, their attitude was "let's not reinvent the wheel -- this is a solved problem, let's use an existing solution."  I understand that attitude.  But to me, the simplicity of the file format with 100% control over the data is worth every ounce of effort I put into it.

Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here!    (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)
splatster
Full Member
***
Offline Offline

Activity: 176
Merit: 100



View Profile
May 06, 2012, 02:59:45 AM
 #23

I think the fact that it is a database in the first place is a silly design decision.
I couldn't agree more.

Why can't we use a simple format?

It's because the wallet is sensitive stuff, and one of the benefits of a database engine is the ACID/atomic operations:  it guarantees that data is written as intended or not written at all  to the database.  No matter what nanosecond the power goes out, it's supposed to be impervious to corruption.   When you're talking about private keys protecting millions of dollars, it's a very good idea to have atomic operations... but it comes with the downside that the database is a kind of blackbox and you don't always know what it's doing (hence the 0.4.0 wallet-not-actually-encrypted bug).

In my wallet format, I created atomic operations using a backup file and some flag files that detect when corruption has happened and to be able to detect and restore an uncorrupted version automatically.  It's probably only 90-95% as effective as a real ACID/atomic database, but I've tested the heck out of it and it does work.   In the end, it's pretty rare that this logic would even be triggered because Armory doesn't keep the wallet file open.  It only does open-modify-close operations occasionally.

I shared this experience with the devs, and while some of them thought it was interesting, their attitude was "let's not reinvent the wheel -- this is a solved problem, let's use an existing solution."  I understand that attitude.  But to me, the simplicity of the file format with 100% control over the data is worth every ounce of effort I put into it.

So you have a solution which combines the safety and the simplicity of each option, yet it hasn't been implemented into the mainstream client?  I might have to make armory my main wallet if something similar never hits the main client.
etotheipi
Legendary
*
expert
Offline Offline

Activity: 1428
Merit: 1093


Core Armory Developer


View Profile WWW
May 06, 2012, 03:10:07 AM
 #24

So you have a solution which combines the safety and the simplicity of each option, yet it hasn't been implemented into the mainstream client?  I might have to make armory my main wallet if something similar never hits the main client.

Bear in mind that my solution is not a perfect replacement for ACID operations.  It's only a "good" solution (as far as I can tell).  However, I think it's a lot less important with deterministic wallets.  Would-be wallet corruption is so rare to begin with.  Now take into account that 90% of that will be recoverable.  Now take into account that your wallet is completely recoverable from your very first paper or digital backup*.

And most users will make a first backup.  The issue with the Satoshi client is that it requires a persistent backup solution, and end-users are very bad at that.

There's some semi-frequent discussion on IRC about what the devs would like to do with the wallet, but I haven't heard any consensus.  However, I do know that they will be implementing deterministic wallets soon, too.  So regardless of disk corruption/failure protections, the Satoshi wallets may have the same only-one-time-backup-needed property.  But of course, it will probably be a while before they get around to it...

*Exception: this isn't true if you imported keys after making the first backup. 


Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here!    (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)
casascius
Mike Caldwell
VIP
Legendary
*
Offline Offline

Activity: 1386
Merit: 1136


The Casascius 1oz 10BTC Silver Round (w/ Gold B)


View Profile WWW
May 06, 2012, 11:46:04 AM
 #25

It's because the wallet is sensitive stuff, and one of the benefits of a database engine is the ACID/atomic operations:  it guarantees that data is written as intended or not written at all  to the database.  No matter what nanosecond the power goes out, it's supposed to be impervious to corruption.   When you're talking about private keys protecting millions of dollars, it's a very good idea to have atomic operations... but it comes with the downside that the database is a kind of blackbox and you don't always know what it's doing (hence the 0.4.0 wallet-not-actually-encrypted bug).

The ability of a database engine to provide atomic operations is no stronger than the underlying OS's ability to reliably report that all writes to a certain point have been flushed to disk when asked - using operations that anybody can call from any application, not just a database engine.

The magic that brings an atomic operation to a database is nothing more complicated than replacing "write a record" with the following flowchart:

1. Write a record somewhere that says you intend to make a particular write, including details of the substance of the write, so the write can be repeated if this is the only record of it.
2. Ensure that that record is committed to disk before continuing.
3. Make the write as intended.
4. Make sure the write done in step 3 is committed to disk before continuing.
5. Eliminate the record you created in step 1.
6. Ensure that the elimination done in step 5 has completed before allowing step 1 to occur on a future write.
7. When your program starts up, have it so that it looks for any unremoved records similar to the one created in step 1.  Confirm that they were written completely.  If so, simply perform the write operation that the record says you planned to make (which will have no effect if the prior write was successful).  If such records were not written completely, discard them.

This is simple enough for computer science students to implement in their homework.

The only magic that a database engine brings to the table is the ability for these seven steps to run at a high level of performance with lots of concurrent operations, in an effort to mitigate the performance penalty of tripling the burden of doing writes.

Since a Bitcoin wallet is only updated eternities apart (in terms of compute time, especially when it is limited only to data created or changed by the user), the perceived performance penalty of doing 3 writes instead of 1 ought to be so negligible as to make a full blown database engine completely unnecessary even when ACID properties are desirable.


Companies claiming they got hacked and lost your coins sounds like fraud so perfect it could be called fashionable.  I never believe them.  If I ever experience the misfortune of a real intrusion, I declare I have been honest about the way I have managed the keys in Casascius Coins.  I maintain no ability to recover or reproduce the keys, not even under limitless duress or total intrusion.  Remember that trusting strangers with your coins without any recourse is, as a matter of principle, not a best practice.  Don't keep coins online. Use paper or hardware wallets instead.
Remember remember the 5th of November
Legendary
*
Offline Offline

Activity: 1862
Merit: 1011

Reverse engineer from time to time


View Profile
May 06, 2012, 12:01:02 PM
 #26

Thread title made me laugh. A user in YT going by the name TheRadBrad often says such phrases. He is funny Cheesy

BTC:1AiCRMxgf1ptVQwx6hDuKMu4f7F27QmJC2
etotheipi
Legendary
*
expert
Offline Offline

Activity: 1428
Merit: 1093


Core Armory Developer


View Profile WWW
May 06, 2012, 12:56:43 PM
 #27

It's because the wallet is sensitive stuff, and one of the benefits of a database engine is the ACID/atomic operations:  it guarantees that data is written as intended or not written at all  to the database.  No matter what nanosecond the power goes out, it's supposed to be impervious to corruption.   When you're talking about private keys protecting millions of dollars, it's a very good idea to have atomic operations... but it comes with the downside that the database is a kind of blackbox and you don't always know what it's doing (hence the 0.4.0 wallet-not-actually-encrypted bug).

The ability of a database engine to provide atomic operations is no stronger than the underlying OS's ability to reliably report that all writes to a certain point have been flushed to disk when asked - using operations that anybody can call from any application, not just a database engine.

The magic that brings an atomic operation to a database is nothing more complicated than replacing "write a record" with the following flowchart:

1. Write a record somewhere that says you intend to make a particular write, including details of the substance of the write, so the write can be repeated if this is the only record of it.
2. Ensure that that record is committed to disk before continuing.
3. Make the write as intended.
4. Make sure the write done in step 3 is committed to disk before continuing.
5. Eliminate the record you created in step 1.
6. Ensure that the elimination done in step 5 has completed before allowing step 1 to occur on a future write.
7. When your program starts up, have it so that it looks for any unremoved records similar to the one created in step 1.  Confirm that they were written completely.  If so, simply perform the write operation that the record says you planned to make (which will have no effect if the prior write was successful).  If such records were not written completely, discard them.

This is simple enough for computer science students to implement in their homework.

The only magic that a database engine brings to the table is the ability for these seven steps to run at a high level of performance with lots of concurrent operations, in an effort to mitigate the performance penalty of tripling the burden of doing writes.

Since a Bitcoin wallet is only updated eternities apart (in terms of compute time, especially when it is limited only to data created or changed by the user), the perceived performance penalty of doing 3 writes instead of 1 ought to be so negligible as to make a full blown database engine completely unnecessary even when ACID properties are desirable.

Well this is exactly what I do in my wallet, except with an extra step of repeating 1-5 on a backup immediately after the main file is updated.  I do this so that if the file gets corrupted I have a guaranteed working backup, and the flag files tell me which one is the corrupted one...

HOWEVER: one of the criticisms of this technique (which I would think equally applies to any app trying to do atomic operations) is that you don't have control over when data actually gets written to disk.  And there's no guarantee that the writes happen in the same order that you issued them. 


Founder and CEO of Armory Technologies, Inc.
Armory Bitcoin Wallet: Bringing cold storage to the average user!
Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

Please donate to the Armory project by clicking here!    (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)
Mike Hearn
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1129


View Profile
May 06, 2012, 06:41:49 PM
 #28

I'm not sure the performance impact of maintaining key/hash160 -> tx indexes is really worth having a simpler wallet format. People who feel really strongly about this can adjust the code to build such an index and measure the impact. Block chain download/processing is already expensive enough, IMHO.
MatthewLM
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 06, 2012, 07:34:44 PM
 #29

    What do people think about other wallet formats?

    For reference, I went as far in the opposite direction as I could, when creating the Armory wallet format.  I hate the Satoshi wallet format as much as kokjo.  Armory uses a simple binary format, easy to read, and only two operations on it are ever used:  append, or overwrite-in-place-with-same-data-size.   I documented it here: 

    http://bitcoinarmory.com/index.php/armory-wallet-files

    I had two goals in mind when I made the wallet format:

    • I want 100% control of what happens in the wallet file.  Inspired by the wallet-not-actually-encrypted bug in 0.4.0
    • I want it to be dead simple for other developers to be able to read (and maybe modify) the wallet files

    There's quite a bit of extra wallet-management code to protect against corruption & errors, and enforce atomic operations, but that's in code -- it doesn't affect the simplicity for other developers to read the files.    The most important feature is that when I encrypt my wallet, the encrypted key is guaranteed to overwrite the original unencrypted key, which prevents any leaks happening when I back it up to Dropbox, etc.  Same with deleting data:  it's overwritten with zeros in-place.  I know the overwrite may not happen in-place on-disk, but there's nothing I can do about that -- at least when someone copies the wallet file from my HDD, the binary file will not have any surprises in it.
    [/list]

    Your format seems pretty good. Where would I be able to find out more about the error-correcting checksums. You say they can fix up to one byte. Sounds good enough to me but what do I know? Would people say that's the only error correction needed?
    etotheipi
    Legendary
    *
    expert
    Offline Offline

    Activity: 1428
    Merit: 1093


    Core Armory Developer


    View Profile WWW
    May 06, 2012, 07:43:13 PM
     #30

    Your format seems pretty good. Where would I be able to find out more about the error-correcting checksums. You say they can fix up to one byte. Sounds good enough to me but what do I know? Would people say that's the only error correction needed?

    I used dumb error-correction:  it's just regular checksums as seen elsewhere in the Bitcoin protocol.  All I do is hash the field, and add the first four bytes to the end of that field.  If a byte goes bad, I just iterate through the field changing single bytes until it matches the checksum again.

    Hashing like this is not really intended for error correction, but 4 bytes is enough to do it reliably, and dead simple to use it.  And given the remarkably-low frequency of hard-drive errors, one-byte should be enough.  If more than one byte in the same checksummed field went bad, there's probably bigger problems.   

    I decided not to use something more appropriate like Reed-Solomon because I thought it would obfuscate the fields (i.e. -- 256 bytes of data has to be converted to 268 bytes of coefficients, meaning you need another library in order to read the data even if you don't care about the error correction).  I just recently found out that's not true under my circumstances, so I may consider switching to it on my next wallet upgrade. 




    Founder and CEO of Armory Technologies, Inc.
    Armory Bitcoin Wallet: Bringing cold storage to the average user!
    Only use Armory software signed by the Armory Offline Signing Key (0x98832223)

    Please donate to the Armory project by clicking here!    (or donate directly via 1QBDLYTDFHHZAABYSKGKPWKLSXZWCCJQBX -- yes, it's a real address!)
    MatthewLM
    Legendary
    *
    Offline Offline

    Activity: 1190
    Merit: 1004


    View Profile
    May 06, 2012, 08:08:34 PM
     #31

    There seems to be a few simple Reed Solomon algorithms available online so I don't think it would be a major problem implementing, assuming these algorithms work.
    Pages: 1 2 [All]
      Print  
     
    Jump to:  

    Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!