mndrix
Michael Hendricks
VIP
Sr. Member
Offline
Activity: 447
Merit: 258
|
|
January 04, 2012, 03:11:46 PM |
|
It's an all-or-nothing deal. Either load the whole blockchain into RAM, or don't load any of it. It's too inefficient and pointless to try and use virtual memory.
Very few transactions need to read the entire blockchain. A reasonable blockchain index could be kept in RAM and the entire blockchain mmap'd to virtual memory. As best I can tell, Bitcoin transactions tend to access recent blocks far more often than ancient blocks. The OS virtual memory system should handle that access pattern efficiently. One possible implementation would be an in-memory index mapping a Bitcoin address to the first block in which it appeared. Private key sweep code must only traverse, and possibly fetch from disk, subsequent blocks. With this kind of access pattern, the OS is likely to keep the last several thousand blocks in RAM and rarely fetch extra data from disk.
|
|
|
|
etotheipi (OP)
Legendary
Offline
Activity: 1428
Merit: 1093
Core Armory Developer
|
|
January 04, 2012, 03:45:10 PM |
|
It's an all-or-nothing deal. Either load the whole blockchain into RAM, or don't load any of it. It's too inefficient and pointless to try and use virtual memory.
Very few transactions need to read the entire blockchain. A reasonable blockchain index could be kept in RAM and the entire blockchain mmap'd to virtual memory. As best I can tell, Bitcoin transactions tend to access recent blocks far more often than ancient blocks. The OS virtual memory system should handle that access pattern efficiently. One possible implementation would be an in-memory index mapping a Bitcoin address to the first block in which it appeared. Private key sweep code must only traverse, and possibly fetch from disk, subsequent blocks. With this kind of access pattern, the OS is likely to keep the last several thousand blocks in RAM and rarely fetch extra data from disk. I need to look more into mmap and the Windows equivalent. I seem to remember concluding that you still needed the address space for the file in RAM, but instead behaved as a sort of RAM-based cache for the file. Also, I didn't like the platform-dependence of it. But for such a big change it might be worth fighting that battle... if it truly does save the RAM. Right now Armory doesn't maintain any disk-index at all. It completely rescans the blockchain on every load, and reaccumulates the balance and outputs of each wallet. This is possible because of how extraordinarily fast my blockchain scanning code is... even on my slow computer, it takes less than 20s to cold-boot Armory on the main network and that's only single-threaded! Sure, this is not a good long-term design, but it wasn't intended to be -- 10s-20s load time is perfectly acceptable to me for the next couple months until I get something more sane in there. And there's no issues with synchronizing index files to the blk0001.dat file... there are no index files! Data structures are my specialty, and I already know how to handle all the maps/indexes for a more-efficient, non-scanning-every-load client (even easier if mmap does what I need). It's easy enough to maintain a master index of addresses and blockchain locations, I've even done implemented it and saw that it takes something like 150 MB. It's just not a priority before my first release, since its runtime is already acceptable.
|
|
|
|
SgtSpike
Legendary
Offline
Activity: 1400
Merit: 1005
|
|
January 04, 2012, 04:10:13 PM |
|
It's an all-or-nothing deal. Either load the whole blockchain into RAM, or don't load any of it. It's too inefficient and pointless to try and use virtual memory.
Very few transactions need to read the entire blockchain. A reasonable blockchain index could be kept in RAM and the entire blockchain mmap'd to virtual memory. As best I can tell, Bitcoin transactions tend to access recent blocks far more often than ancient blocks. The OS virtual memory system should handle that access pattern efficiently. One possible implementation would be an in-memory index mapping a Bitcoin address to the first block in which it appeared. Private key sweep code must only traverse, and possibly fetch from disk, subsequent blocks. With this kind of access pattern, the OS is likely to keep the last several thousand blocks in RAM and rarely fetch extra data from disk. I can see what you mean if an index was created for addresses, but for transactions? What good does a transaction index do when you are looking for transactions to do with specific addresses? Regardless, I can see how an index would make it work.
|
|
|
|
P4man
|
|
January 04, 2012, 04:22:12 PM |
|
Looking very very good indeed, congrats. Subbed and ill test it tomorrow.
|
|
|
|
gnar1ta$
Donator
Hero Member
Offline
Activity: 798
Merit: 500
|
|
January 04, 2012, 05:49:17 PM |
|
Looks great, unfortunately I use OSX.
|
Losing hundreds of Bitcoins with the best scammers in the business - BFL, Avalon, KNC, HashFast.
|
|
|
etotheipi (OP)
Legendary
Offline
Activity: 1428
Merit: 1093
Core Armory Developer
|
|
January 04, 2012, 05:56:35 PM |
|
Looks great, unfortunately I use OSX. Once I get official build-instructions out there for Windows and Linux (and actually make the first release), I'm sure someone will figure out how to port it to OSX. I've tried hard to use platform-independent code, so hopefully it won't be too difficult.
|
|
|
|
|
etotheipi (OP)
Legendary
Offline
Activity: 1428
Merit: 1093
Core Armory Developer
|
|
January 04, 2012, 07:06:34 PM |
|
It is platform specific if I have to use a completely different implementation for each platform, unless they have identical interface calls (which I doubt). Specifically, I'll have to do preprocessor branching to compile one set of code in Windows and another branch for Linux (and maybe yet another for OSX). The question is how much code has to be branched to make it work? My guess is that it's a lot... As for address space comment, am I correct that mmap doesn't help me? It sounds like if the blockchain is 2.0 GB, I have to reserve 2.0 GB of RAM to mmap it -- therefore I'm not really benefitting any folks who only have 1-2 GB of total system RAM. Perhaps someone with more mmap experience than me can explain if there's anything I can get out of it, and how much different the code is going to look between platforms.
|
|
|
|
Libreyseco
Newbie
Offline
Activity: 6
Merit: 0
|
|
January 04, 2012, 07:08:02 PM |
|
It's great, I like it. Thank you for this increidble Bitcoin client.
|
|
|
|
Mike Hearn
Legendary
Offline
Activity: 1526
Merit: 1134
|
|
January 04, 2012, 09:16:22 PM |
|
It's probably just a few lines of code. They have nearly identical behaviors so it's really not a big deal. mmap maps regions of address space to files. When you access them the OS faults the underlying file region into memory if necessary. Therefore you do not need 2GB of RAM to load a 2GB file. You do however need 2GB of address space. On a 64 bit system (regardless of how much physical RAM you have) this is not an issue. On 32 bit you're not going to achieve that because most OS' fragment the address space somewhat and indeed typically reserve the top one or two gigs of address space for the kernel. Having taken a look at the code, I would be careful with recommending users put any significant money into this app. It's great that you implemented useful features, but I see the following: if(!blockIsNewTop) { cout << "Block data did not extend the main chain!" << endl; // TODO: Not sure if there's anything we need to do if this block // didn't extend the main chain. } if(blockchainReorg) { cout << "This block forced a reorg! (and we're going to do nothing...)" << endl; //TODO: do something important (besides seg-faulting) }
Yes, you probably need to do something important. Re-org handling is a chore but a Bitcoin implementation that doesn't do it is incorrect and will mysteriously fail, suffer security problems or corrupt its internal data structures at some random future point in time. I hope you address this in future and provide a convincing set of unit tests. And in general tackle correctness before cool features.
|
|
|
|
BitcoinBug
|
|
January 04, 2012, 09:32:41 PM |
|
Looks like a hell lot of work. Congrats!
|
|
|
|
etotheipi (OP)
Legendary
Offline
Activity: 1428
Merit: 1093
Core Armory Developer
|
|
January 04, 2012, 09:37:38 PM Last edit: January 05, 2012, 10:23:49 PM by etotheipi |
|
... Yes, you probably need to do something important. Re-org handling is a chore but a Bitcoin implementation that doesn't do it is incorrect and will mysteriously fail, suffer security problems or corrupt its internal data structures at some random future point in time.
I hope you address this in future and provide a convincing set of unit tests. And in general tackle correctness before cool features.
You should look closer at the code, because you completely misunderstood what you read. I already ran an exhaustive blockchain re-organization unit-test and posted them on the forums a couple months ago. Not only does my code gracefully handle re-orgs and double spends, but I even packaged and documented it, and put it on the forums for everyone else. I spent a 1-2 weeks painstakingly creating that unit test and stepping through the debugger hundreds of times to make sure it works perfectly. I'm not pleased that you accuse me of "not focusing on correctness" when I have 2500 lines of unit testing code in my project. Those comments you referenced were for additional things that I might want to do on a re-org, outside of updating the blockchain data structures. At the moment, there isn't anything else. (To see the reorg-handling code, do a search for the variable txJustInvalidated_, and the function "reassessAfterReorg" in BlockUtils.cpp)
|
|
|
|
SgtSpike
Legendary
Offline
Activity: 1400
Merit: 1005
|
|
January 04, 2012, 10:06:27 PM |
|
It's probably just a few lines of code. They have nearly identical behaviors so it's really not a big deal.
mmap maps regions of address space to files. When you access them the OS faults the underlying file region into memory if necessary. Therefore you do not need 2GB of RAM to load a 2GB file. You do however need 2GB of address space. On a 64 bit system (regardless of how much physical RAM you have) this is not an issue. On 32 bit you're not going to achieve that because most OS' fragment the address space somewhat and indeed typically reserve the top one or two gigs of address space for the kernel.
Well, that is a good point, but then, who is running a 64-bit OS with less than 4GB of ram? Most people only utilize a 64-bit implementation of an OS so they can address 4GB+ of RAM. I bet the percentage of users using less than 4GB of ram on a 64-bit OS would be extremely small, which would render the work of implementing virtual memory into this client rather useless. If it's not a big deal to implement, go for it, but if it's going to take away from development of other features, or bug fixed, etc, I would first do a market study to find out exactly how many people NEED a virtual memory implementation. You know, a cost vs benefit analysis and all of that.
|
|
|
|
jothan
Full Member
Offline
Activity: 184
Merit: 100
Feel the coffee, be the coffee.
|
|
January 04, 2012, 11:09:08 PM |
|
It is platform specific if I have to use a completely different implementation for each platform, unless they have identical interface calls (which I doubt). Specifically, I'll have to do preprocessor branching to compile one set of code in Windows and another branch for Linux (and maybe yet another for OSX). The question is how much code has to be branched to make it work? My guess is that it's a lot... As for address space comment, am I correct that mmap doesn't help me? It sounds like if the blockchain is 2.0 GB, I have to reserve 2.0 GB of RAM to mmap it -- therefore I'm not really benefitting any folks who only have 1-2 GB of total system RAM. Perhaps someone with more mmap experience than me can explain if there's anything I can get out of it, and how much different the code is going to look between platforms. You would need a thin layer of glue code, it should not be that bad. On 32 bits, you may need to separate the block chain in 256MiB or 512MiB chunks (and limit the number of chunks loaded) to map at a time to avoid the 1-3 GiB mmap limit. As for efficiency, it all depends on your access patterns. The memory system will conspire to keep often-used pages in RAM to speed up their access. Using madvise or similar on other platforms may help before doing a full scan. If there is no such thing on Windows, the glue code can be an empty function. I also understand your need to keep things simple at this point. mmaping on 64 bits should, except for the initial setup, be identical to your current scheme. I also forgot to say good job ! Looks wonderful so far ! Keep it up ! I'll fork some coins over sometime soon (I'm waiting for a money transfer). Cheers !
|
Bitcoin: the only currency you can store directly into your brain.
What this planet needs is a good 0.0005 BTC US nickel.
|
|
|
btcinstant
|
|
January 05, 2012, 02:53:25 AM |
|
Brillant just brillant!
|
|
|
|
Mike Hearn
Legendary
Offline
Activity: 1526
Merit: 1134
|
|
January 05, 2012, 09:17:14 AM |
|
No, it's I who should apologise. You're right, there is handling of re-orgs in there. I was taking a quick look over the code before heading to sleep last night (I try for every new client that is open source) and stopped reading at that point. It's a little confusing that the comments refer to parts of this code saying it's not implemented (this happens in addNewBlock too), when they are, but that doesn't excuse me not checking more carefully.
I sent you a donation by way of apology.
I've got a few more questions - do you run through the re-org tests manually? I also did a quick look for code that verifies the wallet contents after re-orgs in unittest.py, but couldn't find it.
When a spend is created and is waiting to confirm, then a re-org invalidates the tx that was spent, does that show up in the ui? Where is it handled?
|
|
|
|
etotheipi (OP)
Legendary
Offline
Activity: 1428
Merit: 1093
Core Armory Developer
|
|
January 05, 2012, 01:05:15 PM Last edit: January 05, 2012, 02:12:05 PM by etotheipi |
|
No, it's I who should apologise. You're right, there is handling of re-orgs in there. I was taking a quick look over the code before heading to sleep last night (I try for every new client that is open source) and stopped reading at that point. It's a little confusing that the comments refer to parts of this code saying it's not implemented (this happens in addNewBlock too), when they are, but that doesn't excuse me not checking more carefully.
I sent you a donation by way of apology.
I've got a few more questions - do you run through the re-org tests manually? I also did a quick look for code that verifies the wallet contents after re-orgs in unittest.py, but couldn't find it.
When a spend is created and is waiting to confirm, then a re-org invalidates the tx that was spent, does that show up in the ui? Where is it handled?
Thanks for the apology. It sounds like I need to go through the code and clean up some old comments. I like to leave myself "TODO"s" and random comments about what isn't done yet, and then forget to remove them. The reorg test is actually in the C++ unittests: cppForSwig/BlockUtilsTest.cpp, and I believe the .dat files have been checked in as part of the test, too. It's not part of the UI yet but you just reminded me that I need to make it part of the UI. Perhaps check the isValid_ flag for a LedgerEntry and pop up some kind of message in the rare event it goes false. I think the user should be notified that they just became a victim...
|
|
|
|
LightRider
Legendary
Offline
Activity: 1500
Merit: 1022
I advocate the Zeitgeist Movement & Venus Project.
|
|
January 05, 2012, 04:45:58 PM |
|
Good work, I'll be testing this!
|
|
|
|
etotheipi (OP)
Legendary
Offline
Activity: 1428
Merit: 1093
Core Armory Developer
|
|
January 07, 2012, 06:37:19 AM |
|
More teasers! And I need some recommendations... I figured you folks would appreciate seeing more screenshots, anyway This dialog is loaded whenever you try to send money from a watching-only wallet. The goal is to explain how the user can execute the offline transaction without needing external advice (yes, some users will still not totally understand it, but that's why this is only available in the "Advanced" usermode). I have spent a lot of time trying to fix the layout and the wording, and I still feel like it could be better. What better way than to get feedback from the potential users! -Eto P.S. - Sorry I haven't gotten around to the Windows build instructions ... I want to get the pre-alpha code out as soon as possible before battling Windows issues. Building in Windows really is a PITA. See post #63 if you want to compile and run in Ubuntu, which is only a few shell commands.
|
|
|
|
SgtSpike
Legendary
Offline
Activity: 1400
Merit: 1005
|
|
January 07, 2012, 07:55:20 AM |
|
More teasers! And I need some recommendations... I figured you folks would appreciate seeing more screenshots, anyway This dialog is loaded whenever you try to send money from a watching-only wallet. The goal is to explain how the user can execute the offline transaction without needing external advice (yes, some users will still not totally understand it, but that's why this is only available in the "Advanced" usermode). I have spent a lot of time trying to fix the layout and the wording, and I still feel like it could be better. What better way than to get feedback from the potential users! -Eto P.S. - Sorry I haven't gotten around to the Windows build instructions ... I want to get the pre-alpha code out as soon as possible before battling Windows issues. Building in Windows really is a PITA. See post #63 if you want to compile and run in Ubuntu, which is only a few shell commands. Dude, sounds good to me! Question though: Can you not generate offline transactions directly on the machine where the private keys reside?
|
|
|
|
|