LevelDB reliability?

Bitcoin Forum

April 28, 2024, 12:29:03 AM

Welcome, Guest. Please login or register.

News: Latest Bitcoin Core release: 27.0 [Torrent]

Home

Help

Search

Login

Register

More

Bitcoin Forum > Bitcoin > Development & Technical Discussion > LevelDB reliability?

Pages: [1] 2 » All

« previous topic next topic »

Author

Topic: LevelDB reliability? (Read 4372 times)

classicsucks (OP)

Hero Member

Offline

Offline

Activity: 686
Merit: 504

LevelDB reliability?

March 10, 2016, 08:03:30 PM

#1

From: https://en.wikipedia.org/wiki/LevelDB

LevelDB is widely noted for being unreliable and databases it manages are prone to corruption.[citation needed] Detailed analyses of vulnerabilities in LevelDB's design have been performed by researchers from University of Wisconsin[14][15] confirming multiple causes for LevelDB's poor reliability. The analysis shows that LevelDB consistency depends on filesystem operations being atomic and order-preserving, but real-world filesystems do not provide guarantees for either of these.

NOTE: Neutrality is disputed.

Thoughts?

1714264143

Hero Member

Offline

Offline

Posts: 1714264143

Personal Message (Offline)

Ignore

1714264143

1714264143

Reply with quote

#2

1714264143


	Report to moderator

What is a hardfork?

Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.

1714264143

Hero Member

Offline

Offline

Posts: 1714264143

Personal Message (Offline)

Ignore

1714264143

1714264143

Reply with quote

#2

1714264143


	Report to moderator

achow101

Moderator
Legendary

expert

Offline

Offline

Activity: 3374
Merit: 6551

Just writing some code

WWW

Re: LevelDB reliability?

March 10, 2016, 08:15:13 PM

#2

LevelDB being stupid is one of the major reasons that people have to reindex on Bitcoin Core crashes. There have been proposals to replace it but so far there are no plans on doing so. However people are working on using different databases in Bitcoin Core and those are being implemented and tested.

Bitcoin Core contributor | Tip Me! | GitHub | GPG Key Fingerprint 0x17565732E08E5E41

jl777

Legendary

Offline

Offline

Activity: 1176
Merit: 1132

WWW

Re: LevelDB reliability?

March 10, 2016, 09:10:42 PM

#3

Quote from: knightdk on March 10, 2016, 08:15:13 PM

LevelDB being stupid is one of the major reasons that people have to reindex on Bitcoin Core crashes. There have been proposals to replace it but so far there are no plans on doing so. However people are working on using different databases in Bitcoin Core and those are being implemented and tested.

Maybe the most reliable DB is no DB at all? Use efficiently encoded read only files that can be directly memory mapped.

https://bitcointalk.org/index.php?topic=1387119.0
https://bitcointalk.org/index.php?topic=1377459.0
https://bitco.in/forum/forums/iguana.23/

James

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET

watashi-kokoto

Sr. Member

Offline

Offline

Activity: 682
Merit: 268

Re: LevelDB reliability?

March 10, 2016, 10:54:04 PM

#4

Quote from: classicsucks on March 10, 2016, 08:03:30 PM

Thoughts?

In my opinion LevelDB is better than the previous Berkley DB Bitcoin used.

Look at the Berkey DB wallets! They became corrupted because some mac os guy renamed his wallet.dat while Bitcoin was running.

Yes we know that Mac filesystems are pile of crap.

Level db looks like simple key value storage so I don't see what exactly did the googol engineers screw up.

jl777

Legendary

Offline

Offline

Activity: 1176
Merit: 1132

WWW

Re: LevelDB reliability?

March 10, 2016, 11:06:13 PM

#5

Quote from: watashi-kokoto on March 10, 2016, 10:54:04 PM

Quote from: classicsucks on March 10, 2016, 08:03:30 PM

Thoughts?

In my opinion LevelDB is better than the previous Berkley DB Bitcoin used.

Look at the Berkey DB wallets! They became corrupted because some mac os guy renamed his wallet.dat while Bitcoin was running.

Yes we know that Mac filesystems are pile of crap.

Level db looks like simple key value storage so I don't see what exactly did the googol engineers screw up.

why use a DB for an invariant dataset?
After N blocks, the blockchain doesnt change, right?

So there is no need for ability to do general DB operations. There are places where DB is the right thing, but it is like comparing CPU mining to ASIC mining.

The CPU can do any sort of calcs, like the DB can do any sort of data operations
ASIC does one thing, but really fast. A hardcoded dataset is the same. Think of it like a ASIC analogue to DB.

so nothing wrong with DB at all, you just cant compare ASIC to CPU

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET

2112

Legendary

Offline

Offline

Activity: 2128
Merit: 1065

Re: LevelDB reliability?

March 11, 2016, 01:06:22 AM

Merited by ABCbits (2)

#6

Quote from: jl777 on March 10, 2016, 11:06:13 PM

why use a DB for an invariant dataset?

That sounds like an exam question for DBMS 101 course.

1) independence of logical data from physical storage
2) sharing of the dataset between tasks and machines
3) rapid integrity verification
4) optimization of storage method to match the access patterns
5) maintenance of transactional integrity with related datasets and processes
6) fractional/incremental backup/restore while accessing software is online
7) support for ad-hoc queries without the need to write software
8) ease of integration with new or unrelated software packages
9) compliance with accounting and auditing standards
10) easier gathering of statistics about access patterns

I think those 10 answers would be good enough for A or A-, maybe B+ in a really demanding course/school.

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0

gmaxwell

Moderator
Legendary

expert

Offline

Offline

Activity: 4158
Merit: 8382

WWW

Re: LevelDB reliability?

March 11, 2016, 01:17:40 AM
Last edit: March 11, 2016, 01:32:14 AM by gmaxwell

Merited by ABCbits (5)

#7

Quote from: knightdk on March 10, 2016, 08:15:13 PM

LevelDB being stupid is one of the major reasons that people have to reindex on Bitcoin Core crashes. There have been proposals to replace it but so far there are no plans on doing so. However people are working on using different databases in Bitcoin Core and those are being implemented and tested.

This is incorrect.

LevelDB needs a "filesystem interface layer". It doesn't come with one for windows; when leveldb is used inside Chrome it uses special chrome specific APIs to talk to the file system. A contributor provided a windows layer for Bitcoin which is what allowed Bitcoin to use leveldb in the first place.

This windows filesystem interface layer was incorrect: it failed to flush to disk at all the points which it should. It was fixed rapidly as soon as someone brought reproduction instructions to Wladimir and he reproduced it. There was much faffing about replacing it, mostly by people who don't contribute often to core-- in my view this was an example of bad cargo-cult "engineering" where instead of actual engineering people pattern-match buzzwords and glue black boxes together: "I HURD YOU NEED A DATABASE. SOMEONE ONCE TOLD ME THAT MYCROSAFT SEQUAL IS A GREAT DATABASE. IT HAS WEBSCALE". When the actual system engineers got engaged, the problem was promptly fixed.

This is especially irritating because leveldb is not a generic relational database, it is a highly specialized transactional key/value store. Leveldb is much more like an efficient disk-backed MAP implementation than it is like anything you would normally call a database. Most other "database" systems people suggest are not within three orders of magnitude in performance for our specific very narrow use case. The obvious alternatives-- like LMDB have other limitations (in particular LMDB must mmap the files, which basically precludes using it on 32 bit systems-- a shame because I like LMDB a lot for the same niche leveldb covers; leveldb also has extensive corruption detection, important for us because we do not want to incorrectly reject the chain due to filesystem corruption).

I think it's more likely that Bitcoin Core would eventually move to a custom data structure than to another "database" (maybe a swap to LMDB if they ever support non-mmap operations... maybe); as doing so would basically be a requirement for performance utxo set commitments.

A large number of these corruption reports were also being caused by anti-virus software randomly _deleting_ files out from under Bitcoin Core. It turns out that there are virus "signatures" that are as short as 16 bytes long... and AV programs avoid deleting random files all over the users system through a set of crazy heuristics like extension matching which failed to preclude the Bitcoin information (though I'm sure actual viruses have no problem abusing these heuristics to escape detection). Core implemented a whitening scheme that obfuscate the stored state in order to avoid these problems or any other potential for hostile blockchain data to interact with weird filesystem or storage bugs.

Right now it's very hard to corrupt the chainstate on Windows in Bitcoin Core 0.12+. There still may be some corner case bugs but they're now rare enough that they're hard to distinguish from broken hardware/bad drivers that inappropriately write cache or otherwise corrupt data-- issues which no sane key value store could really deal with. If you're able to reproduce corruption like that, I'd very much like to hear from you.

We've suffered a bit, as many other Open Source projects do -- in that comparatively few skilled open source developers use Windows (and, importantly, few _continue_ to use windows once they're hanging out with Linux/BSD users; if nothing else they end up moving to Mac)-- so we're extra dependent on _good_ trouble reports from Windows users whenever there is a problem which is Windows specific...

Quote from: jl777 on March 10, 2016, 11:06:13 PM

why use a DB for an invariant dataset?
After N blocks, the blockchain doesnt change, right?

Bitcoin Core does not store the blockchain in a database (or leveldb) and never has. The blockchain is stored in pre-allocated append only files on the disk as packed raw blocks in the same format they're sent across the network. Blocks that get orphaned are just left behind (there are few enough that it hardly matters.

Quote from: 2112 on March 11, 2016, 01:06:22 AM

[Lecture about generic reasons to use a RDBMS]

None of which are applicable to the storage of a disk backed map storing highly compressed state information at the heart of a cryptographic consensus algorithm, but good points generally.

jl777

Legendary

Offline

Offline

Activity: 1176
Merit: 1132

WWW

Re: LevelDB reliability?

March 11, 2016, 01:28:26 AM

#8

Quote from: gmaxwell on March 11, 2016, 01:17:40 AM

Bitcoin Core does not store the blockchain in a database (or leveldb) and never has. The blockchain is stored in pre-allocated append only files on the disk as packed raw blocks in the same format they're sent across the network. Blocks that get orphaned are just left behind (there are few enough that it hardly matters.

OK, so what is the DB used for? Will everything still work without the DB?

If the dataset isnt changing, all the lookup tables can be hardcoded

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET

jl777

Legendary

Offline

Offline

Activity: 1176
Merit: 1132

WWW

Re: LevelDB reliability?

March 11, 2016, 01:33:30 AM

#9

Quote from: 2112 on March 11, 2016, 01:06:22 AM

Quote from: jl777 on March 10, 2016, 11:06:13 PM

why use a DB for an invariant dataset?

That sounds like an exam question for DBMS 101 course.

1) independence of logical data from physical storage
2) sharing of the dataset between tasks and machines
3) rapid integrity verification
4) optimization of storage method to match the access patterns
5) maintenance of transactional integrity with related datasets and processes
6) fractional/incremental backup/restore while accessing software is online
7) support for ad-hoc queries without the need to write software
Cool

Cool

ease of integration with new or unrelated software packages
9) compliance with accounting and auditing standards
10) easier gathering of statistics about access patterns

I think those 10 answers would be good enough for A or A-, maybe B+ in a really demanding course/school.

OK, you can get an A

However, memory mapped files share a lot of the same advantages you list:

1) independence of logical data from physical storage - yes
2) sharing of the dataset between tasks and machines - yes (you do need both endian forms)
3) rapid integrity verification - yes, even faster as once verified, no need to verify again
4) optimization of storage method to match the access patterns - yes that is exactly what has been done
5) maintenance of transactional integrity with related datasets and processes - yes
6) fractional/incremental backup/restore while accessing software is online - yes

7) support for ad-hoc queries without the need to write software - no
Cool

Cool

ease of integration with new or unrelated software packages - no
9) compliance with accounting and auditing standards - not sure
10) easier gathering of statistics about access patterns - not without custom code

So it depends on if 7 to 10 trump the benefits and if the resources are available to get it working

James

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET

classicsucks (OP)

Hero Member

Offline

Offline

Activity: 686
Merit: 504

Re: LevelDB reliability?

March 11, 2016, 10:08:06 AM

#10

Thanks to everyone for the informed replies.

It seems that the underlying filesystem is the wildcard when using LevelDB. Probably it was developed and tested on top of ext3 or ext4... Who knows what happens if one is using btrfs or whatever...

I suppose the performance would be killed by having an intermediate write caching layer?

hhanh00

Sr. Member

Offline

Offline

Activity: 467
Merit: 266

Re: LevelDB reliability?

March 11, 2016, 10:58:31 AM

#11

Quote from: jl777 on March 10, 2016, 09:10:42 PM

Quote from: knightdk on March 10, 2016, 08:15:13 PM

LevelDB being stupid is one of the major reasons that people have to reindex on Bitcoin Core crashes. There have been proposals to replace it but so far there are no plans on doing so. However people are working on using different databases in Bitcoin Core and those are being implemented and tested.

Maybe the most reliable DB is no DB at all? Use efficiently encoded read only files that can be directly memory mapped.

https://bitcointalk.org/index.php?topic=1387119.0
https://bitcointalk.org/index.php?topic=1377459.0
https://bitco.in/forum/forums/iguana.23/

James

LevelDb is used to store the UTXO set. How is that read only?

http://bitcoinfs.github.io/bitcoinfs/index.html

kushti

Full Member

Offline

Offline

Activity: 315
Merit: 103

WWW

Re: LevelDB reliability?

March 11, 2016, 11:13:52 AM

#12

LevelDB is surely about weak consistency for performance's sake, as well as most NoSQL databases.

Blockchain systems probably need for versioned immutable databases with rollback possibility and efficient cleanup of old versions in background. There are no known implementations around though.

Ergo Platform core dev. Previously IOHK Research / Nxt core dev / SmartContract.com cofounder.

jl777

Legendary

Offline

Offline

Activity: 1176
Merit: 1132

WWW

Re: LevelDB reliability?

March 11, 2016, 01:37:26 PM

#13

Quote from: hhanh00 on March 11, 2016, 10:58:31 AM

Quote from: jl777 on March 10, 2016, 09:10:42 PM

Quote from: knightdk on March 10, 2016, 08:15:13 PM

LevelDB being stupid is one of the major reasons that people have to reindex on Bitcoin Core crashes. There have been proposals to replace it but so far there are no plans on doing so. However people are working on using different databases in Bitcoin Core and those are being implemented and tested.

Maybe the most reliable DB is no DB at all? Use efficiently encoded read only files that can be directly memory mapped.

https://bitcointalk.org/index.php?topic=1387119.0
https://bitcointalk.org/index.php?topic=1377459.0
https://bitco.in/forum/forums/iguana.23/

James

LevelDb is used to store the UTXO set. How is that read only?

UTXO set falls into the write once category. Once an input is spent, you cant spend it again. The difference with the UTXO set is explained here: https://bitco.in/forum/threads/30mb-utxo-bitmap-uncompressed.941/

So you can calculate the OR'able bitmap for each bundle in parallel (as soon as all its prior bundles are there). Then to create the current utxo set, OR the bitmaps together.

What will practically remain volatile is the bitmap, but the overlay bitmap for each bundle is read only. This makes a UTXO check a matter to find the index of the vout and check a bit.

James

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET

kushti

Full Member

Offline

Offline

Activity: 315
Merit: 103

WWW

Re: LevelDB reliability?

March 11, 2016, 01:51:06 PM

#14

Quote from: kushti on March 11, 2016, 11:13:52 AM

Blockchain systems probably need for versioned immutable databases with rollback possibility and efficient cleanup of old versions in background. There are no known implementations around though.

Well in fact we have half-done solution for that used in our open modular blockchain framework Scorex ( https://github.com/ScorexProject/Scorex ). We can externalize it and make Pull-Request to Bitcoinj if some Java dev would like to help with Java part

Ergo Platform core dev. Previously IOHK Research / Nxt core dev / SmartContract.com cofounder.

hhanh00

Sr. Member

Offline

Offline

Activity: 467
Merit: 266

Re: LevelDB reliability?

March 12, 2016, 02:30:27 AM

#15

Quote from: jl777 on March 11, 2016, 01:37:26 PM

Quote from: hhanh00 on March 11, 2016, 10:58:31 AM

Quote from: jl777 on March 10, 2016, 09:10:42 PM

Quote from: knightdk on March 10, 2016, 08:15:13 PM

LevelDB being stupid is one of the major reasons that people have to reindex on Bitcoin Core crashes. There have been proposals to replace it but so far there are no plans on doing so. However people are working on using different databases in Bitcoin Core and those are being implemented and tested.

Maybe the most reliable DB is no DB at all? Use efficiently encoded read only files that can be directly memory mapped.

https://bitcointalk.org/index.php?topic=1387119.0
https://bitcointalk.org/index.php?topic=1377459.0
https://bitco.in/forum/forums/iguana.23/

James

LevelDb is used to store the UTXO set. How is that read only?

UTXO set falls into the write once category. Once an input is spent, you cant spend it again. The difference with the UTXO set is explained here: https://bitco.in/forum/threads/30mb-utxo-bitmap-uncompressed.941/

So you can calculate the OR'able bitmap for each bundle in parallel (as soon as all its prior bundles are there). Then to create the current utxo set, OR the bitmaps together.

What will practically remain volatile is the bitmap, but the overlay bitmap for each bundle is read only. This makes a UTXO check a matter to find the index of the vout and check a bit.

James

Not sure if we are talking about the same thing. Following your link, it seems you are describing the internal data structure used by a block explorer which aren't necessarily optimal for a bitcoin node.
In particular, you use a 6 byte locator. Given a new incoming transaction that can spend any utxo (hash+vout), do you need to map it to a locator? And if so, how is it done?

http://bitcoinfs.github.io/bitcoinfs/index.html

jl777

Legendary

Offline

Offline

Activity: 1176
Merit: 1132

WWW

Re: LevelDB reliability?

March 12, 2016, 03:21:32 AM

#16

Quote from: hhanh00 on March 12, 2016, 02:30:27 AM

Quote from: jl777 on March 11, 2016, 01:37:26 PM

Quote from: hhanh00 on March 11, 2016, 10:58:31 AM

Quote from: jl777 on March 10, 2016, 09:10:42 PM

Quote from: knightdk on March 10, 2016, 08:15:13 PM

LevelDB being stupid is one of the major reasons that people have to reindex on Bitcoin Core crashes. There have been proposals to replace it but so far there are no plans on doing so. However people are working on using different databases in Bitcoin Core and those are being implemented and tested.

Maybe the most reliable DB is no DB at all? Use efficiently encoded read only files that can be directly memory mapped.

https://bitcointalk.org/index.php?topic=1387119.0
https://bitcointalk.org/index.php?topic=1377459.0
https://bitco.in/forum/forums/iguana.23/

James

LevelDb is used to store the UTXO set. How is that read only?

UTXO set falls into the write once category. Once an input is spent, you cant spend it again. The difference with the UTXO set is explained here: https://bitco.in/forum/threads/30mb-utxo-bitmap-uncompressed.941/

So you can calculate the OR'able bitmap for each bundle in parallel (as soon as all its prior bundles are there). Then to create the current utxo set, OR the bitmaps together.

What will practically remain volatile is the bitmap, but the overlay bitmap for each bundle is read only. This makes a UTXO check a matter to find the index of the vout and check a bit.

James

Not sure if we are talking about the same thing. Following your link, it seems you are describing the internal data structure used by a block explorer which aren't necessarily optimal for a bitcoin node.
In particular, you use a 6 byte locator. Given a new incoming transaction that can spend any utxo (hash+vout), do you need to map it to a locator? And if so, how is it done?

iguana is a bitcoin node that happens to update block explorer level dataset.
The data structures are optimized for parallel access, so multicore searches can be used
however even with a single core searching linearly (backwards), it is quite fast to find any txid

Each bundle of 2000 files has a hardcoded hash table for all the txid's in it, so it is a matter of doing a hash lookup until it is found. I dont have timings of fully processing a full block yet, but I dont expect it would take more than a few seconds to update all vins and vouts

since txid's are already high entropy, there is no need to do an additional hash, so I XOR all 4 64-bit long ints of the txid together to create an index into an open hash table, which is created to be never more than half full, so it will find any match in very few iterations. Since everything is memory mapped, after the initial access to swap it in, each search will take less than a microsecond

http://www.digitalcatallaxy.com/report2015.html
100+ page annual report for SuperNET

hyc

Member

Offline

Offline

Activity: 88
Merit: 16

Re: LevelDB reliability?

March 12, 2016, 09:22:36 AM

#17

Quote from: gmaxwell on March 11, 2016, 01:17:40 AM

LevelDB needs a "filesystem interface layer". It doesn't come with one for windows; when leveldb is used inside Chrome it uses special chrome specific APIs to talk to the file system. A contributor provided a windows layer for Bitcoin which is what allowed Bitcoin to use leveldb in the first place.

This windows filesystem interface layer was incorrect: it failed to flush to disk at all the points which it should. It was fixed rapidly as soon as someone brought reproduction instructions to Wladimir and he reproduced it. There was much faffing about replacing it, mostly by people who don't contribute often to core-- in my view this was an example of bad cargo-cult "engineering" where instead of actual engineering people pattern-match buzzwords and glue black boxes together: "I HURD YOU NEED A DATABASE. SOMEONE ONCE TOLD ME THAT MYCROSAFT SEQUAL IS A GREAT DATABASE. IT HAS WEBSCALE". When the actual system engineers got engaged, the problem was promptly fixed.

This seems to ignore the large number of non-Windows-related corruption occurrences.

Quote

This is especially irritating because leveldb is not a generic relational database, it is a highly specialized transactional key/value store. Leveldb is much more like an efficient disk-backed MAP implementation than it is like anything you would normally call a database. Most other "database" systems people suggest are not within three orders of magnitude in performance for our specific very narrow use case. The obvious alternatives-- like LMDB have other limitations (in particular LMDB must mmap the files, which basically precludes using it on 32 bit systems-- a shame because I like LMDB a lot for the same niche leveldb covers; leveldb also has extensive corruption detection, important for us because we do not want to incorrectly reject the chain due to filesystem corruption).

I think it's more likely that Bitcoin Core would eventually move to a custom data structure than to another "database" (maybe a swap to LMDB if they ever support non-mmap operations... maybe); as doing so would basically be a requirement for performance utxo set commitments.

LevelDB is not a transactional data store, it doesn't support full ACID semantics. It lacks Isolation, primarily, and its Atomicity features aren't actually reliable. No storage system that relies on multiple files for storage can offer true Atomicity - the same applied to BerkeleyDB too.

Meanwhile, despite relying on mmap, LMDB works perfectly well on 32 bit systems. And unlike LevelDB, LMDB is fully supported on Windows. (Also unlike LevelDB, LMDB is *fully supported* - LevelDB isn't actively maintained any more.)

Quote

A large number of these corruption reports were also being caused by anti-virus software randomly _deleting_ files out from under Bitcoin Core. It turns out that there are virus "signatures" that are as short as 16 bytes long... and AV programs avoid deleting random files all over the users system through a set of crazy heuristics like extension matching which failed to preclude the Bitcoin information (though I'm sure actual viruses have no problem abusing these heuristics to escape detection). Core implemented a whitening scheme that obfuscate the stored state in order to avoid these problems or any other potential for hostile blockchain data to interact with weird filesystem or storage bugs.

Right now it's very hard to corrupt the chainstate on Windows in Bitcoin Core 0.12+. There still may be some corner case bugs but they're now rare enough that they're hard to distinguish from broken hardware/bad drivers that inappropriately write cache or otherwise corrupt data-- issues which no sane key value store could really deal with. If you're able to reproduce corruption like that, I'd very much like to hear from you.

We've suffered a bit, as many other Open Source projects do -- in that comparatively few skilled open source developers use Windows (and, importantly, few _continue_ to use windows once they're hanging out with Linux/BSD users; if nothing else they end up moving to Mac)-- so we're extra dependent on _good_ trouble reports from Windows users whenever there is a problem which is Windows specific...

TierNolan

Legendary

Offline

Offline

Activity: 1232
Merit: 1083

Re: LevelDB reliability?

March 12, 2016, 05:40:29 PM

Merited by ABCbits (4)

#18

Quote from: jl777 on March 11, 2016, 01:28:26 AM

OK, so what is the DB used for? Will everything still work without the DB?

All blocks are stored in the same format that they are received, as append only files. There are undo (reverse) files which are stored separately (and they are append-only file too).

You can take the UTXO set as it is for block 400,000 and then apply the undo file for block 400,000 and you get the state for block 399,999.

This means that the UTXO set can be stepped back during a chain reorg. You use the undo (reverse) files to step back until you hit the fork and then move forward along the new fork.

The database stores the UTXO set that matches the chaintip at any given time. It is only ever changed to add/remove a block atomically.

Each new block atomically updates the UTXO set.

Block-hashes are mapped to (type 'b' record):

previous hash
block height
which file is is stored in
file offset for the block in block*****.dat
file offset for undo data in rev*****.dat
block version
merkle root
timestamp
target (bits)
nonce
status (Headers validated, block received, block validated etc)
transaction count)

Each key is pre-pended by a code to indicate what type of record and all records are stored in the same database.

DB_COINS = 'c';
DB_BLOCK_FILES = 'f';
DB_TXINDEX = 't';
DB_BLOCK_INDEX = 'b';

c: Maps txid to unspent outputs for that transaction
f: Maps file index to info about that file
t: Maps txid to the location of the transaction (file number + offset)
b: Maps block hash to the block header info (see above)

The 't' field doesn't have all the transactions unless txindex is enabled.

These are single record only fields (I think):

DB_BEST_BLOCK = 'B';
DB_FLAG = 'F';
DB_REINDEX_FLAG = 'R';
DB_LAST_BLOCK = 'l';

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF

watashi-kokoto

Sr. Member

Offline

Offline

Activity: 682
Merit: 268

Re: LevelDB reliability?

March 12, 2016, 06:19:52 PM

#19

Quote from: TierNolan on March 12, 2016, 05:40:29 PM

All blocks are stored ....

Really interesting info, thanks a lot

2112

Legendary

Offline

Offline

Activity: 2128
Merit: 1065

Re: LevelDB reliability?

March 13, 2016, 04:29:33 AM

#20

Quote from: jl777 on March 11, 2016, 01:33:30 AM

However, memory mapped files share a lot of the same advantages you list:

1) independence of logical data from physical storage - yes
2) sharing of the dataset between tasks and machines - yes (you do need both endian forms)
3) rapid integrity verification - yes, even faster as once verified, no need to verify again
4) optimization of storage method to match the access patterns - yes that is exactly what has been done
5) maintenance of transactional integrity with related datasets and processes - yes
6) fractional/incremental backup/restore while accessing software is online - yes

7) support for ad-hoc queries without the need to write software - no
Cool

Cool

ease of integration with new or unrelated software packages - no
9) compliance with accounting and auditing standards - not sure
10) easier gathering of statistics about access patterns - not without custom code

So it depends on if 7 to 10 trump the benefits and if the resources are available to get it working

James

I really don't want to be discouraging to you. You do very creative and innovative work, but you know less than zero about databases, and it is going to hamper you. You write like an autodidact or maybe you went to some really bad school.

1) no mmap()-ed files as used by you don't provide independence of the physical storage layout. It is fixed in your code and in your proposed filesystem usage. Not even simple tasks like adding additional disk volumes with more free space while online.

2) it doesn't seem like your code does any locking, so currently you can't even do sharing on a single host. Shared mmap() over NFS (or MapViewOfFile() over SMB) is only for desperadoes.

3) here you slip from zero knowledge to willful, aggressive ignorance. Marking files read-only and using SquashFS is not storage integrity verification. Even amateur non-programmers, like musicians or video producers are on average more aware of the need to verify integrity.

4) unfortunately you've got stuck in a rut of exclusively testing the initial sync. This is very non-representative access pattern. Even non-programmers in other thread are aware of this issue. This is the reason why professional database systems have separate tools for initial loading (like SQL*Loader for Oracle or BULK INSERT in standard SQL).

5) I don't believe you know what you're talking about. I'm pretty sure that you've never heard of https://en.wikipedia.org/wiki/Two-phase_commit_protocol and couldn't name any https://en.wikipedia.org/wiki/Transaction_processing_system to save your life.

6) I couldn't find any trace of support for locking or incremental backup/restore in your code. Personally, you look like somebody who rarely backs up, even less often restores and verifies. Even amateur, but experienced non-programmers seem to be more aware of the live-backup issues.

So not 7/10 or even 6/10. It is 0/10 with big minus for getting item 3) so badly wrong.

Again, I don't want to be discouraging to your programming project, although I don't fully comprehend it. Just please don't write about something you have no understanding.

People like Gregory Maxwell, Mike Hearn or Peter Todd are getting paid to pretend to not understand. Old quote:

Quote from: Upton Sinclair

It is difficult to get a man to understand something, when his salary depends upon his not understanding it!

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0

Pages: [1] 2 » All

Bitcoin Forum > Bitcoin > Development & Technical Discussion > LevelDB reliability?

« previous topic next topic »

Jump to:

Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines