Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: PremiumCodeX on November 17, 2016, 08:13:45 PM



Title: What DB do you use at your end?
Post by: PremiumCodeX on November 17, 2016, 08:13:45 PM
I was researching the ways how people make queries in the blockchain and I got the result that many create custom DBs on their ends to support their ways. If you use an additional DB at your end (other than core's leveldb) to store blockchain data, what DB do you use?


Title: Re: What DB do you use at your end?
Post by: TransaDox on November 18, 2016, 08:53:08 AM
SQLite (https://sqlite.org/index.html)


Title: Re: What DB do you use at your end?
Post by: CIYAM on November 18, 2016, 08:57:50 AM
SQLite (https://sqlite.org/index.html)

Whilst it is a great small SQL DB it isn't suitable for multiple users due to its use of global locking (unless they have completely reworked that in recent years).

I would favour using something like MySQL if wanting to have something that will scale hugely and handle multiple concurrent users.


Title: Re: What DB do you use at your end?
Post by: MadGamer on November 18, 2016, 09:04:24 AM
I was researching the ways how people make queries in the blockchain and I got the result that many create custom DBs on their ends to support their ways. If you use an additional DB at your end (other than core's leveldb) to store blockchain data, what DB do you use?


Both MSSQL Server (If you are using Windows to host it) and MySQL should be great choices . SQLite or Realm.io should be considered as an alternative for MSSQL or MySQL If you are building a mobile app.


Title: Re: What DB do you use at your end?
Post by: PremiumCodeX on November 19, 2016, 09:47:46 PM
So, most people use MySQL here, but also recommend SQLite and Realm.io. I am missing Oracle, though. In my story, I was using Oracle for several months up to now and it has nice features like highly customizable locking configuration. Although I used and agree with the excellence of the listed other DBs, why am I so lonely with my Oracle as a favorit?


Title: Re: What DB do you use at your end?
Post by: MadGamer on November 20, 2016, 06:38:35 PM
So, most people use MySQL here, but also recommend SQLite and Realm.io. I am missing Oracle, though. In my story, I was using Oracle for several months up to now and it has nice features like highly customizable locking configuration. Although I used and agree with the excellence of the listed other DBs, why am I so lonely with my Oracle as a favorit?

Oracle is considered as one of the most powerful RDBMS, don't under estimate it. It's mostly the companies that use Oracle and here why (https://www.quora.com/Why-would-companies-use-Oracle-database-when-there-is-free-MySQL) as for the differences you could check this article from Udemy (https://blog.udemy.com/oracle-vs-mysql-vs-sql-server/) or Oracle documentation (https://docs.oracle.com/cd/E12151_01/doc.150/e12155/oracle_mysql_compared.htm#i1026116)


Title: Re: What DB do you use at your end?
Post by: CIYAM on November 20, 2016, 06:50:40 PM
I have worked with many RDBMS engines (including DB/2 and Oracle) and they are all very powerful things.

There is really no reason to prefer one over the other in regards to what you are wanting to do.

The only thing that no current RDBMS handles well is actual "rollback" in terms of a "re-org" (normally a DB "rollback" is actually just restoring a backup and then "rolling forward" the newer txs from the log).

AFAIA no modern DB can do this (although I do have a design for exactly this).


Title: Re: What DB do you use at your end?
Post by: PremiumCodeX on November 21, 2016, 07:02:05 PM
I have worked with many RDBMS engines (including DB/2 and Oracle) and they are all very powerful things.

There is really no reason to prefer one over the other in regards to what you are wanting to do.

The only thing that no current RDBMS handles well is actual "rollback" in terms of a "re-org" (normally a DB "rollback" is actually just restoring a backup and then "rolling forward" the newer txs from the log).

AFAIA no modern DB can do this (although I do have a design for exactly this).


With that argument you have raised my curiosity.

Would you mind, in the essence at least, elaborating your design for me?

How should a rollback happen in your view?


Title: Re: What DB do you use at your end?
Post by: CIYAM on November 21, 2016, 07:53:12 PM
How should a rollback happen in your view?

A real "rollback" would require that the transaction log would have kept the full details of any deleted record (no major DB in the world does that AFAIA).

So if you want to "reverse history" it would be easy enough to recover "deleted records" using the transaction log.

Hope that makes sense.


Title: Re: What DB do you use at your end?
Post by: PremiumCodeX on November 26, 2016, 09:34:53 PM
How should a rollback happen in your view?

A real "rollback" would require that the transaction log would have kept the full details of any deleted record (no major DB in the world does that AFAIA).

So if you want to "reverse history" it would be easy enough to recover "deleted records" using the transaction log.

Hope that makes sense.


Correct me if I misunderstood something, but would not that require a MUCH bigger size of HDD? In addition, the log writing and reading speed would significantly increase that would result in slower DBs, would not it? At large DBs, speed is essential because some queries could take up to 20 or more minutes even without the mentioned idea.


Title: Re: What DB do you use at your end?
Post by: doof on November 27, 2016, 04:39:08 AM
How should a rollback happen in your view?

A real "rollback" would require that the transaction log would have kept the full details of any deleted record (no major DB in the world does that AFAIA).

So if you want to "reverse history" it would be easy enough to recover "deleted records" using the transaction log.

Hope that makes sense.


Correct me if I misunderstood something, but would not that require a MUCH bigger size of HDD? In addition, the log writing and reading speed would significantly increase that would result in slower DBs, would not it? At large DBs, speed is essential because some queries could take up to 20 or more minutes even without the mentioned idea.

Yes, I used MS SQL for a project www.blockchainsql.io.  450GB database.  If you optimise right, queries can be lightning fast.


Title: Re: What DB do you use at your end?
Post by: doof on November 27, 2016, 04:40:53 AM
Eg:  total chain size

select sum(block.size) as totalsize from block


Title: Re: What DB do you use at your end?
Post by: CIYAM on November 27, 2016, 05:20:01 AM
Correct me if I misunderstood something, but would not that require a MUCH bigger size of HDD?

As deletion (and in Bitcoin rollbacks) are infrequent operations it wouldn't practically be as significant an increase as you are imagining (but it would use more space for the translation log).

In addition, the log writing and reading speed would significantly increase that would result in slower DBs, would not it? At large DBs, speed is essential because some queries could take up to 20 or more minutes even without the mentioned idea.

Normal RDBMS "queries" do not involve the log at all so it would not have any speed effect upon read operations (it would only slow down "delete" operations).


Title: Re: What DB do you use at your end?
Post by: PremiumCodeX on November 27, 2016, 10:50:31 AM
Yes, I used MS SQL for a project www.blockchainsql.io.  450GB database.  If you optimise right, queries can be lightning fast.

Cool. Is that your project or did you build a project that used the mentioned project as something like an API?

At a 450-GB DB I think queries could be optimized up to less than 1 minute speed with your home gaming PC.

Correct me if I misunderstood something, but would not that require a MUCH bigger size of HDD?

As deletion (and in Bitcoin rollbacks) are infrequent operations it wouldn't practically be as significant an increase as you are imagining (but it would use more space for the translation log).

In addition, the log writing and reading speed would significantly increase that would result in slower DBs, would not it? At large DBs, speed is essential because some queries could take up to 20 or more minutes even without the mentioned idea.

Normal RDBMS "queries" do not involve the log at all so it would not have any speed effect upon read operations (it would only slow down "delete" operations).


It depends on the use-case then because in some DBs there are frequent delete operations. But I agree that most DBs are not like that so the possibility is worth to take into consideration. The question is how much it would increase the necessary space and the issue of SRP because even though rollbacks are infrequent operations, when they need them they usually need them ASAP.


Title: Re: What DB do you use at your end?
Post by: CIYAM on November 27, 2016, 11:31:34 AM
It depends on the use-case then because in some DBs there are frequent delete operations. But I agree that most DBs are not like that so the possibility is worth to take into consideration.

For blockchains (whose very purpose is to be a non-mutating structure) the "deletes" are only re-orgs (which although not that uncommon are generally not very large delete operations as most re-orgs are only one or two blocks deep) and depending upon how things are structured (such as if the structure of the chain itself is separated from the content of the blocks) would not need to involve much data.

The question is how much it would increase the necessary space and the issue of SRP because even though rollbacks are infrequent operations, when they need them they usually need them ASAP.

A "true rollback" (which as I said isn't being used by an major RBDMS that I am aware of) would actually be much faster than any other method (i.e. you either are going to have to issue update queries to act as though things are being unwound or restore an earlier backup which you'd then have to perform a partial log restore from).

Understand also that even if deletes involve appending the data to the log you would still be able to truncate the log for "checkpoints" (at which point any of that extra wasted space is recovered).


Title: Re: What DB do you use at your end?
Post by: BuySomeBitcoins on November 28, 2016, 06:42:43 PM
The choice of the database depends on your needs, mostly I work on market places and cybersecurity [fraud] so I use MongoDB - Oracle


Title: Re: What DB do you use at your end?
Post by: TransaDox on November 29, 2016, 12:50:03 AM
A "true rollback" (which as I said isn't being used by an major RBDMS that I am aware of) would actually be much faster than any other method (i.e. you either are going to have to issue update queries to act as though things are being unwound or restore an earlier backup which you'd then have to perform a partial log restore from).

Understand also that even if deletes involve appending the data to the log you would still be able to truncate the log for "checkpoints" (at which point any of that extra wasted space is recovered).


SQLite enables one to create "Savepoints (https://www.sqlite.org/lang_savepoint.html)" which are like GOTO statements for rollback.


Title: Re: What DB do you use at your end?
Post by: CIYAM on November 29, 2016, 08:04:06 AM
SQLite enables one to create "Savepoints (https://www.sqlite.org/lang_savepoint.html)" which are like GOTO statements for rollback.

An interesting feature although from reading the documentation I think that is just for use with nested transactions.

The concept of a blockchain re-org though is not the same thing as you need to "rollback" transactions that have already been committed (something rather alien to how RDBMS transactions operate).


Title: Re: What DB do you use at your end?
Post by: piotr_n on November 29, 2016, 01:00:44 PM
If you use an additional DB at your end (other than core's leveldb) to store blockchain data, what DB do you use?
I use my own solution, based on a hashmap (acting as the index) holding pointers to the records.

It makes accessing the db very fast, but needs much more RAM comparing to leveldb or berkeley.

Apart from the hashmap index and the records, there is of course also a code for the disk operations - to keep the files in sync with the memory.

It's simple and rather archaic, but has been proven to work great for UTXO-db purposes - I would not swap it for any other db engine I know.
Although I'm also quite interested in other solutions, which I don't know - there must be some more optimal ways of doing it.

https://github.com/piotrnar/gocoin/tree/master/lib/qdb


Title: Re: What DB do you use at your end?
Post by: borris123 on November 29, 2016, 02:26:31 PM
Nexus has created their own DB 10x faster than googles.

https://twitter.com/mirraxFTC/status/798217450739220481

https://bitcointalk.org/index.php?topic=657601.0


Title: Re: What DB do you use at your end?
Post by: PremiumCodeX on December 04, 2016, 09:10:08 PM
Thank you for suggestion! I too read very positive opinions on NEXUS, seems worthy to try out next.

@CIYAM, I find your description a bit difficult to imagine. Maybe because I have studied for my finals the whole day and my brain is tired  :P Have you got a model or - even better a - prototype of such a DB to see?


Title: Re: What DB do you use at your end?
Post by: Rick Storm on December 05, 2016, 12:14:06 PM
But what you don't have in the log is the content of the record XXX which therefore cannot be restored unless you actually restore an older backup (this is the meaning of D in ACID transaction which is what all serious RDBMS do - i.e. once deleted always deleted).
...
and it should be clear that they could not do this simply because the information has not been preserved in the transaction log

This doesn't sound right. Afaik all major databases store the deleted record in the transaction log, e.g. SQL Server:

https://sqlfascination.com/2010/02/03/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-1/


Title: Re: What DB do you use at your end?
Post by: CIYAM on December 05, 2016, 12:49:13 PM
That certainly is a surprise to me (thanks for the link and I deleted my own post to avoid any further confusion). Thinking about it more I guess it makes sense that in order to do a "rollback" it would need to be able to restore deleted data (as the size of a transaction could be huge it has to write all the recovery information and keep that at least until the commit has been completed and all final tx data has been flushed to disk).

But can you force rollbacks to occur after a successful commit (and especially after a checkpoint)?

Certainly I'm not familiar with any standard SQL commands for doing such a thing (and I wouldn't count nested transactions if supported as really being a solution) so I still don't see how you could use a traditional RDMBS to achieve a blockchain "re-org" (which is the main point that I was perhaps poorly trying to make).

Assuming that most RDBMS engines do work the same way regarding logging then perhaps it would at least be potentially possible to add some method of tagging the DB state at a certain "block height" and therefore later be able to "rewind" the DB back to the state it was at that specified tag (although I don't think that there is any such standard method).


Title: Re: What DB do you use at your end?
Post by: PremiumCodeX on December 08, 2016, 10:55:51 PM
I am glad because I have not closed this discussion yet lol. Thank you Rick Storm for enlightening me about this matter! I think, you could do this "tagging" with triggers. You could define the block height as a custom condition in a trigger and in the trigger set a savepoint where data is stored too. I too am not sure about how you could "tell the DBMS" to do that so I would be thankful if someone attached some example in any SQL.


Title: Re: What DB do you use at your end?
Post by: uanode on December 08, 2016, 11:56:08 PM
We are using PostgreSQL for these tasks whose performance and capabilities are endless.


Title: Re: What DB do you use at your end?
Post by: btc_enigma on December 09, 2016, 05:58:26 AM
mysql is slow in terms of read performance

Why not use leveldb as used by core. It has very good read speeds. Also you don't require concurrent writes. Only need to write a block that comes in every 10 minutes. Most of the times are reads. I doubt any other RDBMS can match the performance of leveldb


Title: Re: What DB do you use at your end?
Post by: tardi on December 15, 2016, 08:59:46 AM
I would insert the outputs from bitcoin into a mysql database and then query it.


Title: Re: What DB do you use at your end?
Post by: TransaDox on December 15, 2016, 09:32:35 PM
That certainly is a surprise to me (thanks for the link and I deleted my own post to avoid any further confusion). Thinking about it more I guess it makes sense that in order to do a "rollback" it would need to be able to restore deleted data (as the size of a transaction could be huge it has to write all the recovery information and keep that at least until the commit has been completed and all final tx data has been flushed to disk).

But can you force rollbacks to occur after a successful commit (and especially after a checkpoint)?

Certainly I'm not familiar with any standard SQL commands for doing such a thing (and I wouldn't count nested transactions if supported as really being a solution) so I still don't see how you could use a traditional RDMBS to achieve a blockchain "re-org" (which is the main point that I was perhaps poorly trying to make).

Assuming that most RDBMS engines do work the same way regarding logging then perhaps it would at least be potentially possible to add some method of tagging the DB state at a certain "block height" and therefore later be able to "rewind" the DB back to the state it was at that specified tag (although I don't think that there is any such standard method).


Rollback is a bit of a red herring for an immutable block chain. Just store everything, good or bad, and adjust the query.

SQLite also has the ability to treat multiple, separate database files as a single database at run-time (max 64). One could create separate DB files for working DB and historical but I don't even see the need for that.

BTW. SQLite can insert 1,000,000 records in under 10 secs and wipes the floor with My/MSSQL and Postgres in terms of performance. What it doesn't have is authentication so the others are a better choice for web servers.


Title: Re: What DB do you use at your end?
Post by: fikihafana on December 17, 2016, 03:58:31 PM
I would use mongodb to maximize performance and scalability


Title: Re: What DB do you use at your end?
Post by: piotr_n on December 17, 2016, 04:07:25 PM
You guys have no idea what you are talking about.

At this moment UTXO database holds over 45 millions records, from over 16 millions transactions.

What you need is an instant access to any of these records - otherwise verifying of blocks and transactions will be too slow.

No SQL database (or mongodb) can give you such a performance.
These kind of DBs are meant for complex queries, but they would have killed your bitcoin node had you tried to use them for handling the UTXO set.


Title: Re: What DB do you use at your end?
Post by: PremiumCodeX on December 18, 2016, 12:02:34 PM
You guys have no idea what you are talking about.

At this moment UTXO database holds over 45 millions records, from over 16 millions transactions.

What you need is an instant access to any of these records - otherwise verifying of blocks and transactions will be too slow.

No SQL database (or mongodb) can give you such a performance.
These kind of DBs are meant for complex queries, but they would have killed your bitcoin node had you tried to use them for handling the UTXO set.

Are you suggesting using a No-SQL DB? I have not used a NoSQL DB yet. Is it any better than SQL DBs when handling BTC transaction data? It could be a good idea since BTC is often looked as an experiment, why not to use a new technology with it.


Title: Re: What DB do you use at your end?
Post by: piotr_n on December 18, 2016, 02:26:06 PM
It is not about the language/API of the queries, but about the speed of accessing the records.

In SQL database just parsing the SQL query takes too much time.
But modern NoSQL databases (like MongoDB) are also not good here, because they are designed for complex queries and fancy data structures.
This all comes at a cost of performance and memory usage.

What you need for bitcoin is a simple [binary_key] -> [binary_record] database.
LevelDB is the most common one I know that can be used for that.
But IMHO even LevelDB is far away from optimal for bitcoin's UTXO set.

I'd say that if you want your bitcoin node to perform really well, you ought to make your own DB engine.
Depending how much RAM you can spare, I'd lean towards keeping all the data inside the memory.
Write data to the disk only to make the memory changes persistent and read from the disk only when loading the DB data while booting your node.
This requires more than 2 GB of RAM for the current UTXO-set, but gives the optimal performance.

That's for UTXO database - which is the tricky one here.
As for the blocks, you might just as well use a pure file system - no additional database engine is really necessary here.


Title: Re: What DB do you use at your end?
Post by: clickerz on December 18, 2016, 03:56:39 PM
I would use mongodb to maximize performance and scalability

MongoDB is quite good and in fact it is gaining more followers also. It is also now incorporated in MEAN Stack development and it is becoming popular. I like MongoDb for its scalability. So far my also with performance is common database being used especially on web development.


Title: Re: What DB do you use at your end?
Post by: TransaDox on December 20, 2016, 01:38:59 AM
mysql is slow in terms of read performance

Why not use leveldb as used by core. It has very good read speeds. Also you don't require concurrent writes. Only need to write a block that comes in every 10 minutes. Most of the times are reads. I doubt any other RDBMS can match the performance of leveldb
Because LevelDB is not ACID compliant and prone (read, renowned for) corrupting the DB. This is why we are being told to back up the chain, make copies etc. Bitcoin should be using an ACID compliant DB and the only one that gets close to the performance of a Key/Value DB for sequential access is SQLite. Plus, one has other niceties like live backup and in memory attached DBs.