terrytibbs
|
|
November 20, 2011, 09:51:51 PM |
|
How is system load? Is the database server busy with CPU? IO? Is there free memory?
My instance runs on a dedicated quad-core Xeon with about 16GB of free memory (right now), disk I/O is extremely low, doesn't even begin to reach what this box is capable of. I'm increasing (what I'm assuming is) the bytes it keeps before committing to db now, will report back.
|
|
|
|
slush
Legendary
Offline
Activity: 1386
Merit: 1097
|
|
November 24, 2011, 01:59:46 PM |
|
John, is there any plan to add firstbits support? I mean lookup for address using it's firstbits and generate firstbits from full address. I found that it's very hard to implement it with current data structures - database does not store address itself, only pubkey hash. Although you implemented lookup for address with prefix, it is case sensitive and I don't see a way how to make case-insensitive lookup with just a pubkey_hash in a database. I'd like to submit a patch for Abe which will extend pubkey table with "address" column and also number of block where address firstly appeared. Afaik blocknum of first appearance is also hard to obtain with current data structures, because it requires heavy joining on fast growing db tables. I'm just asking you if there's a possibility to accept such patch to upstream. Storing such data to another columns goes against 3rd normal form, which is usually wrong solution. However we're not doing homework from SQL but real application with milions of records and such patch will make firstbits resolution much more easier and blazingly fast. Also indexing addresses (not just pubkey hashes) can be very useful also for other projects like Casascius coin analyzer ( https://bitcointalk.org/index.php?topic=52537.0).
|
|
|
|
John Tobey (OP)
|
|
November 25, 2011, 04:11:26 PM |
|
John, is there any plan to add firstbits support? I mean lookup for address using it's firstbits and generate firstbits from full address. I found that it's very hard to implement it with current data structures - database does not store address itself, only pubkey hash. Although you implemented lookup for address with prefix, it is case sensitive and I don't see a way how to make case-insensitive lookup with just a pubkey_hash in a database. I'd like to submit a patch for Abe which will extend pubkey table with "address" column and also number of block where address firstly appeared. Afaik blocknum of first appearance is also hard to obtain with current data structures, because it requires heavy joining on fast growing db tables. I'm just asking you if there's a possibility to accept such patch to upstream. Storing such data to another columns goes against 3rd normal form, which is usually wrong solution. However we're not doing homework from SQL but real application with milions of records and such patch will make firstbits resolution much more easier and blazingly fast. Also indexing addresses (not just pubkey hashes) can be very useful also for other projects like Casascius coin analyzer ( https://bitcointalk.org/index.php?topic=52537.0). Yes, I have been thinking about supporting firstbits. I would consider a patch that adds address and first block_height (or block_id) to pubkey. If I were doing it myself, I would try a new table "firstbits" with address_version, pubkey_id, block_id, firstbits. "address" could be an optional field for applications that want it. Storing firstbits directly would give us simple two-way lookups. Putting address or firstbits in pubkey would make me nervous about chain splits (where each side remains active) and firstbits adoption by alt chains. However, ideally I'd like to support this design for apps that want denormalization for performance. So any design involving a new table would add a view to make it look as if the fields were in pubkey, and a design that adds columns to pubkey should wrap it with a view that provides constant "00" address_version. Abe doesn't yet have a way to turn on or off features such as firstbits. I would like to let users turn features on or off at install time: firstbits, coin-days destroyed, namecoin stuff, etc. I would store a flag in configvar for each feature (such as 'firstbits'='yes'/'no') and skip the processing associated with deselected features. This is just my vision at the moment, I don't have any code beyond what you see. Any patch that looks useful to somebody, I'd probably accept. If it compromises too much in some area, I'd put it on its own branch until the compromises become options. By the way, I don't know how firstbits.com would handle two addresses with the same unique prefix first appearing in the same block. I would give the shorter prefix to the address in the first transaction within the block, with ties going to the first txout within a transaction.
|
|
|
|
MORA
|
|
December 17, 2011, 03:04:05 PM Last edit: December 17, 2011, 10:45:46 PM by MORA |
|
Hi, I have gotten abe to sync up with the bitcoin dir, took about 12hours on a I7-920 with SSD server, but theres alot other going on at the server. CPU was not pegged, so probaly disc/io that was the bottleneck. Anyways... After it syncs it listens for a HTTP connection, to show the interface and all. Does it still sync new data at this point, or does it need to be restarted on a timer ? After a few minutes it looses the MySQL connection, probaly due to a timeout, and it does not gracefully recover it, instead it crashes. hostname - - [17/Dec/2011 15:52:12] "GET /favicon.ico HTTP/1.1" 200 3774 Traceback (most recent call last): File "Abe/DataStore.py", line 1874, in catch_up store.catch_up_dir(dircfg) File "Abe/DataStore.py", line 1892, in catch_up_dir ds = open_blkfile() File "Abe/DataStore.py", line 1885, in open_blkfile store._refresh_dircfg(dircfg) File "Abe/DataStore.py", line 2058, in _refresh_dircfg WHERE dirname = ?""", (dircfg['dirname'],)) File "Abe/DataStore.py", line 458, in selectrow store.sql(stmt, params) File "Abe/DataStore.py", line 372, in sql store.cursor.execute(cached, params) File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 166, in execute self.errorhandler(self, exc, value) File "/usr/lib/pymodules/python2.6/MySQLdb/connections.py", line 35, in defaulterrorhandler raise errorclass, errorvalue OperationalError: (2006, 'MySQL server has gone away') Warning: failed to catch up /fast/bitcoin/.bitcoin: (2006, 'MySQL server has gone away') {'blkfile_number': 1, 'dirname': '/fast/bitcoin/.bitcoin', 'chain_id': None, 'id': Decimal('1'), 'blkfile_offset': 828778091} Traceback (most recent call last): File "/usr/lib/python2.6/wsgiref/handlers.py", line 93, in run self.result = application(self.environ, self.start_response) File "/fast/bitcoin/abe/Abe/abe.py", line 198, in __call__ abe.store.rollback() File "Abe/DataStore.py", line 578, in rollback store.conn.rollback() OperationalError: (2006, 'MySQL server has gone away')
I would like to build a set of scripts to provide the same as the now defunct bitcoinnotify´er, but in a way that everyone can run their own if they want to. So my plan is to use abe to parse the bitcoind files, and then make some PHP scripts that will use the MySQL database to check if there are new transactions for any of the monitored addresses. And if any of the monitored transactions has the number of confirmations needed for a notification to be sent (be it email, post, db change, etc.) [EDIT] Just checked my my.cnf, the timeout it set to 60seconds, so if abe sends a keep alive less often than that, the connection will be terminated by the SQL server. -Does anyone know what the keepalive setting is in abe? -And abe should handle a lost connection to MySQL more gracefully, maybe attempt y reconnects with y*10seconds delay before quitting, so a monitor script can spot the missing process. I tried to set wait_timeout to 1hour in MySQL, sofar abe has been idle for 1500seconds, so I dont think there is any keep alive builtin. I think it should be an option for the user to either use keep-alive, or remake the connection when needed, since high traffic sites may prefer a live connection, ready to use, and sites that just keep the blocks updated in the DB will only see action when a new block is ready (if abe updates it while running). I will try to look in the source, but python is not really my strong side, so jumping right into making a thread to send keep alives/reconnect the sql driver, may be a bit rough [EDIT2] This small fix handles the error when accessing a webpage after the SQL connection has been killed for whatever reason. It simply tries to reconnect and execute once more when a execute fails. Its a workaround rather that an actual bug fix since, the real fix (IMHO) should be to either keep the connection alive, or close it when done and make one when needed. $ git diff diff --git a/Abe/DataStore.py b/Abe/DataStore.py index e256115..d013219 100644 --- a/Abe/DataStore.py +++ b/Abe/DataStore.py @@ -402,8 +402,12 @@ class DataStore(object): try: store.cursor.execute(cached, params) except Exception, e: - store.sqllog.info("EXCEPTION: %s", e) - raise + try: + store.reconnect(); + store.cursor.execute(cached, params) + except Exception, e: + store.sqllog.info("EXCEPTION: %s", e) + raise
def ddl(store, stmt): if stmt.lstrip().startswith("CREATE TABLE "):
|
|
|
|
Red Emerald
|
|
December 18, 2011, 08:04:47 AM Last edit: December 19, 2011, 04:00:11 AM by Red Emerald |
|
I'm watching this. I've checkout out the source but haven't played with it yet.
EDIT: Oh yeah. Firstbits support will be awesome. I'm looking forward to giving someone 11235813 as my address
|
|
|
|
John Tobey (OP)
|
|
December 19, 2011, 03:23:11 AM |
|
MORA, Thanks for the comments and workaround. Indeed, db idle timeouts are a problem. I use two workarounds but haven't settled on a default approach. I have a cron job request the homepage every minute to trigger the catch_up code. And there is a "catch_up_thread" branch in git that automatically does this on a separate thread. It may need merging with the master branch to get the latest features, and I have not tested it as well. ThomasV implemented it for ecdsa.org and I think uses it in Electrum. Reconnecting automatically is a good idea. I think your patch will work in practice, although I see a slight chance of database corruption if it tries to reconnect in the middle of a transaction. The chance is remote, since transaction durations won't normally approach the idle timeout, but the 12-hour init makes me extremely cautious about corruption. My long-term plan is to test "begin transaction" for portability and explicitly start each transaction. The start of a transaction would be the time to reconnect if needed. For now, let me know if the workarounds prove inadequate.
|
|
|
|
MORA
|
|
December 20, 2011, 01:04:53 PM |
|
Okay, so the catch-up only happens when a page is requested from the webserver ?
The workaround I posted is not perfect, after a night of no activity, the web interface did not respond, and database was not updated, no errors on console, but today when I restarted it, I didnt get the block catchup messenges either, but it did catch up just fine, so maybe my terminal is broken.
|
|
|
|
John Tobey (OP)
|
|
December 22, 2011, 01:56:55 AM |
|
Okay, so the catch-up only happens when a page is requested from the webserver ?
Yes. You can also force a catch-up by running the program with --no-serve in another process from the command line while the server listens. But this won't help with the database idle connection timeouts.
|
|
|
|
MORA
|
|
December 22, 2011, 01:36:37 PM Last edit: December 22, 2011, 07:51:49 PM by MORA |
|
Yes. You can also force a catch-up by running the program with --no-serve in another process from the command line while the server listens. But this won't help with the database idle connection timeouts.
Thanks. I made a small bash script to run it with --no-serve then sleep and repeat (for now I dont need the webinterface). Then I can also insert my script after Abe completes, and in that way be sure there are no locking problems. To replicate the notify system I need to find new transactions and monitor them. SELECT tx_id, txout_value, pubkey_hash FROM txout_detail WHERE tx_id > xyz where xyz is the last completed block, is a good start, it finds the rows needed. Then to find which blocks they were in (could ofcause JOIN it in the first query) select block_id from block_tx where tx_id = X But do you have a good way to find the number of confirmations? (limited to say 6 or 10). From what I understand the process is to select the next_block_id from block_next, and if any is found thats 1 confirmation, then repeat with the result untill no result or the required amount is found. (UPDATE) One way I can think of is ... SELECT b1.next_block_id FROM block_next b1 JOIN block_next b2 ON b2.block_id = b1.next_block_id JOIN block_next b3 ON b3.block_id = b2.next_block_id JOIN block_next b4 ON b4.block_id = b3.next_block_id JOIN block_next b5 ON b5.block_id = b4.next_block_id JOIN block_next b6 ON b6.block_id = b5.next_block_id WHERE b1.block_id = xyz
This will not tell how many confirmations it have, but if is have one for each join, a bit rough way to test, but it could do, if you think its correct.
|
|
|
|
John Tobey (OP)
|
|
December 25, 2011, 06:52:29 AM |
|
But do you have a good way to find the number of confirmations? (limited to say 6 or 10). From what I understand the process is to select the next_block_id from block_next, and if any is found thats 1 confirmation, then repeat with the result untill no result or the required amount is found.
Yes, that will work. For the common case where the transaction is on the main branch, you can just subtract its block height from the longest chain's height. chain_candidate.in_longest will equal 1 when a block is on main. For the top block, you can use: SELECT b.block_height FROM block b JOIN chain c ON c.chain_last_block_id = b.block_id WHERE c.chain_id = 1
or as in DataStore.get_block_number: SELECT MAX(block_height) FROM chain_candidate WHERE chain_id = 1 AND in_longest = 1
This is probably the right thing even if the transaction is not on the main branch, since users won't care about confirmations on dead ends.
|
|
|
|
MORA
|
|
December 27, 2011, 09:25:26 PM |
|
I am almost done with my little project, and started to look into the license issue brought up earlier in the thread.
I would like to release my script as public domain, however since it makes use of a database populated by AGPL software, Im not sure thats allowed. -Also one of the php libs I use is public domain already.
Also I have made a website to host a service around my script for those that does not want to run it them selves, and I would like to keep that closed source for now, both for security untill the worst bugs are located, and to prevent 100identical pages popping up and going down. The website only accceses Abe data in 1 place that I could remove (Latest block height in main chain), other than that its user administration only.
The monitor script looks in a database populated by the website (or phpMyAdmin for that sake) and the Abe database for new transactions, so it touches both parts.
I have not modified Abe, since the --no-serve works quite well for my purpose.
In short : Do you considder the data entered into the MySQL database protected by AGPL, or can we write scripts that build on the data without having to choose AGPL as the license.
|
|
|
|
BkkCoins
|
|
December 28, 2011, 10:02:08 AM |
|
The license for using and changing the software has no bearing on the data in the database. If such were the case every website using Apache or MySql would be infringing their licenses, but they're not. The license specifically deals with copying, distributing and changing the software. BUt if you need to be more confident you could read more on the GNU and FSF web sites.
|
|
|
|
John Tobey (OP)
|
|
December 28, 2011, 06:26:07 PM |
|
@MORA,
Congratulations, and thanks for asking about the license. Short answer: I am not a lawyer, the software license does not cover the data, and nothing in the AGPL prevents you from putting your own code in the public domain if that code's dependencies are compatible with the GPLv3 or AGPL.
Now, it gets a little hairy if you offer a proprietary service based on Abe's tables, and it needs a running Abe to keep those tables up to date. Maybe the law would consider that a "work based on" Abe even though the service only directly reads the tables. If in doubt, describe your plan to me. If I find it in keeping with the spirit of collaboration and the goals of Abe and Bitcoin, I will write a license exception giving it explicit permission to use Abe.
|
|
|
|
John Tobey (OP)
|
|
December 28, 2011, 07:29:47 PM |
|
Also I have made a website to host a service around my script for those that does not want to run it them selves, and I would like to keep that closed source for now, both for security untill the worst bugs are located, and to prevent 100identical pages popping up and going down.
If you intend to release the source and are concerned about bugs affecting its output (misleading users about their bitcoin holdings) then I suggest you use the Bitcoin testnet during beta. If you want to test the site live on the Internet and are concerned about security bugs (exploits) then I suggest you protect it with a password and give it to only a few trustworthy testers. Also, use security best practices where possible, such as a dedicated OS account or virtual host with no access to non-empty wallets. If you want to get something out in public quickly but are afraid of exploits or misguided forks of your code, use common sense. If the site is non-commercial (free to use and free of advertising) I won't care much about the license until its proprietary features make it the most popular site running Abe.
|
|
|
|
MORA
|
|
December 28, 2011, 07:44:02 PM |
|
If you intend to release the source and are concerned about bugs affecting its output (misleading users about their bitcoin holdings) then I suggest you use the Bitcoin testnet during beta.
If you want to test the site live on the Internet and are concerned about security bugs (exploits) then I suggest you protect it with a password and give it to only a few trustworthy testers. Also, use security best practices where possible, such as a dedicated OS account or virtual host with no access to non-empty wallets.
If you want to get something out in public quickly but are afraid of exploits or misguided forks of your code, use common sense. If the site is non-commercial (free to use and free of advertising) I won't care much about the license until its proprietary features make it the most popular site running Abe.
Thanks for your reply. I believe I have taken all the steps needed to protect the setup, I do web programming as my day job, so its been done before But ofcause I cannot gurantee that I have not missed a bug or the code cant handle a specific char in someones password, or a odd but valid bitcoin address is not accepted, etc. I plan to release the monitor part and announce the website at the same time, and then after 1-2weeks of public testing, I will release the website code itself. I prefer public domain, and AFAIK which license I publish my work under should not be dependant on the libs/code it uses/links, as long as the license is less restrictive, after all the project is already a mix of 4 different licenses. If you like I can send you a link in pm to the website and the monitor code.
|
|
|
|
MORA
|
|
December 28, 2011, 07:50:03 PM |
|
Now, it gets a little hairy if you offer a proprietary service based on Abe's tables, and it needs a running Abe to keep those tables up to date. Maybe the law would consider that a "work based on" Abe even though the service only directly reads the tables. If in doubt, describe your plan to me. If I find it in keeping with the spirit of collaboration and the goals of Abe and Bitcoin, I will write a license exception giving it explicit permission to use Abe.
Yes, since it needs the data, one could argue that its work based on Abe. But since the only interface is indeed the database, both parts of the solution can be replaced and still work. However I would considder the schema work to be covered by the AGPL and unique at the time it was published.
|
|
|
|
John Tobey (OP)
|
|
December 28, 2011, 10:04:29 PM |
|
I plan to release the monitor part and announce the website at the same time, and then after 1-2weeks of public testing, I will release the website code itself.
If you stick to this schedule, there is no problem. But since the only interface is indeed the database, both parts of the solution can be replaced and still work.
In theory, yes, but I would not be surprised if a court still considered the combination a work based on Abe. RMS discussed this in 1992 in regard to linking executables and libraries. I think the AGPL is designed to reproduce the situation in the context of online services. What the lawyer said surprised me; he said that judges would consider such schemes to be "subterfuges" and would be very harsh toward them. He said a judge would ask whether it is "really" one program, rather than how it is labeled.
But it seems to me you intend to stay within the spirit of the license, so I apologize for going off topic.
|
|
|
|
slush
Legendary
Offline
Activity: 1386
Merit: 1097
|
|
January 01, 2012, 09:51:39 PM |
|
Two weeks ago I installed Abe to one of my VPS because of running Electrum server. Although it's quadcore Xeon, it has poor I/O performance, so initial indexing took around four days. Few days ago, my database crashed and Abe turned into inconsistent state for some reason, which forced me to reindex whole blockchain (=another four days of waiting) again. Because of such experience, I decided to provide a MySQL dump of Abe to public, as a torrent file. If you want to install Abe, feel free to download following file, it's clean blockchain index up to block 160095: http://mining.bitcoin.cz/media/download/abe-160095.sql.gz.torrent
|
|
|
|
MORA
|
|
January 02, 2012, 08:27:19 AM |
|
Have you tested if this dump works ? I read somewhere in the documentation that Abe does not handle changes to the block file too well, are the block files on 2 systems guaranteed to be identical, ie. will the data point to the right location in the files generated on a different system ?
|
|
|
|
slush
Legendary
Offline
Activity: 1386
Merit: 1097
|
|
January 02, 2012, 04:02:13 PM |
|
MORA, actually I don't know how Abe handles bitcoind's blockchain, but I would be really surprised if two initial imports of blockchain should lead to two different database structures.
|
|
|
|
|