John Tobey (OP)
|
|
January 02, 2012, 05:14:24 PM |
|
It's not that the dumps would be different, but the file offset stored in the datadir table would probably not be at a block boundary in the local block file. The solution is to reset the pointer. Abe should have a command-line option for this or even do it automatically, but currently we do it with: UPDATE datadir SET blkfile_number = 1, blkfile_offset = 0; The next run will spend a few(?) minutes scanning the block file, skipping blocks already loaded via the dump. Thanks for the torrent!
|
|
|
|
MORA
|
|
January 02, 2012, 06:09:23 PM |
|
So those of us that have Abe running on a hosted machine with BW to spare could offer a "live" dump to speed up things Or we could just seed on the torrent.
|
|
|
|
John Tobey (OP)
|
|
January 02, 2012, 06:36:51 PM |
|
Yup, or someone with time to spare might write export and import functions, dumping and loading the data in a bitcoin-specific, db-neutral format. If that runs pretty fast, write a translator from block files to that format, and it might approach the speed of torrent+mysql for the initial load. The main thing, I suspect, is to create indexes after the tables have data.
|
|
|
|
slush
Legendary
Offline
Activity: 1386
Merit: 1097
|
|
January 02, 2012, 07:17:07 PM |
|
John, will Abe check database consistency after startup from initial db import and check if db and local blockchain in bitcoind are the same? I mean - isn't blindly using export from some unknown entity potential attack vector?
At least torrent file have a checksum, so anybody who trust me can trust the torrent download, too. But would be nice to know that Abe is checking it by self...
|
|
|
|
John Tobey (OP)
|
|
January 03, 2012, 02:06:09 AM Last edit: January 03, 2012, 02:17:04 AM by John Tobey |
|
John, will Abe check database consistency after startup from initial db import and check if db and local blockchain in bitcoind are the same? I mean - isn't blindly using export from some unknown entity potential attack vector?
Abe verifies proof of work and, as of 0.6, transaction Merkle trees on import. Yes, an export/import tool like this should come with caveats about trust. There's a verify.py script (possibly out of date) that verifies the Merkle roots already loaded, and it would be simple to add proof-of-work checks there or as part of an import tool. Of course, if it is part of a system for fast loading of a local, known good block chain, it's not so vulnerable. Edit: By "verifies proof of work" I do not mean checking hashes against the target or difficulty, just verifying that the "previous block hash" is indeed the hash of the previous block's header. Adding a target check would be nice, though challenging for alternative chains that represent target and proof differently.
|
|
|
|
slush
Legendary
Offline
Activity: 1386
Merit: 1097
|
|
January 06, 2012, 04:59:11 AM |
|
Today I tried to understand Abe's source code and although I'm still confused, I may understand it a bit more than before. From what I see, Abe is parsing blockfile and is reconstructing blockchains and transactions in the SQL with many checks. What happen when block stored in the blockfile is orphaned or blockchain is forked? Does Abe solve such issues correctly? Afaik blockfile is just dumb store of block structures, so it already should do all validations.
Why I'm asking:
I don't like that Abe needs blockchain stored locally, it makes it far less flexible, For example running full Abe installation (bitcoind + database + application) on VPS is pretty problematic, because of memory consumption and required disk I/O (bitcoind itself is using disk a lot, plus Abe makes disk busy by database writes). For running Stratum servers (where I want to use Abe internally, at least for initial implementation), I need as small footprint as possible, to have a possibility to run Stratum server also on cheap VPS.
I have already some experience with communicating over Bitcoin P2P, so I have an idea of patching Abe for loading blocks and transactions directly from the network. In this case, Abe will need only (trusted?) bitcoin node to connect to port 8333. Unfortunately, my networking code does not do any block/transaction validation, it just receive messages and parse them into python objects. So my question is related to this; when I'll feed Abe with those deserialized data from P2P network, will Abe check everything necessary to have consistent index in the database?
|
|
|
|
MORA
|
|
January 06, 2012, 07:26:51 AM |
|
I don't like that Abe needs blockchain stored locally, it makes it far less flexible, For example running full Abe installation (bitcoind + database + application) on VPS is pretty problematic, because of memory consumption and required disk I/O (bitcoind itself is using disk a lot, plus Abe makes disk busy by database writes).
Hmm, I run it on a (good) VPS, granted it uses quite a bit of disk spare, but other than that it works fine, the Abe block viewer is a bit slow, but I dont think that would change, just by moving the bitcoind files off the server ? up 2 days, 11 min, 1 user, load average: 0.00, 0.00, 0.00 Tasks: 78 total, 1 running, 77 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 0.3%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 1027308k total, 1013608k used, 13700k free, 15900k buffers Of those 707MB is cache, so real usage is just 316MB mem. You could split it out on several servers, but ofcause that would not make it cheaper Abe just needs the files, it should not care if bitcoind is infact running on the machine, ie. network mounted share. And Abe will connect to a MySQL server other than localhost withot problems.
|
|
|
|
slush
Legendary
Offline
Activity: 1386
Merit: 1097
|
|
January 06, 2012, 10:14:40 AM |
|
Hmm, I run it on a (good) VPS, granted it uses quite a bit of disk spare, but other than that it works fine, the Abe block viewer is a bit slow, but I dont think that would change, just by moving the bitcoind files off the server ?
The key is RAM; it's pretty slow for you, because database don't fit the RAM and every request is spinning the HDD a lot. In the ideal world, full Abe's database should be loaded into the memory, which is pretty hard to achieve on VPS (MySQL database actually have around 4.5GB), but at least database indexes should fit into the memory (around 1.5 GB), which is doable. Smaller server memory will provide poor performance, exactly as you're reporting. Move bitcoind outside the machine can save around 200 MB of RAM and significant portion of disk I/O usage. Your idea with mounting blockfile over the net would probably works, you're right. But it is still more like a hack than real solution; you still need disk access to blockchain and handling failover of NFS mount is much harder than providing the pool of trusted P2P nodes to connect. If John confirm that my idea with feeding from P2P network will work, I'll try it. Otherwise will setup NFS mounts...
|
|
|
|
John Tobey (OP)
|
|
January 06, 2012, 05:24:14 PM |
|
Today I tried to understand Abe's source code and although I'm still confused, I may understand it a bit more than before. From what I see, Abe is parsing blockfile and is reconstructing blockchains and transactions in the SQL with many checks. What happen when block stored in the blockfile is orphaned or blockchain is forked? Does Abe solve such issues correctly? Afaik blockfile is just dumb store of block structures, so it already should do all validations.
Abe has logic to attach orphaned blocks and reorganize a forked chain. As far as I know it works, but it is the area I would most like to test when I have time. Relevant code: adopt_orphans and _offer_block_to_chain in Abe/DataStore.py. Why I'm asking:
I don't like that Abe needs blockchain stored locally, it makes it far less flexible, For example running full Abe installation (bitcoind + database + application) on VPS is pretty problematic, because of memory consumption and required disk I/O (bitcoind itself is using disk a lot, plus Abe makes disk busy by database writes). For running Stratum servers (where I want to use Abe internally, at least for initial implementation), I need as small footprint as possible, to have a possibility to run Stratum server also on cheap VPS.
I have already some experience with communicating over Bitcoin P2P, so I have an idea of patching Abe for loading blocks and transactions directly from the network. In this case, Abe will need only (trusted?) bitcoin node to connect to port 8333. Unfortunately, my networking code does not do any block/transaction validation, it just receive messages and parse them into python objects. So my question is related to this; when I'll feed Abe with those deserialized data from P2P network, will Abe check everything necessary to have consistent index in the database?
Abe does not validate blocks beyond what's needed to "checksum" a chain up to a trusted current-block hash. Complete block validation is very hard and not on my priority list, though I might add hooks to use external logic. (Wrapping Abe.DataStore.import_block with a subclass might suffice.) I don't expect a problem feeding Abe deserialized data. You would need a structure like that created by Abe.deserialize.parse_Block with one extra element: 'hash' whose value is the block header hash as a binary string. The structure is based on Gavin's BitcoinTools. You would pass that structure "b" to store.import_block(b, frozenset([1])). (chain_id 1 = main BTC chain) Abe.DataStore.import_blkdat does this for every block in blk0*.dat that was not previously loaded. One way to shrink the footprint would be to add support for binary SQL types. Abe supports the SQL 1992 BIT type and tests for it at installation, but the only database that passes the test is SQLite, which is unsuitable for large servers. On MySQL and all the others, Abe falls back to binary_type=hex and stores scripts and hashes in hexadecimal, wasting half the bytes. Relevant code is in DataStore: configure_binary_type, _set_sql_flavour (beneath the line "val = store.config.get('binary_type')"), and _sql_binary_as_hex, where Abe translates DDL from standard BIT types to CHARs of twice the length. Another improvement would be to remove unneeded features (or, ideally, make them optional) such as the Coin-Days Destroyed calculation (block_tx.satoshi_seconds_destroyed etc.) and the unused pubkey.pubkey column.
|
|
|
|
MORA
|
|
January 13, 2012, 12:31:35 PM |
|
How do you support multiple chains ? I would like to add LTC and NMC to my database, but as I understand it, I would have to run their fork of bitcoind, which would make a new dir of block chain files.
|
|
|
|
John Tobey (OP)
|
|
January 14, 2012, 05:45:35 PM |
|
How do you support multiple chains ? I would like to add LTC and NMC to my database, but as I understand it, I would have to run their fork of bitcoind, which would make a new dir of block chain files.
See the comments about "datadir" in the sample abe.conf.
|
|
|
|
MORA
|
|
January 22, 2012, 02:04:04 PM |
|
Silly me, of cause it was documented I tried the sql file from the torrent today, since the VPs was having a hard time catching up. It didnt work out too well, after importing the SQL, one needs to manually recreate all views, because they contain a security definer pointing to an invalid user. But then Abe can import, however once its done skipping rows, it fails because it tries to insert a block at id 1, if I rerun it, it ties as 2,3,4 etc. I gave up in the end and started the import over again, poor VPS
|
|
|
|
slush
Legendary
Offline
Activity: 1386
Merit: 1097
|
|
January 22, 2012, 02:39:23 PM |
|
MORA, did you updated the file pointer in the DB, as someone stated here?
|
|
|
|
MORA
|
|
January 22, 2012, 02:54:49 PM |
|
Yes, I added it to the bottom of the import script. Also ran it again after fixing the views, to make it rescan and then fail :/
|
|
|
|
John Tobey (OP)
|
|
January 22, 2012, 05:58:54 PM |
|
But then Abe can import, however once its done skipping rows, it fails because it tries to insert a block at id 1, if I rerun it, it ties as 2,3,4 etc.
There must be a problem with the identifier sequences. You can save the database by setting them to correct values. For portability, Abe supports several methods of ID generation (called "sequences" in some DBMSs). You can find out which implementation it chose with: mysql> select configvar_value from configvar where configvar_name = 'sequence_type'; I assume this value is 'mysql' in your case. The 'mysql' sequence implementation associates an empty table with just one column (an auto_increment) with each sequenced table. For example, the next `block_seq`.`id` would become the next `block`.`block_id`. Apparently, the dump/load process did not preserve the tables' internal counters. This script might fix things for you. INSERT INTO block_seq (id) SELECT MAX(block_id) FROM block; DELETE FROM block_seq; INSERT INTO magic_seq (id) SELECT MAX(magic_id) FROM magic; DELETE FROM magic_seq; INSERT INTO policy_seq (id) SELECT MAX(policy_id) FROM policy; DELETE FROM policy_seq; INSERT INTO chain_seq (id) SELECT MAX(chain_id) FROM chain; DELETE FROM chain_seq; INSERT INTO datadir_seq (id) SELECT MAX(datadir_id) FROM datadir; DELETE FROM datadir_seq; INSERT INTO tx_seq (id) SELECT MAX(tx_id) FROM tx; DELETE FROM tx_seq; INSERT INTO txout_seq (id) SELECT MAX(txout_id) FROM txout; DELETE FROM txout_seq; INSERT INTO pubkey_seq (id) SELECT MAX(pubkey_id) FROM pubkey; DELETE FROM pubkey_seq; INSERT INTO txin_seq (id) SELECT MAX(txin_id) FROM txin; DELETE FROM txin_seq;
If you have a chance to try it, please let us know the result.
|
|
|
|
Remember remember the 5th of November
Legendary
Offline
Activity: 1862
Merit: 1011
Reverse engineer from time to time
|
|
January 26, 2012, 04:41:38 PM Last edit: January 27, 2012, 10:22:31 PM by Remember remember the 5th of November |
|
Here is an offset for you guys to use if you indexed above block 93,000(I rounded this), and you screwed ABE somehow.
Change the offset to 52169990 and abe will not start from 0, and offset 930044092 for 163k or so.
|
BTC:1AiCRMxgf1ptVQwx6hDuKMu4f7F27QmJC2
|
|
|
molecular
Donator
Legendary
Offline
Activity: 2772
Merit: 1019
|
|
January 27, 2012, 08:57:26 AM |
|
I'm experiencing a problem I can't quite put my finger on: litecoin@void:~/bitcoin-abe$ Abe/abe.py --config=abe_pg.conf --no-serve block_tx 69604 230654 block_tx 69604 230655 Failed to catch up {'blkfile_number': 1, 'dirname': '/home/litecoin/.litecoin', 'chain_id': None, 'id': Decimal('2'), 'blkfile_offset': 393411764} Traceback (most recent call last): File "/home/litecoin/bitcoin-abe/Abe/DataStore.py", line 2141, in catch_up store.catch_up_dir(dircfg) File "/home/litecoin/bitcoin-abe/Abe/DataStore.py", line 2162, in catch_up_dir store.import_blkdat(dircfg, ds) File "/home/litecoin/bitcoin-abe/Abe/DataStore.py", line 2277, in import_blkdat store.import_block(b, chain_ids = chain_ids) File "/home/litecoin/bitcoin-abe/Abe/DataStore.py", line 1597, in import_block b['ss'] = prev_ss + ss_created - b['ss_destroyed'] TypeError: unsupported operand type(s) for +: 'NoneType' and 'long'
any idea?
|
PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0 3F39 FC49 2362 F9B7 0769
|
|
|
Remember remember the 5th of November
Legendary
Offline
Activity: 1862
Merit: 1011
Reverse engineer from time to time
|
|
January 27, 2012, 10:15:07 PM |
|
Had that happen to me. I tried whatnot, I had to start ALL OVER again.
|
BTC:1AiCRMxgf1ptVQwx6hDuKMu4f7F27QmJC2
|
|
|
molecular
Donator
Legendary
Offline
Activity: 2772
Merit: 1019
|
|
January 28, 2012, 09:18:11 AM |
|
Had that happen to me. I tried whatnot, I had to start ALL OVER again.
clearly the reason is missing value for "block_satoshi_seconds". here are the last 2 blocks: litecoin=> select * from block where block_id >= 68441; -[ RECORD 1 ]---------+----------------------------------------------------------------------------- block_id | 68441 block_hash | 32b88211213afa2967292d62d9ef7062701c0eefba5fc93357150007475f48f0 block_version | 1 block_hashmerkleroot | 479d09b39383373e6966073d9a5d36fbe7daf844f6a69305e959850449f84331 block_ntime | 1326989099 block_nbits | 486596529 block_nnonce | 32342 block_height | 67997 prev_block_id | 68440 block_chain_work | 00000000000000000000000000000000000000000000000000000000000000008cf2a594e20c block_value_in | 50001000270 block_value_out | 55001000270 block_total_satoshis | 341354059492795 block_total_seconds | 9016434 block_satoshi_seconds | 1341706815903229013235 block_total_ss | 1752272617217106557696 block_num_tx | 2 block_ss_destroyed | 203189392579199160 -[ RECORD 2 ]---------+----------------------------------------------------------------------------- block_id | 68442 block_hash | 36228438219fb8599bcdd96163c025e1930b97383c025f9b4950636f9875c3d9 block_version | 1 block_hashmerkleroot | 3803d4347db77a2277e484f06f2043f643e8cc1914fbcbc7b344b959744f3b8b block_ntime | 1326989519 block_nbits | 486596529 block_nnonce | 32838 block_height | 67998 prev_block_id | 68441 block_chain_work | 00000000000000000000000000000000000000000000000000000000000000008cf3ca8e7ea6 block_value_in | 56233821563 block_value_out | 61233821563 block_total_satoshis | 341359059492795 block_total_seconds | 9016854 block_satoshi_seconds | block_total_ss | block_num_tx | 4 block_ss_destroyed |
now trying to fix this somehow, I can see 2 options: - remove that last block (block_id 68442) from db and have it be reinserted (don't know how exactly)
- fill block 68442 (last block) block_satoshi_seconds and block_total_ss with correct values
ideas?
|
PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0 3F39 FC49 2362 F9B7 0769
|
|
|
molecular
Donator
Legendary
Offline
Activity: 2772
Merit: 1019
|
|
January 28, 2012, 09:53:34 AM |
|
ok, I did this: litecoin=> update block set block_satoshi_seconds=1335443292248335697512, block_total_ss=1750296912974188663950, block_ss_destroyed=218178294634342867 where block_id=68442;
which I think is correct (values from another instance). back to operational state.
|
PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0 3F39 FC49 2362 F9B7 0769
|
|
|
|