Bitcoin Forum
June 23, 2024, 03:04:12 PM *
News: Voting for pizza day contest
 
  Home Help Search Login Register More  
  Show Posts
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 [48] 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 ... 315 »
941  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 08, 2016, 08:52:26 PM
so say, @james you only have the blocks data - from 1 to 400000. stored close, on ssd disk or whatever. say ramdisk Smiley
how long does it "load" for you to have all the database content of the block 400000?
do you mean reprocessing all the raw data from scratch? With 8 cores that would take 30 minutes or less depending on if you validate all sigs from block0 or not, maybe a bit longer.

If you have all the processed bundle files, then in the time it takes to memory map 200 files, the data set is accessible to the CPU. So that would be up to the OS/HDD, but I am seeing it map all the files in a few seconds on a lowend laptop and less than a second on a VPS server.

Once the bundle files are mapped, then a block explorer (not blockchain) level data is available for all addresses, blocks, tx

James
942  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 08, 2016, 07:51:40 PM
Assuming you have already the blockchain internally, it might be easier to just directly generate the bundle files. The key is the encoding and if interop is not a constraint, then you can just use a similar model without making it exactly compatible.
Sounds good, but can you express it in a way that I could understand, what does it actually mean to do for the software?
to encode the tx data in the following format:


struct iguana_txid { bits256 txid; uint32_t txidind,firstvout,firstvin,locktime,version,timestamp,extraoffset; uint16_t numvouts,numvins; } __attribute__((packed));

struct iguana_unspent { uint64_t value; uint32_t txidind,pkind,prevunspentind,scriptoffset; uint16_t hdrsi:14,type:4,vout:14; } __attribute__((packed));

struct iguana_spend { uint32_t spendtxidind,scriptoffset:31,coinbase:1; int16_t prevout; uint16_t numsigs:4,numpubkeys:4,p2sh:1,sighash:4,external:1,sequenceid:2; } __attribute__((packed));

struct iguana_pkhash { uint8_t rmd160[20]; uint32_t pkind,pubkeyoffset; } __attribute__((packed));

I describe this in the iguana child board in a lot more detail

Since I am not you, I do not know what expression requirements exist to allow proper parsing and semantic understanding by you.

Maybe you can ask the specific questions after reading the iguana threads?

James
943  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 08, 2016, 06:13:25 PM
James, can you please describe how you actually implemented your db?
I'm especially interested in the index part, that (according to you) fits into the CPU's cache.

I've been trying to read through your code, but its shit loads and I don't have that much patience.
I once implemented sipa's secp256k functions in Go, so I do have some patience... but it has its limits Smiley

Say, I'd like to implement the read-only database you made, in another language - what does it take?

You would need a high end CPU to fit it all in L caches and not all the "DB" would fit into CPU cache, unless you partitioned queries for the different datasets to different cores and strongly influenced the content of each CPU's cache

iguana_ramchain.c has most of what is needed, but right now it is a construction zone as I just change the data structures to support lossless encoding/decoding and quite tricky doing that with all the constraints.

Assuming you have already the blockchain internally, it might be easier to just directly generate the bundle files. The key is the encoding and if interop is not a constraint, then you can just use a similar model without making it exactly compatible. Of course, having an independent implementation would be a fantastic cross check, so if you are up for that we can work out a strict memory layout. Then you will know you made things bugfree if it makes the identical file and debugging is quite the chore without this...

I am starting to describe in some detail the internal data formats: https://bitco.in/forum/forums/iguana.23/

The codebase is highly tuned and optimized to do the parallel sync and due to this, I have each peer's thread create the first pass file(s). Then helper threads aggregate a bundle and generate the bundle file. the ramchain_iterate handles both the raw and expanded forms, so maybe a port of iguana_ramchain.c is a good way to port it.

The functions ending with PT indicate it is running in a peer's thread and not all functions are needed. the bundlesaveHT function runs from a helper thread and it is the one doing the combining

James
944  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 08, 2016, 05:29:21 PM
this looks like a right way to go, but..

1) I failed to install the software on my linux, following the docs. Or to build the sources....

2) I don't think TCP based API is suitable for the needs of a DB we're discussing here

how do you imagine browsing trough a content of the current 35M+ UTXO records whit this TCP based API?
If you just want an embeddable DB, http://sophia.systems/ is pretty good. I see they released a version 2 with hybrid in-memory, which is good as my feedback to them was having everything HDD made it too slow.

The sophia 1.2 was 10x smaller than BDB and much faster, so 2.0 version should be that much better.

but nothing would have the performance/footprint profile of the read-only memory mapped bundles. I just cant double the size of the code just to add a DB, especially as I could just use a deterministic reverse hash chain for a wallet's keypairs. Loss of the wallet passphrase would compromise the privkeys, but it seems the same result as storing independent privkeys in a DB encrypted with the same passphrase.

Making slow but steady progress with the debugging and documenting it here: https://bitco.in/forum/threads/debugging-status.950/

James
945  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 07, 2016, 08:55:27 PM
There are some intel chips with 128MB L4 cache (shared cpu/gpu). Obviously the further "away" it is from the cores, the slower it will be, but still you'd expect it to be at least twice as fast as the ram.

http://www.extremetech.com/extreme/188776-how-l1-and-l2-cpu-caches-work-and-why-theyre-an-essential-part-of-modern-chips

https://forums.aida64.com/topic/2864-i7-5775c-l4-cache-performance/

The 10x to 40x speed improvements creates very nonlinear performance boosts when the dataset is small enough

James
946  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 06, 2016, 08:22:06 PM
Looks pretty interesting. I was looking for a lightweight DB to use for storing the privkeys for the wallet.

Thanks!

from the custom RAM data structures I have achieve millions of operations per second, but it all depends on the specifics as to what performance level is possible. Getting the entire dataset to fit inside the CPU's Lx cache is what gets things going really, really fast. Once things need to "swap" out to the DRAM, it slows down sharply. And if it has to hit the HDD... even if it is SSD, it is an order of magnitude or more, slower.

But for non-performance critical things like tracking wallet privkeys, these performance issues are not really an issue. (unless you are satoshi and have thousands and thousands of privkeys)

James
947  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 06, 2016, 04:53:28 AM
You're mixing up things. Probably because you don't really understand what the UTXO database actually does inside the bitcoin software.

Spreed of accessing the records is important for obvious reasons.
The traditional reason is that you want to verify new transactions/blocks quickly and efficiently.
And the new reasons, like hardware wallets or any kind of stealth payments.

The current approach from the satoshi's client, where the wallet is glued to the node and keeps track of its addresses, updating their balance as new blocks appear - IMHO this solution has proven to be unreliable. Plus is totally unfeasible for private offline wallets.

An efficient UTXO database that can quickly browse through all the records is definitely a must for the next generation of bitcoin node software.
I am just not aware of any existing engines that would be ready to replace the currently used leveldb.
I think one eventually needs to be (and will be) created explicitly for Bitcoin. Just like sipa's secp256k1 lib was created explicitly for Bitcoin, as the originally used openssl's implementation wasn't good enough to handle a crucial part of the software.
I had this discussion about 3 years ago with etotheipi, the ex-CEO of ex-Armory:

New blockchain management in Armory

https://bitcointalk.org/index.php?topic=144015.0

Now we are sure that Armory was a total loss as a business: they delivered promised software but achieved exactly zero sales, not even a consulting contract for the ex-employees.

I'm not going to repeat the arguments from 3 years ago, interested readers could study the Armory trainwreck.

AFAIK Armory sometime in 2015 also understood that LevelDB is not really suitable and attempted to change the underlying engine to LightningDB:

https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database

Notwithstanding the above I agree with you that distributed cyptocurrencies both pose unique requirements and provide unique opportunities for optimization. In particular the traditional database ACID is an overkill for UTxO, something much simpler will work due to extraordinary replication that is fundamental to the concept of distributed cryptocurrencies.

I always like to collect the losers who deny the need for transactional consistency is critical. GLBSE and ASICMINER were the two most well known occurrences of double payment. Couple of days ago BitBet joined the club, this time blaming it on miners. I'm going to quote the whole Qntra article for my future reference here:
A Miner Problem

Posted on March 2, 2016  by  Mircea Popescu   
  
As announced in #bitcoin-assets, BitBet was attacked earlier today through a transaction withholding mechanism. The attack unfolded as follows :

1. A BitBet bet (Jeb Bush will be Republicans' 2016 Presidential Nominee) was closed and resolved on 21 February. This created a list of winners with a reasonable expectation to be paid their winnings.

2. A first transaction was broadcast, to satisfy the claims of the winners, spending some inputs, and offering a fee of 0. This transaction was, as you'd expect, neither mined nor included in the mempools of most Bitcoin nodes.

3. A second transaction was broadcast, spending the same inputs as A1, including a fee of 0.0001, call it A2. For a ~1.5kb transaction, this fee is on the low side, so it perhaps could be argued that it not being mined would be expected. Nevertheless, transaction A2 was also not included in the mempools of most Bitcoin nodes.

4. As neither transaction A1 or A2 were mined after 54 (48) hours, a further transaction was broadcast, spending the same inputs as A1 and A2, and including a fee of 0.000175, call it A3. By any measure, a fee in excess of 10 satoshi per byte should be sufficient to have transactions mined. Nevertheless, contrary to expectation, transaction A3 was not included in either a block or the mempools of most Bitcoin nodes.

5. After a further 48 hours, a fourth transaction was broadcast, spending the same inputs as A1, A2 and A3, and including a fee of 0.00022, call it A4. Just like the previous three, transaction A4 was not either included in a block or advertised by most Bitcoin nodes.

6. After a further 16 hours, transaction B was broadcast, that included the same outputs as transactions A1-A4, but different inputs. Transaction B, like transaction A4, included a fee of 0.00022. Transaction B was advertised by most Bitcoin nodes immediately thereafter, and was included in a block within half hour of being broadcast.

7. Two hours after B was included in a block, transaction A1 was re-broadcast by an unknown third party. Twenty minutes later, the 0 fee, week old, not-advertised-by-anyone-ever transaction was included in a block.

On the basis of these events, the following allegations can readily be supported :

• That the notion of "a majority of Bitcoin nodes" is void of content, most of the relay network being under the control of the same entity and supporting functionality not contemplated by the Bitcoin protocol (such as selective mothballing of specific transactions).(1) This specifically means that "someone"(2) has the ability to nuke transactions out of the network irrespective of whether they are validly signed or not, directly replicating functionality already available to fiat governments in fiat payment processors such as Visa or Paypal.

• That a cartel of Bitcoin miners is deliberately and systematically withholding blocks for an interval of about 20 minutes to a half hour, so as to ensure themselves a (significant) advantage over any would-be competitors.(3)

• That neither above item makes sense or could long survive without the other, meaning that necessarily the culprit is one and the same. Note for the record that an entity controlling 51% of the network (as is by a large margin the case with the Antpool/F2pool cartel) can safely withhold blocks for an indefinite period – if a competitor publishes a longer chain all the cartel has to do is keep mining on its own, eventually they'll prevail.(4)

Note that there are no alternative theories that more parsimoniously explain the chain of observed phenomena, and consequently the validity of the above allegations is a matter of objective truth. One may disagree on the strength of subjective feeling, or put forth his own interpretations, but these – much as in the case of alternative explanations for biological evolution – suffer from a fundamental logical weakness.

Update March 2, 2016 18:49 UTC:  In a further development on this story, Bitbet has now announced a moratorium on payments until the issue is resolved. (N. ed.)

1. You should remember that the Bitcoin relay network has for many years been in very poor shape – there were as little as 60 nodes active last year around this time. Since then the situation has only deteriorated – the jury is still out as to whether BlueMatt's doohicky actually helped or hindered while it was functional, but there's no argument that the network degraded YoY.

2. Who, according to your own preference, may be the NSA or the Antpool/F2pool cartel.

3. The A1 transaction wasn't broadcast and then included in a block – at the time it was broadcast, the block containing it had already been mined – it simply hadn't yet been shared with the rest of the plebs is all.

4. 51% means that out of a day's 144 blocks, one has 74, and consequently gains a 3 block advantage over a competitor chain every single day..It is true that the cartel does not generally wish to advertise its presence (yet), and so to date they've avoided major reorgs. Bear in mind however that it is also the case that the remainder minority is not united but fragmented – so they don't actually need to resort to major reorgs.
Edit: formatting fixes
regardless of what "DB" is used, of course transactional consistency is critical. Not sure how you got the impression that this was not the case.

Now in the case the network does a deep reorg, then regardless of what DB or non-DB is used, things will be out of sync, until the deep reorg propagates. As long as the non-DB solution generates the same results as a DB solution, then whatever the external issues are, are external to the local storage methods

James
948  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 05, 2016, 10:51:52 PM
I think it's very important to be able to browse through all the records in a shortest possible time.

I disagree, the major requirements

verify utxo spend
verify utxo amount
remove used txo
add new utxos from block
reorganize revert utxo
yes, this was my thinking with my design. most of the operations above are very fast.
adding utxo from new block does require updating all the various datasets, but all in all not so much time

the reorg revert utxo is the only slow one, but I delay finalizing the bundle until 10 blocks past, so the odds for reorg to invalidate bundle are pretty small. If it happens, several minutes to regenerate the bundle file.

the realtime bundle should be small enough so any reorg can simply recalculate from the max reorg depth
949  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 05, 2016, 09:56:02 PM
Once all the balances for all the addresses are verified at all the bundle boundaries, then "all" that is left is the realtime bundle being validated. The brute force way would just be to generate a bundle without all the blocks, as each new block come in. I guess it wont be so bad as both the bloom filter and the open hashtable was designed to be able to be incrementally added to.
Although I'm not quite sure what you mean by "brute force", indeed the way to go is to keep just a blockchain secured snapshot ot the utxo db.
Its the only way to go.
Everybody knows that, but nobody does anything about it Smiley
I have sha256 hashes of all the internal datasets, so it is possible to quickly find out where any mismatch is between two datasets (though it is contaminated with endian dependence at this time). How to know who put the utxo hash into the blockchain? If there is a way to do this without it being vulnerable to attacker posting his version, then it would be great to have

by bruteforce, I mean iterating through each block, tx, address and compare results to a known good data source.

Once it confirms at a specific bundle boundary, then it is all wrapped into a read-only filesystem, which by their nature are pretty tamper proof and also adds a standard lz compression/decompression layer transparently

This means the only things that would ever need to be recalculated are the most recent bundle (max 2000 blocks) and even that is possible to be reduced with partial bundles saved as the realtime bundle grows. It is a tradeoff between startup calculation time and runtime backups

James
950  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 05, 2016, 09:40:16 PM
sounds like a good stuff.
I'll give it a better look once I get sober Smiley

so can you refer to some specific code that implements this?
github.com/jl777/SuperNET

in the SuperNET/iguana dir the iguana_*.c files have this, mostly in iguana_ramchain.c. No comments to speak of and I like to pack as much into each line as possible, so it is 3x to 5x denser than most the C code out there.

I will properly document everything once it is all finalized.

As you might imagine, it is quite tricky to get all the bits in the right place, so validating the dataset is my current priority. I added ability for it to act as a lossless codec for the rawtransactions, which would allow a brute force comparison with the reference bitcoind.

Once that is confirmed, then the next step is to validate the block explorer level data, but I am not familiar enough with any to generate the reference data. Hopefully someone around here is.

Once all the balances for all the addresses are verified at all the bundle boundaries, then "all" that is left is the realtime bundle being validated. The brute force way would just be to generate a bundle without all the blocks, as each new block come in. I guess it wont be so bad as both the bloom filter and the open hashtable was designed to be able to be incrementally added to.

James
951  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 05, 2016, 09:25:01 PM
Quote
Not sure what sort of things "like to do anything on each single one of them" you would want to do across all UTXO. To my thinking, other that total balance, the UTXO is not spanning across all addresses, so if you can extract all the relevant tx per address efficiently, that is sufficient.
ok, so let me ask you this way.

you have your utxo database, or so you call it.
you get a new transaction and you have to verify it.
how much does it take?

i hope you know that you need to fetch each tx's input from the db, verify its consistency and then put the changed records back to the db.

FETCH:
For each vin, the txid/vout is mapped to the corresponding unspentind
I have a bitwise hashtable that is at most 50% full, so an openhash table works pretty well. I think the worst case is about 20 collisions, but each one is just a matter to increment to the next item.

once the txid is found, the vout tells us where the unspent is, as each txid has the firstvout as part of its data. at most 200, but on average scanning backwards finds the txid quicker than N/2 bundles for obvious reasons.

Using the listunspent as a comparison, that has to also traverse a linked list and it would need to do that for each bundle, so the updating utxo would be an order magnitude less work. Dont have benchmarks yet, but probably in the 100 microsecond range

VERIFY:
Once the unspentind is retrieved, the utxo bitmap is checked (instant)

PUT:
assuming all is well, the utxo bitmap bit is set to spent (instant)

On startup the utxo bitmap does need to be regenerated and updated from the last bundle boundary. I plan to make utxo dataset saves more frequently than 2000 block boundaries, maybe every 10, so only 10 blocks worth needs to be replayed to get to a current utxo set

James
952  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 05, 2016, 08:43:29 PM
In its raw form with all the data scattered across the 200 bundles, a simplistic search for a specific address took 2.5 milliseconds on a 1.4Ghz i5. Not a parallel process, but serial. Using faster CPU and 8 cores, I think times of 100 nanoseconds could do a complete search and sum for a specific address.
Yes, but I'm not interested in checking balance on addresses. I just want to know how fast it is to go trough each of the UTXO db records - like to do anything on each single one of them.

As for the rest, the memory mapping, storing it on disk and stuff, I think you're on to something.
But first I just want to know how fast your utxo db works comparing to mine. Smiley

it took 2.5 milliseconds to do a listunspents for an address with thousands of tx on my laptop. It could have been any address (once I get all the bugs fixed)

If you wanted to scan all addresses, the totally unoptimized method would be taking a long time as it is designed for parallel searching specific address, but once the optimized dataset is there, then a rich list could be calculated in a few seconds.

Not sure what sort of things "like to do anything on each single one of them" you would want to do across all UTXO. To my thinking, other that total balance, the UTXO is not spanning across all addresses, so if you can extract all the relevant tx per address efficiently, that is sufficient.

Maybe I missed something that needs to rescan all utxo? With my setup importprivkey is not needed as all addresses are basically already there. If the millisecond time is too slow, then it can of course be dereferenced into 6 bytes per utxo.

All utxo are in a linked list inside each bundle and a hash table allows to find the end point directly. Additionally a bloom filter is added to each bundle to quickly determine if there is a need to search within the bundle.

If you just want all utxo to scan through, just iterate through the utxo bitmap, extract the utxo data and put it all into a single memory mapped file. But not sure it is worth spending a GB to do this as all the unspent data is already in memory mapped files, so 6 bytes per unspent is enough to make a list for each address's utxo. Then with a direct lookup of that address, you find all the utxo

struct iguana_unspent // 28 bytes
 {
    uint64_t value;
    uint32_t txidind,pkind,prevunspentind,scriptoffset;
    uint16_t hdrsi:14,type:4,vout:14;
 } __attribute__((packed));

What I am saying is that there is no need for DB. All operations can be done essentially in RAM via memory mapped files, with a few exceptions for realtime address balances, which need's r/w memory

James
953  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 05, 2016, 08:23:20 PM
for utxo, my approach with iguana creates a parallel set of datasets in read only memory mapped files. Arguably, this is the ultimate DB as it simply cant get any faster than this.

All things needed for block explorer level queries are precalculated and put into the read only files.

By having each bundle of 2000 blocks being mostly independent of all other blocks, this not only allows for parallel syncing of the blockchain but also using multiple cores for all queries regarading tx history.

I also calculate all address balances for each bundle, so by adding all the values across all the bundles you end up with the current balance as of the most recent bundle. Special case handling of the realtime bundle gets us up to date block explorer data for all blocks, tx, addresses (even multisig)

I dont quite have everything done, but the basic structure is showing very good results. I also remove as much redundancy as possible during the creation of the read-only files so instead of taking more space than the raw blockchain, it ends up taking about half the space. About half of that is the signatures, which can be purged for non-relaying nodes after it has been validated.

The format of the utxo vector would just be the location of the unspent that is being spent and the location of the spend is implicit in the order of the vins.

[u0, u1, u2, u3, ...] is a list of 6 byte locators for the unspent that the corresponding vin in the bundle is spending, where each ui contains the bundle id and the unspentind. Each unspentind is simply the order it appears in the bundle (starting at 1).

the above is not directly usable, but it is invariant and can be created once and put into the read-only file(system). using mksquashfs reduces the size of the index tables by almost 50%.

Given such a list for all bundles, then using a parallel multicore process, they can all be traversed and update a bitmap to indicate which outputs are spent. So a 0 bit for this means it is unspent. Maybe 1 bit of space per utxo is good enough, as that is seemingly as good as it gets, but it is 1 bit per vout, so more like 30MB size. However, I think this can be compressed pretty good with most of the early vouts already being spent (ignoring the satoshi ones). I havent had a chance to get an up to date utxo bitmap, but I hope to get one in the next week.

At the 30MB size or smaller, the entire utxo bitmap will fit into CPU L cache and provide 10x faster performance than RAM. RAM is actually quite slow compared to things that can be done totally inside the CPU.

James
I'm sorry I don't have time to dig into your code, though I'm sure its a good stuff.
So how do you actually store the db in memory, in some simple words?

I use hashmaps and its pretty fast, but also very primitive.
Plus it takes loads of memory and ages to load from the disk.
And writing it into disk - separate problem.

I'm sure there must be so much space to improve it.

Can you say how many seconds you need to browse through all the current utxo records and calc, lets say, the sum of all the values.
With my engine I cannot get below 4 seconds anymore.

UTXO db is over 35 million records now.
It really needs a better shit than leveldb
In its raw form with all the data scattered across the 200 bundles, a simplistic search for a specific address took 2.5 milliseconds on a 1.4Ghz i5. Not a parallel process, but serial. Using faster CPU and 8 cores, I think times of 100 nanoseconds could do a complete search and sum for a specific address.

But if the usecase is to keep all the totals updated, then after each bundle is completed, the thing to do is update the utxo as of that bundle boundary. That is a one time calculation of adding the spends starting from the previous bundle boundary.

What I am saying is to create write once data set into memory mapped files. So it is stored in the file. memory mapping it allows the CPU direct access to it.

The onetime update would iterate through the bundle's spends and update the utxo bitmap, but it would also update the total balances for each address. So at the end of each bundle, there would be the up to date total of all spends for each address.

Within each bundle is also a set of balances for all addresses that occur in that bundle. Again, with a onetime update, all the address balances can be updated with the changes from that bundle. This is not a mainstream case, so I dont do this automatically in iguana, but it wouldnt add too much time to the overall processing assuming these total balances are updated at the end of each bundle boundary.

It would be vin number of additions that occur in each bundle to get the total spends for addresses in that bundle and then one addition for each address to the global balances. Doing it this way, to get an up to date balance would be a matter of indexing into the global balance list and then adjusting it by the realtime bundle's vins and vouts.

In one implementation I updated all the balances as new tx came in, so it took no time to get a total summary. Checking that total by summing all the accounts took a few milliseconds, but sorting it into a rich list did end up taking a few seconds. It used up HDD space for storing all the balances, 16 bytes per address, so it does add up the space required if all addresses are needing to be updated. but a CPU can do tens of millions of adds quite quickly through RAM

James
954  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 05, 2016, 04:12:57 PM
It really is a pointless question.

I think it's very important to be able to browse through all the records in a shortest possible time.

For whom is the speed so important?

For the people who treat finances responsibly the transactional integrity will be most important specification, immediately followed by capability to integrate with existing financial and accounting middleware. Speed would be secondary or even tertiary, especially since the UTxO set is widely replicated and six confirmations take one hour on average.


All users value speed. Otherwise we would still be using the slowest CPU that will eventually complete the tasks.

While speed and reliability can be a tradeoff decision, it is actually possible to get more reliability and speed at the same time. For example, if you are needing many subsystems to all work vs one, you can get a more reliable result that is based on just one thing, vs one that requires many parts.

Of course it is most likely that any such speedy and reliable system is more complicated to write, but once written it can be validated

James
955  Bitcoin / Development & Technical Discussion / Re: An optimal engine for UTXO db on: March 05, 2016, 03:56:58 PM
for utxo, my approach with iguana creates a parallel set of datasets in read only memory mapped files. Arguably, this is the ultimate DB as it simply cant get any faster than this.

All things needed for block explorer level queries are precalculated and put into the read only files.

By having each bundle of 2000 blocks being mostly independent of all other blocks, this not only allows for parallel syncing of the blockchain but also using multiple cores for all queries regarading tx history.

I also calculate all address balances for each bundle, so by adding all the values across all the bundles you end up with the current balance as of the most recent bundle. Special case handling of the realtime bundle gets us up to date block explorer data for all blocks, tx, addresses (even multisig)

I dont quite have everything done, but the basic structure is showing very good results. I also remove as much redundancy as possible during the creation of the read-only files so instead of taking more space than the raw blockchain, it ends up taking about half the space. About half of that is the signatures, which can be purged for non-relaying nodes after it has been validated.

The format of the utxo vector would just be the location of the unspent that is being spent and the location of the spend is implicit in the order of the vins.

[u0, u1, u2, u3, ...] is a list of 6 byte locators for the unspent that the corresponding vin in the bundle is spending, where each ui contains the bundle id and the unspentind. Each unspentind is simply the order it appears in the bundle (starting at 1).

the above is not directly usable, but it is invariant and can be created once and put into the read-only file(system). using mksquashfs reduces the size of the index tables by almost 50%.

Given such a list for all bundles, then using a parallel multicore process, they can all be traversed and update a bitmap to indicate which outputs are spent. So a 0 bit for this means it is unspent. Maybe 1 bit of space per utxo is good enough, as that is seemingly as good as it gets, but it is 1 bit per vout, so more like 30MB size. However, I think this can be compressed pretty good with most of the early vouts already being spent (ignoring the satoshi ones). I havent had a chance to get an up to date utxo bitmap, but I hope to get one in the next week.

At the 30MB size or smaller, the entire utxo bitmap will fit into CPU L cache and provide 10x faster performance than RAM. RAM is actually quite slow compared to things that can be done totally inside the CPU.

James

956  Alternate cryptocurrencies / Announcements (Altcoins) / Re: [PRE-ANN] WAVES. Ultimate crypto-tokens blockchain platform. on: March 05, 2016, 01:44:48 PM
I confirm that I will be advising and investing in this project, especially on methods to add BTC security to other chains and enabling asset mobility with the asset passport system.

James
957  Alternate cryptocurrencies / Announcements (Altcoins) / Re: Declaration of Independence - Atomic Cross Chain Asset Standard on: March 05, 2016, 12:53:37 AM
Once each block has those, then how can an attacker rewrite the history?

It's very easy for an attacker to rewrite history. Suppose normal chain has blocks A1, A2, A3 which reference T1, T2, T3.

Attacker will create alternative blocks  A1', A2', A3' which also reference T1, T2, T3.

Perhaps he won't be able to create block A4 before T4 is known, but this has nothing to do with rewriting history.

Once things go past the +/- 1 Ti blocks segment, maybe there are ways to overcome the earliest/latest bracketing. Maybe you can point out the obvious way to overcome the entire network checking each submitted block for valid time sequence? It is possible, but it isnt totally wrong.

When you analyze consensus algorithms you should consider behavior of a new node which joins the network and doesn't have any pre-assumptions. So, suppose a new node connects to one honest node and one attacker's node. Suppose honest node gives it a chain which ends with A1...A100 and attacker's chain ends with blocks A1'...A100'. How can a new node tell which of them is valid? Timestamps don't help here.

It can be mathematically proven that your block-base timestamps are no better than ordinary UNIX timestamps, assuming that node clocks aren't horribly out of sync.
I believe your mathematical proof is not using the proper assumptions. Of course you can make some assumptions to prove what you say, but  a bigger context can provide the external reference for the bootstrapping node.

With many chains all part of the system, they will all have hashes from the other chains. Wouldnt that require an attacker to compromise the majority of participating chains?

Even assuming the attacker has more than half the nodes, the new node will see many with A1... A100 along with the A1'...A100', however on all the other chains the A1'...A100' is not there.

So all the participating chains provides security to all the other chains. If each chain specifies which external chains they find as valid, this will prevent an attacker to set up a bunch of fake external chains.

Thank you for specific example. this is what is needed to make a bullet proof system. I believe you are agreeing that the sequence server does work for nodes that are current and the issue was the history attack. Let me know how you can mathematically prove that having a set of external chains cross verifying each other cannot provide an external reference for a bootstrapping node. Bootstrapping a node is a different case than confirming blocks, so it might be more than one method is needed to cover both aspects.

James

P.S. While I would like the BTC chain to also have the data, clearly it wont go into the bitcoin core, so the problem of how to validate data in the BTC chain needs to be solved before that can be used
958  Alternate cryptocurrencies / Announcements (Altcoins) / Re: Declaration of Independence - Atomic Cross Chain Asset Standard on: March 04, 2016, 12:10:36 PM
Of course if the external chain is some arbitrary ledger that can be updated at will by some corporate entity, then these security assumptions do not hold up. What is required is another rule in the external chain that after R permanent Ti blocks have passed, that it is permanent.

Aha, so we just need to have a consensus to have consensus, got it.

You're begging the question. The whole point of this exercise is to make sure that transaction history cannot be rewritten. If you just assume that it cannot be rewritten, then you don't need all this Ti mumbo-jumbo.

Inclusion of Ti does not make it any harder to rewrite the history, as attacker has access to all Ti and thus he will be able to create a seemingly valid chain which references all Ti blocks but is completely different.

I think you've got things wrong, to improve security you need to reference alt-coin blocks from Bitcoin blocks, not the other way around. This is known as anchoring, it's already used in several projects (for example, Factom) and it can make alt-coin consensus as strong as Bitcoin. (See here: https://bitcointalk.org/index.php?topic=313347.0)

Inclusion of Ti allows to create new consensus rules that create both a "before" and "after" time relationship.

The "after" simply by the proof of common sense that if you refer to a specific blockhash, it most likely happened after that blockhash came into existence. The exact time of this is a bit fuzzy, but we can get it narrowed down to a relatively narrow time frame.

Did I get that part totally wrong?

So how to get the "before" part? This requires the chain to do some validation of all blocks and make sure it is referring to the currently valid Ti. There is a propagation delay that could make things off by one, but we assume propagation times in the network dont span a bitcoin block generation time.

If I didnt get things totally wrong, then this means each block happened after Ti, but before Ti+2 occured.

So we can now tag each block with an "earliest" and "latest" time reference.

Once each block has those, then how can an attacker rewrite the history? he can only create blocks within the allowed "earliest" and "latest" range of 2 bitcoin blocks. Yes, within this 20 minutes, I guess he can do whatever he wants, if that is the assumption. So any transactions within this timeframe is like a transaction without enough confirmations.

Once things go past the +/- 1 Ti blocks segment, maybe there are ways to overcome the earliest/latest bracketing. Maybe you can point out the obvious way to overcome the entire network checking each submitted block for valid time sequence? It is possible, but it isnt totally wrong.

Also, I think you are the one that is backwards, as it requires some blockchain group signature to validate data that is put into the BTC chain to use as a cross reference. That is possible, but it is a bit on the complicated side, though I do have ideas on how to do it. Just not having all the details yet.

For now, I want to make sure the time sequence can be established and across all chains that follow the same simple set of rules.

James
959  Alternate cryptocurrencies / Announcements (Altcoins) / Re: Declaration of Independence - Atomic Cross Chain Asset Standard on: March 04, 2016, 11:59:20 AM
Hi James,

how does this relate to the (concept of) BlockNet project? Any thoughts?

thanks
no idea about blocknet details
960  Alternate cryptocurrencies / Announcements (Altcoins) / Re: Declaration of Independence - Atomic Cross Chain Asset Standard on: March 04, 2016, 11:58:46 AM

Where is your claimed Privatebet project?
Where is your claimed Skynet project?
Where is your claimed NXT2PAY project?
Where is your claimed Pangea project?
Where is your claimed crypto777 project?
Where is your claimed InstantDEX project?
Where is your claimed NXTcoinsco project?
Where is your claimed NXTprivacy project?
Where is your claimed jl777hodl project?
Where is your claimed SuperHODL project?
Where is your claimed SNN project?
Where is your claimed SpaceMesh project?
Where is your claimed omnigames project?
Skynet http://docs.finhive.com/api/docs.cgi?run=page&sub=home
Privatebet, pangea, crypto777, InstantDEX, NXTcoinsco and NXTprivacy are in the supernet codebase docs.supernet.org

jl777hodl and superHODL have been completed for over a year since they are still doing the HODL thing.

Spacemesh https://docs.google.com/spreadsheets/d/1ya9fMS0Z0-xSew2rhLcHKbB9y4HVERwk9t-G1W4PzlY/edit?pref=2&pli=1

But you got me on the others, those projects I am not directly involved and and they are of unknown status.

Also, didnt you notice this is a call for interop standard and that I am not asking for any funds?

James
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 [48] 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 ... 315 »
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!