Bitcoin Forum
November 10, 2024, 08:55:08 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 [4] 5 6 7 8 9 »  All
  Print  
Author Topic: [PRE-ANN] Qeditas: A Formal Library as a Bitcoin Spin-Off  (Read 15002 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
June 12, 2015, 01:57:36 PM
Last edit: August 05, 2015, 08:41:46 PM by Bill White
 #61

I added some ocaml code to the qeditas git repo:

git clone git://qeditas.org/qeditas.git

It includes code for type checking and proof checking, cryptography related code, and code for assets (including documents with definitions, proofs, etc.), transactions, compact trees (a form of Merkle tree for approximating the state) and blocks. There is still a significant amount to be done (consensus code, networking code, interface). However, it's probably best for the code so far to be public. This shows the project is progressing and gives people a chance to comment on the code or ask questions about it.

It can be compiled with configure and make or with the script makebytecode.

In some ways Qeditas has diverged from the Coq formalization (qeditastheory). After adding theories, it was unclear how to handle ownership of objects and propositions. Someone can claim ownership of a proposition after proving it, but it was unclear if this ownership should be specific to a theory or should be for all theories. Making it general seems to be wrong because some propositions are provable in some theories but not in others. As a simple example, consider excluded middle, i.e., the proposition (forall x:Prop, x or not x). On the other hand, making ownership specific to theories introduces the problem that someone could simply start a new theory with a minor difference from an existing theory and then copy and republish all the work done for the existing theory.

My decision for now is to have (potentially) separate owners for both the general version and the specific version. This means Alice can be the first to prove excluded middle in Theory A and become the owner of the general version as well as the specific version for Theory A. Later Bob could prove excluded middle in a different Theory B, but he could only claim ownership of the specific version for Theory B (since Alice already owns the general version). If someone wanted to import excluded middle (without a proof) in Theory A, Alice could require the purchase of two kinds of rights: general and specific. If someone wanted to import excluded middle (without a proof) in Theory B, Alice could require general rights and Bob could require specific rights.

Qeditas will support "bounties" on unproven propositions, but these should only be placed on the theory specific versions. If there is a bounty on a general proposition, then anyone can publish an inconsistent theory (e.g., the theory with one axiom: False), give a trivial proof of the proposition in that theory, and then claim the bounty. This restriction on bounties should be enforced at the protocol level, but it currently isn't.

The code for signing via endorsements is there (see signat.ml and script.ml). Endorsements are Bitcoin signed text messages of the form

Qeditas endorsement <Qeditas address>
endorse <Qeditas address>

People can use these to sell their initial distribution even before the Qeditas network is running. I will make a separate post about endorsements with details.

Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
June 23, 2015, 09:56:32 PM
Last edit: June 26, 2015, 03:29:53 PM by Bill White
 #62

I'm once again having trouble with github, so none of my repositories are visible at the moment. It's unfortunate since github is such a popular website for open source projects, but I suspect the best option is for me to stop using them. I will set up git on a server I control and find some way that people can still clone and pull the code from it. I will also need to go through my posts and remove the links to my github account.

In the meantime, it's still possible to clone my five repositories in their current state even without my account being visible. I have up to date copies already, but if anyone else wants them:

git clone https://github.com/billlwhite/cryptohash.git
git clone https://github.com/billlwhite/ledgertheory.git
git clone https://github.com/billlwhite/qeditastheory.git
git clone https://github.com/billlwhite/egal.git
git clone https://github.com/billlwhite/qeditas.git

It's unclear how much longer they'll be there. In any case, I won't be updating them on github anymore.

sfultong
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
June 26, 2015, 01:01:24 AM
 #63

Using bitbucket or another public git repo provider is probably more expedient. It's rather bizarre that you've experienced visibility issues with github.
Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
June 26, 2015, 02:38:54 PM
 #64

Using bitbucket or another public git repo provider is probably more expedient. It's rather bizarre that you've experienced visibility issues with github.

Thank you for the suggestion. I looked at bitbucket briefly the first time github made my account invisible, but at this point I'd just prefer to minimize my dependence on third parties.

The five git repos I had on github are now available to be cloned from qeditas.org:

Code:
git clone git://qeditas.org/qeditas.git
git clone git://qeditas.org/qeditastheory.git
git clone git://qeditas.org/cryptohash.git
git clone git://qeditas.org/ledgertheory.git
git clone git://qeditas.org/egal.git

They can also be viewed using gitweb here:

qeditas.org/gitweb

If someone wants to contribute, clone the relevant repo and then send me the information so that I can add your repo as a remote. Then let me know (here or PM or billwhite at bitmessage.ch) when you have pull requests.

Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
July 07, 2015, 09:06:57 PM
 #65

I added some (untested) consensus code. It's proof-of-stake with a bit of proof-of-storage. The code is in block.ml:

http://qeditas.org/gitweb/?p=qeditas.git;a=blob;f=src/block.ml;h=852f51b4c8cb1147970fc48ef40df54a62a20c51;hb=HEAD

Every block is signed by a staker who has a hit. The function check_hit determines if there's a hit. A hit is determined by hashing the assetid of the staker with the number of seconds since the last block ("deltatime") and a stake modifier. If there is no proof-of-storage element, there is a hit if the hash value is less than the target times the number of coins in the staked asset.

I was originally thinking of calculating the stake modifier like Blackcoin does (or at least the description of Blackcoin in the Neucoin white paper). In the end I did something a little simpler. It can be gamed a bit, but I think it's not enough to be dangerous. I prefer a transparent method of choosing the stake modifier to an obscure one.

For each block there is a 256-bit current stake modifier and a 256-bit future stake modifier. The current stake modifier is the one used to determine the current hit. The next stake modifier is obtained by dropping one bit, shifting all the bits, and adding a new bit popped from the future stake modifier. The person who stakes the block can freely choose one new bit to push onto the new future stake modifier. Note that this means the stake modifier for the next 256 blocks is always known.

The proof-of-storage component works as follows: If the staker includes a valid proof-of-storage, then the coins in the staked asset count for 25% more when calculating if there is a hit. A valid proof-of-storage is either an approximate term (tm) or an approximate document (pdoc). It should be such that everything is abbreviated by a hash except one leaf. When this leaf is hashed along with the hashroot of the tm/pdoc and the address of the staker, it should give a special kind of hash (1 chance in 65536). This can be used so that a staker can know in advance which proofs-of-storage are possible. In addition, the proof-of-storage should hash with the deltatime and stake modifier to give a value less than the current target times the modified stake with the extra 25%.

The main thing that remains is to write the networking code and to start testing.

When the real chain starts, it will need to be seeded with 512 bits for the current and future stake modifier. My plan is to announce a Bitcoin block height in advance. When a block of that height is mined (and not orphaned), then everyone can apply sha256 to the block hash to obtain 256 bits for the current stake modifier and then apply sha256 again for the future stake modifier. This is a way to ensure the initial 256 stake modifiers were not chosen maliciously.

I expect to do at least one test run before the real staking begins. I will announce any such test run here.

Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
July 12, 2015, 02:33:04 PM
 #66

Given the way stake modifiers are computed, it makes sense to insist that an asset is at least 512 blocks old (or from the initial distribution) before it can stake. Consequently, the first 512 blocks must be staked from the initial distribution. I was concerned that there wouldn't be enough actively staking assets to make it through this initial phase, but I've thought of a way to ensure it.

As I wrote in the white paper, Qeditas units will have 4 more decimal points of precision than Bitcoin. One satoshi in a Bitcoin address at the snapshot will correspond to 10,000 Cantors (cants) in the initial distribution for Qeditas. I had intended to distribute this as one asset per address, but instead I will split it over 100 assets. For example, if you had 1 satoshi in an address, then the corresponding Qeditas address will start with 100 assets worth 100 cants each. More realistically, if you had 0.1 bitcoins in an address, then the corresponding Qeditas address will start with 100 assets worth 0.001 fraenks (1 zerm) each.

This way even if someone only has one funded address, they will still be able to stake up to 100 times in the first 512 blocks. After the first 512 blocks, staking rewards will start to become eligible to stake.

Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
July 19, 2015, 08:46:34 PM
Last edit: July 22, 2015, 06:27:37 PM by Bill White
 #67

It's time for another update. I am currently testing and modifying the consensus code. The first tests made it clear that once the first block rewards mature (after 512 blocks) they stake far more often than the initial distribution. To avoid this, I'm including coinage. Coins mature after 512 blocks and then age linearly quadratically for 16384 blocks (approximately 16 weeks). The coins in the initial distribution start out as mature. (Otherwise, there would be no coins to stake for the first 512 blocks.) Since the initial distribution starts aging immediately and the first block rewards only start aging after 512 blocks, addresses with a reasonable balance in the initial distribution should continue having a chance to stake even after block rewards start maturing.

On another note, I added a "proof of forfeiture" component which is similar to the Slasher idea described in early 2014 by Vitalik Buterin:

https://blog.ethereum.org/2014/01/15/slasher-a-punitive-proof-of-stake-algorithm/

I hadn't intended to do this because I didn't see short term forks as a serious problem. However, I now think there is a potential problem. Suppose B0 is a block. Suppose Alice and Bob sign different successor blocks B1 and B1'. Other participants need to know whether to sign a successor to B1 or B1'. Using the current stake modifier, the same participant will have the power to sign a successor to B1, to B1', to neither, or to both. The economically rational decision is to sign two successors: B2 to B1 and B2' to B1'. The reason is that if the staker only signs one, there is a chance that his choice will be orphaned and he will lose the staking reward. This same argument is a reason for every participant to sign two (or more successors), leaving a future (economically irrational) staker to decide which fork will win.

In summary, I don't see a serious problem if some participants sign blocks on different branches, but I do see a serious problem if every participant signs blocks on different branches.

To counteract this, the signer of a block can include a "proof of forfeiture" in the block. Such a proof consists (more or less) of two block header chains (each of length <= 6) which prove there is a fork from a common point to two block headers signed using the same staker's address. If such a proof is included, the coinstake transaction is allowed to include as inputs any currency assets owned by the double signer which were created in the last 6 blocks. In essence, the double signer forfeits his stake rewards to the new signer who exposed the double signing.

What remains is to do more testing and to write the networking code.

Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
July 22, 2015, 06:51:17 PM
 #68

I wasn't happy with the testing results, so I modified how coins age. Coins now age quadratically: the coinage is essentially (floor(B/512))^2 where B is the number of blocks since the asset's creation. The maximum value for B is 16384 (about 16 weeks) at which point the coinage remains at the maximum coinage of 32^2 = 1024. There are two exceptions to this:

1. The initial distribution starts off with the maximum coinage.

2. Addresses can have assets that are "locked" until a certain block. Locked assets age twice as quickly, but then lose the ability to stake at all 32 blocks before they become spendable. (Such locks can be used to loan stake to a third party staker without losing the ability to spend the asset later.)

The exact details coinage are still open to being changed based on testing. The goal is to ensure that if no "bitcoin whales" are staking, then the first weeks of stake rewards do not overwhelm the staking ability of the initial distribution. In one of the latest tests, someone who had 0.025 bitcoins in the snapshot would've been able to stake block 3844 (roughly 26 days). In reality, this will depend on how much of the initial distribution is being staked during those first 26 days.

Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
August 05, 2015, 08:48:58 PM
 #69

The testing branch now contains some unit tests. Some of the unit tests failed which required debugging, so lots of the code has changed. Unit tests still need to be written for assets, transactions, ctrees, ctree grafts and blocks.

When I originally talked about signed endorsements I said they would have the form "endorse <Qeditas address>". However, in the first version of the code the message was expected to be "Qeditas endorsement <Qeditas address>". The "endorse <Qeditas>" format is simpler, so I changed the code to match it. I also edited a post in this thread from June which claimed the format would be the longer one.

Earlier there was some discussion of having a market for signed endorsements before the network starts running. This probably isn't a good idea. There would be nothing preventing someone from selling their endorsements multiple times.

Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
August 17, 2015, 07:34:42 PM
 #70

Quick update: I've finished writing unit tests for the assets module. These are committed in the testing branch. Unit tests for transactions, compact trees and blocks still need to be written. Also, the networking code still needs to be written.

Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
September 02, 2015, 12:58:20 PM
 #71

There are about 26 formal documents that were released with Egal, and obviously Qeditas should be able to process all of these (e.g., check the proofs). To ensure this, I added serializations of those Egal documents in files on the testing branch of Qeditas and added unit tests to check them. Fortunately, they all do check properly (though the code is slow). In the process it became clear that having one exception "CheckingFailure" was inadequate, and so now there are a number of new exceptions which make it clearer why checking fails.

kushti
Full Member
***
Offline Offline

Activity: 317
Merit: 103


View Profile WWW
September 02, 2015, 02:17:40 PM
 #72

Nice to see progress with the project!

From PoS algo description, it seems grinding  by deltatime is possible (like proposed for Neucoin: https://nxtforum.org/general-discussion/neucoin's-40-page-white-paper-rebuts-all-nothing-at-stake-objections/msg176475/#msg176475 ).

Ergo Platform core dev. Previously IOHK Research / Nxt core dev / SmartContract.com cofounder.
Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
September 04, 2015, 01:22:31 PM
Last edit: September 14, 2015, 03:30:25 PM by Bill White
 #73

Nice to see progress with the project!

From PoS algo description, it seems grinding  by deltatime is possible (like proposed for Neucoin: https://nxtforum.org/general-discussion/neucoin's-40-page-white-paper-rebuts-all-nothing-at-stake-objections/msg176475/#msg176475 ).

Thank you. Regarding grinding over the time stamp, the situation should become clearer once the first trial run starts. I can say two things at the moment. First, the term "grinding" suggests searching over a large space (thus becoming proof of work). A search space of 7200 (each second for the next 2 hours) wouldn't qualify as proof of work, in my opinion. Nevertheless, the real question is whether or not a staker could gain some advantage by using a time stamp in the future. It's not clear to me what the advantage would be, for the same reasons koubiac gave in the discussion at nxtforum.

(Edit, September 14, 2015: Reading the next paragraph again, it's misleading. Rather than rewrite it, I will write a separate post below.)

In the specific case of Qeditas, here's what I would expect to happen. Suppose Alice sees she can stake 3600s from now. She can create and sign a block with the corresponding future time stamp and publish it. Bob will see the time stamp is in the future. If Bob will be able to stake in less than 3600s, there is no reason for him to build on Alice's block. He can instead create his own block (closer to the current time) and publish it. Suppose Bob will be able to stake in 3700s from now. He could create a successor to Alice's block now and hope participants will continue to build on this chain. However, it seems more likely that someone other than Alice will be able to stake in less than 3600s from now. If that's the case, then someone, Carol, will publish a block with a more appropriate time stamp. Presumably people would build on Carol's block. If that's the case, Alice and Bob will have endangered their opportunities to stake 3600s and 3700s from now. The reason is that Qeditas allows a staker to forfeit the rewards of a recent staker who signed blocks on two branches. If Alice or Bob signed a block in the near future, a staker that follows could use their original published blocks to take their rewards.

In practice, we will have to see what happens.  Can you describe a scenario in which a staker can gain an advantage (stake more blocks?) by claiming a time stamp in the future?

kushti
Full Member
***
Offline Offline

Activity: 317
Merit: 103


View Profile WWW
September 09, 2015, 09:35:35 PM
 #74

Nice to see progress with the project!

From PoS algo description, it seems grinding  by deltatime is possible (like proposed for Neucoin: https://nxtforum.org/general-discussion/neucoin's-40-page-white-paper-rebuts-all-nothing-at-stake-objections/msg176475/#msg176475 ).

Thank you. Regarding grinding over the time stamp, the situation should become clearer once the first trial run starts. I can say two things at the moment. First, the term "grinding" suggests searching over a large space (thus becoming proof of work). A search space of 7200 (each second for the next 2 hours) wouldn't qualify as proof of work, in my opinion. Nevertheless, the real question is whether or not a staker could gain some advantage by using a time stamp in the future. It's not clear to me what the advantage would be, for the same reasons koubiac gave in the discussion at nxtforum.

In the specific case of Qeditas, here's what I would expect to happen. Suppose Alice sees she can stake 3600s from now. She can create and sign a block with the corresponding future time stamp and publish it. Bob will see the time stamp is in the future. If Bob will be able to stake in less than 3600s, there is no reason for him to build on Alice's block. He can instead create his own block (closer to the current time) and publish it. Suppose Bob will be able to stake in 3700s from now. He could create a successor to Alice's block now and hope participants will continue to build on this chain. However, it seems more likely that someone other than Alice will be able to stake in less than 3600s from now. If that's the case, then someone, Carol, will publish a block with a more appropriate time stamp. Presumably people would build on Carol's block. If that's the case, Alice and Bob will have endangered their opportunities to stake 3600s and 3700s from now. The reason is that Qeditas allows a staker to forfeit the rewards of a recent staker who signed blocks on two branches. If Alice or Bob signed a block in the near future, a staker that follows could use their original published blocks to take their rewards.

In practice, we will have to see what happens.  Can you describe a scenario in which a staker can gain an advantage (stake more blocks?) by claiming a time stamp in the future?

1. I'm naming any iteration over parameter space(allowed by a protocol) instead of using a parameter as planned as "grinding attack". Terminology isn't stable here still. It's interesting where is a line to split PoW/smallspace "grinding attacks", if we're going to a more precise terminology.

2. If every forger will follow Alice, so all the possible blocks for coming 2 hours will be published immediately. Also please note we're talking not about standard software implementation(where timestamping is honest), but some modified software, so forgers behaviour isn't predictable(a forger could extend any fork allowed by a protocol, or multiple of them in fact).

3. Alice have a chance to extend a chain immediately, and if longest chain rules that's about more blocks to produce. Maybe other blockchain quality functions are safe to that, I don't sure atm.

Ergo Platform core dev. Previously IOHK Research / Nxt core dev / SmartContract.com cofounder.
Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
September 14, 2015, 04:08:43 PM
 #75

In the specific case of Qeditas, here's what I would expect to happen. Suppose Alice sees she can stake 3600s from now. She can create and sign a block with the corresponding future time stamp and publish it. Bob will see the time stamp is in the future. If Bob will be able to stake in less than 3600s, there is no reason for him to build on Alice's block. He can instead create his own block (closer to the current time) and publish it. Suppose Bob will be able to stake in 3700s from now. He could create a successor to Alice's block now and hope participants will continue to build on this chain. However, it seems more likely that someone other than Alice will be able to stake in less than 3600s from now. If that's the case, then someone, Carol, will publish a block with a more appropriate time stamp. Presumably people would build on Carol's block. If that's the case, Alice and Bob will have endangered their opportunities to stake 3600s and 3700s from now. The reason is that Qeditas allows a staker to forfeit the rewards of a recent staker who signed blocks on two branches. If Alice or Bob signed a block in the near future, a staker that follows could use their original published blocks to take their rewards.

In practice, we will have to see what happens.  Can you describe a scenario in which a staker can gain an advantage (stake more blocks?) by claiming a time stamp in the future?

1. I'm naming any iteration over parameter space(allowed by a protocol) instead of using a parameter as planned as "grinding attack". Terminology isn't stable here still. It's interesting where is a line to split PoW/smallspace "grinding attacks", if we're going to a more precise terminology.

2. If every forger will follow Alice, so all the possible blocks for coming 2 hours will be published immediately. Also please note we're talking not about standard software implementation(where timestamping is honest), but some modified software, so forgers behaviour isn't predictable(a forger could extend any fork allowed by a protocol, or multiple of them in fact).

3. Alice have a chance to extend a chain immediately, and if longest chain rules that's about more blocks to produce. Maybe other blockchain quality functions are safe to that, I don't sure atm.

After rereading the paragraph I wrote describing what I would expect to happen with Qeditas, I realized that I had forgotten some details about how proof of stake in Qeditas has been implemented.  I implied that if Alice sees she can stake 3600s from now, she can either sign a block now with a future time stamp or wait for an hour and sign a block then. This was wrong. Alice's ability to stake in 3600s depends on the current stake modifier which changes with each block. She (along with everyone else) will know in advance what the current stake modifier will be for the next 256 blocks (roughly the next 2 days), so she can still precompute whether or not she will get the chance to stake in the next few hours whether she signs a block with a false future time stamp now or not. If she will get another chance to sign a block, then signing one that is likely to be orphaned now will risk her chance to stake in the near future (due to forfeiture of recent rewards for proven double signers). On the other hand, if her chance to stake "now" with a false future time stamp will be her only chance to stake in the near future (the next hour or so), then it is rational for her to sign a block with the false future time stamp and publish it. It would still be up to others whether or not to build on her block. Obviously if other participants also had the chance to stake with the current stake modifier within the next hour, then they would be inclined to publish their own blocks with a more accurate time stamp. Adding the block with the earliest time stamp should lead to the chain with the most cumulative stake.

Regarding grinding, I have been more concerned about possible grinding attacks on the future stake modifier. I've designed the code that each time someone stakes a block they can choose 1 bit that will be part of the stake modifiers 257 - 512 blocks in the future. In principle, a staker could explore some subset of some of the possible stake modifiers in the future in order to determine whether its advantagous for him to choose a 0 or a 1. Optimally, everyone would choose the bit randomly, but it's unrealistic to expect this. I expect people will learn that sometimes they can precompute if a 0 or a 1 is likely to help them stake at some point roughly 2 days in the future and choose that bit selfishly. I also expect that most of the time neither bit will be obviously preferable and so the choice will effectively be random. As long as the new bit is chosen randomly often enough, it should prevent a large staker from taking over the network (i.e., staking every block). As always, however, this is an experiment for which we don't yet know the results.

sfultong
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
September 30, 2015, 02:08:29 PM
 #76

Hi Bill,

I notice you haven't pushed any changes to the git repos recently. How are things coming?

Briefly looking through your code, it doesn't seem like you have worked on creating the ledger snapshot from bitcoin. That's actually what I'm going to be working on right now, with my bitcoin spinoff toolkit.

I was tempted to write the spinoff toolkit in ocaml (I much prefer it to C++) and maybe submit a pull request to you, but ultimately I wanted something that was easy to integrate with existing cryptocurrency codebases and something that more of the general public could contribute to. Of course, in reality, I'm unlikely to get contributions either way.

Well, I hope things are going well for you. Qeditas seems like a very exciting and daunting project.
Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
October 01, 2015, 03:16:55 PM
 #77

Hi Bill,

I notice you haven't pushed any changes to the git repos recently. How are things coming?

Briefly looking through your code, it doesn't seem like you have worked on creating the ledger snapshot from bitcoin. That's actually what I'm going to be working on right now, with my bitcoin spinoff toolkit.

I was tempted to write the spinoff toolkit in ocaml (I much prefer it to C++) and maybe submit a pull request to you, but ultimately I wanted something that was easy to integrate with existing cryptocurrency codebases and something that more of the general public could contribute to. Of course, in reality, I'm unlikely to get contributions either way.

Well, I hope things are going well for you. Qeditas seems like a very exciting and daunting project.

The Qeditas code is in progress, but still unfinished. I decided to push my most recent commits to the remote repository this morning. Coincidentally, the thing I've been doing the past few days is creating the initial ledger (as a compact/Merkle tree) from the snapshot I made of the Bitcoin block chain last March.

You're correct that code to create a snapshot is not in any of the git repositories. I did write some ocaml code to do this (in cooperation with bitcoind) but most of the code is a mess. For example, there are references to specific directories and filenames where I saved intermediate results. It took roughly 6 weeks (from mid-February to the end of March of this year) to successfully construct the snapshot I wanted (through the first 350,000 blocks). The end result of the snapshot is here:

https://mega.co.nz/#F!0xhzUbTS!h-LeyiTbF9dXgusKCc_bfQ

It's split into p2pkh balances and p2sh balances. In each case the address is given as the important 20 bytes in hex and then the balance of satoshis at the snapshot. There are also two small python scripts people can use to check their balances by giving the corresponding Bitcoin address. Once I had the snapshot I wanted to move on to coding Qeditas itself, so I never spent more time with the code to generate the snapshot.

The biggest problem I faced were resource bounds. What I would try to do would either require more RAM than I had, too much disk space or too much time. I spent most of my time modifying my strategy to compute the snapshot to work around these bounds. A more reasonable strategy probably would have been to buy more advanced hardware. Here are a few things I can say that might be helpful:

1. The blocks are all stored in .bitcoin/blocks/*.dat files and one can read the relevant data from there without needing bitcoind (now bitcoin-cli I suppose). However, it is dangerous. The files include orphan blocks. I'm also unsure whether the blocks are guaranteed to be in the correct order. After realizing this, I used bitcoind with getblockhash to explicitly give me the hash of each of the 350,000 blocks in the main chain. At this point it's possible to use the *.dat files to read blocks, skipping orphans, or it's possible to use bitcoind getblock with each of these hashes. I tried both, and am no longer sure which variant was used to make the final snapshot.

2. I went through all the txs in the main chain and computed that only the first 7 bytes of the 32 byte hash is needed to uniquely identify the tx. There are two exceptions of txs for which there are 'twin' txs with the same txid (since the txs look exactly the same):

d5d27987d2a3dfc724e359870c6644b40e497bdc0589a033220fe15429d88599
e3bf3d07d4b0375638d5f1db5255fe07ba2c4cb067cd81b84ee974b6585fb468

These were due to bugs in bitcoin having to do with ignored coinbase inputs. I chose to ignore the duplicate and only count each tx once. My understanding is that in bitcoin the relevant txos can only be spent once in spite of being in the block chain twice.

The important point is that it was sufficient to only use 7 bytes to identify txs instead of 32. Since there have been many txs since the end of March, it's possible 7 bytes is no longer enough.

3. My first attempt to create a snapshot was to go forward through the blocks and store all utxos until they were spent. In principle, at the end I would have had all the utxos and hence the snapshot. This ran out of memory. As a workaround, I tried reading a certain number of blocks and then saving the state (utxos so far) and then restarting. This still wasn't good enough. Then I tried saving all the txs in reverse order. I kept in memory all the spent txos that I had not yet seen created. Each time I encountered a new txo, it was either known to be spent in the future or I would know it should go into the snapshot. This also ran out of memory. At this point, I would have to carefully review my notes to see what exactly I did, but it's at least clear that I saved some very large files named "spentxxx" and "unspentxxx" where xxx ranged from 000 to roughly 250 -- mirroring the names of the block*.dat files. I do roughly remember that I created some of these files by processing the blocks forward and the other files by processing the blocks backward, and extracting the snapshot from a combination of the information.

I hope you can find a more robust solution. Creating a snapshot was a surprisingly difficult hurdle for creating a spinoff.

sfultong
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
October 01, 2015, 04:15:25 PM
 #78

Thanks Bill, that's very helpful.

Did you filter out dust?
Bill White (OP)
Member
**
Offline Offline

Activity: 118
Merit: 11

Qeditas: A Formal Library as a Bitcoin Spin-Off


View Profile WWW
October 01, 2015, 06:45:50 PM
 #79

I didn't filter out dust. It's a reasonable thing to do. I just checked and roughly 9.5% of addresses in the snapshot have 1 satoshi.

I'm changing the units in Qeditas to have 4 extra decimal places, so those addresses will start with 10,000 cants (the atomic currency unit of Qeditas). It's obviously still dust though.

rendravolt
Hero Member
*****
Offline Offline

Activity: 2156
Merit: 521



View Profile
October 01, 2015, 06:47:35 PM
 #80

Thanks dev, this is what I need.  Smiley
Pages: « 1 2 3 [4] 5 6 7 8 9 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!