I do agree too and seems like it was a misunderstanding based on Ohad's post.
I think we do still have some "fine point" details to work out, but yet we at least agree on the actual goal here, now.
Many processes don't have the properties necessary to be authenticated. For example, you can verify the work of a miner by a simple hash function, but you can't verify the work of a neural network that simply.
Really, you can! An ANN is really just a composition of sigmoids in a graph and you can certainly authenticate over sigmoid functions, graphs, and the composition. You can't assert something like "it hit a correct error rate" because you can't define what would be a correct meeting of any arbitrary objective, there, but you can certainly assert "evaluation and backprop/annealing were applied correctly" and infer from that the error rate hit was the same as what you would've gotten to running locally, which is all we desire.
If the publisher has 1000 hosts on his VM, and wants to verify their works one by one, it would take a lot of computational power on his side.
This is pretty central to the debate. While this is historically true, what we are finding in contemporary work in this field is that most of these overheads actually stem from the combination of authentication for correctness and privacy preservation. One of the key insights that the work of Socrates1024 brings to the table is that if you remove the privacy preservation criteria (and, as a side effect of the change, remove a particular type of termination criteria - though one that I think doesn't apply to the Zennet case anyway) that the overhead of authentication is greatly reduced.
What we are mostly discussing now in our "side-band" IRC debate is if introducing a scale on the "granularity" of authenticated big step reductions creates, or can create, enough of an inflection point to control the overhead to within some acceptable bounds without sacrificing the utility of the authentication. We think that, at least for the "resource measure authentication, only" goals of Zennet, we can.
You can think of our approach (greatly simplified) as something like this: Authenticating (hashing over and signing a receipt for) each individual opcode (or, worse, micro-coded state transition) create a huuuuuuge overhead. Authenticating over every stack scope (function call) state requires only hashing over call operations, creating far less overhead. Authenticating over "an entire system run" starting from some confimed-correct hardware loaded with some confirmed-correct software requires only one hash over a certification that you did turn the power on, in the correct state, and run the thing from there. It becomes pretty self evident that we have a clean gradient of overheads, here.
Our goal is, in a nutshell, to make the granularity of the "authenticated step" naturally large enough to be efficient but small enough to still be meaningful. We're attempting to do this by, basically, a model of authenticating not over the entire computations but only over "each step involving a resource access." The key realization is that we don't care if the actual math done between these resource accesses is correct - that is not what we are trying to certify! What we are trying to certify is that resource access is done with a correct ordering, accounted for correctly in the "billing receipt" created by the worker, and that if any tampering is being done it cannot be done in a way that can be hidden from the publisher, i.e. with no evidence of the tampering included on the receipt. We think that, under our new approach, for the worker to "get away with something" he will either have to provide visible evidence of his shenanigans to the publisher or else will have to expend significantly more resource than just running the computation correctly in order to generate a "believable" lie that will not stand out to the publisher on the receipt.
We are sketching out some details of a "toy proof of concept" (literally an adaptation of "hello world") to help confirm our theory.
Also, I assume by 'work' we don't mean running a mathematical operation across hosts.
Well, of course we do! What we don't mean is running some specific
prescribed operation, but any general operation we please. The work function can be summarized as "run our authenticated hypervisor and give us IO to the inside" basically.
I don't know the infrastructure for the VM,
HEH, neither do we, yet! It will probably not stray too very far from what was originally proposed, just with a small peice of the VM either added or removed, depending upon how you look at it. We've tossed around a few ideas for different approaches, but decided to defer many of those details until after we can prove the basic premise with a "hello world" type example being authenticated.
but the system may assume all hosts are online and cooperating in a non-malicious way, so it can build and operate an entire OS across them. If one host acts maliciously, it would endanger the integrity of the whole VM. In this perspective, assuming 1 in a 1000 defective host endangers the entire system, not just 1/1000 of work.
This is also somewhat central to our debate. I think at this point we have about 5 to 7 opinions between the three of us, hehe. I think pretty much all we *do* agree on so far about this particular aspect is that we need a much more formal treatment of this particular aspect in the specification!
I agree with HMC here. Any kind of benchmarking used must be run alongside the process. Any host can benchmark high and detach resources after the process has begun.
I think you missed something key, here. Benchmarking is continuous, and ongoing, in any case. In other words, your job is benchmarked "alongside the process" so if you start out benchmarking high and then go about removing applied resource, you will not be able to (assuming we can get the ancillary issues sorted out) continue billing without also reducing your billing rate correspondingly. We all agree that this will work fine and that rates will converge appropriately.
What we don't agree on is the meaningfulness of the initial "baseline" benchmark that you start from, to do your initial rounds of billing before this convergence starts to "settle into" the correct values via the linear decomposition. I don't dispute the validity of the linear solve, itself, only the applicability of a single "cannonical" or general benchmark to any initial billing for an arbitrary process.
The details on this are a bit too deep and maths-y to get into here, I think. Join #zennet and we can wade into it if you'd like. :-)
... This may introduce another problem aside, any open source OS selected must be heavily changed.
Yes, this has come up as well. We'd obviously like to avoid something as (insanely) effort-intensive as authenticating an entire kernel and/or VM. Although Ohad briefly considered it as an option, I discouraged such a "moon shot" goal, favoring instead an approach more like a special purpose vm layer.
I think if point 3 and 2 are solved, this won't rise.
I agree! If we can solve 2 and 3 then any "lower dimension" non-linearity introduced into the pricing model by an "attacking" worker becomes immediately quite visible, and the publisher can reliably abort.
If we can identify well behaved nodes that give verifiable results with verifiable resources used, this incentive wouldn't exist. Any pricing model based on this would be sound.
Exactly. The conclusion we do all solidly agree on is that if we can verify enough such that a "big lie" becomes very self-evident and "creating lots of continuous small lies over time" becomes very computationally expensive, then the rest of the model follows soundly from that. (Assuming ID cost is correct, i.e. my point #1 is solidly addressed as well.)