Bitcoin Forum
December 15, 2024, 09:28:01 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 »  All
  Print  
Author Topic: [PRE-ANN][ZEN][Pre-sale] Zennet: Decentralized Supercomputer - Official Thread  (Read 57103 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 02:51:42 AM
 #141


it is clear that:
1. you want the computational work to be proven

Ideally, yes.

Quote
2. i want to control the risk expectation and decrease it to practical values

As I've said I'd accept this as compromise, but have yet to see the mechanic by which it is decreased to practical values.

Quote
i claim that your method is less practical, but i'm open to hear more.

Excellent.  Why, specifically, do you claim that my method is less practical?  Also what, specifically, would you like to hear more about?

Quote
you claim that my method won't work.

I only claim that they "shouldn't work, as described."

Quote
now let's focus on either ways:
if we want to talk about my approach, then miscalc and procfs spoof are indeed different.

GAH how are they at all different? :-)

In either case the attacker claims to execute one reduction but actually executes another related reduction.  Where is the difference?  Either I am executing a different algorithm than specified for the job itself (miscalc) or I am executing a different algorithm than specified for the consumption sampling (procfs spoof) but either way I'm doing the same thing - a different reduction from what the publisher thinks I am doing.

HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 02:54:28 AM
 #142

it's true for ANY vectors. calling them "procfs msmts" doesnt change the picture.
i can linearly span the last 100 chars you just typed on your pc by using say 200 weather readings each from 200 different cities.
of course, only once, otherwise the variance will be too high to make it somehow informative.
but for a given program, the variance over several runnings is negligible (and in any case can be calculated, and taken into account on the least squares algo).

Again, I don't disagree, but how does any of this have any bearing on "my arbitrary program?"  How do you get around the lack of functional extension?  How do you make the vectors, in any way, relate back to the job program?
HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 02:58:42 AM
 #143

Huh  If benchmarks don't have to be accurate than why do them at all?

that's the very point: i dont need accurate benchmarks. i just need them to be different and to keep the system busy. that's all!! then i get my data from reading procfs of the benchmark's running. if you understand the linear independence point, you'll understand this.

Wha?  I guess I don't understand the linear independence point. Oh, wait, I already knew that I didn't understand that.  Wink

If the benchmarks don't need to be accurate then why are they in the model at all?  If everyone can just reply 3.141 for every benchmark result and that is somehow "just ok" then what purpose does the benchmark serve?

My only conclusion is that it isn't actually "just ok."

Quote
oh well
so you know how to write software that applies to all future hw?

Sure, that is precisely what the full lambda cube is for!

What I don't know how to do is devise any scheme to meaningfully benchmark all future hw.  You briefly claimed you had.  I didn't hold my breath.  Good thing.
Yuzu
Sr. Member
****
Offline Offline

Activity: 368
Merit: 250



View Profile
September 20, 2014, 03:07:59 AM
 #144

 I have this marked for reading later.
ohad
Hero Member
*****
Offline Offline

Activity: 897
Merit: 1000

http://idni.org


View Profile WWW
September 20, 2014, 03:26:32 AM
 #145

what you say point to many misunderstandings of what i said. i dont blame you. the higher probability is me not being clear enough, plus i do have a language barrier when it comes to english:

1. those benchmarks has nothing to do with benchmarking. i don't even read their result. it's just i need programs that extensively use various system components, and such programs i'll typically find under the shelve of "benchmarks". but that's all.

1.1 the info i do gather from the benchmark running is the procfs readings of the running itself.

1.2 now let's go on with the simple case (on the pdf it's where s=t1), when the user just sets one number: "i want 1 zencoin for every hour of full load". so the client will blow his pc with "benchmarks", take procfs measurements, and compute the pricing vector. call this latter vector g. this vector plays the main role.

1.3 at every few seconds, the client takes procfs measurements vector. this is a vector of n components. the first component is CPU User Time. the second is CPU Kernel time. the third is RAM Sequential Bytes Read, and so on many procfs vars. then we take the inner product of g with the procfs msmts vector, and that's the amount to be paid.

1.4 the location of the vars as components in the vector also answers your misunderstanding regarding the decomposition of a procfs msmts of a given program to a linear combination of procfs msmts of k>>n programs.

2. all the publisher wants is to fold his protein in 10 minutes over zennet rather than 2 weeks in the univerty's lab. he does not need uptime guarantees such as bitcoin. even if he's folding various proteins 24/7.

3. as for the id pow, just common sense. one estimates a confidence bound of the reward of a spoof/spam/scam and determines a POW which is significantly more expensive that it.

Quote
GAH how are they at all different? :-)

In either case the attacker claims to execute one reduction but actually executes another related reduction.  Where is the difference?  Either I am executing a different algorithm than specified for the job itself (miscalc) or I am executing a different algorithm than specified for the consumption sampling (procfs spoof) but either way I'm doing the same thing - a different reduction from what the publisher thinks I am doing.

4. they're equivalent from the attacker's/provider's point of view, but not from the publisher's one who wants to detect them, as i stated.

5. what do you mean by functional extension? like feature mapping, rkhs etc.?
Quote
Quote
oh well
so you know how to write software that applies to all future hw?
Sure, that is precisely what the full lambda cube is for!

What I don't know how to do is devise any scheme to meaningfully benchmark all future hw.  You briefly claimed you had.  I didn't hold my breath.  Good thing.

6. you're both a scientist and an engineer, right? but when you wrote it, you were only a scientist. we'll get back to this point Wink
Quote
Excellent.  Why, specifically, do you claim that my method is less practical?  Also what, specifically, would you like to hear more about?

7. the methods i've encountered in the literature were seemed to me less practical. i do not rule out that you found or read about good enough methods. and of course i'd like to hear more about such technologies, and even more than hear, depends on the specific ideas.

more answers to give but let's first establish a wider basis of agreement and understanding.

Tau-Chain & Agoras
HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 03:46:13 AM
 #146

as i wrote, i'll be glad to discuss with you methods for proving computation. not even discuss but maybe even work on it.

I'm not sure there is much to work on.  Just replace the VM with something authenticating, and that's that, right?  There's a good list of options at http://outsourcedbits.org/coding-list/ covering everything from basic two-party garbling (you don't need more then two) to full multi-party morphisms.

There is also the work of SCIPR labs, but nothing so "drop in" yet.  I hear M$ and IBM both have some really nice tech for it, too, but I doubt they'd share.

One of the other folks who fits in my small elevator is Socrates1024, aka AMiller.  He has something I've come to call "Miller's Merkle Trees" which he calls generalized authenticated data structures.  This work started from his now-infamous model to reduce block chain bloat.  He has recently extended this into a reduction model for general authenticated computing without requiring Yao's garbling step and the overhead that comes with the logic table rewrite.  Verification happens in reduced complexity from the prover challenge itself.  This seems almost perfectly suited to your goals.  I'm sure he'd love to see practical application, and almost everything you'd need is in his github in "some form" or another, hehe.

Many of us have suspected for some time now that there is some middle ground between the approaches, with a preservation of privacy over data but not over the circuit construction itself and without the overheads of it.  I tend to believe this is probably the case, but as far as I'm aware it is still an open question.

I'm probably not the best to explain, since I'm only standing on the shoulders of the giants with this stuff.  You probably want to talk to the giants directly.  Wink  I can try to make a few introductions if you'd like.
ohad
Hero Member
*****
Offline Offline

Activity: 897
Merit: 1000

http://idni.org


View Profile WWW
September 20, 2014, 03:53:21 AM
 #147

as i wrote, i'll be glad to discuss with you methods for proving computation. not even discuss but maybe even work on it.

I'm not sure there is much to work on.  Just replace the VM with something authenticating, and that's that, right?  There's a good list of options at http://outsourcedbits.org/coding-list/ covering everything from basic two-party garbling (you don't need more then two) to full multi-party morphisms.

There is also the work of SCIPR labs, but nothing so "drop in" yet.  I hear M$ and IBM both have some really nice tech for it, too, but I doubt they'd share.

One of the other folks who fits in my small elevator is Socrates1024, aka AMiller.  He has something I've come to call "Miller's Merkle Trees" which he calls generalized authenticated data structures.  This work started from his now-infamous model to reduce block chain bloat.  He has recently extended this into a reduction model for general authenticated computing without requiring Yao's garbling step and the overhead that comes with the logic table rewrite.  Verification happens in reduced complexity from the prover challenge itself.  This seems almost perfectly suited to your goals.  I'm sure he'd love to see practical application, and almost everything you'd need is in his github in "some form" or another, hehe.

Many of us have suspected for some time now that there is some middle ground between the approaches, with a preservation of privacy over data but not over the circuit construction itself and without the overheads of it.  I tend to believe this is probably the case, but as far as I'm aware it is still an open question.

I'm probably not the best to explain, since I'm only standing on the shoulders of the giants with this stuff.  You probably want to talk to the giants directly.  Wink  I can try to make a few introductions if you'd like.
thanks much for the info. will go over it.
will be delighted for introductions, of course. also if you want to be a part of zennet, let's consider it

Tau-Chain & Agoras
ohad
Hero Member
*****
Offline Offline

Activity: 897
Merit: 1000

http://idni.org


View Profile WWW
September 20, 2014, 04:10:23 AM
 #148

after a quick look on the link:
1. recall that on zennet we speak about native code only. no jvms or so.
2. data privacy is nice to have. much of the data is not private (just numbers), and on any case, homeomorphic encryptions etc. can take place over zennet using the any existing implementation. all zennet does is giving you ssh. if your verification tool gives you suspicious info, then just disconnect. that's part of your distributed app, which i do nothing to help you distribute it, rather than giving you passwordless ssh access to remote machines.
3. we're speaking about arbitrary native code (rules out projects for specific calcs). verifiable calcs are well and economically mitigated, as above.
but that's only after a quick look

Tau-Chain & Agoras
HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 04:16:21 AM
 #149

1. those benchmarks has nothing to do with benchmarking. i don't even read their result. it's just i need programs that extensively use various system components, and such programs i'll typically find under the shelve of "benchmarks". but that's all.

1.1 the info i do gather from the benchmark running is the procfs readings of the running itself.

Sure, this is what I meant by the output/result of the benchmarking.  No confusion there.

Quote
1.2 now let's go on with the simple case (on the pdf it's where s=t1), when the user just sets one number: "i want 1 zencoin for every hour of full load". so the client will blow his pc with "benchmarks", take procfs measurements, and compute the pricing vector. call this latter vector g. this vector plays the main role.

1.3 at every few seconds, the client takes procfs measurements vector. this is a vector of n components. the first component is CPU User Time. the second is CPU Kernel time. the third is RAM Sequential Bytes Read, and so on many procfs vars. then we take the inner product of g with the procfs msmts vector, and that's the amount to be paid.

The procfs measurements in 1.2 and in 1.3 are entirely unrelated.  The load that any given benchmarking algorithm, B, puts on the system will not be representative of the load the job algorithm, P, puts on the system.  No combination of any algorithms B2, B3, ... BN will decompose onto a meaningful relation with P's load.  This is the fundamental problem.  No measurement of any other algorithm tells us anything about P unless that algorithm is isomorphic to P.  Even then it doesn't tell us anything about our particular run of P, the performance profile of which will likely be dominated by external factors.

Quote
1.4 the location of the vars as components in the vector also answers your misunderstanding regarding the decomposition of a procfs msmts of a given program to a linear combination of procfs msmts of k>>n programs.

?

Other than the well ordering making the subsequent decomposition consistent with the original g vector, I fail to see how it relates.  Is there something more and special to this?

Quote
2. all the publisher wants is to fold his protein in 10 minutes over zennet rather than 2 weeks in the univerty's lab. he does not need uptime guarantees such as bitcoin. even if he's folding various proteins 24/7.

Sure, I've never disputed this, nor do I see it as any problem.  (With the exception of the absconder cases, which I consider a separate concern anyway.)  You brought in uptimes into the discussion, in an unrelated context.  Let's move on.

Quote
3. as for the id pow, just common sense. one estimates a confidence bound of the reward of a spoof/spam/scam and determines a POW which is significantly more expensive that it.

But how do they make the determination?  How am I supposed to pick a target without knowing anything about any participants capacity?  How can I have any reason the believe that my target is not way too low, creating no incentive not to abscond.  I can just set the target unreasonably high, but then I significantly reduce my possible workers.

How can we say the puzzle is sufficiently difficult if we can't quantify difficulty?

Quote
4. they're equivalent from the attacker's/provider's point of view, but not from the publisher's one who wants to detect them, as i stated.

Only in that, without authentication, they have separate mitigation to the fact that they are not actually detectable.  (Even this is arguable, as in both cases mitigation can only really be performed by way of comparison to a third party result!)

Quote
5. what do you mean by functional extension? like feature mapping, rkhs etc.?

I mean http://ncatlab.org/nlab/show/function+extensionality as in proving two algorithms equivalent, or proving a morphism path as continuous from one to the other under homotopy.  If you can't show that the benchmark work is a functional extension from the job work than any measurements over the benchmark work have no relation to the job work.  In other words, if you can't assert any relation between the algorithms then you can't assert any relation between the workloads of executing the algorithms!

Quote
6. you're both a scientist and an engineer, right? but when you wrote it, you were only a scientist. we'll get back to this point Wink

I've been both, at various points.  I try to avoid ever being both simultaneously, I find it tends to just cause problems.


Quote
7. the methods i've encountered in the literature were seemed to me less practical.

Until the past few years, they have been almost entirely impractical.  Times have changed rapidly, here.

Quote
i do not rule out that you found or read about good enough methods.

It concerns me that you wouldn't have been aware of these developments, considering the market you're about to try to penetrate.  Know your competition, and all that.

Quote
and of course i'd like to hear more about such technologies, and even more than hear, depends on the specific ideas.

I've sent you a PM wrt this.

Quote
more answers to give but let's first establish a wider basis of agreement and understanding.

Sure.  Which of our disagreements and/or misunderstandings should we focus on?
HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 04:22:45 AM
 #150

after a quick look on the link:
1. recall that on zennet we speak about native code only. no jvms or so.
2. data privacy is nice to have. much of the data is not private (just numbers), and on any case, homeomorphic encryptions etc. can take place over zennet using the any existing implementation. all zennet does is giving you ssh. if your verification tool gives you suspicious info, then just disconnect. that's part of your distributed app, which i do nothing to help you distribute it, rather than giving you passwordless ssh access to remote machines.
3. we're speaking about arbitrary native code (rules out projects for specific calcs). verifiable calcs are well and economically mitigated, as above.
but that's only after a quick look

Then what you want, certainly, is Miller's work, and not Yao.
HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 04:29:16 AM
 #151

thanks much for the info. will go over it.
will be delighted for introductions, of course. also if you want to be a part of zennet, let's consider it

I'd consider it.  However demands on my time are already pretty extreme.   Undecided
ohad
Hero Member
*****
Offline Offline

Activity: 897
Merit: 1000

http://idni.org


View Profile WWW
September 20, 2014, 04:32:01 AM
Last edit: September 20, 2014, 04:11:49 PM by ohad
 #152

Quote
Other than the well ordering making the subsequent decomposition consistent with the original g vector, I fail to see how it relates.  Is there something more and special to this?

it agrees with the ordering of all procfs measurements vectors.
say it's a two dimensional vector: [<UserTime>, <KernelTime>], like [5,3.6].
now say I run your small task with various inputs many times. I get some average vector of procfs, of the form [<UserTime>, <KernelTime>], plus some noise with some variance.
this vector can be written as a*[<B1.UserTime>, <B1.KernelTime>]+b*[<B2.UserTime>, <B2.KernelTime>] plus some noises with covariances.
of course, this is only if B1.UserTime*B2.KernelTime-B1.KernelTime*B2.UserTime does not equal zero (this is just the determinant). if it's zero, we need B3, B4 and so on until we get rank 2.
this is how to decompose a procfs vector of a program into of several given (rank-n) procfs vector programs.

Quote
Quote
i do not rule out that you found or read about good enough methods.
It concerns me that you wouldn't have been aware of these developments, considering the market you're about to try to penetrate.  Know your competition, and all that.

i did that research, just didn't find something good enough. the best i found involved transmitting vm snapshots.
no human being is able to read all papers regarding distributed computing.

Quote
3. as for the id pow, just common sense. one estimates a confidence bound of the reward of a spoof/spam/scam and determines a POW which is significantly more expensive that it.
Quote
But how do they make the determination?  How am I supposed to pick a target without knowing anything about any participants capacity?  How can I have any reason the believe that my target is not way too low, creating no incentive not to abscond.  I can just set the target unreasonably high, but then I significantly reduce my possible workers.

How can we say the puzzle is sufficiently difficult if we can't quantify difficulty?

just dont give too much work to a single provider. that's a good practice for many reasons.
in addition, begin slowly. hash some strings, see you get correct answers etc., fell your provider before massive computation (all automatically of course).

Tau-Chain & Agoras
HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 06:17:20 AM
 #153

Quote
Other than the well ordering making the subsequent decomposition consistent with the original g vector, I fail to see how it relates.  Is there something more and special to this?
it agrees with the ordering of all procfs measurements vectors.
...
this is how to decompose a procfs vector of a program into of several given (rank-n) procfs vector programs.

Right, so the ordering only matters to keep the decomposition consistent with the g values, not more. This is what I suspected.

Quote
i did that research, just didn't find something good enough. the best i found involved transmitting vm snapshots.
no human being is able to read all papers regarding distributed computing.

I'm just quite surprised you didn't catch it.  The service industry has been "abuzz" about the post-Gentry work.

Quote
Quote
3. as for the id pow, just common sense. one estimates a confidence bound of the reward of a spoof/spam/scam and determines a POW which is significantly more expensive that it.
Quote
But how do they make the determination?  How am I supposed to pick a target without knowing anything about any participants capacity?  How can I have any reason the believe that my target is not way too low, creating no incentive not to abscond.  I can just set the target unreasonably high, but then I significantly reduce my possible workers.

How can we say the puzzle is sufficiently difficult if we can't quantify difficulty?

just dont give too much work to a single provider. that's a good practice for many reasons.

This doesn't answer the question at all.  There really does need to be a global difficulty, as far as I can tell.  I've been mulling this one over for awhile now, and just don't see a way around it.  For an arbitrary worker, I can't know if he is doing his id pow with cpu, gpu, pen and paper, asic, pocket calculator, or some super specialized super efficient worker that outperforms all the rest.  What is an appropriate difficulty target to set on his puzzle?

ohad
Hero Member
*****
Offline Offline

Activity: 897
Merit: 1000

http://idni.org


View Profile WWW
September 20, 2014, 10:36:03 AM
 #154

1. those benchmarks has nothing to do with benchmarking. i don't even read their result. it's just i need programs that extensively use various system components, and such programs i'll typically find under the shelve of "benchmarks". but that's all.

1.1 the info i do gather from the benchmark running is the procfs readings of the running itself.

Sure, this is what I meant by the output/result of the benchmarking.  No confusion there.

Quote
1.2 now let's go on with the simple case (on the pdf it's where s=t1), when the user just sets one number: "i want 1 zencoin for every hour of full load". so the client will blow his pc with "benchmarks", take procfs measurements, and compute the pricing vector. call this latter vector g. this vector plays the main role.

1.3 at every few seconds, the client takes procfs measurements vector. this is a vector of n components. the first component is CPU User Time. the second is CPU Kernel time. the third is RAM Sequential Bytes Read, and so on many procfs vars. then we take the inner product of g with the procfs msmts vector, and that's the amount to be paid.

The procfs measurements in 1.2 and in 1.3 are entirely unrelated.  The load that any given benchmarking algorithm, B, puts on the system will not be representative of the load the job algorithm, P, puts on the system.  No combination of any algorithms B2, B3, ... BN will decompose onto a meaningful relation with P's load.  This is the fundamental problem.  No measurement of any other algorithm tells us anything about P unless that algorithm is isomorphic to P.  Even then it doesn't tell us anything about our particular run of P, the performance profile of which will likely be dominated by external factors.


true, they're unrelated.
as for load decomposition,
why on earth the programs should be identical/homomorphic/isomorphic?
programs that performs 1000 FLOPs will take about 1/2 than programs that perform 2000 similar FLOPs, even if they're two entirely different algos. i'm not counting on that anywhere, just pointing to the fact that i'm apparently not interested at all in such functional equivalence.
how any kind of such equivalence is related to resource consumption?
Quote
I mean http://ncatlab.org/nlab/show/function+extensionality as in proving two algorithms equivalent, or proving a morphism path as continuous from one to the other under homotopy.  If you can't show that the benchmark work is a functional extension from the job work than any measurements over the benchmark work have no relation to the job work.  In other words, if you can't assert any relation between the algorithms then you can't assert any relation between the workloads of executing the algorithms!

again, how come you tie so closely between workloads and specific algo's operation?
maybe it's needed for proven comps, but for my approach estimating consumption and risk reducing?

Quote
1.4 the location of the vars as components in the vector also answers your misunderstanding regarding the decomposition of a procfs msmts of a given program to a linear combination of procfs msmts of k>>n programs.

?

Other than the well ordering making the subsequent decomposition consistent with the original g vector, I fail to see how it relates.  Is there something more and special to this?
[/quote]
where are we stuck on agreeing over the procfs vector decomposition?
maybe you're looking for something deep like functional extension, but those vectors can be spanned by almost any enough random vectors. like all vectors.
Quote
Quote
Quote
4. they're equivalent from the attacker's/provider's point of view, but not from the publisher's one who wants to detect them, as i stated.

Only in that, without authentication, they have separate mitigation to the fact that they are not actually detectable.  (Even this is arguable, as in both cases mitigation can only really be performed by way of comparison to a third party result!)

sometimes the verification is so fast that it can be done on the publisher's computers. so as for miscalc, not always 3rd party is needed. moreover, yes, i do assume that each publisher rents many providers and is able to compare between them.

I still don't understand which flaw you claim have found.
You claim that people will just create many addresses, make jobs for a few seconds, and this way fool the whole world and get rich until zennet is doomed?
So many mechanisms were offered to prevent this. Such as:
1. Easy outlier detection cause of [large population and] normal estimators.
2. Keeping local history and building a list of trusted addresses, by exposuring yourself to risk slowly, not counting on an address fully right from the first second.
3. You can always ask it to hash something and verify the result! Moreover, you can spend the your first new seconds with your new provider with just proving his work. Yes, they can became malicious a moment after, but: you'll find out pretty quickly and never work with that address, while requiring more POW than this address has.
4. i'll be glad of a detailed scenario of an attacker trying to earn. i dont claim he won't make a penny. but show me how he'll make more than a penny a day. Can you pinpoint the problem? for miscalc and procfs spoofing, we may assume we dont have the fancy pricing model, and we can discuss it as we were pricing according to raw procfs.

Tau-Chain & Agoras
CryptoPiero
Member
**
Offline Offline

Activity: 98
Merit: 10


View Profile
September 20, 2014, 12:34:19 PM
 #155

peled1986
Legendary
*
Offline Offline

Activity: 882
Merit: 1002


View Profile
September 20, 2014, 01:10:28 PM
 #156


Super interesting conversation between ohad and HMC   Smiley
HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 06:31:48 PM
 #157

true, they're unrelated.
as for load decomposition,
why on earth the programs should be identical/homomorphic/isomorphic?

First, homomorphism is an entirely seperate thing. I only want the isomorphism.

Second, they "should" be isomorphic because the performance profile of any given algorithm has to be assumed as potentially unique to that algorithm.  No benchmark you could throw at the system will necessarily "load" that system in the same way(s) that my program P will "load" that system.  What we ultimately want to get to is a measure of "how hard is the system working on P" and we can't formulate this beginning from some baseline that has nothing to do with P!

Quote
programs that performs 1000 FLOPs will take about 1/2 than programs that perform 2000 similar FLOPs, even if they're two entirely different algos. i'm not counting on that anywhere, just pointing to the fact that i'm apparently not interested at all in such functional equivalence.

Sure, but this reasoning is only really valid if we look at a single measure in isolation - which is not what we intend, per your decomposition!  If we run your benchmarks and generate our g values across all metrics, and then use a spin loop for P, we will see 100% processor usage, but no other load.  Does this mean that we are loading the system with 100% of "P work?"  YES, but your model will initially decide that the usage is actually lower!

Quote
how any kind of such equivalence is related to resource consumption?

If we can formulate our g in relation to P then our measures relate, too.  If we take our g measure using a variety of spin loops (potentially "extended" from P in some way) and then perform our decomposition against our P spin loop your decomposition will "know" from g that the only meaningful dimension of the performance profile is the cpu, and will correctly measure P as loading the system to 100%.

Obviously this is a contrived example to illustrate the point, no-one will want to pay to run a busy loop P.  However taking the point to an extreme like this makes it simple to illustrate.

Quote
again, how come you tie so closely between workloads and specific algo's operation?

Because every algorithm combined with a particular reduction of that algorithm has a unique performance profile criteria!  A performance is unique to a particular run.  We can never know how our particular run will behave, but we can't measure it against the baseline of some other algorithm(s) in a meaningful way.  Disk usage or page faults or etc in our benchmarks have NO bearing on our measure of our busy loop P!

Quote
maybe it's needed for proven comps, but for my approach estimating consumption and risk reducing?

It is necessary in either case to derive meaningful measure in the decomposition.

Also, this notion should actually be turned the other way round - proven comps should be considered needed for estimating consumption, as AMiller pointed out on IRC.

" 12:21 < amiller> imo it's not a bad idea to have some pricing structure like that, i'm not sure whether it's novel or not, i feel like the biggest challenge (the one that draws all my attention) is how to verify that the resources used are actually used, and this doens't address that" [SIC]


Quote
where are we stuck on agreeing over the procfs vector decomposition?
maybe you're looking for something deep like functional extension, but those vectors can be spanned by almost any enough random vectors. like all vectors.

The problem is that such a span is not meaningful relative to subsequent measure unless some functional extension exists.  Unless our benchmarks are "similar" to our P busy loop, they only introduce noise to our decomposition of our measures over P.


Quote
sometimes the verification is so fast that it can be done on the publisher's computers. so as for miscalc, not always 3rd party is needed.

Again, I'm assuming most computations will not have referential transparency, will not be pure, precluding this.

Quote
moreover, yes, i do assume that each publisher rents many providers and is able to compare between them.

Again, I'd rather find solutions than defer to (potentially costly) mitigation.

Quote
I still don't understand which flaw you claim have found.
You claim that people will just create many addresses, make jobs for a few seconds, and this way fool the whole world and get rich until zennet is doomed?

Among other behaviors.  I basically assume that people will do "all the same crap they did with CPUShare et al that made those endeavors fail miserably."

Quote
So many mechanisms were offered to prevent this. Such as:

Except as offered, they don't prevent anything at all.  They presume to probabilistic avoid the concern, except that there is no formalism, yet, around why we should believe they will serve to avoid any of them to any reasonable probable degree.  Again, I defer to AMiller who always puts things so much better than I ever could:

"12:25 < amiller> there are no assumptions stated there that have anything to do with failure probability, malicious/greedy hosts, etc. that would let you talk about risk and expectation"

Quote
3. You can always ask it to hash something and verify the result! Moreover, you can spend the your first new seconds with your new provider with just proving his work. Yes, they can became malicious a moment after, but: you'll find out pretty quickly and never work with that address, while requiring more POW than this address has.

Again, how is the challenge difficulty to be set?  This is still unanswered!

Quote
4. i'll be glad of a detailed scenario of an attacker trying to earn. i dont claim he won't make a penny. but show me how he'll make more than a penny a day.

How much he stands to make largely depends on his motive and behavior.  Some attackers might make 0 illegitimate profit, but prevent anyone *else* from being able to accept jobs, for example.  Some attackers might burn addresses, and just make whatever "deposit" payments he can take.  Some attackers might fake their computation, and make the difference in energy cost between his fake work and the legitimate work.  Whether or not he'll make more than a penny a day depends on how capable his approach is, how naive/lax his victims are, and how much traction the network itself has gained.

My concern is not that some attacker will get rich, my concern is that the attackers leeching their "pennies a day" will preclude the network from being able to gain any traction at all, and will send it the way of CPUShare et al.

It is very important to remember that, for some attackers, a dollar a day would be a fortune.  Starving people are starving.  Some of those starving people are smart and own computers, too.

Quote
Can you pinpoint the problem? for miscalc and procfs spoofing, we may assume we dont have the fancy pricing model, and we can discuss it as we were pricing according to raw procfs.

The root of the problem is just the combination of everything I've already described.  Wink  It is all predicated from lack of authentication over the g values, and made worse by the lack of cost for identity.

HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 07:17:39 PM
 #158


Super interesting conversation between ohad and HMC   Smiley

But the big question: Who is wrong?  Wink

Is there anyone out there in internetland who wants to jump in with a new perspective?
ohad
Hero Member
*****
Offline Offline

Activity: 897
Merit: 1000

http://idni.org


View Profile WWW
September 20, 2014, 08:27:51 PM
 #159

Quote
Quote
3. You can always ask it to hash something and verify the result! Moreover, you can spend the your first new seconds with your new provider with just proving his work. Yes, they can became malicious a moment after, but: you'll find out pretty quickly and never work with that address, while requiring more POW than this address has.
Again, how is the challenge difficulty to be set?  This is still unanswered!

Easily: requirind ID POW is like requiring your BTC pubkey to begin with a certain amount of zeros.
When I connect to you, I give you some challange to sign with your privkey, to make sure you own that pubkey with tons of POW.
So with this approach, "good people" will use one addr and not gen new one every time, both for the id pow, and for keeping history with good name on the publisher's local data.

As for AMiller's, see the following chat (he didn't answer yet):

amiller:
so the conditions for gauss markov theorem are weaker than IID, but they're still have to be independent and the distributions have to have the same variance
i don't know what you mean by as for lying, connected to actual outcome
if many other nodes perform the same job,
then you have to assume that those nodes failing are all independent
but on the other hand if *all* the nodes know they can cheat and get away with it, they might all cheat
so it's not justified to assume that the probability of any individual node cheating is uncorrelated with any of the others
this is kind of the difference between adversarial reasoning and, i dunno what to call the alternative, "optimistic" reasoning

me: the issue you raise is simpler than the linear model: say i have 10^9 work items to distribute. i begin dividing them among client. after one job is done by a worker (the smaller job, the better) i can tell if they cheated or not, or, i can tell how much they charged me for that work item. if i don't like the numbers i see, i can always disconnect. it will all happen automatically, by the publisher specifying their prices and margins.
the publisher may also begin slowly, giving at the beginning some hashing commands just to see the correct result. they won't pay more for working slow. they pay according to stationary resource consumptions.

The main point is that the exposure is short (few secs), while the publisher can perform any risk-decreasing behavior. Even running a small background task just for sanity verification. Of course, the attacker can modify rare random calculations, and lowering this risk (the miscalc risk) can be done at either:
1. have an easily proven jobs. you claim that they are rare cause of impurity. I claim that for many if not most common problems, one can formalize them and pick a method of solution that'd be distributed and with main massive pure task such as huge eigenvalue problem.
2. increase the number of times each work item is calculated (exp. decrease risk)
3. selected known or trusted providers.
4. require high ID POW, so high such that there is no incentive to lose it, just for the price of a few seconds of calculations.

note again: even if the provider has computing monsters, the publisher doesn't must use all resources. he can run, at first, simple cheap tasks. so the potentially lost several seconds can contain minimal amount of work to be paid.
-- this latter solution of course does not really help you identify scammers, but it does help you slowly gain experience with the good guys.
will be back later with more answers.

Tau-Chain & Agoras
HunterMinerCrafter
Sr. Member
****
Offline Offline

Activity: 434
Merit: 250


View Profile
September 20, 2014, 08:55:26 PM
 #160

When I connect to you, I give you some challange to sign with your privkey, to make sure you own that pubkey with tons of POW.

How do we know what constitutes "tons" for any given puzzle solver?!?!?!?!  Without a global difficulty how do we know what is sufficiently difficult?  How do we know that our "tons" is not too little to fail to avoid the scammers and not too much to disqualify the non-scammers?  How should we pick an number that we can be confident falls in the middle, here, without knowing anything about the worker and how he will perform his hashing?

Quote
4. require high ID POW, so high such that there is no incentive to lose it, just for the price of a few seconds of calculations.

How do we know how high is high enough?  How do we know how high is too high?  How do we pick the target?  You keep dodging this question, somehow.  Wink

I'll let AMiller respond to the rest, since it is intended for him.  I don't want to interject there and just muddy things further.  Grin
Pages: « 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!