The idea is simply brilliant. The question how unused resources could be used in a (semi-) trustless and distributed way was asked many times, but using micropayment channels might be the missing piece and part of the solution.
After a quick read I have a few questions and excuse me, if I missed the general concept:
xennet.io describes the idea of decrentralized supercomputing where access via SSH to VMs can be rented or sold. Contracts are negotiated over a P2P network and payments are done via payment channels for actual work which, according to the description, can be measured. This seems pretty straight forward.
xennetdocs in contrast mentiones elements like XenFS and XenTube, proof of storage and much more. So what's the plan? Distributed HPC or MaidSafe 2.0?
This particular statement made me wonder:
1. Publisher A broadcasts an announcement (ann) to the blockchain, saying it is seeking providers. The ann contains information about the required systems in terms of hardware capabilities, and the publisher's IP address.
2. Provider B polls on the blockchain. Once an ann that matches its filter is found, it connects to A's IP address.
I'd like to
quote Satoshi:
We define an electronic coin as a chain of digital signatures. (...) The problem of course is the payee can't verify that one of the owners did not double-spend the coin.
We need a way for the payee to know that the previous owners did not sign any earlier transactions. For our purposes, the earliest transaction is the one that counts, so we don't care about later attempts to double-spend.
In this paper, we propose a solution to the double-spending problem using a peer-to-peer distributed timestamp server to generate computational proof of the chronological order of transactions.
The solution we propose begins with a timestamp server. A timestamp server works by taking a hash of a block of items to be timestamped and widely publishing the hash, such as in a newspaper or Usenet post [2-5]. The timestamp proves that the data must have existed at the time, ...
The blockchain is basically a ledger where data is published in an ordered structure and it provides an answer to the question which piece of data came in first.
Peer discovery and contract negotiation doesn't seem like something that requires such properties and might as well be satisfied by other communication networks. Once two peers are matched, they can furthermore communicate in an isolated channel. I don't really see the benefit of using a blockchain here and the BitTorrent Mainline DHT with likely over 25 million participants is probably a prime example how it could be done, too -- without any delay based on "block confirmations" or whatsoever. You may take a look at the colored coins projects or
bitsquare (to name a another concrete example), as well, which intend to use an overlay network for order publishing.
I assume this is directly linked to my third note or question:
Why do you want to create a new coin at all? Despite that this would be a huge and complex task on it's own, not even looking at all the implications and security risks, I seem to miss the underlying need in the first place.
To quote:
One would naturally ask: why isn't Xennet planned to be implemented over Bitcoin? The answer is mainly the following: in order to initiate a micropayment channel, it is necessary to deposit money in a multisig address, and the other party has to wait for confirmations of this deposit. This can make the waiting time for the beginning of work to last 30-90 minutes, which is definitely unacceptable.
I'm not sure, if this is indeed linked to my previous comment (with something like: tx with "announcement" -> block confirmation -> tx with "accept" -> confirmation -> tx to "open channel" -> ...), but let's assume for a moment this is only about opening the payment channel: I would humbly disagree here and wonder: how do you come up with a delay of 30-90 minutes? When I look at
Gavin's chart which shows the relation between fees and delay of a and inclusion within a block it seems that you can be pretty sure a transaction can be confirmed within one block at a cost of 0.0005-0.0007 BTC/1000 byte or two blocks at a cost of about 0.00045 BTC/1000 byte transaction size. Given that opening the micropayment channel equals funding the multisig wallet via a standard pay-to-hash transaction with a size of usually about 230 byte, then it comes down to a cost of about 0.0001-0.0002 BTC to ensure at a high probability the channel is opened within one or two blocks.
With a block confirmation time of usually 10 minutes, but due to the increasing total computation power of the network, of usually even less then 10 minutes, this is far away from 30-90 minutes.
This timeframe, depending on the level of trust in the other party, could be used to begin with the work (probably not wise), but also to setup everything that's needed in general and especially to run the benchmark (let's call this proof-of-benchmark
to measure the system's capabilities. The inceives during this periode seem sort of balanced, given that one party at least pays the transaction fee to open the channel and the other party which spends computitional resources to run the benchmark.
Another question I'd like to throw in: is an almost instant start even required here? I'm not familiar with HPC and what is usually computed at all, but I would assume tasks that require heavy resources usually run over longer periods of time and say (totally out of thin air) there is a lab which wants to run some kind of simulation over the next 7 days, then I'd say it doesn't really matter, if the work begins within 5 or 30 minutes.
My last question derives from my lack of knowledge in this field, too, but would it be possible to game the system or even produce bogus data? Related to Bitcoin mining it's pretty simple: there is a heavy task of finding a nonce which produces a hash with some specific properties. The task can easily take quite some time to be solved, but the solution can be verified almost instantaneously.
If the "usual work" done in HPC environments has similar properties, then I see a golden future. If this is not the case and if results may not even be verifiable at all, then this could be a significant problem.
Looking forward for your answers. Cheers!
Thanks Dex!
It is important to bear in mind that Xennet is only IaaS (Infrastructure as a Service) and not PaaS (Platform) or SaaS (Software as a Service). Xennet brings you access to hardware. It need not help you manage your workers and their up/downtime, or latency and so on, or to divide the distributed work between them - Xennet won't do it just because there are plenty of wonderful tools out there (e.g. Hadoop). We do not aim to innovate on this field. All we do is bringing more metals to existing distributed applications.
Even though Xennet is an infrastructure, it's a strong and versatile one. Any native code can be executed, hence more applications can be built on top of Xennet. One of then is XenFS. Xennet+XenFS will provide both computational power and storage, but they're different layers, one on top of another. So we'll have HPC, Big Data, Cloud, Storage, and all decentralized and open for applications on top of it.
Xennet does not implement algorithms to verify the correctness of the execution. Such algorithms are a very hot topic in academic research nowadays, and once there will be a good way to do it, we will probably use it. But, Xennet can still be a fair market even without such verifications. We cannot totally eliminate the probability of incorrect computation, but we can make this probability as small as we want.
For example, if we make each work twice, it'll cost twice, but the probability of mistake will be squared. Linear growth in the cost gives exponential decrease in the risk. But that's not the only mechanism. The micropayments protocol bounds the time that a fraud affect to several seconds. As for storage in XenFS, the mechanism of bounding the probability is totally different, as described in the RFC.
Maliciously misleading measurements will be taken care by calculating normally distributed fluctuations from the pseudo-inverse. See the linear algebra part. The errors of the least squares problem are known to distribute normally (Gauss proved that) so one can configure the distribution tails' when they disconnect the from their party due to high enough probability of misleading measurements. It's a bit complex, I know, but the user will only config % of probability to reject.
Moreover, a publisher can rent, say, 10K hosts, and after a few seconds drop the less-efficient 50%.
Another risk-decreasing behavior is that each provider works simultanously for several publishers. Say each work for 10 publishers in parallel, so the risk to both sides decreses 10 fold. See my answers earlier on this thread.
You claim that the time for confirmation (for the micropayment initialtion via the multisig deposit) may be much less than 30-90 (I stated those numbers while thinking about 6 confirmations. DPOS gives you more security with less confirmations). You mentioned that high fees may help. Note that this deposit should be for each provider separately, and on the total, they don't get much coins since usually small amounts are transferred, also for risk management. So the fee cannot be high. In addition, even if the confirmation would take 5 minutes, that's still a lot. Look: the world's fastest supercomputer is about like 8,000 AMD 280X GPU, in terms of TFLOPS. So a full day of fastest supercomputer work should be done in minutes or even seconds, without Xennet having even 1% of all GPUs. Moreover, Xennet is an infrastructure for more applications, like XenFS and XenTube. If someone wants to publish an encoding and streaming task over XenTune, will they have to wait 5 mins before they begin watching?
So 5 min are too much time. In addition, POW is pretty obsolete. It's also centralized, de-facto. I tend to find DPOS much better. What is your opinion?
As for the benchmark, the publisher does not run a benchmark each time it connects to a provider (otherwise huge waste will take place). The provider's client does benchmarks from time to time, and the linear algebra algorithm is smart enough to compensate over different hardwares showind same measurements but still one is more efficient, or fluctuations of other tasks in the background, and so on.
Why publish the ann over the Blockchain? It will be up to the user's choice whether to make the ann prunable or not. We might even support propagating it between the nodes without getting into the blockchain (but we have to think about cases if nodes don't have incentive to pass anns). Some users will want their ann to be public and persistant, for the sake of reputation.
I hope I touched every point and didn't forget more related strengths to present on this scope.
Thanks again,
Ohad.