Astrohacker (OP)
|
 |
July 22, 2011, 10:13:55 PM |
|
Proofnet: a P2P broadcast network based on proof-of-work to minimize spam
There is no standard decentralized way to broadcast information around the world. After trying to solve many other problems, particularly the problem of trust, I've realized it would be extremely useful to have such a system. The basic idea is to have a peer-to-peer (P2P) network where anyone on the network can broadcast any kind of information to their peers, who rebroadcast the information, until everyone has it. The problem, of course, is spam, since if such a system ever became popular, it would be flooded with valueless spam. One way to solve this problem is to use bitcoins to pay people to rebroadcast your message. But it would be hard to know if you can trust anonymous nodes and thus no one would be willing to spend bitcoins on it. Another method is to attach proof-of-work to each message. This is the method that I propose. Nodes rebroadcast all the messages they can, but give preference to greater work. It would still be possible to spam this network, but it would be expensive. The protocol I've designed for this is called "proofnet". In this article I will explain how the basic message format looks, and describe some applications. If people are interested in this, I will create an example implementation in Python.
The message format
The nodes will be connected to one another in a web. The nodes will then broadcast messages to one another. A message might look like this:
(proof-of-work hash)(message)
The (proof-of-work hash) is a SHA-256 hash of the message that follows. The lower the hash, the more work went into it. Nodes rebroadcast as many messages as they can, giving priority to messages with high proof-of-work (i.e., a lower hash). There are problems with this simplistic format, however. The only way to increase the work would be to change the message. So we should include a nonce:
(proof-of-work hash)(nonce)(message)
The nonce is just a random number that is iterated until the proof-of-work hash is low enough that other nodes are likely to rebroadcast it. Another problem is that spammers might find one high proof-of-work message, and then rebroadcast it again and again. People who just joined the network would be duped by the spam messages. Thus, we should include a timestamp.
(proof-of-work hash)(nonce)(utc timestamp)(message)
Nodes will only rebroadcast messages that 1) they've never seen before, and 2) are fairly new, say within the past 10 minutes. That way, we can be sure no one is just spamming the same old messages. All nodes will need to keep their clocks accurate otherwise their messages will be ignored.
Another problem is that if this protocol became popular, the network would be saturated with messages, and many users would not have enough computing power or bandwidth to be able to participate. So we'd like to allow for separate channels.
(proof-of-work hash)(nonce)(utc timestamp)(channel hash)(message)
The channel hash is a SHA-256 hash of a string like "Proofnet1" or "My awesome channel". The reason for using a hash of the string is to make the format simpler since the hash will always be a constant number of bytes (32), and for some level of privacy, since only people who are privy to the channel name will know what the channel is about (they cannot practically invert the hash to find out).
It's worth noting that the nonce, UTC timestamp, channel hash, and message must all be included in the proof-of-work hash: 1) The nonce must be included to easily allow people to compute different proof-of-work hashes until they get a low one. 2) The UTC timestamp must be included so that you can't just change the UTC time without redoing the proof-of-work, thus you can't just spam old messages. 3) The channel hash must be included so that spammers can't just take a message from one channel and spam it on all others without redoing the proof-of-work. 4) The message itself must obviously be included in the proof-of-work hash so that spammers can't just change the message and rebroadcast it without redoing the proof-of-work. In summary, all information must be included in the proof-of-work hash so that every new broadcast message requires new work.
There is one more piece we need. Since this is going to be a general information broacast network, we need to support different message types.
(proof-of-work hash)(nonce)(utc timestamp)(channel hash)(message type hash)(message)
The (message type hash) is a SHA-256 hash of a string describing the message, like "UTF-8 text" or "JPG Image". We take the hash so that the format is simpler and also to allow for some secrecy, just like with the channel hash. If a client doesn't know what string goes into a hash, it doesn't need to know.
This allows anyone to broadcast any kind of information, and the more computational power they have, the more people will get it. Messages with higher work (lower proof-of-work hashes) are declared more valuable and thus given preference to being rebroadcast. There is no guarantee a given message will make it to every node. The format allows for different channels to exist while not allowing messages on one channel to be rebroadcast on another without redoing the work. It also allows information of any kind to be broadcast and is not limit to text. It allows messages of any size to be broadcast, but longer messages take more work to hash, and thus smaller messages will be easier to broadcast.
Proofnet can broadcast any kind of information, but the standard Proofnet client will include a few applications by default: 1) Text broadcast. 2) Signed text broadcast, like a decentralized IRC. 3) Signed, encrypted messages, like encrypted email, except the recipient is not guaranteed to receive the message. 4) A trust-declaration broadcast system, like decentralized eBay rankings.
1. Text broadcast
Text broadcast is the simplest application. The message type is the string "proofnet:text". And the message is just UTF-8 text. When a client receives this, they can immediately read the message. They will also know how much work went into the message. Messages with more work will look more impressive. The author will be unknown.
2. Signed text broadcast
It is often valuable to know who is sending a message. The message type for this will be "proofnet:textfrom" and the message will take this form:
(message)=(from public key)(public key signature of utf-8 text)(utf-8 text)
This would be like a decentralized IRC. A slightly more sophisticated version could encrypt messages with a password, and only people who know the password could decrypt the message.
3. Private messages
It would be nice to send private messages over this network. The message type will be "proofnet:textto". The message looks like this:
(message)=(to public key)(encrypted message) (encrypted message)=encrypted[(from public key)(from public key signature of reply-to public key and utf-8 text)(reply-to public key)(utf-8 text)]
People would know they are receiving a message by spotting their public key, but no one else would know who sent it or what the message is. A new reply-to address is included to encourage further privacy. Conversations could take place where the main identities of the participants aren't known, and what they are saying isn't known, except to the people participating.
4. Trust rankings
The internet is filled with scammers. We need a decentralized ranking system, like eBay rankings but on a P2P network. We can do this by allowing public keys to advertise how much they trust other public keys on Proofnet. A "trust ranking" is a ranking between 0 and 1. User X might trust User Y with 0.9, and then User Y trusts User Z with 0.5, and thus User X would know that User Z should be trusted with about 0.45. Trust is relative; objective rankings are meaningless because new identities can be created at will and thus someone can cheaply rank theirself up. Only relative rankings through the web of trust make sense.
The message type is "proofnet:trustrank". The message format looks like this:
(message)=(from public key)(public key being ranked)(signed decimal and message)(fixed point decimal from 0 to 1)(optional utf-8 text message explanation)
To send out a scammer alert, you can give someone a ranking of 0 and put an enormous amount of work into the message. To know trust rankings, you would need to monitor the network continuously, or download them from a service that saves all of them. There could a be a channel devoted specifically to trust rankings, e.g. "Proofnet Trust Channel".
A similar application could be created for a social network, e.g. expressing someone as a friend rather than expressing how much you trust them. The social network could include any arbitrary features we wanted, like posting messages on a wall, posting and labeling images, etc. Since most people will not be on the network continuously, there would need to be services that are on it continuously and will give you the information later. Or rather than using centralized services for this, old broadcasted data could be shared via BitTorrent.
Bitcoin
Proofnet is useful to bitcoin in at least two ways. 1) It might be valuable to encode bitcoin transaction broadcasts in this format, making it harder to spam the network with transactions. 2) The trust network would enable people to know who they can trust on the internet and thus who is worth giving bitcoins to.
Other applications
It is fun to consider how many applications of Proofnet there could be.
* It would be possible to implement decentralized torrent trackers and indexes on Proofnet, eliminating the only central weakness in BitTorrent file transfers. Peers who are seeding a given torrent could broadcast themselves on the "example.torrent" channel. Torrent files could be searched for, requested, and sent via broadcasted messages, in a way that is completely anonymous, except for the IPs that you connect to for file transfer.
* A decentralized data storage service. A request could be sent out to save a file of a certain file size, and responses would come back from public keys with a price given in bitcoins. The user could look at the trust ranking of the public keys of the responses, and also at the prices, and decide who is worth saving their data with. This would be a completely anonymous file storage service, but you would still know if you can trust the people you are dealing with.
* A decentralized replacement for DNS. Names could be assigned to IPs by using the information on the trust network or social network. e.g., if your friend who has a trust rating of 0.9 thinks wikileaks.org points to some address, and your enemy with a ranking of 0.001 thinks wikileaks.org points to another IP, you would trust your friend. This is very different than Namecoin, which aims to establish objective control over names. But in this case, names would be controlled by popular opinion.
* A decentralized Twitter. The message format is just like "proofnet:textfrom" except limited to 140 characters. Messages longer than that would not be rebroadcast. To follow someone, just log on to the channel and display broadcasts from that user, or use a website that logs the history of that channel.
* It would be possible to broadcast images, streaming audio, and streaming video. It would be necessary to put information about the chronological order of a piece of information for anything that is streamed, and there would be no guarantee that every piece would reach every node. There are also bandwidth limitations; one channel would probably only be able to support one streaming video at a time.
* It would be possible to send alerts about important events. If a very important event occurs, then put a lot of work into the message, and people will quickly learn about it all over the world. The more work, the more impressive the message, and the more people will notice.
The P2P network
It does not actually matter what technology is used to connect peers together, although the standard client will obviously support TCP/IP and could be extended to support IPv6 in the future. Nodes can find each other in many ways: 1) An IRC channel could be established for the purpose of allowing nodes to find each other, like with Bitcoin. 2) Nodes could broadcast themselves on the network that they are allowing connections to get more nodes connected that are already on the network. 3) Lists of nodes could be pasted on the web or shared in torrents.
PyProofnet: The first client
I have not actually written a client for Proofnet, but I will if people express interest. It will be written in Python.
Conclusion
Proofnet is a decentralized, general purpose information broadcast system operating on a P2P network and based on proof-of-work, where anyone can send a message to their peers, but messages are only rebroadcast by their peers if they demonstrate enough work. Any kind of information can be broadcast. It is reasonably secure against spammers since spammers would need to do a lot of work to spam their messages. There are many conceivable applications that could be developed on top of this system. Since it is decentralized, it has the advantage over centralized systems that no one can limit what you do with it. I will write an example implementation in Python if people are interested. Comments and criticisms appreciated.
|