Bitcoin Forum
September 19, 2024, 04:43:59 AM *
News: Latest Bitcoin Core release: 27.1 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 3 4 »  All
  Print  
Author Topic: CoinShuffle: Practical Decentralized Coin Mixing for Bitcoin  (Read 23776 times)
TimRuffing (OP)
Newbie
*
Offline Offline

Activity: 14
Merit: 10


View Profile
April 12, 2014, 10:07:39 AM
Last edit: April 24, 2014, 04:55:28 PM by TimRuffing
Merited by ABCbits (9)
 #1

Hello,

we (Tim Ruffing, Pedro Moreno-Sanchez, and Aniket Kate) are group of researches at Saarland University in Germany.
We have written a research paper in which we propose a new coin mixing protocol to improve anonymity in Bitcoin. The protocol is called CoinShuffle.

The key innovation is that it is decentralized and it does not require a mixing server. Among other advantages, this means that no mixing server learns which output addresses and input addresses in a mixing transaction belong together. Since CoinShuffle is based on the idea of CoinJoin, theft of coins is excluded as well. CoinShuffle does not require complex cryptography and performs well in practice, even in scenarios with a large number (about 50) of participants.

A preliminary version of the paper is available.

This is the abstract:
Quote
The decentralized currency network Bitcoin is emerging as a potential new way for performing financial transactions across the globe. Its use of pseudonyms towards protecting users' privacy has been an attractive feature to many of its adopters. Nevertheless, due to the inherent public nature of the Bitcoin transaction ledger, users' privacy is severely restricted to linkable anonymity, and a few Bitcoin transaction deanonymization attacks have been reported so far.

In this paper, we propose CoinShuffle, a completely decentralized Bitcoin mixing protocol that allows users to utilize Bitcoin in a truly anonymous manner. CoinShuffle is inspired from the accountable anonymous group communication protocol Dissent and enjoys several advantages over its predecessor Bitcoin mixing protocols. It does not require any (trusted, accountable or untrusted) third party and it is perfectly compatible with the current Bitcoin system. CoinShuffle introduces only a small communication overhead for its users, while it completely avoids additional anonymization fees and minimizes the computation and communication overhead for the rest of the Bitcoin system.


A comparison to other approaches to improve anonymity, e.g., Zerocoin and Mixcoin, can be found in Section 8 in the paper.

We would be happy to hear your feedback and critique about our proposal. All details can be found in the paper.
sundance
Newbie
*
Offline Offline

Activity: 22
Merit: 29


View Profile
April 13, 2014, 04:47:37 AM
 #2

A very nice paper, guys, thanks. Will be in touch to exchange more ideas.

I have a similar paper with another approach that I will share with the community.
wladston
Full Member
***
Offline Offline

Activity: 157
Merit: 102


Always remember to be awesome.


View Profile WWW
April 16, 2014, 01:34:30 PM
 #3

Would really like to know the opinion of Gmaxwell and the rest of the dev team regarding this research. To me this looks like something that could be well integrated into Bitcoin Thin clients.

Love programming? Checkout my book at http://code.energy/book
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
April 16, 2014, 02:44:42 PM
 #4

The effect is that your output is mixed with all the honest participants?  If there was 3 honest and 7 colluding, then your output is sent to one of the 3 remaining outputs at random.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
TimRuffing (OP)
Newbie
*
Offline Offline

Activity: 14
Merit: 10


View Profile
April 17, 2014, 12:02:35 PM
 #5

Since you cannot know who is honest, you always mix with everybody.

To clarify:
Assume there are 10 participants in the protocol. 3 of them are honest, including you, and 7 are malicious and collude. Assume the protocol has been finished successfully.
If the 7 malicious guys would like to find out which of the 10 output addresses is yours, they always know that your address is not among their 7 addresses, just because they know their own 7 addresses. Thus your address can only be among the 3 remaining addresses.
(This "attack" is always possible in every form of coin mixing, no matter how you organize the mixing.)

However, the protocol ensures that the 7 guys cannot tell which of the 3 honest output addresses is your output address.

So in a nutshell: You shuffle all 10 addresses, but shuffling your address with the 7 malicious addresses does not help you. In the end, you get "anonymity among the 3 honest addresses".
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
April 17, 2014, 12:36:25 PM
 #6

So in a nutshell: You shuffle all 10 addresses, but shuffling your address with the 7 malicious addresses does not help you. In the end, you get "anonymity among the 3 honest addresses".

Yeah, that's what I meant.  You get maximum mixing.

A client could be created that constantly mixes your coins.  Leaking zero info would be hard though. 

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
TimRuffing (OP)
Newbie
*
Offline Offline

Activity: 14
Merit: 10


View Profile
April 24, 2014, 11:41:51 AM
Last edit: April 28, 2014, 02:35:19 PM by TimRuffing
 #7

As suggested by some readers, to illustrate our idea better, let me give a small example with (say) four participants Alice, Bob, Charlie, Dave. The participants have exactly 1 BTC at each of their respective addresses A, B, C, D. Assume the participants already know that they would like to run the protocol with each other and they know the addresses of each other. (Finding other participants can be done via a P2P protocol, for example.)

The participants create fresh addresses A', B', C', D' but do not show them to each other. The goal of CoinJoin-based mixing is to create a mixing transaction with input addresses A, B, C, D and output addresses A', B', C', D' to hide the relation between the coins and their owners. (If it is not clear to you why such transactions are possible, I recommend reading the thread about CoinJoin). However, if we would stick to that particular order A', B', C', D' of output addresses, everybody would learn that A belongs to A', B belongs to B', and so on. So we need to shuffle the list of output addresses to make sure that the linkage of input and output addresses remains hidden. But just shuffling the output addresses in the created transaction does not suffice: For example, if everybody just announced his output addresses during the protocol in plain, i.e., Alice announces A', everybody would learn that A' belongs to Alice. So we have to make sure that the messages that are sent during the protocol do not break the anonymity. CoinShuffle solves exactly this problem.

A successful run of the protocol looks as follows: (Note that the description is simplified; the full details are in the paper.)

All messages are signed using the private signing key belonging to the input address of the sender of the message. I'm omitting the signatures in the following description to simplify the presentation.

Phase 1: Key exchange
Each participant (except for Alice) creates an key pair of a public key encryption scheme, consisting of a public encryption key and a private decryption key. We call the public encryption keys ekB, ekC, and ekD. Each participant announces his public encryption key, signed with the signature key corresponding to his/her input address.

Phase 2: Shuffling
Once everybody knows the public encryption key each other, the shuffling can start:

Alice encrypts her output address A' with all the encryption keys, in a layered manner. That is, Alice encrypts A' first for Dave, obtaining enc(ekD, A'). Then this ciphertext is encrypted for Charlie, obtaining enc(ekC, enc(ekD, A')) and so on for Dave. This resulting message is sent to Bob:
Alice -> Bob: enc(ekB, enc(ekC, enc(ekD, A')))

Bob gets the message, decrypts it, obtaining enc(ekC, enc(ekD, A')).
He also creates a nested encryption of his address, obtaining enc(ekC, enc(ekD, B')).
Now Bob has a list two ciphertexts, containing A' and B'. Bob shuffles this list randomly, i.e., either exchanges the two entries or leave them. Say we are in the case that they are exchanged. Bob sends the shuffled list to Charlie:
Bob -> Charlie: enc(ekC, enc(ekD, B')) ; enc(ekC, enc(ekD, A'))

Participant C does the same: He decrypts the two entries in the list, adds his own entry and shuffles the list:
Charlie -> Dave: enc(ekD, B') ; enc(ekD, C') ; enc(ekD, A')

Dave does the same again: He decrypts all entries, obtaining B', C', A'. He adds his own address D' and
shuffles the list. The resulting shuffled list is sent to everybody:
Dave -> everybody: D', B', C', A'

Phase 3: Creating the transaction
Every participant receives the list of output addresses and can verify that his output address is indeed there. If yes, he signs the transaction. If, e.g., Bob sees that his address is not there, he would lose his coin by performing the transaction, so he obviously does not want to sign. (This is the main idea of CoinJoin.)
In the case that Bob's address is not there, somebody must have cheated during the run of the protocol. Bob complains and the participants enter an additional phase to find out who cheated. CoinShuffle makes sure that this "blame phase" always exposes at least one cheating participant (and none can be accused falsely). This cheating participant can then be excluded from a subsequent run of the protocol: Say Alice cheated. Then Bob, Charlie and Dave can run the protocol again without Alice.

The key point is that in phase 2, only the participant that performed the shuffling knows the relation between the messages in the list that he received and the messages in the list that he sent.
For example, only Charlie knows that he left the message containing B' in the first position, because the encryption ensures that nobody can relate enc(ekC, enc(ekD, B')) and enc(ekD, B'). But even Charlie does not know that this was the message with Bob's address.
In the end, all addresses of honest participants are shuffled as explained in my previous posting. Nobody knows the permutation. (A detailed argument can be found in the paper.)


Note: The protocol works even if participants do not have exactly 1 BTC at their input addresses. It suffices that they have at least 1 BTC. In that case, they create a transaction that sends 1 BTC to each of the shuffled output addresses and the remaining coins to a change addresses that the users announce in the beginning. This can be done as for normal Bitcoin transactions. (This idea is also described in CoinJoin already.)

By the way, we managed to improve the execution time. Using our prototype implementation, a protocol run with 50 participants takes roughly 3 minutes now in the setting that we consider in the paper.

A comparison with other approaches
Mixcoin
The main innovation of Mixcoin is accountability for mixing servers (mixes): If the mix server steals coins, the user obtains a cryptographic proof of this theft and can hold the mix accountable. This is done in public: Everybody can verify this proof and the mix will hopefully lose its reputation, and nobody will use the mix in the future. The mix can still steal money but it will be caught and probably has to go out of business.
In contrast, the advantage of CoinShuffle is that it prevents stealing of coins in the first place instead of providing accountability only after the theft. Additionally, a centralized mixing server is not at all necessary in CoinShuffle.

Zerocoin / Zerocash / Anoncoin
Zerocoin (and the upcoming optimized Zerocash) are great because they provide "built-in anonymity" by using quite novel cryptography, e.g. ZK-SNAKRS. However, are their own currencies. Zerocoin and Zerocash are not compatible with Bitcoin, they need their own protocol extensions and chain. For instance, Zerocoin is being implemented in Anoncoin, an altcoin. CoinShuffle works directly on top of Bitcoin, without changing the Bitcoin protocol or forking the chain.

CoinSwap
Most important, the participants know which coins belong to which user in CoinSwap, so the anonymity is limited. Furthermore, CoinSwap needs at least 4 transactions and the corresponding fees. CoinShuffle needs only one transaction. However, CoinSwap is essentially a two party protocol, so it requires less interaction and coordination. The original CoinSwap thread provides a detailed comparison to CoinJoin, which provides the basis for CoinShuffle.
TierNolan
Legendary
*
Offline Offline

Activity: 1232
Merit: 1104


View Profile
April 24, 2014, 12:29:30 PM
 #8

Interesting.

More info on the blame system would be nice.

How do you handle a case where a participant refuses to send on a message to the next person?

Does each of the communications have to kept secret? 

When you say Bob -> Charlie, do you mean it is a broadcast?  Are the messages digitally signed too?

You should explain the blame stage too.

Assume that Bob cheats, he includes 2 addresses for himself.

Alice to Bob: enc(ekB, enc(ekC, enc(ekD, A')))

Bob to Charlie: enc(ekC, enc(ekD, B')) ; enc(ekC, enc(ekD, X'))

Charlie to Dave:  enc(ekD, B') ; enc(ekD, C') ; enc(ekD, X')

Dave to all: D', B', C', X'

Alice complains that her address is missing.

Each participant publishes their private key.  The process can be stepped backwards, until the "mistake" is detected.

1LxbG5cKXzTwZg9mjL3gaRE835uNQEteWF
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
April 24, 2014, 05:56:17 PM
 #9

With reference to the overview diagram given above and assuming all participants are honest and all communications secure:

Once the procedure is complete and everyone receives the shuffled addresses from Dave, it appears that Bob can determine Alice's new address by encrypting each of the new addresses first with Dave's key, then with Charlie's, and comparing the results with the data he received from Alice.  What is it that I'm missing?
wladston
Full Member
***
Offline Offline

Activity: 157
Merit: 102


Always remember to be awesome.


View Profile WWW
April 24, 2014, 07:06:31 PM
 #10

@teukon's observation seems relevant. However if Alice includes some random data along with her address, but Dave only publishes a list of addresses (without the extra random data) them Bob couldn't determine Alice's address.

Love programming? Checkout my book at http://code.energy/book
BitCoinDream
Legendary
*
Offline Offline

Activity: 2380
Merit: 1205

The revolution will be digital


View Profile
April 24, 2014, 07:16:48 PM
 #11


(This "attack" is always possible in every form of coin mixing, no matter how you organize the mixing.)


I think a trusted third party approach is required to nullify this attack...

wladston
Full Member
***
Offline Offline

Activity: 157
Merit: 102


Always remember to be awesome.


View Profile WWW
April 24, 2014, 07:23:15 PM
 #12

@BitCoinDream, @TimRuffing is correct, this attack (all other participants colluding) isn't fixable by mixing. If Bob, Charlie and Dave are malicious, no matter how you mix the coins, these bad guys can look at the output transaction and determine Alice's address (it will be the only address in the transaction not owned by them).

Love programming? Checkout my book at http://code.energy/book
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
April 24, 2014, 07:56:25 PM
 #13

@teukon's observation seems relevant. However if Alice includes some random data along with her address, but Dave only publishes a list of addresses (without the extra random data) them Bob couldn't determine Alice's address.

This was my first thought.  I checked out the paper to see if something along these lines had been included but found nothing.  I'd add to this and suggest that all participants add some random data to each level of the layered encryption.  If they add random data just to their address at the beginning, then Bob and Dave, working together, could determine Alice's (and therefore also Charlie's) address.

I assume something in the presented method tackles this, for if everyone is able to rebuild and compare in this way, then there's simply no point to the layered encryption in the first place.  Participants could pass unencrypted addresses with the same result.

Another potential remedy is to undergo multiple rounds, where each participant has a chance to be somewhere in the middle of the chain.  Assuming no more than one dishonest node, I'm fairly sure "maximum mixing" can be achieved in O(log(n)) rounds (where n is the number of participants) but it's not clear to me how devastating colluding members would be in this model.

Clarification would be appreciated.
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
April 24, 2014, 08:06:14 PM
 #14

I think a trusted third party approach is required to nullify this attack...

Perhaps your idea of a trusted third party in this situation is just a special case of an honest 5th party, called Erik say.
TimRuffing (OP)
Newbie
*
Offline Offline

Activity: 14
Merit: 10


View Profile
April 24, 2014, 10:29:09 PM
 #15

How do you handle a case where a participant refuses to send on a message to the next person?

Does each of the communications have to kept secret?  

When you say Bob -> Charlie, do you mean it is a broadcast?
The communication does not have to be secret. And you already indicate a possible answer to your first question: In the case that a participant, say Charlie, claims not to have received a message from his predecessor, broadcasts are a possibility to find out which of the parties really failed. Without going into details, the PeerReview protocol is one possible fully-fledged solution that we mention in our paper but explaining the whole idea would be too much for this thread. To demonstrate that those problems are solvable, I can sketch a simpler (but probably not so efficient) solution later if you are interested.

Are the messages digitally signed too?
Yes, all messages are signed using the private signing key belonging to the input address. I added that to the posting. (Sorry, I simplified too much with respect to that aspect. Wink)

You should explain the blame stage too.
Your example is already quite nice. It shows the most interesting case, namely what happens if something goes wrong during the shuffling. Then the participants publish their private decryption keys and everybody is able to replay every step of the shuffling to find out who misbehaved. (Publishing the decryption keys is not a problem, because the participants have to restart the protocol anyway. Somebody may be able to link, e.g., A and A', but no transaction involving A' is generated. The output address A' is just discarded and in the subsequent run, the users will create fresh output addresses and fresh encryption/decryption keys.)

Things can also go wrong in other phases too. Let me give one example, namely for phase 1: Somebody could send different encryption keys to different users in phase 1, e.g., Dave sends ekD1 to Bob and Charlie, but ekD2 to Alice. (This is called equivocation.) Dave will then learn that A' is Alice's output address: It's the only output address encrypted with ekD2. And if Charlie colludes with Dave, Charlie would not report that he obtained ciphertexts encrypted with different keys during the encryption. To exclude this attack, the full protocol contains an additional equivocation check between phase 2 and 3: Everybody signs the list of all encryption keys and broadcasts the list together with the signature.  Only if everybody has the same view on the encryption keys, the protocol continues normally. (I did not include this step the posting above for simplicity, the goal of the posting was rather to get the core idea across.)
If the participants have different views on the list of keys, this will be detected in this additional step. In the example above: Alice would publish the signed list (ekB, ekC, ekD1) but Bob would publish (ekB, ekC, ekD2). In that case, the blame phase would be entered too. From the point of view of the users, there are two possible cases now: Either Dave has really equivocated, or one of Alice and Bob is claiming to have received a key from Dave that he/she actually has not received. To find out who is malicious, Alice and Bob can just publish the signed messages that have they received from Dave in phase 1. That will clarify if they have published a wrong list of keys or if Dave has equivocated by sending different keys. So one malicious user is exposed at the end of the blame.

---

@teukon's observation seems relevant. However if Alice includes some random data along with her address, but Dave only publishes a list of addresses (without the extra random data) them Bob couldn't determine Alice's address.
This was my first thought.  I checked out the paper to see if something along these lines had been included but found nothing.  I'd add to this and suggest that all participants add some random data to each level of the layered encryption.  If they add random data just to their address at the beginning, then Bob and Dave, working together, could determine Alice's (and therefore also Charlie's) address.
There seems to be a misconception about encryption in general. Every secure* encryption scheme always uses randomness for the encryption to make sure that encrypting the same message twice does not yield the same ciphertext. This is exactly to exclude attacks like the one described above; such attacks are a general problem in cryptography,  not only in this protocol. The added randomness is built in the encryption algorithm itself, one does not have to add randomness manually to the message before giving it to the algorithm. One typically leaves the randomness implicit: When I write enc(ek, m), I actually mean enc(ek, m, r), where r is fresh randomness.
Try it: Take any encryption tool and try to encrypt the same message twice. Wink

* This holds for every standard cryptographic definition of "secure". In particular, it holds for IND-CCA, the security definition that we require to be achieved by the used encryption scheme.

---

@BitcoinDream:
A trusted third party does not help here! (That's why we would like to avoid it.) Mixing is not possible if you are the only honest user. You need at least one other user that is also honest.
Assume you are Bob. Further assume that Alice, Charlie and Dave are malicious and collude. No matter how you organize the mixing with a trusted third party or not, Alice, Charlie and Dave can observe the output of the mixing (e.g., by looking at the blockchain). Thus they observe four output addresses A', B', C', D'. Because they know that A', C' and D' are their own addresses, they can determine that B' is your address.
teukon
Legendary
*
Offline Offline

Activity: 1246
Merit: 1011



View Profile
April 24, 2014, 11:34:37 PM
 #16

@teukon's observation seems relevant. However if Alice includes some random data along with her address, but Dave only publishes a list of addresses (without the extra random data) them Bob couldn't determine Alice's address.
This was my first thought.  I checked out the paper to see if something along these lines had been included but found nothing.  I'd add to this and suggest that all participants add some random data to each level of the layered encryption.  If they add random data just to their address at the beginning, then Bob and Dave, working together, could determine Alice's (and therefore also Charlie's) address.
There seems to be a misconception about encryption in general. Every secure* encryption scheme always uses randomness for the encryption to make sure that encrypting the same message twice does not yield the same ciphertext. This is exactly to exclude attacks like the one described above; such attacks are a general problem in cryptography,  not only in this protocol. The added randomness is built in the encryption algorithm itself, one does not have to add randomness manually to the message before giving it to the algorithm. One typically leaves the randomness implicit: When I write enc(ek, m), I actually mean enc(ek, m, r), where r is fresh randomness.
Try it: Take any encryption tool and try to encrypt the same message twice. Wink

* This holds for every standard cryptographic definition of "secure". In particular, it holds for IND-CCA, the security definition that we require to be achieved by the used encryption scheme.

Aha, ok.  I suspected something like this had been baked in at a low level.  I was just surprised when I didn't see something like your "enc(ek, m, r)" in Section 5.2, or a note in Section 5.1 explaining that this was included in the definition of "Enc", or even a note about this in the otherwise quite verbose Section 6.1 - Unlinkability.

I think I was thrown off the scent a little by the use of "the ciphertext" from Section 5.1.

Quote
We denote by Enc(ek; m) the ciphertext that encrypts the message m with the encryption key ek.

Anyway, thanks for the explanation.  I was sure it wasn't a problem; I just wanted to understand.
TwinWinNerD
Legendary
*
Offline Offline

Activity: 1680
Merit: 1001


CEO Bitpanda.com


View Profile WWW
July 05, 2014, 03:10:14 PM
 #17

Funny that a "non-bitcoin"-coin will be the first(probably) to implement this: https://twitter.com/comefrombeyond/status/485429369268350977

#NXT!

bahamapascal
Hero Member
*****
Offline Offline

Activity: 695
Merit: 500



View Profile
July 05, 2014, 03:56:03 PM
 #18

Funny that a "non-bitcoin"-coin will be the first(probably) to implement this: https://twitter.com/comefrombeyond/status/485429369268350977

#NXT!

Awesome! Not sure, but I think Nxt seems to have the biggest developer team behind it, they really role out one innovation after another  Shocked

If all those innovations prove to be working, which they seem to, maybe Bitcoin can copy some of them.Would be cool!

msin
Legendary
*
Offline Offline

Activity: 1470
Merit: 1004


View Profile
July 05, 2014, 04:26:16 PM
 #19

Awesome! Not sure, but I think Nxt seems to have the biggest developer team behind it, they really role out one innovation after another  Shocked

If all those innovations prove to be working, which they seem to, maybe Bitcoin can copy some of them.Would be cool!


Not sure about the biggest, but definitely the best looking, which I think is the most important aspect of any dev team.  Anyway, looking forward to coinshuffle on Nxt.
bluemeanie1
Sr. Member
****
Offline Offline

Activity: 280
Merit: 257


bluemeanie


View Profile WWW
July 05, 2014, 05:11:04 PM
 #20

Funny that a "non-bitcoin"-coin will be the first(probably) to implement this: https://twitter.com/comefrombeyond/status/485429369268350977

#NXT!

How is Come From Beyond going to implement Coinshuffle if NXT doesn't have inputs and outputs?

#Vaporware!

-bm

Just who IS bluemeanie?    On NXTautoDAC and a Million Stolen NXT

feel like your voice isn't being heard? PM me.   |   stole 1M NXT?
Pages: [1] 2 3 4 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!