Using mixing transactions to improve anonymity


Meni Rosenfeld:
In this post I will discuss what I call a "mixing transaction", which allows you to move bitcoins from one of your addresses A1 to another address of yours A2, without a clear trace linking the two addresses. This is similar to the ideas presented in this comment, and this one. Such transactions can act as a primitive to help improve anonymity, though how exactly to incorporate them in a complete anonymity solution is beyond the scope of this post.

Mixing transactions can be done completely p2p without a centralized service which risks fraud, leaking information, being shut down and so on.

Step 1: 2-party mixing transaction

User A who wants to perform a mixing transaction finds user B who also wants to make such a transaction (peer finding for this will probably be done on a dedicated p2p network). A wants to send X bitcoins from address A1 to A2, and B wants to send X bitcoins from B1 to B2. They exchange information about A1, A2, B1, B2 and construct a transaction with 2 inputs and 2 outputs. One input is X BTC from A1, the other is X BTC from B1 (we can assume A1 and B1 have a redeemable output with exactly X BTC, since such an output can be prepared in advance from a larger output); one output is X BTC to A2, the other is X BTC to B2. The inputs and outputs appear in random order. A and B sign the transaction with the private keys corresponding to A1 and B1, respectively, and broadcast the signed transaction to the network.

A will have X BTC deducted from A1 and X added to A2, as desired; likewise for B. Of course, no party can run away with the money because the funds are exchanged in a single atomic transaction. An outside observer cannot deduce from this transaction that A1 and A2 are related; they have equal chance to belong to the same person or to two unrelated people. If A routes his funds through 10 mixing transactions, the address where the funds end up is related as strongly to A's original address as to 1023 other addresses, each of which could belong to a different person. This indeed makes tracing difficult. Of course, this will require that a large number of people participate in this anonymization method.

The weakness of a 2-party transaction is that B knows certainly that A1 and A2 belong to the same person and could leak this information. This may not be devastating if enough mixing transactions are made, but we want a more robust system where privacy is maintained with as few transactions as possible.

Step 2: 3-party mixing transaction

User A finds peers B and C. A wants to send X BTC from A1 to A2, and so on. They exchange information about A1, B1 and C1; however, they do not openly share the addresses A2, B2 and C2. We want each participant to know all addresses A2, B2 and C2, but not know which belongs to whom. That is, A shouldn't know whether B2 belongs to B and C2 belongs to C or vice versa.

To achieve this, the following protocol can be used:
1. A creates two keypairs, (priv_B, pub_B) and (priv_C, pub_C), for a one-way, commutative encryption technique.
2. A sends pub_B to B and pub_C to C.
3. B encrypts B2 with pub_B to obtain E_B(B2), and sends E_B(B2) to C. C encrypts C2 with pub_C to obtain E_C(C2), and sends E_C(C2) to B.
4. B encrypts E_C(C2) with pub_B to obtain E_BC(C2). C encrypts E_B(B2) with pub_C to obtain E_BC(B2), and sends E_BC(B2) to B.
5. B sends E_BC(C2) and E_BC(B2) to A in a random order.
6. The cryptosystem is commutative, so A can't deduce which of E_BC(C2), E_BC(B2) was encrypted with pub_B first and then pub_C. But she can use priv_B and priv_C to decrypt both messages and obtain B2 and C2.
7. A sends A2, B2 and C2, in a random order, to B and C.

A only receives encrypted versions of B2 and C2 in a random order so can't deduce which belongs to whom. B knows that E_C(C2) is C's encrypted address, but he doesn't know pub_C so he cannot verify which of A2, C2 belongs to C.

Now that A, B and C know A2, B2 and C2, they can construct a transaction with 3 inputs and 3 outputs and sign it. For an outside observer, A2 is mixed with 3 addresses A1, B1 and C1. Even if B is compromised, all that is revealed is that A2 corresponds to either A1 or C1 (branching factor 2), rather than that it definitely corresponds to A1.

However, if B and C collude or are both compromised, the relation between A2 and A1 does leak. This seems unlikely, but it still means we may be interested in transactions with more mixing power.

Step 3: n-party mixing transaction

For an n-party mixing transaction, n peers need to exchange destination address information without any other knowledge given about which address belongs to whom. I don't know of an efficient cryptographic protocol to achieve this, though it can be done with generic secure multi-party computation. Alternatively, an external anonymization network could be used for each user to broadcast his address without revealing the source.

For an n-party transaction where each user has an input of X BTC and an output of X BTC, compromising m participants only means that each of the n-m uncompromised destination addresses is known to be associated with one of the n-m uncompromised source addresses (branching factor n-m).

Having a mixing transaction with many parties also allows to be more flexible with the input/output amounts. The inputs could be chosen to be any value among some standard values (e.g. powers of 2), and each user can have several inputs with 1 output. Since there are multiple inputs of each size, it should be impossible to deduce the relations between inputs and outputs. If the input amounts are generic, it will be possible, information-theoretically, to match inputs to outputs by finding which inputs sum to a given output amount; however, depending on the specifics, it might be intractable computationally (this is related to the subset sum problem).

One weakness with large mixing transactions is availability. Since all parties need to collaborate to complete the transaction, unavailability of either one of them will create extra work for reorganizing the transaction. This is especially true if one of the supposedly interested parties is in fact a DoS attacker planning to quit and blow up the deal. If he makes sure to be part of every large mixing transaction, he can make it hard to have any such transaction go through.

Good idea, this could bring more casual anonymity to bitcoin.

Nice and interesting.  I like how so many centralized problems can be solved with strong decentralized encryption.  It will take time but I imagine decentralized solutions will eventually be developed for most issues.

Full MPC is not practical yet, but the 2-party system is actually very strong.  When repeated, it has identical security to a system that makes very large mixing transactions.

Something like this:

An anonymous (e.g. TOR) rendezvous point is established for people to connect and randomly pair off with each other.   Those paired up write a 2-input, 2-output TX to fresh addresses.  They do this repeatedly pairing off each time.  Note a special kind of service is needed for this where clients keep no long-term identity.  If you wanted to use, say, IRC, you would need to disconnect and reconnect with a new name to the IRC server after each mix step.

The hard part is, as usual, sybil attacks.  I.e., I could connect 10,000 clients to that rendezvous point and whenever you paired off, you would be paired with me.  I could then record all the input-output mappings.

Preventing the above requires some kind of proof-of-work.  One option is CAPTCHA: you generate a captcha and are paired with the first person to solve it.  Unfortunately, this means you need to be on-line to mix coins.  Another is computational, like an easier mining problem, but if cost is negligible to honest people, it will still be not-to-bad to dishonest ones.

But once you can say something like "at least 20% of the time, I'm paired with someone honest who's not recording the input-output mappings", you can get the same security guarantee as you would with full MPC.  The most intuitive way to see this is just to imagine the distribution of mappings seen by an attacker.  To begin, they know who you are.  But each mixing step between honest parties adds entropy to this distribution.  From some basic random walk theory, if there are N total participants, and a p fraction of them  are honest, then after O((log N)/p) random pairings, the distribution of input-output addresses is going to be very close to uniformly random over all pN honest participants (the same distribution as if all N made a single transaction via MPC).

tl;dr 2-party system is just as strong as N-party if repeated O(log N) times and an attacker controls only a constant fraction of participants.


[0] Message Index