I'm curious to know what is the best way to coinjoin my dollar cost averaging UTXOs? I have about 0.15BTC now in UTXOs after a few years of averaging. But they are all from Coinbase. They are of varying sizes. What is the most private way to coinjoin them? Do I send them all at once to Wasabi wallet? Then transfer it to cold storage from there?
Coinbase already knows how much BTC you have and which addresses those coins are in, so if you send all your coins at once, no new information is revealed to them. If you wanted to add privacy from the perspective of others watching the blockchain besides Coinbase, you could gradually send your UTXOs one at a time and do 1-2 coinjoin rounds in between deposits.
And what if I only want UTXOs that are larger than 0.02 in my cold storage?
You can send specific sized amounts in a coinjoin using Wasabi's RPC interface -
https://docs.wasabiwallet.io/using-wasabi/RPC.html#payincoinjoinHow do we avoid small UTXOs?
The creation of small UTXOs from coinjoins allows you to avoid creating change outputs from coinjoins. These small UTXOs won't continue to be split and pile up over time though, small UTXOs will get consolidated back into bigger UTXOs as you continue to coinjoin.
By the way, really, according to what algorithm and on whose side (coordinator or user) are the output amounts formed?
With Joinmarket, the coordinator is the taker and they choose the output amounts. Makers can still set their own personal minimums and ignore takers offering small coinjoins.
With WabiSabi, users choose their own output amounts. Coordinators can set their own minimums and maximums (at the start of the round).
If the amounts are formed on the user side, then how does it happen that different users have the same outputs for non-standard amounts? If suddenly someone has already been interested in this question, maybe there is a link to GitHub to the right place in the code?
Yes, getting clients to match amounts with each other (despite not ever communicating directly with each other) is the main research problem that had to be solved for WabiSabi to be an effective coinjoin method. Here is the description of the solution from the mailing list post:
## Output Selection
The coordinator collects all input registrations, and forwards them to all users. At this point, all clients knows all inputs of this transaction. The goal now is to get a Schelling point among users of
which output denominations to choose, so that the anonset size of each denomination is sufficiently large.
First of all, it's a good idea to limit the denominations that the client will register. Some simulations confirmed that low Hemming weight numbers are efficient, thus clients generate a list of standard
denominations which are: powers of two; powers of three; two times powers of three; powers of ten; two times powers of ten; and five times powers of ten. However, remove some of those denominations which are very close to each other, more so for larger values. Notice that this list of standard denominations is the same across all rounds, it does not depend on specific inputs.
We can further decrease the list of potential denominations that the client chooses, but specifically for every round. This is a further Schelling point of which denominations the client prefers to choose.
This is done with a deterministic frequency table, based on the inputs of the proposed transaction.
Take each input and greedily decompose it into the standard denominations, meaning every input has precisely one decomposition. [45 decomposes greedily into 32+10+3] Count the occurrences of every standard denomination into a frequency table. All those standard denominations, which have a count of 2 or larger, are considered likely denominations.
Notice that currently we remove the largest input from this frequency table calculation. This is so that the whale does not mix alone by himself. Also notice that individual inputs, and not input sums are
decomposed. This is because we found that generating the frequency table based on only one input leads to a more accurate Schelling point, which increases anonset count of the finally chosen denominations. Finally, notice that we only calculate one single decomposition for each input, the greedy one, but we could also calculate multiple different [or all possible] decompositions for each input, thus generate a larger frequency table and more likely denominations.
Whereas the frequency table should be deterministic as a Schelling point, the actual user's input sum must not be deterministically decomposed, otherwise an adversary who knows the input sum would find out which denominations the client chose. [but not which of the equal outputs he got]
The client takes his input sum [minus fees] and brute-force decomposes into all possible groups of the likely denominations [those with high count in this rounds' frequency table]. We found that in most cases, even with this reduced list of likely denominations, any input sum can be decomposed into up to eight outputs. [Sometimes the wealthiest user gets a non-standard change amount] However, each decomposition has some small amount of sats left over, which is is not put into an output value, but instead pays miner fees.
Sort this list of all possible output groups ascending by leftover amount, and remove those groups which have a leftover amount 1.3x larger than the best option. Further, remove a group if it has a similar largest denomination as another one. [So far everything deterministic, given all coinjoin inputs and the users' input sum]
Out of this shorter list of output amount groups, shuffle and pick randomly one of them. These are non-deterministic denominations which will be registered for the actual coinjoin outputs. If there were no shuffle, but a selection of the amount group with the lowest loss, users would save sats. But arguably having this randomness here increases privacy sufficiently to justify the slight increase in leftover amount cost.
Again, while choosing their own outputs, clients do not know which outputs other clients registered. If the client would have this information, it could possibly increase the quality of it's own output
registration substantially.
Notice there is a different decomposition strategies for the frequency table [greedy] and the input sum [brute-force all]. Maybe, having the same decomposition strategy here would lead to better results.
Notice further that there is no rank ordering of the possible denominations based on some ambiguity score or entropy score. Rather, the choice is random, and in some cases, this might result in not
optimal outcomes.
Here are some results of our simulation of the current algorithm:
50 inputs 15 users
Median output count: 98
Median change count: 4
Median change percent: 3.2
Median out anonsets: 3.5
Median leftovers: 481
300 inputs 70 users
Median output count: 442
Median change count: 0.5
Median change percent: 0.3
Median out anonsets: 9.6
Median leftovers: 394
Thank you for your consideration to review!
Skol
Max