Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: Meni Rosenfeld on November 27, 2012, 06:04:55 PM



Title: How to detect the change output?
Post by: Meni Rosenfeld on November 27, 2012, 06:04:55 PM
Given a transaction, what methods exist to detect which of the outputs is most likely to be the change?

How do different clients choose the order of the outputs? Assuming no order information can be used, what kind of other heuristics can be used?

We can assume that no custom anonymization techniques are used. Among other things it means that there will be at most 1 change output.

Looking at some transactions I've made, it looks like Bitcoin-qt always puts the change first, and blockchain.info MyWallet always puts the change last. Do they consistently do so? What about other clients?

Some suggested heuristics:

1. In sendmany, if the outputs have simple integer ratios between them, the transaction is likely to be some proportional distribution, and the one output which isn't an integer multiple of the GCD is the change. Example: In this transaction (http://blockchain.info/tx-index/34601186/f69173bf19362f7aff12081e41f119fd30b4e43bb39f837c1274e67bedf4915d), all outputs are a multiple of 0.01648442, except for the change.

2. Change outputs are likely to have more decimal places (or be otherwise more complicated). Example: In this transaction (http://blockchain.info/tx-index/34600764/0c75cbf6ddfe5596da171936db8b20151b76d9b48681f334c1b1b792ee247fe8), 114 is the real transfer, while 7.527 is the difference which has 3 decimal places since that's what the input had.

3. (Assuming no anonymity-targeted coin selection): The change must be lower in value than all inputs (otherwise that input would not have been selected). This is especially useful for transactions where multiple coins are combined.

4. Transactions involving addresses known to belong to some service can have their own rules to be deduced on a per-service basis.

Any other suggestions?


Title: Re: How to detect the change output?
Post by: Stephen Gornick on November 27, 2012, 08:49:21 PM
Any other suggestions?

The Bitcoin-Qt client will pull a new address from the key pool for change.  Thus any amount sent to an address that already has been used in a transaction is unlikely to be change.   Of course, with other clients, particularly Blockchain.info, this isn't necessarily the case.

There are some clues from anecdotes as well.  For over-the-counter trades, quite often upon receipt of funds the recipient will turn around and spend those funds that were received as soon as there has been a confirmation.

This alone doesn't help much but because the coins chosen by Bitcoin-Qt client will first try to use coins for a transaction where all coins selected already have at least six confirmations, then if a transaction spends a coin with fewer than six confirmations it was only done so because there were insufficient funds in the wallet of coins with six or more confirmations.

So take a transaction A with outputs X and Y.
Then another transaction, (transaction B), spends X after 1 confirmation.   More often than not, it would seem, the argument could be made that looking at both transactions together, that Y was the change in transaction A.   The odds are much less likely to be that X was the change but that change was then used in another transaction from the same wallet shortly thereafter.   It might even be likely that in transaction B, if there is more than one output that the smaller output is the change.  A person that spends funds shortly after receiving them will generally be spending all of the funds received or at least a large chunk of it.


Title: Re: How to detect the change output?
Post by: jgarzik on November 27, 2012, 08:51:40 PM
Given a transaction, what methods exist to detect which of the outputs is most likely to be the change?

It is intentionally randomized to make it harder to guess :)



Title: Re: How to detect the change output?
Post by: kjj on November 27, 2012, 09:18:34 PM
3. (Assuming no anonymity-targeted coin selection): The change must be lower in value than all inputs (otherwise that input would not have been selected). This is especially useful for transactions where multiple coins are combined.

This is a decent assumption most of the time, but once every few weeks, someone pops on IRC complaining that the coin selector overselected.  Most recent was, if I recall, something like 0.5 + 0.2 + 0.3 + 0.01 -> 1.0 + 0.01

On the topic of tracking in general, I would just say, "Welcome to the arms race".  The raw transaction API lets crazy people like me do silly things like create all transactions with 2 to 5 outputs of nearly equal size.  The next step will be a public mix-sender.


Title: Re: How to detect the change output?
Post by: etotheipi on November 27, 2012, 10:35:32 PM
Given a transaction, what methods exist to detect which of the outputs is most likely to be the change?

How do different clients choose the order of the outputs? Assuming no order information can be used, what kind of other heuristics can be used?


I spent a lot of time implementing coin selection in Armory.  A few of your assumptions will be incorrect if the tx was created with Armory.  Here's how Armory creates transactions:

(1) Coin-selection-evaluation method.   This is a method that scores a coin-selection based on various factors:  How many unique addresses are linked on the input side, the size of the transaction, the required fee, and... the output anonymity of the transaction (explained below).  All factors are scored between 0 (unfavorable) and 1 (favorable).

(2) I generate a dozen deterministic coin selection methods that accommodate different, specific wallet structures.  Those are added to the list. Then I generate some semi-random solutions (deterministic solutions like above, but with some perturbation).  

(3) Score all the solutions in part (2) with the eval method in part (1).  Default weights are hard-coded into armoryengine.py, but the user could modify them.  It defines how important each factor is:  most important is not using zero-conf.  Next is whether the transaction can be free (not unrelated to zero-conf).  Then number of input addresses linked.  Then tx priority, tx size, and output anonymity.  However, the user is free to make output anonymity most important (I do!  but only for fun, not because I actually care :)).

(4) A new address is retrieved from the wallet, a change TxOut is added to the recipient list, then the recipient list is shuffled.  Transaction is signed and broadcast.

The parts you care about:  

(1)  If the transaction is small relative to the wallet balance, it a couple of the deterministic solutions are approx 2x the output value and 3x, so that change and recip are approximately the same or at least same order of magnitude.  It may help clean up dust, it may help anonymity (but no guarantee for either).  But when the tx amount is small relative to the wallet balance, you will frequently get equal or reversed size outputs, in random order.  This violates your assumption that the change address is always smaller.  Even if Armory didn't do it, a wallet with one very large TxOut, must have a large change output if you want to send only a small amount to someone.  So even without Armory's technique, you'll see that assumption violated with any client.

(2) Recipients are always shuffled, though only with python RNG (don't need cryptographic RNG for this :))

(3) The fun part of Armory's "output anonymity" scoring is that it actually takes into account the number of significant digits in both change output and recipient.  Even though (recip, change) = (12.4, 11.8320385) are the same order of magnitude, it's pretty obvious which one is change which is output.  As such, I actually count the number of significant digits, and give the selection a higher score if it's closer, like (12.4, 11.83).  It gets an extra bonus -- a score actually above 1.0 -- if it finds a solution in which the change address has less sigfigs than the recip:  (12.4, 10.0).  i.e. deceptively-sized outputs is favored.  

How often does (3) trigger?  I don't know.  I've tested the evaluation code, though, and it definitely scores those solutions higher, but I don't know how often such deceptive solutions are ever found/included.  But if it ever happens, then your assumption about sig figs would be violated, as well.


Title: Re: How to detect the change output?
Post by: keystroke on November 29, 2012, 04:51:50 AM
Given a transaction, what methods exist to detect which of the outputs is most likely to be the change?

How do different clients choose the order of the outputs? Assuming no order information can be used, what kind of other heuristics can be used?


I spent a lot of time implementing coin selection in Armory.  A few of your assumptions will be incorrect if the tx was created with Armory.  Here's how Armory creates transactions:

(1) Coin-selection-evaluation method.   This is a method that scores a coin-selection based on various factors:  How many unique addresses are linked on the input side, the size of the transaction, the required fee, and... the output anonymity of the transaction (explained below).  All factors are scored between 0 (unfavorable) and 1 (favorable).

(2) I generate a dozen deterministic coin selection methods that accommodate different, specific wallet structures.  Those are added to the list. Then I generate some semi-random solutions (deterministic solutions like above, but with some perturbation).  

(3) Score all the solutions in part (2) with the eval method in part (1).  Default weights are hard-coded into armoryengine.py, but the user could modify them.  It defines how important each factor is:  most important is not using zero-conf.  Next is whether the transaction can be free (not unrelated to zero-conf).  Then number of input addresses linked.  Then tx priority, tx size, and output anonymity.  However, the user is free to make output anonymity most important (I do!  but only for fun, not because I actually care :)).

(4) A new address is retrieved from the wallet, a change TxOut is added to the recipient list, then the recipient list is shuffled.  Transaction is signed and broadcast.

The parts you care about:  

(1)  If the transaction is small relative to the wallet balance, it a couple of the deterministic solutions are approx 2x the output value and 3x, so that change and recip are approximately the same or at least same order of magnitude.  It may help clean up dust, it may help anonymity (but no guarantee for either).  But when the tx amount is small relative to the wallet balance, you will frequently get equal or reversed size outputs, in random order.  This violates your assumption that the change address is always smaller.  Even if Armory didn't do it, a wallet with one very large TxOut, must have a large change output if you want to send only a small amount to someone.  So even without Armory's technique, you'll see that assumption violated with any client.

(2) Recipients are always shuffled, though only with python RNG (don't need cryptographic RNG for this :))

(3) The fun part of Armory's "output anonymity" scoring is that it actually takes into account the number of significant digits in both change output and recipient.  Even though (recip, change) = (12.4, 11.8320385) are the same order of magnitude, it's pretty obvious which one is change which is output.  As such, I actually count the number of significant digits, and give the selection a higher score if it's closer, like (12.4, 11.83).  It gets an extra bonus -- a score actually above 1.0 -- if it finds a solution in which the change address has less sigfigs than the recip:  (12.4, 10.0).  i.e. deceptively-sized outputs is favored.  

How often does (3) trigger?  I don't know.  I've tested the evaluation code, though, and it definitely scores those solutions higher, but I don't know how often such deceptive solutions are ever found/included.  But if it ever happens, then your assumption about sig figs would be violated, as well.


Awesome.


Title: Re: How to detect the change output?
Post by: etotheipi on December 02, 2012, 05:05:56 AM
By the way, I just implemented coin-control in Armory (not released yet, but soon).  And I discovered a good use for it:  cleaning up BTC crumbles in my wallet.  For instance, I just select all addresses that have balances with more than 2 decimal places, then I create a transaction sending their combined balance:  the integer part goes back to me, the non-integer part gets donated to another project.   Then my wallet is a ton cleaner (its balance is 28.15 instead of 28.37302811).  I just started doing this with each of my wallets and it made me think of this thread:

(1) The recipient of my transaction is being sent an output with 8 decimal places, the change has zero decimal places -- the exact opposite of what you would expect of a normal transaction
(2) Future transactions will be much cleaner in terms of decimal places, because when I pay someone 10.1 BTC, my change is 18.05 BTC, which is fairly indistinguishable.  
(3) Not that it matters to determining change, but I think it's good for the network (albeit, very mildly) to be sweeping up the crumbs like that.

I suspect, when I do release coin control, there will be other users doing the same thing, further complicating anyone's efforts to determine what is the correct change address. 


Title: Re: How to detect the change output?
Post by: casascius on December 02, 2012, 05:14:33 AM
For instance, I just select all addresses that have balances with more than 2 decimal places, then I create a transaction sending their combined balance:  the integer part goes back to me, the non-integer part gets donated to another project.

Funny I'm not the only one who does this.  I normally work mostly with whole bitcoins (since my lowest physical coin is 1 BTC anyway), and then I use coin crumbles to make paper banknote wallets worth 0.1 BTC and carry them around to leave in tip jars etc.

Anyway, one useful (but less useful in a deterministic sense) heuristic when following the blockchain is to try and correlate behavior as well.  If I notice consistent behavior following a chain one way, it's probably the same person.  This is what led to me providing what was later determined to be a critical clue as to Goat's lost 400 BTC a while back: I notice that if I followed the chain a certain way I could see a pattern of a sender who frequently sent bitcoins in amounts like 3.995 1.995 8.995 which led me to suggest that it was a service that charged a 0.005 BTC transaction fee and took it out of the transaction amount.


Title: Re: How to detect the change output?
Post by: Hal on December 14, 2012, 01:43:57 AM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw:

// Insert change txn at random position:
vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());
wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));

The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.

I think it should be size()+1.


Title: Re: How to detect the change output?
Post by: MoonShadow on December 14, 2012, 02:00:06 AM
I wonder if having a choice in the automatic selection methods would be beneficial as well.  Consider this scenario.

Say I have a full desktop client, as well as a thin hardware client (think bitcoincard) that syncs with the desktop client on a regular basis.  So I want my desktop client to automaticly create simple transactions that the thin hardware client can use easily, and preferablely in combinations that don't have a change address at all.  (Thin hardware clients most likely would have only one, or only a few, addresses to work with, change addresses would be trivial to identify if they go back to the sending address)

So what I would want my desktop client to do is to regularly produce transactions wherein I have two change transactions; one for the thin client, in the useful denominations, and one with the true change back to the desktop client.  One time, the thin client might get a .01 transaction, another time it might get .02, and another .05; up through .1|.2|.5|1|2|5 so that it would have a selction of transactions that would permit it to combine a spending transaction to any exact amount, or almost any exact amount.  A full client could also use this same technique to, occasionally, produce a changeless transaction as well.  I'm not sure if this would be useful or not, but I would guess that it might complicate 'taint' tracking systems.


Title: Re: How to detect the change output?
Post by: Gavin Andresen on December 14, 2012, 02:30:16 AM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw...
Ay carumba, how did we not notice that for over two years?

I introduced that bug with the 'sendmany' command two years ago (commit b9d1ed85). This is why programmers should not be trusted to test their own code (I probably carefully tested to make sure the change position looked random when I send to more than one destination, and never tested the degenerate send-to-one case; sigh).



Title: Re: How to detect the change output?
Post by: keystroke on December 14, 2012, 07:26:04 AM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw...
Ay carumba, how did we not notice that for over two years?

I introduced that bug with the 'sendmany' command two years ago (commit b9d1ed85). This is why programmers should not be trusted to test their own code (I probably carefully tested to make sure the change position looked random when I send to more than one destination, and never tested the degenerate send-to-one case; sigh).



Oh wow, so the change was always the first output if you send bitcoins to 1 address? Awesome find. But yea.. how did we miss it.


Title: Re: How to detect the change output?
Post by: BkkCoins on December 14, 2012, 08:45:00 AM
The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.

I think it should be size()+1.
Even with many outputs it would always not be the last one. Would it be useful as well to randomly split change into more than one address when above some value? To further complicate predictability.


Title: Re: How to detect the change output?
Post by: Meni Rosenfeld on December 14, 2012, 08:52:12 AM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw...
Ay carumba, how did we not notice that for over two years?
Maybe because all this time nobody made an analysis of the Bitcoin transaction graph which prompted an investigation into ways to determine common ownership of addresses :). So you can thank Adi and Dorit for that.

Also: http://dilbert.com/strips/comic/2001-10-25/. Many of us have looked at transactions in Blockexplorer, assuming change should be in a random location, but without paying special attention to this, there's hardly any smoking gun saying "This isn't random".


Title: Re: How to detect the change output?
Post by: elux on December 14, 2012, 11:07:56 AM
I think Hal's donation wallet address has a flaw. The problem is that balance() returned 0.

I added one bitcoin, but balance()+1 doesn't seem like enough.

Details here:
http://blockchain.info/address/1HALookLvrqrn2kxHhXCnwDgknD3RXifhH
http://blockchain.info/address/157i5gK7iN4bNAN39Ahuoiq6Tx5TaQukTE

I think it should be size()+1.

Seeing this made me really happy.


Title: Re: How to detect the change output?
Post by: molecular on December 14, 2012, 11:44:52 AM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw:

// Insert change txn at random position:
vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());
wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));

The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.

I think it should be size()+1.

Holy moly! That means we can get a much better approximation of transaction volume ;)


Title: Re: How to detect the change output?
Post by: gmaxwell on December 14, 2012, 11:51:26 AM
Maybe because all this time nobody made an analysis of the Bitcoin transaction graph which prompted an investigation into ways to determine common ownership of addresses :). So you can thank Adi and Dorit for that.
More like: the fact that their analysis didn't turn that up further confirms how shallow and ineffective the research was.  If only there were real research happening it might have been discovered more quickly!  (Of course, many other people did do things with the transaction graph without noticing this too— see for example blockchain.info's taint tracking service. So they were hardly alone in not noticing it)


Title: Re: How to detect the change output?
Post by: Sergio_Demian_Lerner on December 14, 2012, 01:35:58 PM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw:

// Insert change txn at random position:
vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());
wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));

The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.

I think it should be size()+1.

Excellent catch Hal!!!!.
It's clear that is very important that programmers/researchers of the community, apart from the core developers, review the code.

Some time ago I added to the Weaknesses page of the Bitcoin wiki a section "Security Vulnerabilities and bugs" (https://en.bitcoin.it/wiki/Weaknesses#Security_Vulnerabilities_and_bugs) regarding the impact of bugs. I will update to reflect this one.



Title: Re: How to detect the change output?
Post by: SgtSpike on December 14, 2012, 04:34:02 PM
When will we have a fix for this?


Title: Re: How to detect the change output?
Post by: Rassah on December 14, 2012, 06:31:24 PM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw...
Ay carumba, how did we not notice that for over two years?


Because we're still in Beta  ;D


Title: Re: How to detect the change output?
Post by: auzaar on December 14, 2012, 08:46:26 PM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw:

// Insert change txn at random position:
vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());
wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));

The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.

I think it should be size()+1.

long live the open source!


Title: Re: How to detect the change output?
Post by: molecular on December 14, 2012, 10:57:42 PM
When will we have a fix for this?

you could just add "+1" after "size()" and recompile. maybe do some testing before using it productively.


Title: Re: How to detect the change output?
Post by: SgtSpike on December 14, 2012, 11:03:04 PM
When will we have a fix for this?

you could just add "+1" after "size()" and recompile. maybe do some testing before using it productively.
I don't compile. Did it once, never planning to do it again. Biggest headache I've ever experienced.  So, I am asking about a general installer release.

Personally, I don't really care about this anonymous side of things and hiding my true balance or associated addresses.  I don't have anything to hide with regards to my transactions.  But I know a lot of people do care about it, so I am sort of asking on everyone's behalf when we can expect an updated client to be released.


Title: Re: How to detect the change output?
Post by: BkkCoins on December 14, 2012, 11:18:41 PM
I think it's also worthwhile pointing out that now there is a "known" history of change/payment values. It should be easier to develop algorithms that can select future change values even after the bug is fixed, which a much higher confidence level.


Title: Re: How to detect the change output?
Post by: Peter Todd on December 14, 2012, 11:43:47 PM
I think Hal's donation wallet address has a flaw. The problem is that balance() returned 0.

I added one bitcoin, but balance()+1 doesn't seem like enough.

Details here: http://blockchain.info/address/1HALookLvrqrn2kxHhXCnwDgknD3RXifhH

From Hal's sig:

Quote
Hal 157i5gK7iN4bNAN39Ahuoiq6Tx5TaQukTE

So Hal, which one is your current Bitcoin address?


Also, it's funny how every tx to 1HALook is obviously affected by this bug.


Title: Re: How to detect the change output?
Post by: Hal on December 15, 2012, 12:06:44 AM
Actually the OP observed that the software always puts the change first. I was working on a separate project suggested by Mike Hearn, using the new secure "trusted computing" modes on modern cpus to enforce spending limits even when infested with malware. I'm using Jon McCune's flicker software from sourceforge. The secure mode signs transactions and enforces limits on spends, but to know the transaction amount it has to know which is the change output.So I was just looking at this code yesterday and scratching my head (figuratively). I'll post more on this project when it's further along.

As far as my donation address, I created that years ago using Gavin's original vanitygen patch. I don't even know if I still have the key. It might be on a computer I don't use any more. I've substituted one of my offline savings wallet addresses. I shouldn't lose that.I'm not hurting for bitcoins; I started running the first day (and gave up after a few weeks because of the fan noise). Still, it's the thought that counts - thanks!


Title: Re: How to detect the change output?
Post by: Peter Todd on December 15, 2012, 12:22:34 AM
Yes, I saw that project, I'm curious to try it out myself, although I'll have to upgrade my computer first.

It's interesting that you found this bug by looking at the code. I mean, the results of it have been sitting in plain sight for, well, anyone to see. Heck, I'm kicking myself for not realizing it, let alone how Gavin and the other devs must feel. It's remarkable that Meni was the first person to notice and bring it to the attention of the forum after all this time, and even then it took your post for what he actually said to sink in.

Re: donations, I sent both you and the OP a BTC after applying your patch. Fortunately for Meni it took two tries for it to be obvious that the patch worked...


Title: Re: How to detect the change output?
Post by: Hal on December 15, 2012, 03:46:46 AM
I felt bad about losing that donation address, particularly after elux very generously donated a whole bitcoin. I found a wallet on a backup that had the key, and so I was able to retrieve the funds. Again, thanks very much!


Title: Re: How to detect the change output?
Post by: Sergio_Demian_Lerner on December 21, 2012, 03:11:02 PM
I'm proud of myself  :)  because I saw this kind of bug coming. I posted this line on November 6, 2012 in my blog (bitslog):

"The (pseudo) anonymization property of Bitcoin is the easiest property to be lost while modifying the code, since the related code is not encapsulated in a certain class or file, but spread all over the code."



Title: Re: How to detect the change output?
Post by: keystroke on December 21, 2012, 04:10:01 PM
I'm proud of myself  :)  because I see this kind of bug coming. I posted this line on November 6, 2012 in my blog (bitslog):

"The (pseudo) anonymization property of Bitcoin is the easiest property to be lost while modifying the code, since the related code is not encapsulated in a certain class or file, but spread all over the code."


Good call Sergio :)


Title: Re: How to detect the change output?
Post by: molecular on January 13, 2013, 05:20:41 PM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw:

// Insert change txn at random position:
vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());
wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));

The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.

I think it should be size()+1.

Exploiting this bug I made a chart showing transaction volume of output #1 only: [CHART] Bitcoin Actual Transaction Volume? (using "change is never last" bug) (https://bitcointalk.org/index.php?topic=136289.msg1451700#msg1451700)


Title: Re: How to detect the change output?
Post by: zebedee on February 01, 2013, 02:02:05 PM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw:

// Insert change txn at random position:
vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());
wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));

The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.

I think it should be size()+1.

Exploiting this bug I made a chart showing transaction volume of output #1 only: [CHART] Bitcoin Actual Transaction Volume? (using "change is never last" bug) (https://bitcointalk.org/index.php?topic=136289.msg1451700#msg1451700)

Electrum had the opposite behaviour though - it always sent the change as the last output (and as currently exists can only do single-destination transactions, so the change was always the second output if there was change).  So guessing change still means guessing what the client used to send the transaction was, which helps anonymity a bit from the date electrum was released.

So I can't agree with you that your results are meaningful.

The most recent 1.6.2 release of Electrum randomizes the change output.

Also, I know some people deliberately send transactions with excess precision to the real output to make the change output have less precision and look like it isn't change.

I really think it's near-impossible to reliably guess the change output if it's not a 0.01xxx amount.


Title: Re: How to detect the change output?
Post by: molecular on February 01, 2013, 03:12:03 PM
bitcoin-qt tries to randomize the position of the change output, but I believe the code has a flaw:

// Insert change txn at random position:
vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());
wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));

The problem is that size() is one in the common case of one payee, so GetRandInt will always return 0.The change ends up in the first output.

I think it should be size()+1.

Exploiting this bug I made a chart showing transaction volume of output #1 only: [CHART] Bitcoin Actual Transaction Volume? (using "change is never last" bug) (https://bitcointalk.org/index.php?topic=136289.msg1451700#msg1451700)

Electrum had the opposite behaviour though - it always sent the change as the last output (and as currently exists can only do single-destination transactions, so the change was always the second output if there was change).  So guessing change still means guessing what the client used to send the transaction was, which helps anonymity a bit from the date electrum was released.

So I can't agree with you that your results are meaningful.

The most recent 1.6.2 release of Electrum randomizes the change output.

Also, I know some people deliberately send transactions with excess precision to the real output to make the change output have less precision and look like it isn't change.

I really think it's near-impossible to reliably guess the change output if it's not a 0.01xxx amount.

Thanks for pointing out the electrum behavior.

There could be many useful heuristics which, especially when taken together, might yield good results. I'm not going to attempt any such thing and just agree with you that it can't reliably be done.