Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: hayek on November 28, 2013, 02:53:38 AM



Title: How does a site like Blockchain.info know which outputs are change?
Post by: hayek on November 28, 2013, 02:53:38 AM
I can't find anything on the wiki about transaction change.

I know what it is but how does it appear any different than other outputs?


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: jl2012 on November 28, 2013, 03:32:01 AM
I can't find anything on the wiki about transaction change.

I know what it is but how does it appear any different than other outputs?

That's just guestimation


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: maaku on November 28, 2013, 03:35:07 AM
It makes a wild-ass guess.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: ivroer on November 28, 2013, 03:42:12 AM
That's just guestimation

It makes a wild-ass guess.

Yup, +1 to these 2. It doesn't know it's a guess.

I imagine there's some guessing logic like, have any of the output addresses been seen on the blockchain before? Yes? It might be the actual "spend"... there might be more logic depending on the ratio of amounts to each address.

But it definitely does not get it correct every time, I've seen plenty of transactions (of my own) where the it has estimated incorrectly.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: gmaxwell on November 28, 2013, 06:29:59 AM
Like a lot of other things on BC.i that are just guesses, its right often enough to confuse people.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: maaku on November 28, 2013, 06:39:05 AM
If it were me, I'd do prime decomposition on the amounts, calculate their relative magnitude, a boolean value indicating whether they'd been seen before, etc., label a number of training examples, and have a support vector machine generate a classifier.

There's a million other ways you can do it and get decent results. Doesn't stop it from being a WAG though.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: scintill on November 28, 2013, 06:54:40 AM
If it were me, I'd do prime decomposition on the amounts, calculate their relative magnitude, a boolean value indicating whether they'd been seen before, etc., label a number of training examples, and have a support vector machine generate a classifier.

There's a million other ways you can do it and get decent results. Doesn't stop it from being a WAG though.

In a typical two-output tx created before ~2013-01-30 (https://github.com/bitcoin/bitcoin/commit/ac7b8ea0864e925b0f5cf487be9acdf4a5d0c487), there's a good chance the first output is the change address (https://github.com/bitcoin/bitcoin/pull/2120).  Maybe even longer, depending on how long until the fix was widely deployed.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: maaku on November 28, 2013, 07:42:23 AM
Bitcoin-Qt is not the only wallet application...


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: Sukrim on November 28, 2013, 02:58:18 PM
Also if there are e.g. a 3 BTC input and a 3 BTC input to a 4 BTC output and a 1 BTC output, the change is likely the 1 BTC, since there would have been no real need to combine the inputs otherwise.

Still it often guesses wrong, maybe there is some research potential in there somehow?


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: dserrano5 on November 28, 2013, 03:23:37 PM
Plus, from what I've seen, bc.info doesn't bother with transactions having more than two outputs, the estimated amount is always the whole amount in the tx.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: TooDumbForBitcoin on November 28, 2013, 04:33:39 PM
Quote
Also if there are e.g. a 3 BTC input and a 3 BTC input to a 4 BTC output and a 1 BTC output, the change is likely the 1 BTC, since there would have been no real need to combine the inputs otherwise.


That's some big fees, or I'm toodumbforbitcoin.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: drawingthesun on November 28, 2013, 04:37:49 PM
Blockchain.info is very misleading.

The estimated transaction volume is trite, pure utter guess work.

Someone could buy a coffee and it could show up as a $100,000,000 transaction.

Also the IP address stuff is crap too, so misleading. The ip is the node that relays the transaction to the blockchain node and in no way represents where the actual transaction originated from.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: TooDumbForBitcoin on November 28, 2013, 04:57:16 PM
Not to mention their 650W/Gh/s electricity consumption nonsense.  Every now and then the MSM picks that up.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: moderate on November 28, 2013, 06:38:11 PM
If it were me, I'd do prime decomposition on the amounts, calculate their relative magnitude, a boolean value indicating whether they'd been seen before, etc., label a number of training examples, and have a support vector machine generate a classifier.

I'm so glad it is not you, that kind of thing is exactly someone fascinated with machine learning would go for. So many thousands and thousands of crap papers where guys blindly go after machine learning -- and it is mostly always svm --, without even considering other methods, reporting results close to 100% accuracy and other metrics just to find out that they don't even know how to setup training/testing sets, neither have a clue about the features they are using.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: piuk on November 28, 2013, 07:12:21 PM
The logic is pretty simple:

- Remove all outputs matching any input addresses.
- If the transaction has one input take the smallest output.
- If a transaction has more than two inputs and exactly two outputs take the output with a value closest to the total input value.
- If a transaction has more than two outputs return the value of the smallest output.

Anyone is welcome to suggest improvements.

If you were really determined the accuracy could be improved by analysing the taint of the inputs used in the next transaction.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: maaku on November 28, 2013, 08:33:04 PM
If it were me, I'd do prime decomposition on the amounts, calculate their relative magnitude, a boolean value indicating whether they'd been seen before, etc., label a number of training examples, and have a support vector machine generate a classifier.

I'm so glad it is not you, that kind of thing is exactly someone fascinated with machine learning would go for. So many thousands and thousands of crap papers where guys blindly go after machine learning -- and it is mostly always svm --, without even considering other methods, reporting results close to 100% accuracy and other metrics just to find out that they don't even know how to setup training/testing sets, neither have a clue about the features they are using.

Yes, because when faced with a classic machine learning problem, the tried and true techniques of machine learning are not what you'd want to use.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: scintill on November 28, 2013, 11:57:11 PM
Bitcoin-Qt is not the only wallet application...

Sure, I meant the probability skews a bit.  In practice maybe it doesn't help much.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: Sukrim on November 29, 2013, 03:09:29 AM
Quote
Also if there are e.g. a 3 BTC input and a 3 BTC input to a 4 BTC output and a 1 BTC output, the change is likely the 1 BTC, since there would have been no real need to combine the inputs otherwise.


That's some big fees, or I'm toodumbforbitcoin.
Yeah, I meant 2+3BTC inputs, not 3+3... ;)


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: Remember remember the 5th of November on November 29, 2013, 03:44:11 AM
Blockchain.info is very misleading.

The estimated transaction volume is trite, pure utter guess work.

Someone could buy a coffee and it could show up as a $100,000,000 transaction.

Also the IP address stuff is crap too, so misleading. The ip is the node that relays the transaction to the blockchain node and in no way represents where the actual transaction originated from.
Really? I actually tracked this guy that mined on top of the genesis blocks(orphans duh) using the IP address on the site, and he confirmed it was him.


Title: Re: How does a site like Blockchain.info know which outputs are change?
Post by: Peter Todd on November 29, 2013, 07:50:23 AM
Blockchain.info is very misleading.

The estimated transaction volume is trite, pure utter guess work.

Someone could buy a coffee and it could show up as a $100,000,000 transaction.

Also the IP address stuff is crap too, so misleading. The ip is the node that relays the transaction to the blockchain node and in no way represents where the actual transaction originated from.
Really? I actually tracked this guy that mined on top of the genesis blocks(orphans duh) using the IP address on the site, and he confirmed it was him.

That's a special case because no other node would have relayed those blocks; in the general case the IP addresses are bullshit.