Bitcoin Forum
May 12, 2024, 07:45:37 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Dragnet: A Method for Tagging Bitcoin Addresses of Exchanges  (Read 245 times)
zsrem (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 49


View Profile
March 14, 2020, 01:31:28 PM
Merited by ABCbits (30), suchmoon (4), AB de Royse777 (2), aliashraf (2), o_e_l_e_o (1), PrimeNumber7 (1), NotATether (1), Heisenberg_Hunter (1)
 #1

https://www.techrxiv.org/articles/Dragnet_A_Method_for_Tagging_Bitcoin_Addresses_of_Exchanges/11852739

We are a group of developers and data analysts. This paper is a recent work by our team. We explained how to find out the Bitcoin addresses of exchanges. The final results are shown on this website (https://chain.info/). Hope for suggestions.

1715499937
Hero Member
*
Offline Offline

Posts: 1715499937

View Profile Personal Message (Offline)

Ignore
1715499937
Reply with quote  #2

1715499937
Report to moderator
Transactions must be included in a block to be properly completed. When you send a transaction, it is broadcast to miners. Miners can then optionally include it in their next blocks. Miners will be more inclined to include your transaction if it has a higher transaction fee.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715499937
Hero Member
*
Offline Offline

Posts: 1715499937

View Profile Personal Message (Offline)

Ignore
1715499937
Reply with quote  #2

1715499937
Report to moderator
AB de Royse777
Legendary
*
Offline Offline

Activity: 2478
Merit: 3895


Hire Bitcointalk Camp. Manager @ r7promotions.com


View Profile WWW
March 14, 2020, 01:44:01 PM
 #2

Quote
To solve the problem of information asymmetry between users and exchanges, we propose a method for tagging Bitcoin addresses of exchanges. Through vertical, forward, and backward address mining, the method can utilize only one or several addresses of an exchange to find out all its addresses and distinguish different address types: deposit wallet, hot wallet, and cold wallet. Then the balance and transfers of the exchange can be further obtained through these addresses, helping users understand the real Bitcoin holdings of the exchange.
https://www.techrxiv.org/articles/Dragnet_A_Method_for_Tagging_Bitcoin_Addresses_of_Exchanges/11852739

Interesting idea and I think this will be a realistic move. These exchanges are really faking the volumes and creating confusion to their users.


Are these real data or just some samples?

Quote
May be an English version will give it more exposure?

Anyway, what are you looking for?

..Stake.com..   ▄████████████████████████████████████▄
   ██ ▄▄▄▄▄▄▄▄▄▄            ▄▄▄▄▄▄▄▄▄▄ ██  ▄████▄
   ██ ▀▀▀▀▀▀▀▀▀▀ ██████████ ▀▀▀▀▀▀▀▀▀▀ ██  ██████
   ██ ██████████ ██      ██ ██████████ ██   ▀██▀
   ██ ██      ██ ██████  ██ ██      ██ ██    ██
   ██ ██████  ██ █████  ███ ██████  ██ ████▄ ██
   ██ █████  ███ ████  ████ █████  ███ ████████
   ██ ████  ████ ██████████ ████  ████ ████▀
   ██ ██████████ ▄▄▄▄▄▄▄▄▄▄ ██████████ ██
   ██            ▀▀▀▀▀▀▀▀▀▀            ██ 
   ▀█████████▀ ▄████████████▄ ▀█████████▀
  ▄▄▄▄▄▄▄▄▄▄▄▄███  ██  ██  ███▄▄▄▄▄▄▄▄▄▄▄▄
 ██████████████████████████████████████████
▄▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▄
█  ▄▀▄             █▀▀█▀▄▄
█  █▀█             █  ▐  ▐▌
█       ▄██▄       █  ▌  █
█     ▄██████▄     █  ▌ ▐▌
█    ██████████    █ ▐  █
█   ▐██████████▌   █ ▐ ▐▌
█    ▀▀██████▀▀    █ ▌ █
█     ▄▄▄██▄▄▄     █ ▌▐▌
█                  █▐ █
█                  █▐▐▌
█                  █▐█
▀▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▀█
▄▄█████████▄▄
▄██▀▀▀▀█████▀▀▀▀██▄
▄█▀       ▐█▌       ▀█▄
██         ▐█▌         ██
████▄     ▄█████▄     ▄████
████████▄███████████▄████████
███▀    █████████████    ▀███
██       ███████████       ██
▀█▄       █████████       ▄█▀
▀█▄    ▄██▀▀▀▀▀▀▀██▄  ▄▄▄█▀
▀███████         ███████▀
▀█████▄       ▄█████▀
▀▀▀███▄▄▄███▀▀▀
..PLAY NOW..
zsrem (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 49


View Profile
March 14, 2020, 02:01:43 PM
 #3

Real Data.

The website changes language according to your browser settings. It doesn't work, does it?

I wonder if there is any suggestion for the algorithm we use.
AB de Royse777
Legendary
*
Offline Offline

Activity: 2478
Merit: 3895


Hire Bitcointalk Camp. Manager @ r7promotions.com


View Profile WWW
March 14, 2020, 02:07:08 PM
 #4

Real Data.

Awesome and worrying. This means anything happen from this exchanges is going to create a big disaster in the market. We really need a culture of not having centralized exchanges. We need to focus on more into P2P exchange.

Quote
The website changes language according to your browser settings. It doesn't work, does it?
I had to change the browser language manually. It was not automated.

Quote
I wonder if there is any suggestion for the algorithm we use.
Not very much tech guy here but I think if your data has accuracy then you are doing a great job.

..Stake.com..   ▄████████████████████████████████████▄
   ██ ▄▄▄▄▄▄▄▄▄▄            ▄▄▄▄▄▄▄▄▄▄ ██  ▄████▄
   ██ ▀▀▀▀▀▀▀▀▀▀ ██████████ ▀▀▀▀▀▀▀▀▀▀ ██  ██████
   ██ ██████████ ██      ██ ██████████ ██   ▀██▀
   ██ ██      ██ ██████  ██ ██      ██ ██    ██
   ██ ██████  ██ █████  ███ ██████  ██ ████▄ ██
   ██ █████  ███ ████  ████ █████  ███ ████████
   ██ ████  ████ ██████████ ████  ████ ████▀
   ██ ██████████ ▄▄▄▄▄▄▄▄▄▄ ██████████ ██
   ██            ▀▀▀▀▀▀▀▀▀▀            ██ 
   ▀█████████▀ ▄████████████▄ ▀█████████▀
  ▄▄▄▄▄▄▄▄▄▄▄▄███  ██  ██  ███▄▄▄▄▄▄▄▄▄▄▄▄
 ██████████████████████████████████████████
▄▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▄
█  ▄▀▄             █▀▀█▀▄▄
█  █▀█             █  ▐  ▐▌
█       ▄██▄       █  ▌  █
█     ▄██████▄     █  ▌ ▐▌
█    ██████████    █ ▐  █
█   ▐██████████▌   █ ▐ ▐▌
█    ▀▀██████▀▀    █ ▌ █
█     ▄▄▄██▄▄▄     █ ▌▐▌
█                  █▐ █
█                  █▐▐▌
█                  █▐█
▀▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▀█
▄▄█████████▄▄
▄██▀▀▀▀█████▀▀▀▀██▄
▄█▀       ▐█▌       ▀█▄
██         ▐█▌         ██
████▄     ▄█████▄     ▄████
████████▄███████████▄████████
███▀    █████████████    ▀███
██       ███████████       ██
▀█▄       █████████       ▄█▀
▀█▄    ▄██▀▀▀▀▀▀▀██▄  ▄▄▄█▀
▀███████         ███████▀
▀█████▄       ▄█████▀
▀▀▀███▄▄▄███▀▀▀
..PLAY NOW..
baro77
Member
**
Offline Offline

Activity: 90
Merit: 91


View Profile WWW
March 14, 2020, 04:15:26 PM
 #5

Cool...
Just to note the the Vertical Mining Heuristic should be very effective with batch-transactions-enabled exchanges, like recently Coinbase...

baro77
Member
**
Offline Offline

Activity: 90
Merit: 91


View Profile WWW
March 14, 2020, 04:22:56 PM
 #6

only Putonghua in your web for me as well... Hope it doesn't depend on localization services (wouldn't be a  fair trade-off IMHO, and infact mine are disabled)


Real Data.

The website changes language according to your browser settings. It doesn't work, does it?

I wonder if there is any suggestion for the algorithm we use.
zsrem (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 49


View Profile
March 14, 2020, 05:20:37 PM
 #7

only Putonghua in your web for me as well... Hope it doesn't depend on localization services (wouldn't be a  fair trade-off IMHO, and infact mine are disabled)

It's a bug. We'll fix it.
Now you can choose the language manually, on the bottom of the page.
AB de Royse777
Legendary
*
Offline Offline

Activity: 2478
Merit: 3895


Hire Bitcointalk Camp. Manager @ r7promotions.com


View Profile WWW
March 14, 2020, 06:04:15 PM
 #8

Now you can choose the language manually, on the bottom of the page.
Good addition. It was very quick!

Was it there already and I missed it at the first place? :-P

The site really looks very nice now.

..Stake.com..   ▄████████████████████████████████████▄
   ██ ▄▄▄▄▄▄▄▄▄▄            ▄▄▄▄▄▄▄▄▄▄ ██  ▄████▄
   ██ ▀▀▀▀▀▀▀▀▀▀ ██████████ ▀▀▀▀▀▀▀▀▀▀ ██  ██████
   ██ ██████████ ██      ██ ██████████ ██   ▀██▀
   ██ ██      ██ ██████  ██ ██      ██ ██    ██
   ██ ██████  ██ █████  ███ ██████  ██ ████▄ ██
   ██ █████  ███ ████  ████ █████  ███ ████████
   ██ ████  ████ ██████████ ████  ████ ████▀
   ██ ██████████ ▄▄▄▄▄▄▄▄▄▄ ██████████ ██
   ██            ▀▀▀▀▀▀▀▀▀▀            ██ 
   ▀█████████▀ ▄████████████▄ ▀█████████▀
  ▄▄▄▄▄▄▄▄▄▄▄▄███  ██  ██  ███▄▄▄▄▄▄▄▄▄▄▄▄
 ██████████████████████████████████████████
▄▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▄
█  ▄▀▄             █▀▀█▀▄▄
█  █▀█             █  ▐  ▐▌
█       ▄██▄       █  ▌  █
█     ▄██████▄     █  ▌ ▐▌
█    ██████████    █ ▐  █
█   ▐██████████▌   █ ▐ ▐▌
█    ▀▀██████▀▀    █ ▌ █
█     ▄▄▄██▄▄▄     █ ▌▐▌
█                  █▐ █
█                  █▐▐▌
█                  █▐█
▀▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▀█
▄▄█████████▄▄
▄██▀▀▀▀█████▀▀▀▀██▄
▄█▀       ▐█▌       ▀█▄
██         ▐█▌         ██
████▄     ▄█████▄     ▄████
████████▄███████████▄████████
███▀    █████████████    ▀███
██       ███████████       ██
▀█▄       █████████       ▄█▀
▀█▄    ▄██▀▀▀▀▀▀▀██▄  ▄▄▄█▀
▀███████         ███████▀
▀█████▄       ▄█████▀
▀▀▀███▄▄▄███▀▀▀
..PLAY NOW..
PrimeNumber7
Copper Member
Legendary
*
Offline Offline

Activity: 1624
Merit: 1899

Amazon Prime Member #7


View Profile
March 14, 2020, 08:46:54 PM
Merited by ABCbits (3), AB de Royse777 (2), vapourminer (1), Heisenberg_Hunter (1), aliashraf (1)
 #9

Take a look at this thread. Some forum members were creating CoinJoin transactions with each other. Review how the Wasabi Wallet works; it combines inputs from various users to obfuscate the relationship between each user's inputs and outputs. At one point, now-defunct exchange Mt Gox allowed users to 'import' their private keys into their accounts, and Mt Gox would use those private keys to create transactions that spent any unspent outputs spendable with the private keys, and the transactions would be spending unspent outputs of other customer's private keys, and some of Mt Gox's own coin. With the advent of Lightning Network, many people will sign transactions that have other people's inputs.

The point of all this is that a single transaction that connects two addresses together is not necessarily enough to link two businesses together, absent additional evidence.

Also, a classification model that is accurate 96% of the time (it is unclear how you are measuring accuracy) has very high accuracy. My first reaction to that high of claimed accuracy is that you might have data leakage. I can't point to the source without looking at your specific steps to train your model, which understandably may not be something you want to share.
aliashraf
Legendary
*
Offline Offline

Activity: 1456
Merit: 1174

Always remember the cause!


View Profile WWW
March 14, 2020, 09:55:01 PM
Merited by AB de Royse777 (2), ABCbits (1)
 #10

The point of all this is that a single transaction that connects two addresses together is not necessarily enough to link two businesses together, absent additional evidence.
The paper addresses this issue:
Quote
Although someone would use the CoinJoin method [9] to combine UTXOs from multiple senders into a single transaction to make it more challenging to determine the relationship between input and output addresses, we detect this method has not been adopted by the exchange so far.


Also, a classification model that is accurate 96% of the time (it is unclear how you are measuring accuracy) has very high accuracy. My first reaction to that high of claimed accuracy is that you might have data leakage. I can't point to the source without looking at your specific steps to train your model, which understandably may not be something you want to share.
I suppose they are presenting a model more than a software. So far, the model seems to me to be solid up to the extent that a good heuristic-based data mining model could be. The implementation is not open and it is not good news, so the results presented are highly suspicious.

For example, consider a conspiracy theory to be true: A shady exchange (such as Bittrex) with very low liquidity and a high incentive to put itself in the top 10 list and faking high volumes of trade, as a part of its scam, hires a team of technical writers and they publish an acceptable analysis model and faking privately generated results in favor of the exchange.
zsrem (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 49


View Profile
March 15, 2020, 01:16:39 AM
Merited by AB de Royse777 (2), ABCbits (1)
 #11

The CoinJoin problem is really hard to solve. But as we see, nowadays, there is actually no/few CoinJoin in exchanges, especially big exchanges.
This paper has been peer-reviewed by some experts. They also addressed the issue of CoinJoin. But I cannot find a way to solve this issue.
PrimeNumber7
Copper Member
Legendary
*
Offline Offline

Activity: 1624
Merit: 1899

Amazon Prime Member #7


View Profile
March 15, 2020, 04:22:16 AM
Merited by AB de Royse777 (2), ABCbits (1)
 #12

The point of all this is that a single transaction that connects two addresses together is not necessarily enough to link two businesses together, absent additional evidence.
The paper addresses this issue:
Quote
Although someone would use the CoinJoin method [9] to combine UTXOs from multiple senders into a single transaction to make it more challenging to determine the relationship between input and output addresses, we detect this method has not been adopted by the exchange so far.
I don't think this is a valid assumption. A CJ transaction can consist of two inputs, each from different entities. In 2013, many exchanges were not as professional as they are today, and were dealing with much less customer money.

The OP appears to be interested in weeding out exchanges with fake volume. An exchange with fake volume could possibly pay a whale to conduct a small number of Coin Join transactions to evade detection of their fake volume.

Also, a classification model that is accurate 96% of the time (it is unclear how you are measuring accuracy) has very high accuracy. My first reaction to that high of claimed accuracy is that you might have data leakage. I can't point to the source without looking at your specific steps to train your model, which understandably may not be something you want to share.
I suppose they are presenting a model more than a software. So far, the model seems to me to be solid up to the extent that a good heuristic-based data mining model could be. The implementation is not open and it is not good news, so the results presented are highly suspicious.

For example, consider a conspiracy theory to be true: A shady exchange (such as Bittrex) with very low liquidity and a high incentive to put itself in the top 10 list and faking high volumes of trade, as a part of its scam, hires a team of technical writers and they publish an acceptable analysis model and faking privately generated results in favor of the exchange.

I would recommend that you learn about machine learning. The Wikipedia article will tell you about ML but will be insufficient for you to be able to speak to it coherently.

The CoinJoin problem is really hard to solve. But as we see, nowadays, there is actually no/few CoinJoin in exchanges, especially big exchanges.
This paper has been peer-reviewed by some experts. They also addressed the issue of CoinJoin. But I cannot find a way to solve this issue.
You are correct, the CJ problem is difficult to solve programmatically. You could rule out transactions that have inputs from unique addresses above a threshold and unique output addresses above a threshold. This would not address Bob and Alice's exchange (who is faking volume) from broadcasting a single CJ transaction with two inputs and two outputs.

Once you find address clusters, you could remove a percentage of transactions associated with each address cluster in your data set, and re-run your cluster analysis. If a high enough percentage of addresses are no longer part of the cluster with transactions excluded, you can either flag the address cluster for closer analysis separately, or you can do something such as looping through each transaction that connects the cluster, and each loop removes a single transaction and adds the previous transaction back in (each loop assumes exactly one transaction is removed). If enough loops produce distinct, large clusters, then you may have a 'hidden' CJ transaction.

Obviously the above would be very expensive computational wise and is Big O squared. You might be able to make different assumptions that would make your function more efficient.
aliashraf
Legendary
*
Offline Offline

Activity: 1456
Merit: 1174

Always remember the cause!


View Profile WWW
March 15, 2020, 06:25:15 AM
Merited by ABCbits (2), AB de Royse777 (2)
 #13

The point of all this is that a single transaction that connects two addresses together is not necessarily enough to link two businesses together, absent additional evidence.
The paper addresses this issue:
Quote
Although someone would use the CoinJoin method [9] to combine UTXOs from multiple senders into a single transaction to make it more challenging to determine the relationship between input and output addresses, we detect this method has not been adopted by the exchange so far.
I don't think this is a valid assumption. A CJ transaction can consist of two inputs, each from different entities. In 2013, many exchanges were not as professional as they are today, and were dealing with much less customer money.

The OP appears to be interested in weeding out exchanges with fake volume. An exchange with fake volume could possibly pay a whale to conduct a small number of Coin Join transactions to evade detection of their fake volume.
It is not that simple. Imagine that we have this model approved and standardized and many watchdogs involved using the basic idea. An exchange confident enough about its volume might decide to let analyzer do their hob and provide the info which puts them in the top list. A shady exchange can not change anything by using coinjoin. It is because of what coinjoin does: hiding assets. The incentive goes the opposite way.

Also, a classification model that is accurate 96% of the time (it is unclear how you are measuring accuracy) has very high accuracy. My first reaction to that high of claimed accuracy is that you might have data leakage. I can't point to the source without looking at your specific steps to train your model, which understandably may not be something you want to share.
I suppose they are presenting a model more than a software. So far, the model seems to me to be solid up to the extent that a good heuristic-based data mining model could be. The implementation is not open and it is not good news, so the results presented are highly suspicious.

For example, consider a conspiracy theory to be true: A shady exchange (such as Bittrex) with very low liquidity and a high incentive to put itself in the top 10 list and faking high volumes of trade, as a part of its scam, hires a team of technical writers and they publish an acceptable analysis model and faking privately generated results in favor of the exchange.

I would recommend that you learn about machine learning.
Thank you for the recommendation and the Wikipedia page you linked.  Cheesy
It is not how it works in technical discussions tho. You got deep knowledge in ML? Good for you! But for now, the only serious objection you've made to the article is about the possibility of the model being coinjoin-attacked by exchanges, making void one of the basic heuristic assumptions of the proposed model. Well,  I'm not convinced, nobody would because there is no sign of that and no incentive for that.
zsrem (OP)
Newbie
*
Offline Offline

Activity: 6
Merit: 49


View Profile
March 15, 2020, 12:36:44 PM
Merited by vapourminer (1), LFC_Bitcoin (1), ABCbits (1), aliashraf (1)
 #14

the model being coinjoin-attacked by exchanges

So far as I see, no CoinJoin happened to exchanges. Most of the exchanges like to use the hot-cold-deposit wallet model. Because it's safe and reliable. The hot and cold wallets have obvious features. The cold wallets have a lot of Bitcoin so they are always shown in the rich list (https://chain.info/richlist). And the cold wallet only has one-to-one transactions with the hot wallet, which has a huge number of transactions.

I manually checked the gathered hot and cold wallets of every exchange. They all look well.

But there is also an exception we found later. Some exchanges like to use a changing addresses as the hot wallet. The hot wallet changes its address every time after it generates a transaction.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!