Bitcoin Forum
June 29, 2024, 01:12:16 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Transaction propagation speed - experimental results  (Read 1859 times)
r.willis (OP)
Jr. Member
*
Offline Offline

Activity: 42
Merit: 11


View Profile
March 28, 2013, 07:22:46 PM
 #1

I was interested, how fast do transactions propagate through network in reality?
I was thinking, if I can connect to every node and listen to inv messages, I could estimate it (node should send inv when it got tx and verified it, and only once). Well, I did exactly that. Only I ended up connected not to all nodes, but to fair share of them (~1900 active connections at the end of the experiment). I logged every inv message with timestamp. After several hours, I processed resulting 2.5 Gb text file with python, remembering time I first saw each transacion hash, and next time I got inv with it, I calculated time difference, binning it (used 1 sec bins) and summing over all transactions.
I ended up with following distribution (x is in seconds):

What can I say? Certainly, propagation speed is not that great. "Tails" are quite long. There are fair amount of retransmits (I believe second mode is result retransmits).
Please discuss.
Zeilap
Full Member
***
Offline Offline

Activity: 154
Merit: 100


View Profile
March 28, 2013, 08:09:35 PM
 #2

You need to exclude any duplicates from the same peer. If I tell you about a transaction 3 seconds after it's first announced, and then later on, I send you an inv message with a whole bunch of transactions from my memory pool, including the one I already told you about, you can't count me telling you twice. Clearly I knew about that transaction less than 3 seconds after it was announced, and that's what your interested in.
SgtSpike
Legendary
*
Offline Offline

Activity: 1400
Merit: 1005



View Profile
March 28, 2013, 08:12:35 PM
 #3

Interesting analysis.  Rather than creating transactions to test, you simply test the first and last time you hear about any given transaction.

The problem, as Zeilap pointed out, is duplicates.  Not only might node A send the same message to you twice, but it might send the same message to peer B twice.  And peer B might not have chosen to send it to you the first time, for whatever reason, but send it to you the second time.  Then it appears that it took X number of seconds to reach peer B, when it is not, in fact, true.
r.willis (OP)
Jr. Member
*
Offline Offline

Activity: 42
Merit: 11


View Profile
March 28, 2013, 08:15:16 PM
 #4

As I understand, peers should not announce same tx twice (except the original sender, which can retransmit after set amount of time). I can try exclude duplicates, but doubt it will change results much.
markm
Legendary
*
Offline Offline

Activity: 2940
Merit: 1090



View Profile WWW
March 28, 2013, 08:19:14 PM
 #5

You culd use two nodes, one that listens to as many as possible, to try to figure out when a transaction first appeared; another that waits to hear directly itself, without few connections, about the transactions.

Or not just two; have many, topologically widsespread, few-connection nodes and average how long it takes for them all to hear about a transaction compared to when the node connected to "almost everyone" first heard about it.

-MarkM-

Browser-launched Crossfire client now online (select CrossCiv server for Galactic  Milieu)
Free website hosting with PHP, MySQL etc: http://hosting.knotwork.com/
r.willis (OP)
Jr. Member
*
Offline Offline

Activity: 42
Merit: 11


View Profile
March 28, 2013, 08:29:26 PM
 #6

Or can use only one, as did I. Node in question did not do any relaying itself, because it would skew results.
Zeilap
Full Member
***
Offline Offline

Activity: 154
Merit: 100


View Profile
March 28, 2013, 08:41:58 PM
 #7

As I understand, peers should not announce same tx twice (except the original sender, which can retransmit after set amount of time). I can try exclude duplicates, but doubt it will change results much.
Depends on the size of their memory pool. Default size is 1000, so at several transactions per second, this equates to a few minutes.
Zeilap
Full Member
***
Offline Offline

Activity: 154
Merit: 100


View Profile
March 28, 2013, 08:52:34 PM
 #8

... topologically widespread ...
This doesn't work - you have no idea what the whole network looks like unless you connect to every relaying peer, so you don't know if your nodes are widespread or not. You'd have to repeat the experiment thousands of times to account for sometimes being very close to each other and other times very far apart.
Come-from-Beyond
Legendary
*
Offline Offline

Activity: 2142
Merit: 1009

Newbie


View Profile
March 28, 2013, 09:04:22 PM
 #9

Why is there a peak at 600s? Coincidence?
r.willis (OP)
Jr. Member
*
Offline Offline

Activity: 42
Merit: 11


View Profile
March 28, 2013, 09:18:28 PM
 #10

It's possibly retransmits. I was told satoshi client does retransmit at random intervals, so it's possibly some other client/software re-sending them after 10 min timeout.
Syke
Legendary
*
Offline Offline

Activity: 3878
Merit: 1193


View Profile
March 28, 2013, 10:09:32 PM
 #11

What can I say? Certainly, propagation speed is not that great. "Tails" are quite long. There are fair amount of retransmits (I believe second mode is result retransmits).

Cool graph. Can you add some marks for some interesting percentages? Like where in the graph does 90% land?

Buy & Hold
Blowfeld
Newbie
*
Offline Offline

Activity: 53
Merit: 0



View Profile
March 29, 2013, 12:55:43 AM
Last edit: March 29, 2013, 01:07:56 AM by Blowfeld
 #12

Please discuss.
Very cool graph!  But I'm not sure if it tells much of a tale.  There's no hint as to how many distinct Tx packets you are seeing.

I don't claim to be an expert on the Bitcoin protocol.  I only know what I've observed.

With respect to the long tail, I think that could happen if a node was offline when the packet originally traversed the network.  Later, when the node rejoins the network, it receives the Tx packet from one of its peers.  I think it retransmits that packet to each peer it subsequently connects with.

I think this may also occur as connections between peers flap up and down.  Doesn't each new "up" connection result in an exchange of some queued Tx data?

You said you had 1900 connections "at the end of the experiment".  I would expect a connection you made 20 minutes into the experiment to send you the transaction 20 minutes (or more) after the start of the experiment.  So, did you track the time between when you last connected to a peer and when the peer next sent you the Tx packet?

A useful overlay would be the time between when you first see a packet and the time when it enters the blockchain.  I think the slope of your long tail is partially dependent on how quickly packets enter the blockchain.
r.willis (OP)
Jr. Member
*
Offline Offline

Activity: 42
Merit: 11


View Profile
March 29, 2013, 06:51:25 AM
 #13

Reprocessed plot (counted each (tx_id, peer) pair only once):
http://img404.imageshack.us/img404/6438/propr.png
As I suspected, there is almost no change (some minor peaks are gone).
Quote
You said you had 1900 connections "at the end of the experiment".  I would expect a connection you made 20 minutes into the experiment to send you the transaction 20 minutes (or more) after the start of the experiment.  So, did you track the time between when you last connected to a peer and when the peer next sent you the Tx packet?
I doubt that something is queued and pushed after connection. I never seen this happen - after connection, you receive invs at steady rate. inv broadcast is triggered by successful validation of tx.
Quote
A useful overlay would be the time between when you first see a packet and the time when it enters the blockchain.
Well, after I do block download parsing, it can update graphs.
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4200
Merit: 8441



View Profile WWW
March 29, 2013, 06:57:35 AM
 #14

Reprocessed plot (counted each (tx_id, peer) pair only once):
One thing you want to do is get a list of transactions across several days— doesn't have to be from connecting to a great number of nodes.... and then collect your big run.  Then exclude all of those 'seen before my window started' transactions. That should get rid of a large number of the initial retransmissions.  I'm not sure how you can exclude subsequent retransmissions reliably, however.


r.willis (OP)
Jr. Member
*
Offline Offline

Activity: 42
Merit: 11


View Profile
March 29, 2013, 07:04:25 AM
 #15

If you do it with small number of peers, your results will be overly - optimistic (you will constantly under-estimate propagation times).
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4200
Merit: 8441



View Profile WWW
March 29, 2013, 08:56:01 AM
 #16

If you do it with small number of peers, your results will be overly - optimistic (you will constantly under-estimate propagation times).
Apparently I wasn't very clear. You should monitor for a long time in order to build a list of old transactions which you will then exclude from your subsequent analysis interval so that your data is less polluted by retransmissions. In making this exclusion list it's not important that you have many peers.

Otherwise you end up with a long tail of broken, invalid, never confirming transactions, double spends, etc. that arrive forever and look like unboundedly long propagation in your analysis. You'll still have many but I believe you can probably exclude most with 24-48 hours of exclusion collection.

r.willis (OP)
Jr. Member
*
Offline Offline

Activity: 42
Merit: 11


View Profile
March 29, 2013, 09:07:20 AM
 #17

I see your point. Yes, prior observations would help to clear picture.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!