Bitcoin Forum
November 13, 2024, 07:07:36 PM *
News: Check out the artwork 1Dq created to commemorate this forum's 15th anniversary
 
   Home   Help Search Login Register More  
Pages: « 1 [2]  All
  Print  
Author Topic: Let's analyze fee rate vs confirmation time! (~4m tx data inside)  (Read 1476 times)
animalspirit
Newbie
*
Offline Offline

Activity: 21
Merit: 0


View Profile
January 17, 2017, 09:50:18 AM
Last edit: January 17, 2017, 06:35:46 PM by animalspirit
 #21

I was curious how much revenue is coming from fees from the transactions which miners include in only the next one or two blocks.,

One problem with using OP's data (confirmation_times.csv) for this is that it uses a timestamp, instead of block #.  So there's no real way to know how many confirmations it took for a transaction to get mined.  But using the average of 1 block = 10 minutes, I should be able to get kind of close.  

What information this provides us is that miners receive ~%75 of their revenue from transactions they include within the first two blocks (well, within the first 20 minutes, to be technically accurate).

https://i.imgur.com/26SmxdY.png

Generated using the following Python3 source for use in a Jupyter notebook.

Code:
import pandas as pd
from plotly import graph_objs as go
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode()   # Jupyter notebook
data = []
daily_threshold = 10   # Ignore days w/less than N btc in fees.
filename = 'confirmation_times.csv'
col_names = ['conf_time', 'first_confirmed', 'fee', 'size']
labels = ['0+ minutes', '10+ minutes', '20+ minutes', '30+ minutes', '1+ hours']
bins = [0, 1*10*60, 2*10*60, 3*10*60, 6*10*60, 9999*10*60]
df = pd.read_csv('./{}'.format(filename), usecols=col_names, parse_dates=['first_confirmed'])
df['fee'] = df['fee'] / 100000000  # Use BTCs, not Satoshis.
df['date'] = pd.to_datetime(df['first_confirmed']).dt.date
df['time'] = pd.cut(df['conf_time'], bins, labels=labels)
grouped = df.groupby(['date', 'time'])
df2 = grouped['fee'].sum().to_frame().rename(columns = lambda x: x + '_sum')
s = df2.unstack()['fee_sum'].sum(1).ge(daily_threshold)
plot_df = df2.loc[s.index[s].tolist()]['fee_sum'].unstack()
for time_bin in list(plot_df):
    data.append(
        go.Bar(
            x=['{:%Y-%m-%d}'.format(dt) for dt in list(plot_df[time_bin].index)],
            y=plot_df[time_bin].tolist(),
            name=time_bin))
layout = go.Layout(
    barmode='stack', title='Bitcoin Mining Fee Revenue By Confirmation Speed',
    xaxis=go.XAxis(title='Date', type='category'),
    yaxis=go.YAxis(title='Total Fees (BTC)'))
fig=go.Figure(data=data, layout=layout)
iplot(fig, filename='stacked-bar')
weex (OP)
Legendary
*
Offline Offline

Activity: 1102
Merit: 1014



View Profile
January 19, 2017, 08:32:34 AM
 #22

I was curious how much revenue is coming from fees from the transactions which miners include in only the next one or two blocks.,

One problem with using OP's data (confirmation_times.csv) for this is that it uses a timestamp, instead of block #.  So there's no real way to know how many confirmations it took for a transaction to get mined.  But using the average of 1 block = 10 minutes, I should be able to get kind of close.  

What information this provides us is that miners receive ~%75 of their revenue from transactions they include within the first two blocks (well, within the first 20 minutes, to be technically accurate).


Very cool! Note that the latest data file does have # of blocks to confirm for each transaction which may simplify your script. That file is at http://www.filedropper.com/conftimes

I'll probably post another this weekend and will look to hook this up to http://bitcoinexchangerate.org/fees to generate the best fee estimates possible.
weex (OP)
Legendary
*
Offline Offline

Activity: 1102
Merit: 1014



View Profile
January 22, 2017, 04:27:17 AM
 #23

Another data file: http://www.filedropper.com/txdb2tar
gmaxwell
Staff
Legendary
*
Offline Offline

Activity: 4270
Merit: 8805



View Profile WWW
January 22, 2017, 06:04:18 AM
 #24

While not terribly useful, this scatter plot of # of blocks to confirm
I think thats exactly the opposite... fee rate vs time is not very useful:  Time just adds in the noise of a possion process with no influence by fees.

If you have a good estimator based on block count you can simply multiply it's predictions by the exponential distribution of interblock intervals and get a good estimator of time... but to take data that has been smeared by random block times and construct good estimates is hard due to all the injected noise.

In effect, your time data is heavily biased by random correlations with higher and lower fee intervals with luckier or less lucky block finding.

This remains true so long as people aren't turning hashrate on and off based on fees-- and so far as I know it, no one is today.

An interesting chart is a grid over n-blocks-wait and fee-rate,  then for each cell set a value of what percentage of transactions paying at least that fee rate were confirmed in at that number or fewer blocks.

How does your data handle transaction replacement?   How do you compute feerates for CPFP transactions?   One possibility is to only consider transactions which are not dependent on unconfirmed transactions and which have no children;  and similarly do not consider replacement or replaced transactions.
weex (OP)
Legendary
*
Offline Offline

Activity: 1102
Merit: 1014



View Profile
January 22, 2017, 08:49:24 AM
 #25

While not terribly useful, this scatter plot of # of blocks to confirm
I think thats exactly the opposite... fee rate vs time is not very useful:  Time just adds in the noise of a possion process with no influence by fees.

If you have a good estimator based on block count you can simply multiply it's predictions by the exponential distribution of interblock intervals and get a good estimator of time... but to take data that has been smeared by random block times and construct good estimates is hard due to all the injected noise.

In effect, your time data is heavily biased by random correlations with higher and lower fee intervals with luckier or less lucky block finding.

This remains true so long as people aren't turning hashrate on and off based on fees-- and so far as I know it, no one is today.

An interesting chart is a grid over n-blocks-wait and fee-rate,  then for each cell set a value of what percentage of transactions paying at least that fee rate were confirmed in at that number or fewer blocks.

How does your data handle transaction replacement?   How do you compute feerates for CPFP transactions?   One possibility is to only consider transactions which are not dependent on unconfirmed transactions and which have no children;  and similarly do not consider replacement or replaced transactions.

I understand blocks to confirm is more useful but just looking at that graph, I could see key information wasn't visible in 2d black and white. That's what I meant by that graph not being terribly useful. I like the grid chart idea.

A question that comes to mind with blocks is, has anyone done analysis to detect patterns in luck or transaction inclusion week by week? I suppose not many people are turning miners on and off due to weather, variable energy pricing over the day but I'd be interested to look into it.

The data doesn't take into account CPFP or dependencies within unconfirmed transactions at this point but it was a thought I had.

I've only started collecting some of this data and want to see what people throw on the wall in terms of analysis, then iterate on the data that's being collected to learn more. I feel like diversity in fee estimation strategies is important to defend against some entity trying to game those strategies.

animalspirit
Newbie
*
Offline Offline

Activity: 21
Merit: 0


View Profile
February 26, 2017, 08:43:42 AM
 #26


That link does not produce a download.   Would you happen to be have a recent update?
weex (OP)
Legendary
*
Offline Offline

Activity: 1102
Merit: 1014



View Profile
February 26, 2017, 08:25:56 PM
 #27


That link does not produce a download.   Would you happen to be have a recent update?
I've been focused on Rein and had a performance issue that caused the script to fall behind under heavy transaction volume.

Will see what's possible but note that the software is available and if you have a full node, you too can do it or hire someone to do it.
nyanhtet
Sr. Member
****
Offline Offline

Activity: 476
Merit: 250


Bawga


View Profile
February 26, 2017, 08:37:41 PM
 #28

bitcoin = miners rich scheme

  Develop Custom Decentralized Blockchain Applications in JavaScript with LISK!         
  Website | Blog | Reddit | Chat - Be part of the decentralized application movement!
  Lisk Delegate - Communitypool: 80% reward sharing delegate
Transparency Report
Pages: « 1 [2]  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!