Long range forecast of the network hashrate and Difficulty

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate

December 24, 2012, 05:51:39 AM
Last edit: January 11, 2013, 10:36:37 AM by organofcorti

#21

0.Introduction
It could be suggested that the reward halving continues to produce a significant effect outside of the direct USDBTC comparison, and certainly that's what I intuit. I suspect only a doubling of the MTGOX US$BTC price or a halving of the network hashrate will bring the model on track, but time will tell.

1. Models and datasets:
The model datasets have been collected into one paste to save time. Model estimates have been likewise aggregated.

Forecast and canary model analysis
All datasets
All estimates

2. Results
Model.f1, the one week forecast, recovered to within the 95% confidence interval range for last weeks's forecast of this week. I think this shows the importance of including the network hashrate lag variables. As somewhat expected the Canary model is still modelling a hashrate much higher than the current network hashrate, which is outside the 95% confidence interval for that model for the third week in a row.

http://organofcorti.blogspot.com/2012/12/weekly-network-forecast-24th-december.html

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate

January 05, 2013, 07:59:40 AM

#22

This week's update, report from the blog:

http://organofcorti.blogspot.com.au/2013/01/weekly-network-forecast-31st-december.html

Weekly network forecast 31st December 2012

0.Introduction
I apologise for the delay in posting this update. The real world intervened in a most pleasant manner.

I've added some more error information this week to enable an easier comparison between forecast and actual network hashrate. The forecast models are recovering already, with all predictions within the estimated error range. It seems the reward halving has only had a temporary effect on the forecast model accuracy so far.

1. Models and datasets:
The model datasets have been collected into one paste to save time. Model estimates have been likewise aggregated.

Forecast and canary model analysis
All datasets
All estimates

2. Results

Canary model (current hashrate estimate based only on current price and previous network hashrates): This model's error has recovered to 4% of the actual weekly average network hashrate this week. The Canary model was outside the expected range for only 3 weeks after the reward halving, so it seems to remain a good indicator of the onset of changes other than the MTGOX BTCUSD price and previous network hashrate averages.
Model.f1 (one week forecast): Model.f1's error recovered to 5% of the weekly average network hashrate, almost as low an average error as in the weeks leading up to the reward halving. Hopefully the model will remain useful until the ASIC hashrates are added.
Models f2, f3 and f4 errors are high, and although within the estimated 95% confidence interval for the error are not yet useful for long range forecasts.

Donations help give me the time to analyse bitcoin mining related issues and write these posts. If you enjoy or find them helpful, please consider a small bitcoin donation:
12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r

Thanks to the following for use of their data:
blockexplorer.com: 1Cvvr8AsCfbbVQ2xoWiFD1Gb2VRbGsEf28
blockchain.info
molecular: 1MoLECHau3nb9zV78uKraNdcZsJ2REwxuL

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

420

Hero Member

Offline

Activity: 756
Merit: 500

Re: Long range forecast of the network hashrate

January 06, 2013, 01:15:55 AM

#23

estimate without asic introduciton... ? right

Donations: 1JVhKjUKSjBd7fPXQJsBs5P3Yphk38AqPr - TIPS
the hacks, the hacks, secure your bits!

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate

January 06, 2013, 03:19:33 AM

#24

Quote from: 420 on January 06, 2013, 01:15:55 AM

estimate without asic introduciton... ? right

Yes. As mentioned previously, the method breaks down when changes to the network are effected by causative factors other than price or the previous network hashrate. That's why the forecasts were so inaccurate after the reward halving. I'm surprised the forecasts are already becoming accurate again so soon.

This could help pinpoint when the first ASICs start hashing.

More details:

https://bitcointalk.org/index.php?topic=127379.0
http://organofcorti.blogspot.com/2012/11/103-canaries-coal-mines-and-black-swans.html

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

Raize

Donator
Legendary

Offline

Activity: 1419
Merit: 1015

Re: Long range forecast of the network hashrate

January 07, 2013, 06:03:36 PM

#25

Regardless of whether or not ASICs drop and change the way modelling estimates are done, someone needs to be doing this so those of us both trading and mining can have some sort of foresight. This is impressive work, organofcorti. Thanks for doing this.

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate

January 07, 2013, 10:23:10 PM

#26

Quote from: Raize on January 07, 2013, 06:03:36 PM

I'm glad you find it useful - I was thinking the simplicity and accuracy of the models were interesting enough to follow up with a weekly update, but I wasn't sure anyone would be able to use the data given the way the errors can increase as the length of the forecast increases.

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate

January 08, 2013, 08:10:31 AM
Last edit: January 09, 2013, 06:26:59 AM by organofcorti

#27

This week's update, report from the blog:

http://organofcorti.blogspot.com/2013/01/weekly-network-forecast-7th-january-2013.html

Note: The charts are now taking up so much room I'm no longer posting them here; you can see them in all their glory at the blog.

Weekly network forecast 7th January 2013

0.Introduction
If you're happy with < 10% error in your forecasts, this week's errors were acceptably useful. Since (barring the changes due to the reward halving) accuracy has been generally good, it's possible to use the weekly hashrate forecasts to estimate the date of a retarget and a forecast estimate for the retarget difficulty.

For example based on this weeks model f1, f2, f3 and f4 weekly hashrate forecasts, we can forecast the retarget at block 217728 will be on 21st January and difficulty will change to ~ 3657934.

So this for this week I've included:

A table of current retarget date and difficulty estimates
A table of previous retarget date and difficulty estimates, and the actual retarget dates and difficulties as a comparison
A chart of the last twenty six weeks of retarget date and difficulty estimates, actual retarget dates and actual difficulties.

It should be noted that each retarget will often be forecast by two consecutive weekly forecasts, hence the multiple points per retarget on the new chart. Please post a comment if it's not clear.

1. Models and datasets:
The model datasets have been collected into one paste to save time. Model estimates have been likewise aggregated.

Forecast and canary model analysis
All datasets
All estimates
Difficulty data and estimates

2. Results

Canary model (current hashrate estimate based only on current price and previous network hashrates): This model's error has recovered to 4% of the actual weekly average network hashrate this week. The Canary model was outside the expected range for only 3 weeks after the reward halving, so it seems to remain a good indicator of the onset of changes other than the MTGOX BTCUSD price and previous network hashrate averages.
Model.f1 (one week forecast): Model.f1's error recovered to 5% of the weekly average network hashrate, almost as low an average error as in the weeks leading up to the reward halving. Hopefully the model will remain useful until the ASIC hashrates are added.
Models f2, f3 and f4 errors are high, and although within the estimated 95% confidence interval for the error are not yet useful for long range forecasts.
The large negative difficulty change after the block reward halving was (of course) not predicted and stands out as a clear error. However the new estimates look on target, and I think an estimate of Difficulty ~ 3.6 million after the next retarget is reasonable.

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

qbits

Full Member

Offline

Activity: 219
Merit: 100

Re: Long range forecast of the network hashrate and Difficulty

January 11, 2013, 07:13:40 AM

#28

Quote from: organofcorti on November 16, 2012, 07:28:04 AM

Below is the original short range forecast post. Long range forecast posts start here.

Model 0: log(H) ~ 1.74 + 0.94lag1(log(H)) + 0.21lag1(log(p)) - 0.14lag4(log(p)) . This will be within +/- 15.1% of the actual network hashrate with 95% confidence.

You are modeling/training your forecast function using the same data that you use to verify it's forecast accuracy. This is not a proper way to create a data model as all you get is a function that is able to accurately model past price movements and you have no information on it's accuracy.

Proper way is to split the historical pricing data into at least two sets:
- learning set, from which you deduce your function parameters. You can use jan-jun 2012 price data for example.
- test set, which you can use to verify function accuracy, that is how successfully it models price data. You can use jul-dec 2012 for example.
- possibly you could have more test sets, 2011 data for example.

If you were to do what I suggested you would find that your model is not very good at modeling price data and hence a very bad predictor/forcaster of future data.

This is not your fault, it's just that you have taken on a very very difficult problem to solve...

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate and Difficulty

January 11, 2013, 07:41:56 AM

#29

Quote from: qbits on January 11, 2013, 07:13:40 AM

Quote from: organofcorti on November 16, 2012, 07:28:04 AM

Below is the original short range forecast post. Long range forecast posts start here.

Model 0: log(H) ~ 1.74 + 0.94lag1(log(H)) + 0.21lag1(log(p)) - 0.14lag4(log(p)) . This will be within +/- 15.1% of the actual network hashrate with 95% confidence.

You are modeling/training your forecast function using the same data that you use to verify it's forecast accuracy. This is not a proper way to create a data model as all you get is a function that is able to accurately model past price movements and you have no information on it's accuracy

Proper way is to split the historical pricing data into at least two sets:
- learning set, from which you deduce your function parameters. You can use jan-jun 2012 price data for example.
- test set, which you can use to verify function accuracy, that is how successfully it models price data. You can use jul-dec 2012 for example.
- possibly you could have more test sets, 2011 data for example.

Thanks for your feedback.

I've always been aware that I might have been overfitting. I couldn't split the data - there's just so little data to be had. Also, various points in time have had slightly differing auto - and cross correlations and I wanted to be able to produce a simple linear model that could account for all data, especially since I'm only using two variables out of all the variables that can affect the network hashrate from time to time.

Instead I decided to use all the data I had, using a linear model with minimum of coefficients and lagged variables. Since then I have applied the model weekly. Imagine the initial post as the "training" phase and my current posts as the "test set". It seems so far that the model has been predicting future network hashrate - and future network mining difficulty - far better than I expected.

Also, it seems you're assuming I'm modelling price. I'm not modelling price data - I'm using lagged historical price data (and lagged historical network hashrate data) to provide a forecast of the network hashrate. There is no modelling of price at all, just a 1 to 4 week forecast of the network hashrate, with confidence intervals for the error (which I'll be the first to admit are quite large at the 3 and 4 week forecast).

Quote from: qbits on January 11, 2013, 07:13:40 AM

If you were to do what I suggested you would find that your model is not very good at modeling price data and hence a very bad predictor/forcaster of future data.

This is not your fault, it's just that you have taken on a very very difficult problem to solve...

However, as I mentioned I'm not modelling price data. I've used genetic algorithms to do that before using the ADF test with some success - but over time it was not enough to beat the ask/bid spread. So I'll leave that to the finance geeks Wink

If you look at my weekly update posts, you'll see I'm assessing the models as I go - not changing them, just assessing them. So far they have performed as expected - within the 95% CI for error, except when the reward halving occurred ( a variable I cannot account for in a simple linear/lag function ).

If I haven't explained this well, take a look at the long range forecast post:
http://organofcorti.blogspot.com/2012/12/104-long-range-forecasts-of-network.html

and the most recent update post:
http://organofcorti.blogspot.com/2013/01/weekly-network-forecast-7th-january-2013.html

I would be interested to read what you think.

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

qbits

Full Member

Offline

Activity: 219
Merit: 100

Re: Long range forecast of the network hashrate and Difficulty

January 11, 2013, 11:08:27 PM

#30

First let me correct what I said in the earlier post when I spoke of price history data: I meant difficulty history data. Indeed you are modeling difficulty and not price I must have confused the two

I read your report and found this: http://2.bp.blogspot.com/-YgYQ-OX3S5E/UOvLIj9-6SI/AAAAAAAAEoQ/drCPQ70jf7U/s1600/DifficultyForecast.2013-01-08.png

As soon as you try to forcast data into any significant future, your model is way off real data.

Anyway splitting data into learning and test sets is a must. That's the only way to test the validity of the model with any confidence.

Another suggestion would be to try other modelling techniques such as neural networks. You can find information about all these in books covering Machine Learning...

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate and Difficulty

January 12, 2013, 02:00:14 PM

#31

Quote from: qbits on January 11, 2013, 11:08:27 PM

First let me correct what I said in the earlier post when I spoke of price history data: I meant difficulty history data. Indeed you are modeling difficulty and not price I must have confused the two

As mentioned in the post, I expected the model to fail when the block reward halving occurred. In the chart to which you link, I asume you refer to the inability of the network hashrate models to predict difficulty retargets correctly? I think it's done a rather good job. The only prediction that has been significantly different from the actual difficulty retarget is the one immediately after the reward halving. The other difficulty predictions (after making the model in November) have been better than I expected, since I'm not modelling it directly and can't provide a confidence interval.

Quote from: qbits on January 11, 2013, 11:08:27 PM

Anyway splitting data into learning and test sets is a must. That's the only way to test the validity of the model with any confidence.

Can you provide a source for this assertion? I've found no books on ARIMA modelling that make such a claim - that training and testing sets are necessary for forecasts of auto- and cross-correlative models. These sets might be necessary when using symbolic regression and (sometimes) when using genetic algorithms to model time series data, but it's not mentioned in the ARIMA texts I've read.

Do keep in mind that in (for example) a univariate series of one hundred weekly average hashrate data points, we have an unknown number of known independent variables (historical price and hashrate data) and a unknown number of unknown independent variables. Some of those unknown variables have been / will be rare but have had / will have a significant effect on the network hashrate which cannot be accounted for by the model. Splitting the data into two smaller sets risks the unknown variables having a significant effect on the accuracy of the model.

Quote from: qbits on January 11, 2013, 11:08:27 PM

Another suggestion would be to try other modelling techniques such as neural networks. You can find information about all these in books covering Machine Learning...

That could be fun, but I'm not trying to provide the most accurate forecast possible - I'm using the simplest possible method to achieve an aim, explain how it's done and hopefully interest some readers to try it for themselves - or go one step further. Anyone who has a basic level of math and coding skills should be able to replicate my work and at the same time know what they're doing. If my work encourages someone to develop a truly accurate forecast, I'll be very happy.

In the meantime, I provide 95% confidence intervals for the hashrate forecasts based on historical data and so far the forecasts have only exceeded the confidence interval only when an unknown variable (the reward halving) has an effect on the network. I'm surprised by the model's rapid recovery post the block halving.

We'll just have to see how it goes. Care to make a wager? Wink

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

qbits

Full Member

Offline

Activity: 219
Merit: 100

Re: Long range forecast of the network hashrate and Difficulty

January 12, 2013, 02:27:45 PM

#32

Quote from: organofcorti on January 12, 2013, 02:00:14 PM

Can you provide a source for this assertion? I've found no books on ARIMA modelling that make such a claim - that training and testing sets are necessary for forecasts of auto- and cross-correlative models. These sets might be necessary when using symbolic regression and (sometimes) when using genetic algorithms to model time series data, but it's not mentioned in the ARIMA texts I've read.

In the meantime, I provide 95% confidence intervals for the hashrate forecasts based on historical data and so far the forecasts have only exceeded the confidence interval only when an unknown variable (the reward halving) has an effect on the network. I'm surprised by the model's rapid recovery post the block halving.

We'll just have to see how it goes. Care to make a wager? Wink

ad 1. start here: http://en.wikipedia.org/wiki/Cross-validation_(statistics). Father Google will provide further references + book recommendations

ad 2. how far into the future are you forecasting?

ad 3. sure! I'll bet you 1 BTC

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate and Difficulty

January 14, 2013, 06:27:21 AM

#33

Quote from: qbits on January 12, 2013, 02:27:45 PM

ad 1. start here: http://en.wikipedia.org/wiki/Cross-validation_(statistics). Father Google will provide further references + book recommendations

No mention that it's the only way to test the validity of a model with confidence, just that it is one method.

Quote from: qbits on January 12, 2013, 02:27:45 PM

ad 2. how far into the future are you forecasting?

Model.f1 forecasts 1 week ahead
Model.f2 forecasts 2 weeks ahead
Model.f3 forecasts 3 weeks ahead
Model.f4 forecasts 4 weeks ahead
The Canary model attempts to detect effects of unknown variables.

Quote from: qbits on January 12, 2013, 02:27:45 PM

ad 3. sure! I'll bet you 1 BTC

OK - I'll bet 1 btc that for the next 5 weeks the actual weekly network average hashrate will be within 13% of the Model.f1 prediction, unless the canary model indicates an external (non hashrate or price) influence.

Now, who will be third party escrow?

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate and Difficulty

January 14, 2013, 08:50:28 AM

#34

Note: The charts are now taking up so much room I'm no longer posting them here; you can see them in all their glory at the blog.

Weekly network forecast 14th January 2013

0.Introduction
The Canary model error for this week turns out to be more than the 95% confidence interval for error, and well outside the 95% confidence interval for the network hashrate estimate. This possibly implies the effect of an external influence, or more likely due to the 3.2% increase in price coupled with an unexpected 7.7% decrease in the weekly average network hashrate.

If there is some sort of external influence which has caused some proportion of miners to switch off suddenly, then the model recovery will likely be slow, as it was after the block reward halving.
If the drop in hashrate with an increase in price is a random event, models should recover to within the expected range next week.
If the last two weeks were anomalous and the block reward halving is having a continued effect on the network, then model error will continue to be outside the 95% confidence interval for error an unknown period of time.

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

tacotime

Legendary

Offline

Activity: 1484
Merit: 1005

Re: Long range forecast of the network hashrate and Difficulty

January 18, 2013, 03:42:24 AM

#35

I'm guessing that the 20 TH/s brought by Avalon will have the network hash rate up to 40 TH/s in a week or two, since anyone who has one will be mining with it.

Code:

XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns

organofcorti (OP)

Donator
Legendary

Offline

Activity: 2058
Merit: 1007

Poor impulse control.

Re: Long range forecast of the network hashrate and Difficulty

March 25, 2013, 10:29:50 AM

#36

I haven't updated this for a while, so here are the most recent charts:

http://organofcorti.blogspot.com/2013/03/weekly-network-forecast-25th-march-2013.html

Bitcoin network and pool analysis 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
follow @oocBlog for new post notifications

Dalkore

Legendary

Offline

Activity: 1330
Merit: 1026

Mining since 2010 & Hosting since 2012

Re: Long range forecast of the network hashrate

March 25, 2013, 04:39:19 PM

#37

Quote from: organofcorti on December 11, 2012, 03:58:26 AM

Quote from: 420 on December 11, 2012, 03:23:49 AM

were they right

Were who right? If you mean Dalkore, then no, he wasn't right. My estimates were much closer.

Yes, please take his estimates. I was just giving commentary. I follow his work and he is really working on refining his numbers.

Hosting: Low as $60.00 per KW - Link
Transaction List: jayson3 +5 - ColdHardMetal +3 - Nolo +2 - CoinHoarder +1 - Elxiliath +1 - tymm0 +1 - Johnniewalker +1 - Oscer +1 - Davidj411 +1 - BitCoiner2012 +1 - dstruct2k +1 - Philj +1 - camolist +1 - exahash +1 - Littleshop +1 - Severian +1 - DebitMe +1 - lepenguin +1 - StringTheory +1 - amagimetals +1 - jcoin200 +1 - serp +1 - klintay +1 - -droid- +1 - FlutterPie +1

creativex

Sr. Member

Offline

Activity: 434
Merit: 250