organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
December 24, 2012, 05:51:39 AM Last edit: January 11, 2013, 10:36:37 AM by organofcorti |
|
0.IntroductionIt could be suggested that the reward halving continues to produce a significant effect outside of the direct USDBTC comparison, and certainly that's what I intuit. I suspect only a doubling of the MTGOX US$BTC price or a halving of the network hashrate will bring the model on track, but time will tell. 1. Models and datasets:The model datasets have been collected into one paste to save time. Model estimates have been likewise aggregated. Forecast and canary model analysisAll datasetsAll estimates2. ResultsModel.f1, the one week forecast, recovered to within the 95% confidence interval range for last weeks's forecast of this week. I think this shows the importance of including the network hashrate lag variables. As somewhat expected the Canary model is still modelling a hashrate much higher than the current network hashrate, which is outside the 95% confidence interval for that model for the third week in a row. http://organofcorti.blogspot.com/2012/12/weekly-network-forecast-24th-december.html
|
|
|
|
organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
January 05, 2013, 07:59:40 AM |
|
This week's update, report from the blog: http://organofcorti.blogspot.com.au/2013/01/weekly-network-forecast-31st-december.htmlWeekly network forecast 31st December 20120.IntroductionI apologise for the delay in posting this update. The real world intervened in a most pleasant manner. I've added some more error information this week to enable an easier comparison between forecast and actual network hashrate. The forecast models are recovering already, with all predictions within the estimated error range. It seems the reward halving has only had a temporary effect on the forecast model accuracy so far. 1. Models and datasets:The model datasets have been collected into one paste to save time. Model estimates have been likewise aggregated. Forecast and canary model analysisAll datasetsAll estimates2. Results- Canary model (current hashrate estimate based only on current price and previous network hashrates): This model's error has recovered to 4% of the actual weekly average network hashrate this week. The Canary model was outside the expected range for only 3 weeks after the reward halving, so it seems to remain a good indicator of the onset of changes other than the MTGOX BTCUSD price and previous network hashrate averages.
- Model.f1 (one week forecast): Model.f1's error recovered to 5% of the weekly average network hashrate, almost as low an average error as in the weeks leading up to the reward halving. Hopefully the model will remain useful until the ASIC hashrates are added.
- Models f2, f3 and f4 errors are high, and although within the estimated 95% confidence interval for the error are not yet useful for long range forecasts.
Donations help give me the time to analyse bitcoin mining related issues and write these posts. If you enjoy or find them helpful, please consider a small bitcoin donation: 12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r Thanks to the following for use of their data: blockexplorer.com: 1Cvvr8AsCfbbVQ2xoWiFD1Gb2VRbGsEf28 blockchain.info molecular: 1MoLECHau3nb9zV78uKraNdcZsJ2REwxuL
|
|
|
|
420
|
|
January 06, 2013, 01:15:55 AM |
|
estimate without asic introduciton... ? right
|
Donations: 1JVhKjUKSjBd7fPXQJsBs5P3Yphk38AqPr - TIPS the hacks, the hacks, secure your bits!
|
|
|
|
Raize
Donator
Legendary
Offline
Activity: 1419
Merit: 1015
|
|
January 07, 2013, 06:03:36 PM |
|
Regardless of whether or not ASICs drop and change the way modelling estimates are done, someone needs to be doing this so those of us both trading and mining can have some sort of foresight. This is impressive work, organofcorti. Thanks for doing this.
|
|
|
|
organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
January 07, 2013, 10:23:10 PM |
|
Regardless of whether or not ASICs drop and change the way modelling estimates are done, someone needs to be doing this so those of us both trading and mining can have some sort of foresight. This is impressive work, organofcorti. Thanks for doing this.
I'm glad you find it useful - I was thinking the simplicity and accuracy of the models were interesting enough to follow up with a weekly update, but I wasn't sure anyone would be able to use the data given the way the errors can increase as the length of the forecast increases.
|
|
|
|
organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
January 08, 2013, 08:10:31 AM Last edit: January 09, 2013, 06:26:59 AM by organofcorti |
|
This week's update, report from the blog: http://organofcorti.blogspot.com/2013/01/weekly-network-forecast-7th-january-2013.htmlNote: The charts are now taking up so much room I'm no longer posting them here; you can see them in all their glory at the blog. Weekly network forecast 7th January 20130.IntroductionIf you're happy with < 10% error in your forecasts, this week's errors were acceptably useful. Since (barring the changes due to the reward halving) accuracy has been generally good, it's possible to use the weekly hashrate forecasts to estimate the date of a retarget and a forecast estimate for the retarget difficulty. For example based on this weeks model f1, f2, f3 and f4 weekly hashrate forecasts, we can forecast the retarget at block 217728 will be on 21st January and difficulty will change to ~ 3657934. So this for this week I've included: - A table of current retarget date and difficulty estimates
- A table of previous retarget date and difficulty estimates, and the actual retarget dates and difficulties as a comparison
- A chart of the last twenty six weeks of retarget date and difficulty estimates, actual retarget dates and actual difficulties.
It should be noted that each retarget will often be forecast by two consecutive weekly forecasts, hence the multiple points per retarget on the new chart. Please post a comment if it's not clear. 1. Models and datasets:The model datasets have been collected into one paste to save time. Model estimates have been likewise aggregated. Forecast and canary model analysisAll datasetsAll estimatesDifficulty data and estimates2. Results- Canary model (current hashrate estimate based only on current price and previous network hashrates): This model's error has recovered to 4% of the actual weekly average network hashrate this week. The Canary model was outside the expected range for only 3 weeks after the reward halving, so it seems to remain a good indicator of the onset of changes other than the MTGOX BTCUSD price and previous network hashrate averages.
- Model.f1 (one week forecast): Model.f1's error recovered to 5% of the weekly average network hashrate, almost as low an average error as in the weeks leading up to the reward halving. Hopefully the model will remain useful until the ASIC hashrates are added.
- Models f2, f3 and f4 errors are high, and although within the estimated 95% confidence interval for the error are not yet useful for long range forecasts.
- The large negative difficulty change after the block reward halving was (of course) not predicted and stands out as a clear error. However the new estimates look on target, and I think an estimate of Difficulty ~ 3.6 million after the next retarget is reasonable.
|
|
|
|
qbits
|
|
January 11, 2013, 07:13:40 AM |
|
Below is the original short range forecast post. Long range forecast posts start here. - Model 0: log(H) ~ 1.74 + 0.94lag1(log(H)) + 0.21lag1(log(p)) - 0.14lag4(log(p)) . This will be within +/- 15.1% of the actual network hashrate with 95% confidence.
You are modeling/training your forecast function using the same data that you use to verify it's forecast accuracy. This is not a proper way to create a data model as all you get is a function that is able to accurately model past price movements and you have no information on it's accuracy. Proper way is to split the historical pricing data into at least two sets: - learning set, from which you deduce your function parameters. You can use jan-jun 2012 price data for example. - test set, which you can use to verify function accuracy, that is how successfully it models price data. You can use jul-dec 2012 for example. - possibly you could have more test sets, 2011 data for example. If you were to do what I suggested you would find that your model is not very good at modeling price data and hence a very bad predictor/forcaster of future data. This is not your fault, it's just that you have taken on a very very difficult problem to solve...
|
|
|
|
organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
January 11, 2013, 07:41:56 AM |
|
Below is the original short range forecast post. Long range forecast posts start here. - Model 0: log(H) ~ 1.74 + 0.94lag1(log(H)) + 0.21lag1(log(p)) - 0.14lag4(log(p)) . This will be within +/- 15.1% of the actual network hashrate with 95% confidence.
You are modeling/training your forecast function using the same data that you use to verify it's forecast accuracy. This is not a proper way to create a data model as all you get is a function that is able to accurately model past price movements and you have no information on it's accuracy Proper way is to split the historical pricing data into at least two sets: - learning set, from which you deduce your function parameters. You can use jan-jun 2012 price data for example. - test set, which you can use to verify function accuracy, that is how successfully it models price data. You can use jul-dec 2012 for example. - possibly you could have more test sets, 2011 data for example. Thanks for your feedback. I've always been aware that I might have been overfitting. I couldn't split the data - there's just so little data to be had. Also, various points in time have had slightly differing auto - and cross correlations and I wanted to be able to produce a simple linear model that could account for all data, especially since I'm only using two variables out of all the variables that can affect the network hashrate from time to time. Instead I decided to use all the data I had, using a linear model with minimum of coefficients and lagged variables. Since then I have applied the model weekly. Imagine the initial post as the "training" phase and my current posts as the "test set". It seems so far that the model has been predicting future network hashrate - and future network mining difficulty - far better than I expected. Also, it seems you're assuming I'm modelling price. I'm not modelling price data - I'm using lagged historical price data (and lagged historical network hashrate data) to provide a forecast of the network hashrate. There is no modelling of price at all, just a 1 to 4 week forecast of the network hashrate, with confidence intervals for the error (which I'll be the first to admit are quite large at the 3 and 4 week forecast). If you were to do what I suggested you would find that your model is not very good at modeling price data and hence a very bad predictor/forcaster of future data.
This is not your fault, it's just that you have taken on a very very difficult problem to solve...
However, as I mentioned I'm not modelling price data. I've used genetic algorithms to do that before using the ADF test with some success - but over time it was not enough to beat the ask/bid spread. So I'll leave that to the finance geeks If you look at my weekly update posts, you'll see I'm assessing the models as I go - not changing them, just assessing them. So far they have performed as expected - within the 95% CI for error, except when the reward halving occurred ( a variable I cannot account for in a simple linear/lag function ). If I haven't explained this well, take a look at the long range forecast post: http://organofcorti.blogspot.com/2012/12/104-long-range-forecasts-of-network.htmland the most recent update post: http://organofcorti.blogspot.com/2013/01/weekly-network-forecast-7th-january-2013.htmlI would be interested to read what you think.
|
|
|
|
qbits
|
|
January 11, 2013, 11:08:27 PM |
|
First let me correct what I said in the earlier post when I spoke of price history data: I meant difficulty history data. Indeed you are modeling difficulty and not price I must have confused the two I read your report and found this: http://2.bp.blogspot.com/-YgYQ-OX3S5E/UOvLIj9-6SI/AAAAAAAAEoQ/drCPQ70jf7U/s1600/DifficultyForecast.2013-01-08.pngAs soon as you try to forcast data into any significant future, your model is way off real data. Anyway splitting data into learning and test sets is a must. That's the only way to test the validity of the model with any confidence. Another suggestion would be to try other modelling techniques such as neural networks. You can find information about all these in books covering Machine Learning...
|
|
|
|
organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
January 12, 2013, 02:00:14 PM |
|
As mentioned in the post, I expected the model to fail when the block reward halving occurred. In the chart to which you link, I asume you refer to the inability of the network hashrate models to predict difficulty retargets correctly? I think it's done a rather good job. The only prediction that has been significantly different from the actual difficulty retarget is the one immediately after the reward halving. The other difficulty predictions (after making the model in November) have been better than I expected, since I'm not modelling it directly and can't provide a confidence interval. Anyway splitting data into learning and test sets is a must. That's the only way to test the validity of the model with any confidence.
Can you provide a source for this assertion? I've found no books on ARIMA modelling that make such a claim - that training and testing sets are necessary for forecasts of auto- and cross-correlative models. These sets might be necessary when using symbolic regression and (sometimes) when using genetic algorithms to model time series data, but it's not mentioned in the ARIMA texts I've read. Do keep in mind that in (for example) a univariate series of one hundred weekly average hashrate data points, we have an unknown number of known independent variables (historical price and hashrate data) and a unknown number of unknown independent variables. Some of those unknown variables have been / will be rare but have had / will have a significant effect on the network hashrate which cannot be accounted for by the model. Splitting the data into two smaller sets risks the unknown variables having a significant effect on the accuracy of the model. Another suggestion would be to try other modelling techniques such as neural networks. You can find information about all these in books covering Machine Learning...
That could be fun, but I'm not trying to provide the most accurate forecast possible - I'm using the simplest possible method to achieve an aim, explain how it's done and hopefully interest some readers to try it for themselves - or go one step further. Anyone who has a basic level of math and coding skills should be able to replicate my work and at the same time know what they're doing. If my work encourages someone to develop a truly accurate forecast, I'll be very happy. In the meantime, I provide 95% confidence intervals for the hashrate forecasts based on historical data and so far the forecasts have only exceeded the confidence interval only when an unknown variable (the reward halving) has an effect on the network. I'm surprised by the model's rapid recovery post the block halving. We'll just have to see how it goes. Care to make a wager?
|
|
|
|
qbits
|
|
January 12, 2013, 02:27:45 PM |
|
Can you provide a source for this assertion? I've found no books on ARIMA modelling that make such a claim - that training and testing sets are necessary for forecasts of auto- and cross-correlative models. These sets might be necessary when using symbolic regression and (sometimes) when using genetic algorithms to model time series data, but it's not mentioned in the ARIMA texts I've read. In the meantime, I provide 95% confidence intervals for the hashrate forecasts based on historical data and so far the forecasts have only exceeded the confidence interval only when an unknown variable (the reward halving) has an effect on the network. I'm surprised by the model's rapid recovery post the block halving. We'll just have to see how it goes. Care to make a wager? ad 1. start here: http://en.wikipedia.org/wiki/Cross-validation_(statistics). Father Google will provide further references + book recommendations ad 2. how far into the future are you forecasting? ad 3. sure! I'll bet you 1 BTC
|
|
|
|
organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
January 14, 2013, 06:27:21 AM |
|
No mention that it's the only way to test the validity of a model with confidence, just that it is one method. ad 2. how far into the future are you forecasting?
Model.f1 forecasts 1 week ahead Model.f2 forecasts 2 weeks ahead Model.f3 forecasts 3 weeks ahead Model.f4 forecasts 4 weeks ahead The Canary model attempts to detect effects of unknown variables. ad 3. sure! I'll bet you 1 BTC OK - I'll bet 1 btc that for the next 5 weeks the actual weekly network average hashrate will be within 13% of the Model.f1 prediction, unless the canary model indicates an external (non hashrate or price) influence. Now, who will be third party escrow?
|
|
|
|
organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
January 14, 2013, 08:50:28 AM |
|
Note: The charts are now taking up so much room I'm no longer posting them here; you can see them in all their glory at the blog. Weekly network forecast 14th January 20130.IntroductionThe Canary model error for this week turns out to be more than the 95% confidence interval for error, and well outside the 95% confidence interval for the network hashrate estimate. This possibly implies the effect of an external influence, or more likely due to the 3.2% increase in price coupled with an unexpected 7.7% decrease in the weekly average network hashrate. - If there is some sort of external influence which has caused some proportion of miners to switch off suddenly, then the model recovery will likely be slow, as it was after the block reward halving.
- If the drop in hashrate with an increase in price is a random event, models should recover to within the expected range next week.
- If the last two weeks were anomalous and the block reward halving is having a continued effect on the network, then model error will continue to be outside the 95% confidence interval for error an unknown period of time.
|
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
January 18, 2013, 03:42:24 AM |
|
I'm guessing that the 20 TH/s brought by Avalon will have the network hash rate up to 40 TH/s in a week or two, since anyone who has one will be mining with it.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
March 25, 2013, 10:29:50 AM |
|
|
|
|
|
Dalkore
Legendary
Offline
Activity: 1330
Merit: 1026
Mining since 2010 & Hosting since 2012
|
|
March 25, 2013, 04:39:19 PM |
|
were they right
Were who right? If you mean Dalkore, then no, he wasn't right. My estimates were much closer. Yes, please take his estimates. I was just giving commentary. I follow his work and he is really working on refining his numbers.
|
Hosting: Low as $60.00 per KW - LinkTransaction List: jayson3 +5 - ColdHardMetal +3 - Nolo +2 - CoinHoarder +1 - Elxiliath +1 - tymm0 +1 - Johnniewalker +1 - Oscer +1 - Davidj411 +1 - BitCoiner2012 +1 - dstruct2k +1 - Philj +1 - camolist +1 - exahash +1 - Littleshop +1 - Severian +1 - DebitMe +1 - lepenguin +1 - StringTheory +1 - amagimetals +1 - jcoin200 +1 - serp +1 - klintay +1 - -droid- +1 - FlutterPie +1
|
|
|
creativex
|
|
March 26, 2013, 02:32:47 PM |
|
Fantastic work organofcorti. Thanks for the update.
Tip inbound.
|
|
|
|
organofcorti (OP)
Donator
Legendary
Offline
Activity: 2058
Merit: 1007
Poor impulse control.
|
|
March 26, 2013, 02:49:38 PM |
|
Fantastic work organofcorti. Thanks for the update.
Tip inbound.
Thank you very much, creativex. And as a special "thank you" treat, you get:
|
|
|
|
Rawted
|
|
March 27, 2013, 06:07:37 AM |
|
i'm late to the party, but this is fantastic work.
|
|
|
|
|