nitrous (OP)
|
|
July 11, 2013, 06:21:41 PM Last edit: July 12, 2013, 02:36:56 PM by nitrous |
|
For some reason bitbucket didn't replace the version with the new one I uploaded, and now it's bugging out so I can't upload the new version. For now, I'm going to just host it on Dropbox until I can make sure bitbucket is working properly. Here's the new version. To verify it, you should get some red text at the top of the export window on Windows. The red text says that I've disabled the progress bar on Windows because I believe that is responsible for most of these crashes. Hopefully I can reenable it at some point in the future, but for now try these links. I am really sorry about all this hassle, and thank you for your patience. I really hope that this version does actually work. Mac - https://dl.dropboxusercontent.com/u/1760961/MTT/v1.0.3/Binary%20(Mac).zip Windows - https://dl.dropboxusercontent.com/u/1760961/MTT/v1.0.3/Binary%20(Windows).zipBitBucket seems to be working again now, so you should be able to download from here, the links above are still available if not though. If you are doing a large export and would like to have a progress bar, try this version and please tell me if it works for you: Windows - https://dl.dropboxusercontent.com/u/1760961/MTT/v1.0.2/Binary%20(Windows).zip(The latest mac version already includes the progress bar)
|
|
|
|
Epoch
Legendary
Offline
Activity: 922
Merit: 1003
|
|
July 11, 2013, 07:28:06 PM |
|
I just tried the latest Windows binary (without status bar) and can confirm it has successfully exported 3600 second candles. I"ll try some other values as well and report if I encounter any errors. Win7/64.
|
|
|
|
nitrous (OP)
|
|
July 11, 2013, 07:38:36 PM |
|
I just tried the latest Windows binary (without status bar) and can confirm it has successfully exported 3600 second candles. I"ll try some other values as well and report if I encounter any errors. Win7/64.
Yes, finally! I'm tentatively hopeful that this success will continue I'll be exploring possible solutions to the progress bar problem, as exports from a full dump can take a very long time; in the meantime, you can check the filesize of the export to see that it is in fact increasing, although that won't tell you how much longer there is to go.
|
|
|
|
Diabolicus
Member
Offline
Activity: 90
Merit: 10
|
|
July 12, 2013, 10:14:16 AM |
|
I will check thoroughly later, but it seems to work. Great job!
|
|
|
|
Diabolicus
Member
Offline
Activity: 90
Merit: 10
|
|
July 12, 2013, 05:53:59 PM |
|
Not working on Win7 64Bit. Want me to test it with aero or something else disabled? Cannot test 32Bit before monday, I'm not at the office any more. However, this version is right now exporting 15min candles since 2011 like a boss (Win 7, 64 Bit)
|
|
|
|
nitrous (OP)
|
|
July 12, 2013, 06:02:45 PM |
|
Not working on Win7 64Bit. Cannot test 32Bit before monday, I'm not at the office any more. Ok, I'll have to think of another way to get the progress bar working. However, this version is right now exporting 15min candles since 2011 like a boss Awesome! Thanks Epoch and Diabolicus for your patience and prompt testing
|
|
|
|
2weiX
Legendary
Offline
Activity: 2058
Merit: 1005
this space intentionally left blank
|
|
July 12, 2013, 08:47:42 PM |
|
trying thiso out ASAP. might have some feature requests, happy to donate.
Great Let me know and I'll see what I can do. err.. a dummie's guide to step-by-stepping to create a dump, first^^
|
|
|
|
nitrous (OP)
|
|
July 12, 2013, 09:46:15 PM |
|
trying thiso out ASAP. might have some feature requests, happy to donate.
Great Let me know and I'll see what I can do. err.. a dummie's guide to step-by-stepping to create a dump, first^^ Yeah, sorry I haven't got around to doing the readme yet. Hopefully I won't be busy tomorrow and I'll get on it.
|
|
|
|
nitrous (OP)
|
|
July 13, 2013, 04:17:02 PM Last edit: July 13, 2013, 06:48:58 PM by nitrous |
|
Ok, I've now written an in-depth readme that should explain how to do everything possible with the tool . There are screenshots for Mac, but the Windows interface is exactly the same so that shouldn't cause any problems. You can download it here: https://bitbucket.org/nitrous/bq/downloads/Readme.pdfI have also updated the landing page readme.md with a quickstart guide.
|
|
|
|
100x
Jr. Member
Offline
Activity: 30
Merit: 501
Seek the truth
|
|
July 16, 2013, 11:08:38 PM |
|
So I am teaching myself how to use Postgresql (among other things), and I thought "what better way could there be than finally getting around to messing with the mtgox historical trade data?" (I had been wanting to analyze this data for sometime). After searching around for a while, I came upon this thread. Thank you so much for this awesome tool! You have saved me much time/effort It was very easy to start a dump, and the green bar is moving to the right currently so it looks like it is working! Just curious, the easiest way to fill in the missing data (as the Big Query db is not yet being updated regularly) would be to fetch trades with IDs after the ID of the last trade in the dump, yes? I was thinking writing a small python script using the API commands as specified in this post: https://bitcointalk.org/index.php?topic=178336.msg2067244#msg2067244
|
|
|
|
nitrous (OP)
|
|
July 16, 2013, 11:40:28 PM |
|
So I am teaching myself how to use Postgresql (among other things), and I thought "what better way could there be than finally getting around to messing with the mtgox historical trade data?" (I had been wanting to analyze this data for sometime). After searching around for a while, I came upon this thread. Thank you so much for this awesome tool! You have saved me much time/effort It was very easy to start a dump, and the green bar is moving to the right currently so it looks like it is working! Just curious, the easiest way to fill in the missing data (as the Big Query db is not yet being updated regularly) would be to create a small script that uses the mtgox API to fetch trades with ID's after the ID of the last trade in the dump, yes? Glad you've found it useful Yes, that's right. Here's an example of one script that's already been created - http://cahier2.ww7.be/bitcoinmirror/phantomcircuit/. This is a different format though, and doesn't store all the data (I did however port this to an export format). You'll need to determine which field in the API maps to which field in the BQ database, and fill in some that isn't included in the API (Bid_User_Rest_App__ for example). There are some discrepancies with the API data - the missing fields, data being split into individual currencies, currencies occasionally being silently added, gaps in the API data, etc. Whilst this shouldn't be too much of a problem for your actual data processing, it could cause problems if you ever want to update from BQ again. I suggest that you instead insert the data from the API into a new table (say, dump_live). You can then create a view from a union between dump and dump_live, delete the obsolete index on dump, and index the view (Money_Trade__ ASC,[Primary] DESC,Currency__ ASC). This will allow you to still access all the data fairly fast, but without corrupting the BQ dump. If you don't mind an extra 200mb, you can leave the index on dump. My tool will recreate this index if it doesn't exist, so it might be easier and more efficient to just leave it. I've actually given thought to putting this in the official tool, hence the insights I had above (I'm not sure when I'll have enough time to do that though, it would mean changing quite a lot of the tool), so if you want any help with your script let me know, especially as I've also written documentation on the MtGox API v2, so I can help you map the API data to the BQ schema and get around the data jumps
|
|
|
|
Diabolicus
Member
Offline
Activity: 90
Merit: 10
|
|
July 17, 2013, 07:43:04 AM |
|
Any idea when the trade data will be updated? It would be interesting to backtest trade strategies against the bear market after May 23rd.
|
|
|
|
nitrous (OP)
|
|
July 17, 2013, 04:10:08 PM |
|
Any idea when the trade data will be updated? It would be interesting to backtest trade strategies against the bear market after May 23rd.
I tried again today but I didn't get an answer. Tomorrow I'll try to contact him earlier, hopefully during Japanese business hours.
|
|
|
|
aspeer
Member
Offline
Activity: 102
Merit: 10
|
|
July 18, 2013, 04:41:43 AM |
|
very nice tool for running scenarios! thanks!
|
|
|
|
dlasher
|
|
July 27, 2013, 12:16:17 AM |
|
Trying, but unsuccessful so far. windows version (on win7-x64) crashes about 50% through. linux version (on debian7) crashes after entering auth code with: No handlers could be found for logger "oauth2client.util" Traceback (most recent call last): File "app.py", line 81, in __call__ return apply(self.func, args) File "/usr/src/bq/bq.py", line 169, in complete credential = flow.step2_exchange(code, http) File "/usr/local/lib/python2.7/dist-packages/google_api_python_client-1.1-py2.7.egg/oauth2client/util.py", line 128, in positional_wrapper return wrapped(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/google_api_python_client-1.1-py2.7.egg/oauth2client/client.py", line 1283, in step2_exchange headers=headers) File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1570, in request (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey) File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1317, in _request (response, content) = self._conn_request(conn, request_uri, method, body, headers) File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1252, in _conn_request conn.connect() File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1021, in connect self.disable_ssl_certificate_validation, self.ca_certs) File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 80, in _ssl_wrap_socket cert_reqs=cert_reqs, ca_certs=ca_certs) File "/usr/lib/python2.7/ssl.py", line 381, in wrap_socket ciphers=ciphers) File "/usr/lib/python2.7/ssl.py", line 141, in __init__ ciphers) SSLError: [Errno 185090050] _ssl.c:340: error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib
Ideas?
|
|
|
|
nitrous (OP)
|
|
July 27, 2013, 01:21:06 PM |
|
Trying, but unsuccessful so far. windows version (on win7-x64) crashes about 50% through. linux version (on debian7) crashes after entering auth code with: No handlers could be found for logger "oauth2client.util" Traceback (most recent call last): File "app.py", line 81, in __call__ return apply(self.func, args) File "/usr/src/bq/bq.py", line 169, in complete credential = flow.step2_exchange(code, http) File "/usr/local/lib/python2.7/dist-packages/google_api_python_client-1.1-py2.7.egg/oauth2client/util.py", line 128, in positional_wrapper return wrapped(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/google_api_python_client-1.1-py2.7.egg/oauth2client/client.py", line 1283, in step2_exchange headers=headers) File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1570, in request (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey) File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1317, in _request (response, content) = self._conn_request(conn, request_uri, method, body, headers) File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1252, in _conn_request conn.connect() File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1021, in connect self.disable_ssl_certificate_validation, self.ca_certs) File "/usr/local/lib/python2.7/dist-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 80, in _ssl_wrap_socket cert_reqs=cert_reqs, ca_certs=ca_certs) File "/usr/lib/python2.7/ssl.py", line 381, in wrap_socket ciphers=ciphers) File "/usr/lib/python2.7/ssl.py", line 141, in __init__ ciphers) SSLError: [Errno 185090050] _ssl.c:340: error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib
Ideas? Hi dlasher, I haven't actually tried it on linux, but I can see why you get the error. The app currently expects httplib2/cacerts.txt to be installed locally to app.py (as in the windows and mac binaries), and it goes wrong when the library is actually installed to pythons dist-packages. I've just pushed an update that should address this issue and use the installed cacerts.txt if it can't find it locally. As for the windows problem, was that with updating or exporting? Was there any indication as to what went wrong? Possibly a log file? I know I'm not catching every exception yet, which could cause an unexpected problem, but under normal conditions the only problems that could happen should be from Tkinter being buggy with multithreaded code.
|
|
|
|
id10tothe9
|
|
August 15, 2013, 09:21:40 PM Last edit: August 15, 2013, 09:41:34 PM by id10tothe9 |
|
I tried it on Ubuntu, after authentication I get "Unexpected Exception" with a long message (cant be copied from the window) at the end of which it says: IOError: [Errno 13] Permission denied: '/home/username/.config/mtgox-trades-tool/creds.dat not sure what to make of that Edit: ok, kinda linux n00b . got it working now..
|
|
|
|
100x
Jr. Member
Offline
Activity: 30
Merit: 501
Seek the truth
|
|
August 16, 2013, 11:58:28 AM |
|
Glad you've found it useful Yes, that's right. Here's an example of one script that's already been created - http://cahier2.ww7.be/bitcoinmirror/phantomcircuit/. This is a different format though, and doesn't store all the data (I did however port this to an export format). You'll need to determine which field in the API maps to which field in the BQ database, and fill in some that isn't included in the API (Bid_User_Rest_App__ for example). There are some discrepancies with the API data - the missing fields, data being split into individual currencies, currencies occasionally being silently added, gaps in the API data, etc. Whilst this shouldn't be too much of a problem for your actual data processing, it could cause problems if you ever want to update from BQ again. I suggest that you instead insert the data from the API into a new table (say, dump_live). You can then create a view from a union between dump and dump_live, delete the obsolete index on dump, and index the view (Money_Trade__ ASC,[Primary] DESC,Currency__ ASC). This will allow you to still access all the data fairly fast, but without corrupting the BQ dump. If you don't mind an extra 200mb, you can leave the index on dump. My tool will recreate this index if it doesn't exist, so it might be easier and more efficient to just leave it. I've actually given thought to putting this in the official tool, hence the insights I had above (I'm not sure when I'll have enough time to do that though, it would mean changing quite a lot of the tool), so if you want any help with your script let me know, especially as I've also written documentation on the MtGox API v2, so I can help you map the API data to the BQ schema and get around the data jumps I ended up putting my trade analysis work on hold for a while, but I wanted to stop by and mention that I was able to get the remainder of the data using the API and add it to my db just fine. Thanks again for all your help. I was curious about the gaps in the API data that you mentioned, what type of gaps exactly are you talking about? I did some extremely basic verification and recreated a few candles for a few random days, and I got the same result as bitcoincharts.com (after filtering to USD trades properly). In case anyone is interested, here is the python code I used to query the remainder of the mtgox data. You have to do it in 1000 trade blocks, so it is essentially just a nice wrapper on a long series of API calls: import urllib2 import json import csv import time import datetime
API_FETCH_URL = 'https://data.mtgox.com/api/2/BTCUSD/money/trades/fetch?since='
# mtgox API varname: postgres DB varname (from BigQuery dump) VAR_MAP = { 'price_currency': 'currency__', 'trade_type': 'type', 'price_int': 'price', 'item': 'item', 'primary': 'single_market', 'tid': 'money_trade__', 'amount_int': 'amount', 'date': 'date', 'properties': 'properties' } VARS = [var for var in VAR_MAP.keys()]
def process_date(timestamp): return datetime.datetime.utcfromtimestamp(timestamp).isoformat(' ')
def fetch_trades(last_id= None, outfile= '', limit=False, noise=False): counter = 0
with open(outfile, 'wb') as f: # create CSV writer object writer = csv.writer(f)
# add header of varnames at top of file writer.writerow([VAR_MAP[var] for var in VARS])
while True: # pause for 2 seconds between API calls to prevent getting banned by anti-DDOS time.sleep(2)
# fetch trades after the most recent trade id, using mtgox API page_data = urllib2.urlopen(API_FETCH_URL + last_id) # read response from urlopen GET request json_response = page_data.read() # decode JSON data into python dictionary response = json.loads(json_response)
if response['result'] == 'success' and len(response['data']) > 0: trades = response['data']
if noise: print 'Batch %04d -- ?since= %s, num trades: %d' % (counter + 1, last_id, len(trades))
# write each trade as a separate line, using only trade values for the vars in the list VARS # for date, convert from timestamp into ISO str format (UTC) to match date column in postgres DB writer.writerows([[trade[var] if var != 'date' else process_date(trade[var]) for var in VARS] for trade in trades])
# set last_id to the tid of the last trade, so we can fetch next batch of trades last_id = trades[-1]['tid']
counter = counter + 1 # if limit parameter is in use, then only do as many batches as specified if limit != False and counter >= limit: break else: print '\n**********' if response['result'] == 'success': print 'SCRIPT END -- last trade reached' else: print 'SCRIPT END -- API call failed' print '**********' break
if __name__ == '__main__': username = 'name' fileame = 'mtgox_recent_trades.csv' # id of last trade from BigQuery dump last_dump_trade_id = '1369319572499520' outfile = 'c:\\users\\%s\documents\\data\\mtgox\\%s' % (username, filename)
fetch_trades(last_id= last_dump_trade_id, outfile= outfile, noise= True)
|
|
|
|
nitrous (OP)
|
|
August 16, 2013, 12:22:20 PM |
|
I ended up putting my trade analysis work on hold for a while, but I wanted to stop by and mention that I was able to get the remainder of the data using the API and add it to my db just fine. Thanks again for all your help.
No problem. Thanks, I'm sure many people here will find that useful until MtGox gets around to finishing the bq database. For anyone still interested in this tool, I'm going to post an update on the situation immediately after this post. I was curious about the gaps in the API data that you mentioned, what type of gaps exactly are you talking about? I did some extremely basic verification and recreated a few candles for a few random days, and I got the same result as bitcoincharts.com (after filtering to USD trades properly).
For USD I wouldn't expect any gaps. Essentially, as well as the 1000 trade limit in the API, there are also other limits such as a time limit, something like 86400 seconds. What this means is that if no trades happen in a 24-hour period, then your script will break down, as no data will be returned and you won't ever update your last_id. The most well known gap is for USD across the tid transition (see my documentation on this here) between tids 218868 and 1309108565842636. I believe this is the only USD gap, and since this was back in 2011 it is covered by the bq database and is not a problem. For other less popular currencies, however, I have found that there are many gaps, some quite long, and so without regular bq updates this is liable to break a script. Of course if you have an up to date database and can run your script regularly, at least once per day, this will not be a problem as you will catch all trades as they come in. Alternatively, you could manually advance the last_id if you haven't caught up to the current time yet, but you need to confirm the limit is indeed 86400s first.
|
|
|
|
nitrous (OP)
|
|
August 16, 2013, 12:29:39 PM |
|
I haven't posted an update about this yet because I haven't got a definitive update from MagicalTux that bq is even still happening At the moment, with US legal issues, litecoin, etc, MagicalTux isn't really focusing on bq at all, but hopefully he will finish it eventually. I don't anticipate this to be anytime soon though, and he probably needs to be reminded about bq occasionally (and shown that there is a demand for it). Unfortunately, I don't have much time to work on this at the moment, and I'm going to university in a few weeks, so if anyone still wants regular data then perhaps someone with python experience might consider forking my project and adding in 100x's script? My idea was to use two tables in the database - one for BQ data, the other for API data. Then you could create a view into a union of both these tables, and delete API data as BQ replaces it. Remember that the API doesn't provide all fields that BQ does, and you have to access each currency individually with the API. If you don't need these other fields though, and only need a few select currencies, you could then use this hybrid system to do live exports as well (as Loozik requested).
|
|
|
|
|