nitrous (OP)
|
 |
May 29, 2013, 06:41:52 AM Last edit: May 29, 2013, 06:53:27 AM by nitrous |
|
Hi there, As I've mentioned before, the extra load caused by people downloading the entire MtGox trade history from the API is unsustainable, and recently I published my own dump to reduce the load on the servers. However, since the 23rd of May, MagicalTux has uploaded a full dump to Google's BigQuery service. This database contains trades for all currencies and contains all information available by the API (and more). The service isn't yet fully integrated, however soon it will start being updated automatically every 10mins-1hr. Until then, you can still access all data before the 23rd of May, and it is the recommended method. To gain access to it, follow these instructions: - First, you need to sign up for Google's BigQuery - see here for instructions on doing so.
- Now that you are signed up, go to the BigQuery console: https://bigquery.cloud.google.com
- Now each time you go to the console, you will need to add the MtGox data set -- click on the drop down array next to API Project on the lefthand sidebar.
- Go to 'Switch to project' > 'Display project...'
- In the resulting popup box (Add Project), enter the Project ID, 'mt-gox'.
- In the sidebar, mt-gox:mtgox should appear. Click on it to reveal the tables, which currently consists of 'trades'.
Now you can perform SQL queries on this using [mt-gox:mtgox.trades] for the table name, e.g.: select * from [mt-gox:mtgox.trades] where Currency__='GBP'; The fields mean the following: Field | Description | Type | Money_Trade__ | This is the tid value, a unique id for the order. The first trades used (almost) sequential integers up to 218868, the majority of trades however came after this (~6 million) and use a Unix microstamp instead. | int | Primary | Whether the trade happened in the currency specified (see Currency__). Some trades may appear on multiple tickers, so to ignore duplicates you may want to `select where primary is true'. | bool | Currency__ | The symbol of the currency against which BTC was bought or sold. | str | Bid_User_Rest_App__ | App ID of the bidding party (the person buying BTC)* | str | Ask_User_Rest_App__ | App ID of the asking party (the person selling BTC)* | str | Type | bid or ask (bid means to buy BTC by selling the specified currency, ask means to buy the specified currency by selling BTC) | str | Properties | A comma separated list of properties of the order, e.g. limit,market,mixed_currency | str | Item | The thing being bought or sold, always BTC. | str | Amount | The volume of the item (BTC) being bought/sold in satoshis (1 BTC = 100,000,000 satoshis). | int | Price | The price of each item in the currency specified. For most currencies, this is given in units of 0.00001 of the currency, so $12.3456 would be 1,234,560. Some currencies, currently JPY and SEK use units of 0.001 instead. | int | Date | The date of the trade in ISO format. | str | * These fields represent the app ID when an app is being used to make the trade. Currently, these are null for all but 4% of trades. There are two fields because there are two parties to every trade, one who is buying the BTC (the bidder) and another who is selling it (the asker). For most of these 4% app trades, only one of the parties is an app, however there are also ~3500 trades for which both parties are apps and both of these fields are non-null.I plan on writing a python program soon to download a dump of this data straight from Google to a local sqlite3 database. Once you are able to obtain this data, feel free to then keep it up-to-date using the API methods (see here for info from my own documentation to do this using the v2 api). MagicalTux has confirmed that this usage is permissible, as the load on the API is minimised this way. Thanks, nitrous
|
|
|
|
MagicalTux
VIP
Hero Member
Offline
Activity: 608
Merit: 501
-
|
 |
May 29, 2013, 06:52:56 AM |
|
Just for info, automatic update of these data is scheduled to start this week or next week
|
|
|
|
whydifficult
|
 |
May 29, 2013, 08:02:57 AM |
|
This is really sweet, thanks!
|
Gekko a nodejs bitcoin trading bot! Realtime Bitcoin Globe - visualizing all transactions and blocks Tip jar (BTC): 1KyQdQ9ctjCrGjGRCWSBhPKcj5omy4gv5S
|
|
|
Loozik
Sr. Member
  
Offline
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
|
 |
May 29, 2013, 08:29:25 PM |
|
Date | The date of the trade in ISO format. | str |
Great  because my charting app does not accept Unixtime. Okay, we have Date like ''2013-05-28''. Isn't Time like ''hhmmss'' missing?
|
|
|
|
nitrous (OP)
|
 |
May 29, 2013, 08:53:42 PM |
|
Date | The date of the trade in ISO format. | str |
Great  because my charting app does not accept Unixtime. Okay, we have Date like ''2013-05-28''. Isn't Time like ''hhmmss'' missing? The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss". Here's a sample of the database: Money_Trade__ | Primary | Currency__ | Bid_User_Rest_App__ | Ask_User_Rest_App__ | Type | Properties | Item | Amount | Price | Date | | 1363132336180895 | true | USD | e7b6384e-3981-42e7-b3b7-7719c4dc74a5 | null | bid | market | BTC | 22276676 | 4435000 | 2013-03-12 23:52:16 | | 1363132369821802 | true | USD | null | null | bid | limit | BTC | 8174066 | 4435000 | 2013-03-12 23:52:49 | | 1363132372700876 | true | USD | null | e75e9f56-4b37-4440-9b10-b56fc6a47ee1 | ask | limit | BTC | 64250873 | 4434351 | 2013-03-12 23:52:52 | | 1363132372786101 | true | USD | null | e75e9f56-4b37-4440-9b10-b56fc6a47ee1 | ask | limit | BTC | 8174066 | 4433910 | 2013-03-12 23:52:52 | | 1363132386679941 | true | USD | null | e75e9f56-4b37-4440-9b10-b56fc6a47ee1 | ask | limit | BTC | 160000000 | 4414717 | 2013-03-12 23:53:06 | |
|
|
|
|
Loozik
Sr. Member
  
Offline
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
|
 |
May 29, 2013, 09:10:04 PM |
|
The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".
Thank you very much for the information. Is it possible (if it's not too late) to separate Date from Time? Most widely used trading and charting applications for retail traders (Multicharts or Ninjatrader and others) will not be able to use the data in the format where date and time are merged. 
|
|
|
|
nitrous (OP)
|
 |
May 29, 2013, 10:02:48 PM |
|
The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".
Thank you very much for the information. Is it possible (if it's not too late) to separate Date from Time? Most widely used trading and charting applications for retail traders (Multicharts or Ninjatrader and others) will not be able to use the data in the format where date and time are merged. Does the application not allow any way of doing this directly? If not, you have a few different options: - You can ask MagicalTux and see if he'll consider it for when he begins regular updates (although that would be 1 or 2 weeks away before you see any actual changes).
- When I am able to write a program to download the entire database, I will release the code. If you have any programming experience, it should be relatively simple to then modify the data to your needs before input into your application. If not, I may be able to convert it to any format you need, however I'm not too sure of the logistics of then giving you the database.
- You could try doing a SQL query along the lines of:
SELECT LEFT(Date,10),RIGHT(Date,8) FROM [mt-gox:mtgox.trades] LIMIT 10; N.B. though, that BigQuery may not let you download all the data at once using this method.[/li]
|
|
|
|
kodo
Newbie
Offline
Activity: 42
Merit: 0
|
 |
May 29, 2013, 10:18:08 PM |
|
Thanks, this is great info!
|
|
|
|
Loozik
Sr. Member
  
Offline
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
|
 |
May 29, 2013, 10:20:47 PM |
|
Does the application not allow any way of doing this directly? Unfortunately not. - You can ask MagicalTux and see if he'll consider it for when he begins regular updates (although that would be 1 or 2 weeks away before you see any actual changes).
I wouldn't have the courage to bother him  - When I am able to write a program to download the entire database, I will release the code. If you have any programming experience, it should be relatively simple to then modify the data to your needs before input into your application. If not, I may be able to convert it to any format you need, however I'm not too sure of the logistics of then giving you the database.
I have no programming experience. Out of curiosity, would it be possible somehow get the data into a CSV file in my computer that would be updated real-time? I would not like to bother you with this. I only want to know if this is feasible. - You could try doing a SQL query along the lines of:
SELECT LEFT(Date,10),RIGHT(Date,8) FROM [mt-gox:mtgox.trades] LIMIT 10; Anything more complicated than Word or Excel overwhelms my circuits  [/list]
|
|
|
|
Loozik
Sr. Member
  
Offline
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
|
 |
May 29, 2013, 11:00:25 PM |
|
The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss". Just a thought: could 2 additional (to already existing Date) columns / fields to collect be added to the API: Date_Only and Time_Only (derived from Date)? Would this be something that requires a lot of work?
|
|
|
|
|
nitrous (OP)
|
 |
May 29, 2013, 11:50:53 PM |
|
The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss". Just a thought: could 2 additional (to already existing Date) columns / fields to collect be added to the API: Date_Only and Time_Only (derived from Date)? Would this be something that requires a lot of work? I will ask him when I next catch him, if what you're saying is true then it would make a lot of sense to include all three to accommodate all users. On the other hand, Google I believe charges for the amount of data uploaded, so that may factor into their decision as it may seem a bit redundant. I don't think google BigQuery makes it very easy for non-programmers to actually download bulk data from them, and I haven't had a chance to read up on using it programmatically yet. Once I have though, it would be really easy for me to make a CSV as you asked for though, and perhaps I could even make a small application to keep a local database updated from the BQ servers. What platform are you on? I assume from your screenshot it is Windows?
|
|
|
|
Loozik
Sr. Member
  
Offline
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
|
 |
May 30, 2013, 12:03:03 AM |
|
I will ask him when I next catch him, if what you're saying is true then it would make a lot of sense to include all three to accommodate all users. Fantastic  On the other hand, Google I believe charges for the amount of data uploaded, so that may factor into their decision as it may seem a bit redundant. This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect. I don't think google BigQuery makes it very easy for non-programmers to actually download bulk data from them, and I haven't had a chance to read up on using it programmatically yet. Once I have though, it would be really easy for me to make a CSV as you asked for though, and perhaps I could even make a small application to keep a local database updated from the BQ servers. What platform are you on? I assume from your screenshot it is Windows?
I do not know how much would creating this application cost. I hope crowd financing might work (with a little help from MtGox). Yes, I am using Windows.
|
|
|
|
nitrous (OP)
|
 |
May 30, 2013, 12:10:39 AM |
|
I will ask him when I next catch him, if what you're saying is true then it would make a lot of sense to include all three to accommodate all users. Fantastic  On the other hand, Google I believe charges for the amount of data uploaded, so that may factor into their decision as it may seem a bit redundant. This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect. I don't think google BigQuery makes it very easy for non-programmers to actually download bulk data from them, and I haven't had a chance to read up on using it programmatically yet. Once I have though, it would be really easy for me to make a CSV as you asked for though, and perhaps I could even make a small application to keep a local database updated from the BQ servers. What platform are you on? I assume from your screenshot it is Windows?
I do not know how much would creating this application cost. I hope crowd financing might work (with a little help from MtGox). Yes, I am using Windows. Well I'm already creating the underlying mechanism of the app for my own personal use, and I am more than happy to do the little bit of extra work to make it cross-platform for free. Of course I always welcome donations  but most of the bitcoin work I do/am doing is open-source. Hopefully I can release a self-contained python app which won't need you to already have installed python so that it's as intuitive as possible to use.
|
|
|
|
Loozik
Sr. Member
  
Offline
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
|
 |
May 30, 2013, 12:24:56 AM |
|
Well I'm already creating the underlying mechanism of the app for my own personal use, and I am more than happy to do the little bit of extra work to make it cross-platform for free.
So I spotted this thread right on time. Of course I always welcome donations  but most of the bitcoin work I do/am doing is open-source. I have no problems with donations. Actually I got interested in Bitcoin because I wanted to make a donation (to a philosopher). But I like programmers too. Hopefully I can release a self-contained python app which won't need you to already have installed python so that it's as intuitive as possible to use.
Fantastic 
|
|
|
|
dexX7
Legendary
Offline
Activity: 1106
Merit: 1026
|
 |
May 30, 2013, 02:24:04 PM Last edit: May 30, 2013, 04:02:56 PM by dexX7 |
|
Wow, the query speed is impressive. Great! Thank you for sharing. Edit: This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect. The data given is already enough for all those purposes. What data format do you need? I'm happy to help. @nitrous: Currently the date has the data type string. I suggest to change it to timestamp. String operations are slow, take much more storage and something like DAYOFWEEK(date) doesn't work.
|
|
|
|
Loozik
Sr. Member
  
Offline
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
|
 |
May 30, 2013, 05:34:18 PM |
|
What data format do you need? I'm happy to help.  1. At the moment I only need BTCUSD trade tick data (no need for Bid and / or Ask data). For a CSV file (I will later import this data into my application) I need: Date,Time,Price,Volume (in Satoshis) yyyy-mm-dd,hh:mm:ss,65000000000 2. In future I will need / want / beg for a mechanism / application that will update the CSV file with missing historical data and will update real-time data from MtGox. I will later map this CSV file (ASCII mapping) with my charting application to dynamically connect the constantly updating BTCUSD file to this charting application.
|
|
|
|
whydifficult
|
 |
May 30, 2013, 08:29:18 PM |
|
The query language for BigQuery is pretty similar to SQL. You can just ask bigQuery to return the date data in the format you want: For everyone who needs timestamp instead of a date string: SELECT PARSE_UTC_USEC(date) as timestamp FROM [mt-gox:mtgox.trades] WHERE price = 2122000 LIMIT 1 This will return: Row timestamp 1 1359647088000000 For everyone who needs the date and the time data in separated columns: SELECT DATE(TIMESTAMP(date)) as date, TIME(TIMESTAMP(date)) as time FROM [mt-gox:mtgox.trades] WHERE price = 2122000 LIMIT 1 This will return: Row date time 1 2013-01-31 15:44:48 Edit, here is a list of stuff you can do related to date/time.
|
Gekko a nodejs bitcoin trading bot! Realtime Bitcoin Globe - visualizing all transactions and blocks Tip jar (BTC): 1KyQdQ9ctjCrGjGRCWSBhPKcj5omy4gv5S
|
|
|
dexX7
Legendary
Offline
Activity: 1106
Merit: 1026
|
 |
May 30, 2013, 11:22:48 PM |
|
1. At the moment I only need BTCUSD trade tick data (no need for Bid and / or Ask data). For a CSV file (I will later import this data into my application) I need: Date,Time,Price,Volume (in Satoshis) yyyy-mm-dd,hh:mm:ss,65000000000 Step 1: Go to the sign up website ( picture) https://bigquery.cloud.google.com/ Step 2: Create project.. ( picture) Step 3: Enable "BigQuery API" ( picture) Step 4: If you agree, accept the ToS ( picture) Step 5: Compose a query ( picture) SELECT DATE(TIMESTAMP(date)) as date, TIME(TIMESTAMP(date)) as time, price/100000 as price, amount FROM [mt-gox:mtgox.trades] WHERE Currency__ = "USD" AND date < "2013-05-19 17:30:00" ORDER BY date DESC, time DESC LIMIT 16000; Step 6: Edit the query for your needs and after processing you should be able to download the data as csv. date,time,price,amount 2013-05-19,17:29:46,120.88,18501496 2013-05-19,17:29:45,120.83,84148500 2013-05-19,17:29:42,120.83,50000000 2013-05-19,17:28:57,120.12501,1138895 ... If you wish, delete the row with the date. This will return the newst transactions (or the newst before the 23rd of may). The maximum number of results is limited to 16000, otherwise it would contain too many rows for direct download, but you can repeat step 5.
|
|
|
|
nitrous (OP)
|
 |
May 31, 2013, 12:24:42 AM |
|
Wow, the query speed is impressive. Great! Thank you for sharing. Edit: This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect. The data given is already enough for all those purposes. What data format do you need? I'm happy to help. @nitrous: Currently the date has the data type string. I suggest to change it to timestamp. String operations are slow, take much more storage and something like DAYOFWEEK(date) doesn't work. I should make it clear that I am not the maintainer of this database, nor am I an employee of MtGox, however MagicalTux at MtGox is.
|
|
|
|
|