Bitcoin Forum
April 25, 2024, 03:59:35 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: New method to pull MtGox trade data  (Read 18523 times)
nitrous (OP)
Sr. Member
****
Offline Offline

Activity: 246
Merit: 250


View Profile
May 29, 2013, 06:41:52 AM
Last edit: May 29, 2013, 06:53:27 AM by nitrous
 #1

Hi there,

As I've mentioned before, the extra load caused by people downloading the entire MtGox trade history from the API is unsustainable, and recently I published my own dump to reduce the load on the servers. However, since the 23rd of May, MagicalTux has uploaded a full dump to Google's BigQuery service. This database contains trades for all currencies and contains all information available by the API (and more). The service isn't yet fully integrated, however soon it will start being updated automatically every 10mins-1hr. Until then, you can still access all data before the 23rd of May, and it is the recommended method.

To gain access to it, follow these instructions:

  • First, you need to sign up for Google's BigQuery - see here for instructions on doing so.
  • Now that you are signed up, go to the BigQuery console: https://bigquery.cloud.google.com
  • Now each time you go to the console, you will need to add the MtGox data set -- click on the drop down array next to API Project on the lefthand sidebar.
  • Go to 'Switch to project' > 'Display project...'
  • In the resulting popup box (Add Project), enter the Project ID, 'mt-gox'.
  • In the sidebar, mt-gox:mtgox should appear. Click on it to reveal the tables, which currently consists of 'trades'.

Now you can perform SQL queries on this using [mt-gox:mtgox.trades] for the table name, e.g.:
Code:
select * from [mt-gox:mtgox.trades] where Currency__='GBP';

The fields mean the following:
FieldDescriptionType
Money_Trade__This is the tid value, a unique id for the order. The first trades used (almost) sequential integers up to 218868, the majority of trades however came after this (~6 million) and use a Unix microstamp instead.int
PrimaryWhether the trade happened in the currency specified (see Currency__). Some trades may appear on multiple tickers, so to ignore duplicates you may want to `select where primary is true'.bool
Currency__The symbol of the currency against which BTC was bought or sold.str
Bid_User_Rest_App__App ID of the bidding party (the person buying BTC)*str
Ask_User_Rest_App__App ID of the asking party (the person selling BTC)*str
Typebid or ask (bid means to buy BTC by selling the specified currency, ask means to buy the specified currency by selling BTC)str
PropertiesA comma separated list of properties of the order, e.g. limit,market,mixed_currencystr
ItemThe thing being bought or sold, always BTC.str
AmountThe volume of the item (BTC) being bought/sold in satoshis (1 BTC = 100,000,000 satoshis).int
PriceThe price of each item in the currency specified. For most currencies, this is given in units of 0.00001 of the currency, so $12.3456 would be 1,234,560. Some currencies, currently JPY and SEK use units of 0.001 instead.int
DateThe date of the trade in ISO format.str

* These fields represent the app ID when an app is being used to make the trade. Currently, these are null for all but 4% of trades. There are two fields because there are two parties to every trade, one who is buying the BTC (the bidder) and another who is selling it (the asker). For most of these 4% app trades, only one of the parties is an app, however there are also ~3500 trades for which both parties are apps and both of these fields are non-null.

I plan on writing a python program soon to download a dump of this data straight from Google to a local sqlite3 database. Once you are able to obtain this data, feel free to then keep it up-to-date using the API methods (see here for info from my own documentation to do this using the v2 api). MagicalTux has confirmed that this usage is permissible, as the load on the API is minimised this way.

Thanks,
nitrous

Donations: 1Q2EN7TzJ6z82xvmQrRoQoMf3Tf4rMCyvL
MtGox API v2 Unofficial Documentation: https://bitbucket.org/nitrous/mtgox-api/overview
MtGox API v2 Unofficial Documentation Forum Thread: https://bitcointalk.org/index.php?topic=164404.0
1714060775
Hero Member
*
Offline Offline

Posts: 1714060775

View Profile Personal Message (Offline)

Ignore
1714060775
Reply with quote  #2

1714060775
Report to moderator
1714060775
Hero Member
*
Offline Offline

Posts: 1714060775

View Profile Personal Message (Offline)

Ignore
1714060775
Reply with quote  #2

1714060775
Report to moderator
1714060775
Hero Member
*
Offline Offline

Posts: 1714060775

View Profile Personal Message (Offline)

Ignore
1714060775
Reply with quote  #2

1714060775
Report to moderator
The forum was founded in 2009 by Satoshi and Sirius. It replaced a SourceForge forum.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
MagicalTux
VIP
Hero Member
*
Offline Offline

Activity: 608
Merit: 501


-


View Profile
May 29, 2013, 06:52:56 AM
 #2

Just for info, automatic update of these data is scheduled to start this week or next week
whydifficult
Sr. Member
****
Offline Offline

Activity: 287
Merit: 250



View Profile WWW
May 29, 2013, 08:02:57 AM
 #3

This is really sweet, thanks!

Gekko a nodejs bitcoin trading bot!
Realtime Bitcoin Globe - visualizing all transactions and blocks
Tip jar (BTC): 1KyQdQ9ctjCrGjGRCWSBhPKcj5omy4gv5S
Loozik
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250


Born to chew bubble gum and kick ass


View Profile
May 29, 2013, 08:29:25 PM
 #4


DateThe date of the trade in ISO format.str


Great  Grin because my charting app does not accept Unixtime.

Okay, we have Date like ''2013-05-28''. Isn't Time like ''hhmmss'' missing?
nitrous (OP)
Sr. Member
****
Offline Offline

Activity: 246
Merit: 250


View Profile
May 29, 2013, 08:53:42 PM
 #5


DateThe date of the trade in ISO format.str


Great  Grin because my charting app does not accept Unixtime.

Okay, we have Date like ''2013-05-28''. Isn't Time like ''hhmmss'' missing?

The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".
Here's a sample of the database:

Money_Trade__PrimaryCurrency__Bid_User_Rest_App__Ask_User_Rest_App__TypePropertiesItemAmountPriceDate
1363132336180895trueUSDe7b6384e-3981-42e7-b3b7-7719c4dc74a5nullbidmarketBTC2227667644350002013-03-12 23:52:16
1363132369821802trueUSDnullnullbidlimitBTC817406644350002013-03-12 23:52:49
1363132372700876trueUSDnulle75e9f56-4b37-4440-9b10-b56fc6a47ee1asklimitBTC6425087344343512013-03-12 23:52:52
1363132372786101trueUSDnulle75e9f56-4b37-4440-9b10-b56fc6a47ee1asklimitBTC817406644339102013-03-12 23:52:52
1363132386679941trueUSDnulle75e9f56-4b37-4440-9b10-b56fc6a47ee1asklimitBTC16000000044147172013-03-12 23:53:06

Donations: 1Q2EN7TzJ6z82xvmQrRoQoMf3Tf4rMCyvL
MtGox API v2 Unofficial Documentation: https://bitbucket.org/nitrous/mtgox-api/overview
MtGox API v2 Unofficial Documentation Forum Thread: https://bitcointalk.org/index.php?topic=164404.0
Loozik
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250


Born to chew bubble gum and kick ass


View Profile
May 29, 2013, 09:10:04 PM
 #6

The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".

Thank you very much for the information.

Is it possible (if it's not too late) to separate Date from Time? Most widely used trading and charting applications for retail traders (Multicharts or Ninjatrader and others) will not be able to use the data in the format where date and time are merged.

nitrous (OP)
Sr. Member
****
Offline Offline

Activity: 246
Merit: 250


View Profile
May 29, 2013, 10:02:48 PM
 #7

The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".

Thank you very much for the information.

Is it possible (if it's not too late) to separate Date from Time? Most widely used trading and charting applications for retail traders (Multicharts or Ninjatrader and others) will not be able to use the data in the format where date and time are merged.

Does the application not allow any way of doing this directly?

If not, you have a few different options:

  • You can ask MagicalTux and see if he'll consider it for when he begins regular updates (although that would be 1 or 2 weeks away before you see any actual changes).
  • When I am able to write a program to download the entire database, I will release the code. If you have any programming experience, it should be relatively simple to then modify the data to your needs before input into your application. If not, I may be able to convert it to any format you need, however I'm not too sure of the logistics of then giving you the database.
  • You could try doing a SQL query along the lines of:
Code:
SELECT LEFT(Date,10),RIGHT(Date,8) FROM [mt-gox:mtgox.trades] LIMIT 10;
    N.B. though, that BigQuery may not let you download all the data at once using this method.[/li]

Donations: 1Q2EN7TzJ6z82xvmQrRoQoMf3Tf4rMCyvL
MtGox API v2 Unofficial Documentation: https://bitbucket.org/nitrous/mtgox-api/overview
MtGox API v2 Unofficial Documentation Forum Thread: https://bitcointalk.org/index.php?topic=164404.0
kodo
Newbie
*
Offline Offline

Activity: 42
Merit: 0



View Profile
May 29, 2013, 10:18:08 PM
 #8

Thanks, this is great info!
Loozik
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250


Born to chew bubble gum and kick ass


View Profile
May 29, 2013, 10:20:47 PM
 #9

    Does the application not allow any way of doing this directly?

    Unfortunately not.

    • You can ask MagicalTux and see if he'll consider it for when he begins regular updates (although that would be 1 or 2 weeks away before you see any actual changes).

    I wouldn't have the courage to bother him  Smiley

    • When I am able to write a program to download the entire database, I will release the code. If you have any programming experience, it should be relatively simple to then modify the data to your needs before input into your application. If not, I may be able to convert it to any format you need, however I'm not too sure of the logistics of then giving you the database.
    I have no programming experience. Out of curiosity, would it be possible somehow get the data into a CSV file in my computer that would be updated real-time? I would not like to bother you with this. I only want to know if this is feasible.

    • You could try doing a SQL query along the lines of:
    Code:
    SELECT LEFT(Date,10),RIGHT(Date,8) FROM [mt-gox:mtgox.trades] LIMIT 10;
      Anything more complicated than Word or Excel overwhelms my circuits  Cheesy[/list]
      Loozik
      Sr. Member
      ****
      Offline Offline

      Activity: 378
      Merit: 250


      Born to chew bubble gum and kick ass


      View Profile
      May 29, 2013, 11:00:25 PM
       #10

      The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".

      Just a thought: could 2 additional (to already existing Date) columns / fields to collect be added to the API: Date_Only and Time_Only (derived from Date)? Would this be something that requires a lot of work?
      Loozik
      Sr. Member
      ****
      Offline Offline

      Activity: 378
      Merit: 250


      Born to chew bubble gum and kick ass


      View Profile
      May 29, 2013, 11:11:43 PM
       #11

      Just look, there is an application called Sierra Chart (I think some trials were made with this app to accept MtGox data) that has Date and Time separated for intraday data in CSV format http://www.sierrachart.com/index.php?l=doc/doc_TextCSVDataFormat.html#ExampleIntraday
      nitrous (OP)
      Sr. Member
      ****
      Offline Offline

      Activity: 246
      Merit: 250


      View Profile
      May 29, 2013, 11:50:53 PM
       #12

      The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".

      Just a thought: could 2 additional (to already existing Date) columns / fields to collect be added to the API: Date_Only and Time_Only (derived from Date)? Would this be something that requires a lot of work?

      I will ask him when I next catch him, if what you're saying is true then it would make a lot of sense to include all three to accommodate all users. On the other hand, Google I believe charges for the amount of data uploaded, so that may factor into their decision as it may seem a bit redundant.

      I don't think google BigQuery makes it very easy for non-programmers to actually download bulk data from them, and I haven't had a chance to read up on using it programmatically yet. Once I have though, it would be really easy for me to make a CSV as you asked for though, and perhaps I could even make a small application to keep a local database updated from the BQ servers. What platform are you on? I assume from your screenshot it is Windows?

      Donations: 1Q2EN7TzJ6z82xvmQrRoQoMf3Tf4rMCyvL
      MtGox API v2 Unofficial Documentation: https://bitbucket.org/nitrous/mtgox-api/overview
      MtGox API v2 Unofficial Documentation Forum Thread: https://bitcointalk.org/index.php?topic=164404.0
      Loozik
      Sr. Member
      ****
      Offline Offline

      Activity: 378
      Merit: 250


      Born to chew bubble gum and kick ass


      View Profile
      May 30, 2013, 12:03:03 AM
       #13

      I will ask him when I next catch him, if what you're saying is true then it would make a lot of sense to include all three to accommodate all users.

      Fantastic  Smiley

      On the other hand, Google I believe charges for the amount of data uploaded, so that may factor into their decision as it may seem a bit redundant.

      This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect.

      I don't think google BigQuery makes it very easy for non-programmers to actually download bulk data from them, and I haven't had a chance to read up on using it programmatically yet. Once I have though, it would be really easy for me to make a CSV as you asked for though, and perhaps I could even make a small application to keep a local database updated from the BQ servers. What platform are you on? I assume from your screenshot it is Windows?

      I do not know how much would creating this application cost. I hope crowd financing might work (with a little help from MtGox). Yes, I am using Windows.
      nitrous (OP)
      Sr. Member
      ****
      Offline Offline

      Activity: 246
      Merit: 250


      View Profile
      May 30, 2013, 12:10:39 AM
       #14

      I will ask him when I next catch him, if what you're saying is true then it would make a lot of sense to include all three to accommodate all users.

      Fantastic  Smiley

      On the other hand, Google I believe charges for the amount of data uploaded, so that may factor into their decision as it may seem a bit redundant.

      This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect.

      I don't think google BigQuery makes it very easy for non-programmers to actually download bulk data from them, and I haven't had a chance to read up on using it programmatically yet. Once I have though, it would be really easy for me to make a CSV as you asked for though, and perhaps I could even make a small application to keep a local database updated from the BQ servers. What platform are you on? I assume from your screenshot it is Windows?

      I do not know how much would creating this application cost. I hope crowd financing might work (with a little help from MtGox). Yes, I am using Windows.

      Well I'm already creating the underlying mechanism of the app for my own personal use, and I am more than happy to do the little bit of extra work to make it cross-platform for free. Of course I always welcome donations Tongue but most of the bitcoin work I do/am doing is open-source. Hopefully I can release a self-contained python app which won't need you to already have installed python so that it's as intuitive as possible to use.

      Donations: 1Q2EN7TzJ6z82xvmQrRoQoMf3Tf4rMCyvL
      MtGox API v2 Unofficial Documentation: https://bitbucket.org/nitrous/mtgox-api/overview
      MtGox API v2 Unofficial Documentation Forum Thread: https://bitcointalk.org/index.php?topic=164404.0
      Loozik
      Sr. Member
      ****
      Offline Offline

      Activity: 378
      Merit: 250


      Born to chew bubble gum and kick ass


      View Profile
      May 30, 2013, 12:24:56 AM
       #15

      Well I'm already creating the underlying mechanism of the app for my own personal use, and I am more than happy to do the little bit of extra work to make it cross-platform for free.
      So I spotted this thread right on time.

      Of course I always welcome donations Tongue but most of the bitcoin work I do/am doing is open-source.
      I have no problems with donations. Actually I got interested in Bitcoin because I wanted to make a donation (to a philosopher). But I like programmers too.

      Hopefully I can release a self-contained python app which won't need you to already have installed python so that it's as intuitive as possible to use.

      Fantastic Smiley
      dexX7
      Legendary
      *
      Offline Offline

      Activity: 1106
      Merit: 1024



      View Profile WWW
      May 30, 2013, 02:24:04 PM
      Last edit: May 30, 2013, 04:02:56 PM by dexX7
       #16

      Wow, the query speed is impressive. Great! Thank you for sharing.

      Edit:

      This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect.

      The data given is already enough for all those purposes. What data format do you need? I'm happy to help. Smiley


      @nitrous: Currently the date has the data type string. I suggest to change it to timestamp. String operations are slow, take much more storage and something like DAYOFWEEK(date) doesn't work.

      Loozik
      Sr. Member
      ****
      Offline Offline

      Activity: 378
      Merit: 250


      Born to chew bubble gum and kick ass


      View Profile
      May 30, 2013, 05:34:18 PM
       #17

      What data format do you need? I'm happy to help. Smiley

      1. At the moment I only need BTCUSD trade tick data (no need for Bid and / or Ask data). For a CSV file (I will later import this data into my application) I need:
      Date,Time,Price,Volume (in Satoshis)
      yyyy-mm-dd,hh:mm:ss,65000000000

      2. In future I will need / want / beg for a mechanism / application that will update the CSV file with missing historical data and will update real-time data from MtGox. I will later map this CSV file (ASCII mapping) with my charting application to dynamically connect the constantly updating BTCUSD file to this charting application.
      whydifficult
      Sr. Member
      ****
      Offline Offline

      Activity: 287
      Merit: 250



      View Profile WWW
      May 30, 2013, 08:29:18 PM
       #18

      The query language for BigQuery is pretty similar to SQL. You can just ask bigQuery to return the date data in the format you want:

      For everyone who needs timestamp instead of a date string:

      Code:
      SELECT PARSE_UTC_USEC(date) as timestamp FROM [mt-gox:mtgox.trades] WHERE price = 2122000 LIMIT 1

      This will return:

      Code:
      Row	timestamp	 
      1 1359647088000000

      For everyone who needs the date and the time data in separated columns:

      Code:
      SELECT DATE(TIMESTAMP(date)) as date, TIME(TIMESTAMP(date)) as time FROM [mt-gox:mtgox.trades] WHERE price = 2122000 LIMIT 1

      This will return:

      Code:
      Row	date	time	 
      1 2013-01-31 15:44:48

      Edit, here is a list of stuff you can do related to date/time.

      Gekko a nodejs bitcoin trading bot!
      Realtime Bitcoin Globe - visualizing all transactions and blocks
      Tip jar (BTC): 1KyQdQ9ctjCrGjGRCWSBhPKcj5omy4gv5S
      dexX7
      Legendary
      *
      Offline Offline

      Activity: 1106
      Merit: 1024



      View Profile WWW
      May 30, 2013, 11:22:48 PM
       #19

      1. At the moment I only need BTCUSD trade tick data (no need for Bid and / or Ask data). For a CSV file (I will later import this data into my application) I need:
      Date,Time,Price,Volume (in Satoshis)
      yyyy-mm-dd,hh:mm:ss,65000000000

      Step 1: Go to the sign up website (picture)

      Code:
      https://bigquery.cloud.google.com/

      Step 2: Create project.. (picture)

      Step 3: Enable "BigQuery API" (picture)

      Step 4: If you agree, accept the ToS (picture)

      Step 5: Compose a query (picture)

      Code:
      SELECT DATE(TIMESTAMP(date)) as date, TIME(TIMESTAMP(date)) as time, price/100000 as price, amount
      FROM [mt-gox:mtgox.trades]
      WHERE Currency__ = "USD"
      AND date < "2013-05-19 17:30:00"
      ORDER BY date DESC, time DESC LIMIT 16000;

      Step 6: Edit the query for your needs and after processing you should be able to download the data as csv.

      Code:
      date,time,price,amount
      2013-05-19,17:29:46,120.88,18501496
      2013-05-19,17:29:45,120.83,84148500
      2013-05-19,17:29:42,120.83,50000000
      2013-05-19,17:28:57,120.12501,1138895
      ...

      If you wish, delete the row with the date. This will return the newst transactions (or the newst before the 23rd of may).

      The maximum number of results is limited to 16000, otherwise it would contain too many rows for direct download, but you can repeat step 5.

      nitrous (OP)
      Sr. Member
      ****
      Offline Offline

      Activity: 246
      Merit: 250


      View Profile
      May 31, 2013, 12:24:42 AM
       #20

      Wow, the query speed is impressive. Great! Thank you for sharing.

      Edit:

      This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect.

      The data given is already enough for all those purposes. What data format do you need? I'm happy to help. Smiley


      @nitrous: Currently the date has the data type string. I suggest to change it to timestamp. String operations are slow, take much more storage and something like DAYOFWEEK(date) doesn't work.

      I should make it clear that I am not the maintainer of this database, nor am I an employee of MtGox, however MagicalTux at MtGox is.

      Donations: 1Q2EN7TzJ6z82xvmQrRoQoMf3Tf4rMCyvL
      MtGox API v2 Unofficial Documentation: https://bitbucket.org/nitrous/mtgox-api/overview
      MtGox API v2 Unofficial Documentation Forum Thread: https://bitcointalk.org/index.php?topic=164404.0
      Pages: [1] 2 »  All
        Print  
       
      Jump to:  

      Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!