Bitcoin Forum

Economy => Speculation => Topic started by: molecular on February 22, 2013, 04:48:33 PM



Title: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 22, 2013, 04:48:33 PM
Due to popular demand, here's a little intro on how to get mtGox trade data (all trades) into a sql database and run some queries on the data:

Depending on your system you might need a lot of stuff I wont list here (like postgres database server, wget/curl, ...). You'll just run into it when you need it.

Firstly we need to wget or curl the csv from bitcoincharts.com ("#>" means you should enter this in a shell):

Code:
#> curl "http://api.bitcoincharts.com/v1/csv/mtgoxUSD.csv" > trades.csv

This will write a file "trades.csv" containing all mtgox trades.

now create a database ("postgres=#" means enter this in psql shell):

Code:
#> psql -U postgres
postgres=# create database mtgox;
CREATE DATABASE

postgres=# \connect mtgox;
You are now connected to database "mtgox" as user "postgres".

mtgox=# \q
#>

now you have created a database called "mtgox" and you can open a psql shell to it directly next time:

we'll create a table for the trades and import the data:

Code:
#> psql -U postgres mtgox
mtgox=# create table trades (id serial, unixtime int, t timestamp, price numeric(16,8), volume numeric(16,8));
NOTICE:  CREATE TABLE will create implicit sequence "trades_id_seq" for serial column "trades.id"
CREATE TABLE

mtgox=# \copy trades(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;

mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';
UPDATE 3563178

mtgox=#

done. now you can run queries:

Code:
mtgox=# select count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades;
   cnt   |      vol_btc      |           vol_usd
---------+-------------------+------------------------------
 3563178 | 39113649.25693204 | 293136750.814254195366000000   cnt   | vol_btc  |  vol_usd

have fun and please, paste your queries here if you have cool ones.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: gimme_bottles on February 22, 2013, 04:50:49 PM
thanks  ;D

i did not know bitcoincharts offers all trade data in csv, great service.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 22, 2013, 04:57:52 PM
thanks  ;D

i did not know bitcoincharts offers all trade data in csv, great service.

took me a while to find that, too. Actually, I think I talked to tcatm and he told me.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 22, 2013, 04:59:21 PM
oh, forgot to update the timestamp:

Code:
mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';

will fix in OP


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 22, 2013, 05:01:48 PM
which leads me to the first "cool query":

Code:
mtgox=# select extract(year from t) as y, count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades group by y order by y;
  y   |   cnt   |      vol_btc      |           vol_usd
------+---------+-------------------+------------------------------
 2010 |    8757 |  2666237.77200000 |    433663.023527280000000000
 2011 | 1279900 | 13525744.44429303 |  86008826.163108379280100000
 2012 | 1970995 | 20228998.28350803 | 152645161.420049849453300000
 2013 |  303526 |  2692668.75713098 |  54049100.207568686632600000

result: 2012-volume (in USD) was almost twice the 2011-volume ;)


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: steamboat on February 22, 2013, 07:51:55 PM
which leads me to the first "cool query":

Code:
mtgox=# select extract(year from t) as y, count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades group by y order by y;
  y   |   cnt   | vol_btc  |  vol_usd
------+---------+----------+-----------
 2010 |    8757 |  2666295 |        45
 2011 | 1279900 | 13493417 |  85888519
 2012 | 1970995 | 20187125 | 152856841
 2013 |  303526 |  2683242 |  53871199

result: 2012-volume (in USD) was almost twice the 2011-volume ;)


and according to the current trend, less than 2013 volume in btc, but more than twice the volume in USD


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: notme on February 22, 2013, 10:16:06 PM
Code:
mtgox=# select count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades;
   cnt   | vol_btc  |  vol_usd
---------+----------+-----------
 3563178 | 39030079 | 292616604
(1 row)

Weighted average price = 292616604/39030079 = $7.50/BTC


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 22, 2013, 10:21:27 PM
Code:
mtgox=# select count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades;
   cnt   | vol_btc  |  vol_usd
---------+----------+-----------
 3563178 | 39030079 | 292616604
(1 row)

Weighted average price = 292616604/39030079 = $7.50/BTC

good idea, vwap by year:

Code:
mtgox=# select extract(year from t) as y, count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd, sum(volume*price) / sum(volume) as vwap from trades group by y order by y;
  y   |   cnt   |      vol_btc      |           vol_usd            |          vwap
------+---------+-------------------+------------------------------+------------------------
 2010 |    8757 |  2666237.77200000 |    433663.023527280000000000 | 0.16264979368362200219
 2011 | 1279900 | 13525744.44429303 |  86008826.163108379280100000 |   6.358897768425487151
 2012 | 1970995 | 20228998.28350803 | 152645161.420049849453300000 |   7.545858637226536507
 2013 |  303526 |  2692668.75713098 |  54049100.207568686632600000 |  20.072688133076431703


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: oakpacific on February 23, 2013, 01:42:00 AM
Gox should made no less than  $1 million last year, it seems.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: oakpacific on February 23, 2013, 01:44:25 AM
Sorry, I think there is something wrong with the 2010 volume in USD...


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: 🏰 TradeFortress 🏰 on February 23, 2013, 01:47:31 AM
Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: oakpacific on February 23, 2013, 01:53:36 AM
Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?

Never below $0.01 after Gox opened, $0.00001 something is outright impossible


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: bb113 on February 23, 2013, 03:15:57 AM
Due to popular demand, here's a little intro on how to get mtGox trade data (all trades) into a sql database and run some queries on the data:


Its not all trades. Its close because the time interval is 1 second. To get all trades you need to get the data from:

t=0
https://mtgox.com/api/1/BTCUSD/trades?since=t

which will give you this:

https://mtgox.com/api/1/BTCUSD/trades?since=0


find the largest timestamp ( "date":1280801871 in this case) in that data, then change the t variable and make another query like:

t=max(date)+1
https://mtgox.com/api/1/BTCUSD/trades?since=t

which gives you:

https://mtgox.com/api/1/BTCUSD/trades?since=1280801872


then repeat



Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 23, 2013, 10:49:10 AM
Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?

Never below $0.01 after Gox opened, $0.00001 something is outright impossible

yes, you're correct. something wrong... might also affect the other years... checking.

EDIT: oh goddamnit, I accidentally used an old script for this. The datatypes are wrong, everything was imported as integer with 0 decimal places.

I'm sorry. I will fix this... will take a while.

EDIT2 corrected OP. you can fix things by doing this:

Code:
mtgox=# drop table trades;
DROP TABLE
mtgox=# create table trades (id serial, unixtime int, t timestamp, price numeric(32,10), volume numeric(32,8));
NOTICE:  CREATE TABLE will create implicit sequence "trades_id_seq" for serial column "trades.id"
CREATE TABLE
mtgox=# \copy trades(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;
mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';
UPDATE 3563178

EDIT3: fixed my other 2 posts containing queries. The values for the years >=2011 changed only "slightly".

EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: oakpacific on February 23, 2013, 10:59:22 AM
Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?

Never below $0.01 after Gox opened, $0.00001 something is outright impossible

yes, you're correct. something wrong... might also affect the other years... checking.

EDIT: oh goddamnit, I accidentally used an old script for this. The datatypes are wrong, everything was imported as integer with 0 decimal places.

I'm sorry. I will fix this... will take a while.

EDIT2 corrected OP. you can fix things by doing this:

Code:
mtgox=# drop table trades;
DROP TABLE
mtgox=# create table trades (id serial, unixtime int, t timestamp, price numeric(32,10), volume numeric(32,8));
NOTICE:  CREATE TABLE will create implicit sequence "trades_id_seq" for serial column "trades.id"
CREATE TABLE
mtgox=# \copy trades(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;
mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';
UPDATE 3563178

EDIT3: fixed my other 2 posts containing queries. The values for the years >=2011 changed only "slightly".

EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?

Thanks, I guessed this would be a floating point problem. :)


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: piramida on February 23, 2013, 11:03:28 AM
Thanks, useful. How do you fetch updates, or you recreate the full table every time?


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 23, 2013, 11:24:03 AM
Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low ;)

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: oakpacific on February 23, 2013, 11:28:34 AM
Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low ;)

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



How is the size of the CSV file? Also I don't recommend people updating their databases too frequently, bitcoinchart is already under quite a bit of load and went 503 from time to time.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 23, 2013, 11:31:52 AM
Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low ;)

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



How is the size of the CSV file? Also I don't recommend people updating their databases too frequently, bitcoinchart is already under quite a bit of load and went 503 from time to time.

142 MB

tcatm recently told me at ccc it wasn't a problem for the server. I'm all for figuring out a way to do updates, though.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: oakpacific on February 23, 2013, 11:36:55 AM
Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low ;)

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



How is the size of the CSV file? Also I don't recommend people updating their databases too frequently, bitcoinchart is already under quite a bit of load and went 503 from time to time.

142 MB

tcatm recently told me at ccc it wasn't a problem for the server. I'm all for figuring out a way to do updates, though.


Molecular what are the programs you used for trading? Also give me your donation address please?


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 24, 2013, 03:58:25 PM
Molecular what are the programs you used for trading? Also give me your donation address please?

used? you mean traidor.py (https://bitcointalk.org/index.php?topic=25451.0)?

1JANa7gQ2VE7Wkv3o4917ECy8NWYtUx5F5


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: piramida on February 24, 2013, 04:30:40 PM

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low ;)

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



Ok so thats easy then, as "copy from" would append data to the table without touching existing rows.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: sgbett on February 24, 2013, 11:01:49 PM
EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?

NUMERIC(16,8) should do it




Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 25, 2013, 06:44:11 AM
EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?

NUMERIC(16,8) should do it




In case postgres uses that datatype for aggregate functions, 8 places before the decimal point will be enough.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: sgbett on February 25, 2013, 12:02:06 PM
the params in that datatype represent scale and precision, scale is the total number of digits (ie including decimals) precision is the number of decimals

with the current bitcoin spec 16,8 is enough to store the largest theoretical possible transaction amount of 21million, and the smallest value of one satoshi.

The datatype scale and precision only affect the stored value, aggregate functions operate correctly disregarding scale and precision of the datatype (within the limits of the architecture).

eg the max value that can be stored in the field is 99999999.99999999,if you sum this with 0.00000001 then the result is 100000000 (ie no overflow occurs)


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on February 25, 2013, 09:45:33 PM
the params in that datatype represent scale and precision, scale is the total number of digits (ie including decimals) precision is the number of decimals

with the current bitcoin spec 16,8 is enough to store the largest theoretical possible transaction amount of 21million, and the smallest value of one satoshi.

The datatype scale and precision only affect the stored value, aggregate functions operate correctly disregarding scale and precision of the datatype (within the limits of the architecture).

eg the max value that can be stored in the field is 99999999.99999999,if you sum this with 0.00000001 then the result is 100000000 (ie no overflow occurs)

Good to know. Thanks. Changed the op to use 16,8


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: oakpacific on March 07, 2013, 06:59:13 AM
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. :(


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on March 07, 2013, 07:33:21 AM
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. :(

Yeah, I'm generally interested. Just need a couple of quiet hours and the muse to kiss me. Been thinking about this.

this http://statistics.ecdsa.org/ would tell us the info if it was up-to-date. Maybe we should pester ThomasV (?).


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: oakpacific on March 07, 2013, 07:48:50 AM
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. :(

Yeah, I'm generally interested. Just need a couple of quiet hours and the muse to kiss me. Been thinking about this.

this http://statistics.ecdsa.org/ would tell us the info if it was up-to-date. Maybe we should pester ThomasV (?).


Yeah, I have been using this, it's great but a bit outdated.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on May 09, 2013, 06:13:16 PM
EDIT: ignore this post, see next post

tcatm has limited the trade api. max 20000 trades can be downloaded per request.

here's a bash script that updates local trades table

Code:
#!/bin/bash

db="mtgox"
user="postgres"
symbol="mtgoxUSD"
PSQL="psql -q -t -U ${user} ${db}"
DL="curl -s"

function sql() {
  rc=$(echo "$1" | $PSQL)
}

function extend() {
  sql "drop table if exists import;"
  sql "create table import (id serial, unixtime int, price numeric(32,10), volume numeric(32,8), type smallint);"
  
  len=$(( 4 * 60 * 60 ))
  while true; do
    sql "select max(unixtime) from trades;"
    start=$rc
    start_human=$(date -d "@$start")
    echo "--- $start_human ----------------------------------------------------------"
    end=$(( $start + $len ))
    echo "start ($start) + len ($len) = end ($end)"
    
    # download
    $DL "http://bitcoincharts.com/t/trades.csv?symbol=$symbol&start=$start&end=$end" > trades.csv
    end_file=$(head -n 1 trades.csv | cut -d , -f 1)
    start_file=$(tail -n 1 trades.csv | cut -d , -f 1)
    echo "start_file ($start_file) - start ($start) = $(( $start_file - $start ))"
    echo "end_file ($end_file) - end ($end) = $(( $end_file - $end ))"
    if [ $start_file -eq $start ]; then
      echo start times match, updating trades table
      # put into import table
      sql "delete from import;"
      sql "\copy import(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;"
      sql "delete from trades where unixtime >= $start;"
      sql "insert into trades (unixtime, t, price, volume) select unixtime, TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second', price, volume from import order by id desc;"
      if [ $start_file -eq $end_file ]; then
        echo "end detected, sleeping for 10 minutes,...."
        sleep 10m;
      fi
    else
      echo "start_file != start, exiting, check code"
      exit 1
    fi
  done
}

extend

it's quite fresh (use accordingly), currently in process of syncing my table. It uses 4-hour blocks.

dont know if it quits nicely yet when done ;)

hoping someone can use it.

EDIT: added sync-end-detection. will sleep 10 minutes, then continue to sync.

EDIT2:

hmm, it stops prematurely. I think something is weird with the bitcoincharts api:

neither this: http://bitcoincharts.com/t/trades.csv?symbol=mtgoxUSD&start=1365709116&end=1365710916
nor this: http://bitcoincharts.com/t/trades.csv?symbol=mtgoxUSD&end=1365710916

deliver any data (except one trade in the first case) while I think they should.

while this: http://bitcoincharts.com/t/trades.csv?symbol=mtgoxUSD&start=1365709116

delivers the most recent trade data (20000 most recent trades) ignoring "start" (at least that is consistent with the api docs).

the api docs: http://bitcoincharts.com/about/markets-api/ say I should only use the "end" parameter. That doesn't work, though (http://bitcoincharts.com/t/trades.csv?symbol=mtgoxUSD&end=1365710916 doesn't deliver)

ideas?


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on November 02, 2013, 08:02:30 AM
api url has change to http://api.bitcoincharts.com/v1/csv/mtgoxUSD.csv, fixed op, ignore previous post.


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: molecular on November 02, 2013, 08:05:09 AM
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. :(

this thread could be interesting for you: https://bitcointalk.org/index.php?topic=316297.0


Title: Re: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries
Post by: oakpacific on November 03, 2013, 02:19:51 PM
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. :(

this thread could be interesting for you: https://bitcointalk.org/index.php?topic=316297.0

Yeah, that's cool, thanks.