Bitcoin Forum
November 01, 2024, 07:07:02 AM *
News: Bitcoin Pumpkin Carving Contest
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries  (Read 5739 times)
molecular (OP)
Donator
Legendary
*
Offline Offline

Activity: 2772
Merit: 1019



View Profile
February 22, 2013, 04:48:33 PM
Last edit: November 02, 2013, 08:02:14 AM by molecular
 #1

Due to popular demand, here's a little intro on how to get mtGox trade data (all trades) into a sql database and run some queries on the data:

Depending on your system you might need a lot of stuff I wont list here (like postgres database server, wget/curl, ...). You'll just run into it when you need it.

Firstly we need to wget or curl the csv from bitcoincharts.com ("#>" means you should enter this in a shell):

Code:
#> curl "http://api.bitcoincharts.com/v1/csv/mtgoxUSD.csv" > trades.csv

This will write a file "trades.csv" containing all mtgox trades.

now create a database ("postgres=#" means enter this in psql shell):

Code:
#> psql -U postgres
postgres=# create database mtgox;
CREATE DATABASE

postgres=# \connect mtgox;
You are now connected to database "mtgox" as user "postgres".

mtgox=# \q
#>

now you have created a database called "mtgox" and you can open a psql shell to it directly next time:

we'll create a table for the trades and import the data:

Code:
#> psql -U postgres mtgox
mtgox=# create table trades (id serial, unixtime int, t timestamp, price numeric(16,8), volume numeric(16,8));
NOTICE:  CREATE TABLE will create implicit sequence "trades_id_seq" for serial column "trades.id"
CREATE TABLE

mtgox=# \copy trades(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;

mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';
UPDATE 3563178

mtgox=#

done. now you can run queries:

Code:
mtgox=# select count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades;
   cnt   |      vol_btc      |           vol_usd
---------+-------------------+------------------------------
 3563178 | 39113649.25693204 | 293136750.814254195366000000   cnt   | vol_btc  |  vol_usd

have fun and please, paste your queries here if you have cool ones.

PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0  3F39 FC49 2362 F9B7 0769
gimme_bottles
Sr. Member
****
Offline Offline

Activity: 316
Merit: 250



View Profile
February 22, 2013, 04:50:49 PM
 #2

thanks  Grin

i did not know bitcoincharts offers all trade data in csv, great service.
molecular (OP)
Donator
Legendary
*
Offline Offline

Activity: 2772
Merit: 1019



View Profile
February 22, 2013, 04:57:52 PM
 #3

thanks  Grin

i did not know bitcoincharts offers all trade data in csv, great service.

took me a while to find that, too. Actually, I think I talked to tcatm and he told me.

PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0  3F39 FC49 2362 F9B7 0769
molecular (OP)
Donator
Legendary
*
Offline Offline

Activity: 2772
Merit: 1019



View Profile
February 22, 2013, 04:59:21 PM
 #4

oh, forgot to update the timestamp:

Code:
mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';

will fix in OP

PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0  3F39 FC49 2362 F9B7 0769
molecular (OP)
Donator
Legendary
*
Offline Offline

Activity: 2772
Merit: 1019



View Profile
February 22, 2013, 05:01:48 PM
Last edit: February 23, 2013, 10:56:37 AM by molecular
 #5

which leads me to the first "cool query":

Code:
mtgox=# select extract(year from t) as y, count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades group by y order by y;
  y   |   cnt   |      vol_btc      |           vol_usd
------+---------+-------------------+------------------------------
 2010 |    8757 |  2666237.77200000 |    433663.023527280000000000
 2011 | 1279900 | 13525744.44429303 |  86008826.163108379280100000
 2012 | 1970995 | 20228998.28350803 | 152645161.420049849453300000
 2013 |  303526 |  2692668.75713098 |  54049100.207568686632600000

result: 2012-volume (in USD) was almost twice the 2011-volume Wink

PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0  3F39 FC49 2362 F9B7 0769
steamboat
Hero Member
*****
Offline Offline

Activity: 648
Merit: 500


View Profile
February 22, 2013, 07:51:55 PM
 #6

which leads me to the first "cool query":

Code:
mtgox=# select extract(year from t) as y, count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades group by y order by y;
  y   |   cnt   | vol_btc  |  vol_usd
------+---------+----------+-----------
 2010 |    8757 |  2666295 |        45
 2011 | 1279900 | 13493417 |  85888519
 2012 | 1970995 | 20187125 | 152856841
 2013 |  303526 |  2683242 |  53871199

result: 2012-volume (in USD) was almost twice the 2011-volume Wink


and according to the current trend, less than 2013 volume in btc, but more than twice the volume in USD

ASIC miners available for purchase

Those who serve best, profit most.
notme
Legendary
*
Offline Offline

Activity: 1904
Merit: 1002


View Profile
February 22, 2013, 10:16:06 PM
 #7

Code:
mtgox=# select count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades;
   cnt   | vol_btc  |  vol_usd
---------+----------+-----------
 3563178 | 39030079 | 292616604
(1 row)

Weighted average price = 292616604/39030079 = $7.50/BTC

https://www.bitcoin.org/bitcoin.pdf
While no idea is perfect, some ideas are useful.
molecular (OP)
Donator
Legendary
*
Offline Offline

Activity: 2772
Merit: 1019



View Profile
February 22, 2013, 10:21:27 PM
Last edit: February 23, 2013, 10:57:34 AM by molecular
 #8

Code:
mtgox=# select count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd from trades;
   cnt   | vol_btc  |  vol_usd
---------+----------+-----------
 3563178 | 39030079 | 292616604
(1 row)

Weighted average price = 292616604/39030079 = $7.50/BTC

good idea, vwap by year:

Code:
mtgox=# select extract(year from t) as y, count(*) as cnt, sum(volume) as vol_btc, sum(volume*price) as vol_usd, sum(volume*price) / sum(volume) as vwap from trades group by y order by y;
  y   |   cnt   |      vol_btc      |           vol_usd            |          vwap
------+---------+-------------------+------------------------------+------------------------
 2010 |    8757 |  2666237.77200000 |    433663.023527280000000000 | 0.16264979368362200219
 2011 | 1279900 | 13525744.44429303 |  86008826.163108379280100000 |   6.358897768425487151
 2012 | 1970995 | 20228998.28350803 | 152645161.420049849453300000 |   7.545858637226536507
 2013 |  303526 |  2692668.75713098 |  54049100.207568686632600000 |  20.072688133076431703

PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0  3F39 FC49 2362 F9B7 0769
oakpacific
Hero Member
*****
Offline Offline

Activity: 784
Merit: 1000


View Profile
February 23, 2013, 01:42:00 AM
 #9

Gox should made no less than  $1 million last year, it seems.

https://tlsnotary.org/ Fraud proofing decentralized fiat-Bitcoin trading.
oakpacific
Hero Member
*****
Offline Offline

Activity: 784
Merit: 1000


View Profile
February 23, 2013, 01:44:25 AM
 #10

Sorry, I think there is something wrong with the 2010 volume in USD...

https://tlsnotary.org/ Fraud proofing decentralized fiat-Bitcoin trading.
🏰 TradeFortress 🏰
Bitcoin Veteran
VIP
Legendary
*
Offline Offline

Activity: 1316
Merit: 1043

👻


View Profile
February 23, 2013, 01:47:31 AM
 #11

Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?
oakpacific
Hero Member
*****
Offline Offline

Activity: 784
Merit: 1000


View Profile
February 23, 2013, 01:53:36 AM
 #12

Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?

Never below $0.01 after Gox opened, $0.00001 something is outright impossible

https://tlsnotary.org/ Fraud proofing decentralized fiat-Bitcoin trading.
bb113
Hero Member
*****
Offline Offline

Activity: 728
Merit: 500


View Profile
February 23, 2013, 03:15:57 AM
 #13

Due to popular demand, here's a little intro on how to get mtGox trade data (all trades) into a sql database and run some queries on the data:


Its not all trades. Its close because the time interval is 1 second. To get all trades you need to get the data from:

t=0
https://mtgox.com/api/1/BTCUSD/trades?since=t

which will give you this:

https://mtgox.com/api/1/BTCUSD/trades?since=0


find the largest timestamp ( "date":1280801871 in this case) in that data, then change the t variable and make another query like:

t=max(date)+1
https://mtgox.com/api/1/BTCUSD/trades?since=t

which gives you:

https://mtgox.com/api/1/BTCUSD/trades?since=1280801872


then repeat

molecular (OP)
Donator
Legendary
*
Offline Offline

Activity: 2772
Merit: 1019



View Profile
February 23, 2013, 10:49:10 AM
 #14

Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?

Never below $0.01 after Gox opened, $0.00001 something is outright impossible

yes, you're correct. something wrong... might also affect the other years... checking.

EDIT: oh goddamnit, I accidentally used an old script for this. The datatypes are wrong, everything was imported as integer with 0 decimal places.

I'm sorry. I will fix this... will take a while.

EDIT2 corrected OP. you can fix things by doing this:

Code:
mtgox=# drop table trades;
DROP TABLE
mtgox=# create table trades (id serial, unixtime int, t timestamp, price numeric(32,10), volume numeric(32,8));
NOTICE:  CREATE TABLE will create implicit sequence "trades_id_seq" for serial column "trades.id"
CREATE TABLE
mtgox=# \copy trades(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;
mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';
UPDATE 3563178

EDIT3: fixed my other 2 posts containing queries. The values for the years >=2011 changed only "slightly".

EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?

PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0  3F39 FC49 2362 F9B7 0769
oakpacific
Hero Member
*****
Offline Offline

Activity: 784
Merit: 1000


View Profile
February 23, 2013, 10:59:22 AM
 #15

Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?

Never below $0.01 after Gox opened, $0.00001 something is outright impossible

yes, you're correct. something wrong... might also affect the other years... checking.

EDIT: oh goddamnit, I accidentally used an old script for this. The datatypes are wrong, everything was imported as integer with 0 decimal places.

I'm sorry. I will fix this... will take a while.

EDIT2 corrected OP. you can fix things by doing this:

Code:
mtgox=# drop table trades;
DROP TABLE
mtgox=# create table trades (id serial, unixtime int, t timestamp, price numeric(32,10), volume numeric(32,8));
NOTICE:  CREATE TABLE will create implicit sequence "trades_id_seq" for serial column "trades.id"
CREATE TABLE
mtgox=# \copy trades(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;
mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';
UPDATE 3563178

EDIT3: fixed my other 2 posts containing queries. The values for the years >=2011 changed only "slightly".

EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?

Thanks, I guessed this would be a floating point problem. Smiley

https://tlsnotary.org/ Fraud proofing decentralized fiat-Bitcoin trading.
piramida
Legendary
*
Offline Offline

Activity: 1176
Merit: 1010


Borsche


View Profile
February 23, 2013, 11:03:28 AM
 #16

Thanks, useful. How do you fetch updates, or you recreate the full table every time?

i am satoshi
molecular (OP)
Donator
Legendary
*
Offline Offline

Activity: 2772
Merit: 1019



View Profile
February 23, 2013, 11:24:03 AM
 #17

Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low Wink

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)


PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0  3F39 FC49 2362 F9B7 0769
oakpacific
Hero Member
*****
Offline Offline

Activity: 784
Merit: 1000


View Profile
February 23, 2013, 11:28:34 AM
 #18

Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low Wink

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



How is the size of the CSV file? Also I don't recommend people updating their databases too frequently, bitcoinchart is already under quite a bit of load and went 503 from time to time.

https://tlsnotary.org/ Fraud proofing decentralized fiat-Bitcoin trading.
molecular (OP)
Donator
Legendary
*
Offline Offline

Activity: 2772
Merit: 1019



View Profile
February 23, 2013, 11:31:52 AM
 #19

Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low Wink

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



How is the size of the CSV file? Also I don't recommend people updating their databases too frequently, bitcoinchart is already under quite a bit of load and went 503 from time to time.

142 MB

tcatm recently told me at ccc it wasn't a problem for the server. I'm all for figuring out a way to do updates, though.

PGP key molecular F9B70769 fingerprint 9CDD C0D3 20F8 279F 6BE0  3F39 FC49 2362 F9B7 0769
oakpacific
Hero Member
*****
Offline Offline

Activity: 784
Merit: 1000


View Profile
February 23, 2013, 11:36:55 AM
 #20

Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low Wink

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



How is the size of the CSV file? Also I don't recommend people updating their databases too frequently, bitcoinchart is already under quite a bit of load and went 503 from time to time.

142 MB

tcatm recently told me at ccc it wasn't a problem for the server. I'm all for figuring out a way to do updates, though.


Molecular what are the programs you used for trading? Also give me your donation address please?

https://tlsnotary.org/ Fraud proofing decentralized fiat-Bitcoin trading.
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!