Bitcoin Forum
December 25, 2024, 04:13:49 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2  All
  Print  
Author Topic: Python client  (Read 36711 times)
Cdecker (OP)
Hero Member
*****
Offline Offline

Activity: 489
Merit: 505



View Profile WWW
July 04, 2010, 11:22:50 AM
Last edit: July 04, 2010, 11:57:30 AM by Cdecker
 #1

I was wondering whether someone here might be interested in creating an alternative client to the Project, which would not only make it easier for some people to get going with BitCoin but also would make it easier to write customized clients and take the load off the main development team.

For this to work we'd have to know exactly how the protocol is designed. I could reverse engineer it but I probably won't be able to explain everything Smiley

BTW: Python is not my first choice when it comes to programming languages but for this I think it'd be the most flexible.

Edit: just thought of it: Google App Engine supports Python so there's another application area Cheesy

Want to see what developers are chatting about? http://bitcoinstats.com/irc/bitcoin-dev/logs/
Bitcoin-OTC Rating
Gavin Andresen
Legendary
*
Offline Offline

Activity: 1652
Merit: 2311


Chief Scientist


View Profile WWW
July 04, 2010, 12:50:26 PM
Last edit: July 15, 2010, 01:58:21 PM by gavinandresen
 #2

I've started reverse-engineering and documenting the wallet and block databases, and have written some Python code that deserializes many of the Bitcoin data structures.  I was going to let it ferment a little more before announcing, but Mr. Google will surely find and index it soon, and it should be a good head start for anybody who wants to start a Python bitcoin client.

Open source, MIT license, at:
  https://code.google.com/p/bitcointools/
  MOVED TO git:  http://github.com/gavinandresen/bitcointools


How often do you get the chance to work on a potentially world-changing project?
andrew
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
July 26, 2010, 03:49:50 PM
 #3

Are you going to be working on a full client yourself?  Grin
Gavin Andresen
Legendary
*
Offline Offline

Activity: 1652
Merit: 2311


Chief Scientist


View Profile WWW
July 26, 2010, 05:16:43 PM
 #4

Are you going to be working on a full client yourself?  Grin
I've been tempted... but no, I'm more interested in doing web-based Bitcoin apps, and extending the existing C++ implementation's JSON-RPC API is a whole lot less work than re-implementing the whole shebang.

How often do you get the chance to work on a potentially world-changing project?
mtve
Newbie
*
Offline Offline

Activity: 9
Merit: 0


View Profile WWW
July 26, 2010, 05:35:47 PM
 #5

Hi!

I had almost working pure perl client, but since I've heard alt clients are not welcomed, i've abandoned that project.

--
mtve
NewLibertyStandard
Sr. Member
****
Offline Offline

Activity: 252
Merit: 268



View Profile WWW
July 26, 2010, 10:37:43 PM
 #6

Hi!

I had almost working pure perl client, but since I've heard alt clients are not welcomed, i've abandoned that project.

--
mtve
Where did you hear that?

Treazant: A Fullever Rewarding Bitcoin - Backup Your Wallet TODAY to Double Your Money! - Dual Currency Donation Address: 1Dnvwj3hAGSwFPMnkJZvi3KnaqksRPa74p
mtve
Newbie
*
Offline Offline

Activity: 9
Merit: 0


View Profile WWW
July 27, 2010, 04:41:30 PM
 #7

http://bitcointalk.org/index.php?topic=221.0 etc
Cdecker (OP)
Hero Member
*****
Offline Offline

Activity: 489
Merit: 505



View Profile WWW
July 27, 2010, 07:21:30 PM
 #8

I strongly believe that a system like BitCoin is only destined to survive for a long time if the protocol is standardized and split from the Mainline client development.

A minimalistic, but clean, Python implementation of the client could be used as a reference for future clients that could then be custom tailored to the needs of its users.

Want to see what developers are chatting about? http://bitcoinstats.com/irc/bitcoin-dev/logs/
Bitcoin-OTC Rating
Cdecker (OP)
Hero Member
*****
Offline Offline

Activity: 489
Merit: 505



View Profile WWW
July 27, 2010, 10:12:07 PM
Last edit: July 28, 2010, 01:47:12 AM by Cdecker
 #9

I'm starting to reverse engineer the protocol, but my C++ is incredibly rusty  Wink
As far as I understand it the protocol is based on simple messages each 16 bytes long with variable argument lists. Each message starts with the sequence {0xf9,0xbe,0xb4,0xd9}, and the command itself is 12 bytes (0x00 padded). Depending on the type of the message we have to parse some arguments. Thus far I've analyzed the "version" command, whose goal is to exchange some initial information (both parties send and receive a version message):

version
  • {0xf9,0xbe,0xb4,0xd9}
  • "version" (0x00 padded)
  • 4 byte message size
  • 4 byte checksum
  • 8 byte nLocalServices (always 1 if !fClient, no idea either what that means)
  • 8 byte timestamp (remember to use network byte order)
  • Two adresses which I was unable to recognize (~44 bytes)
  • 8 byte nLocalHostNonce (needed for a handshake, if I'm not mistaken)
  • A subversion string ".0" in my case
The adresses are exchanged only if we are not proxied, this way the nodes can learn their external address.

Anyone else interested in byte-guessing  Grin

Want to see what developers are chatting about? http://bitcoinstats.com/irc/bitcoin-dev/logs/
Bitcoin-OTC Rating
lachesis
Full Member
***
Offline Offline

Activity: 210
Merit: 105


View Profile
July 28, 2010, 01:16:58 AM
 #10

According to main.cpp:1761 (svn r113):
Code:
    // Message format
    //  (4) message start
    //  (12) command
    //  (4) size
    //  (4) checksum
    //  (x) data

Bitcoin Calculator | Scallion | GPG Key | WoT Rating | 1QGacAtYA7E8V3BAiM7sgvLg7PZHk5WnYc
Cdecker (OP)
Hero Member
*****
Offline Offline

Activity: 489
Merit: 505



View Profile WWW
July 28, 2010, 01:49:03 AM
 #11

That sounds reasonable :-)
Any idea what Checksum algorithm this is? And is the header (4bytes and message type) also considered in the checksum and message size?

Edit: Just checked: the size is the number of bytes following the size byte itself, I guess that the checksum is for that range also.

Want to see what developers are chatting about? http://bitcoinstats.com/irc/bitcoin-dev/logs/
Bitcoin-OTC Rating
lachesis
Full Member
***
Offline Offline

Activity: 210
Merit: 105


View Profile
July 28, 2010, 02:02:21 AM
Last edit: July 28, 2010, 03:12:45 AM by lachesis
 #12

That sounds reasonable :-)
Any idea what Checksum algorithm this is? And is the header (4bytes and message type) also considered in the checksum and message size?
From my research, the header is _not_ considered in the size. It cannot be considered in the checksum, since the checksum would have to be known to calculate the checksum.

Here's the relevant code. Some comments added. I don't know what exactly the Hash() function does yet.
Code:
        // Set the size
        unsigned int nSize = vSend.size() - nMessageStart; // vSend is the message buffer, of type CDataStream (look in serialize.h)
        // essentially, this says "size of whole message" - "starting position of data chunk"
        // therefore, nMessageStart = size_of(header).
        memcpy((char*)&vSend[nHeaderStart] + offsetof(CMessageHeader, nMessageSize), &nSize, sizeof(nSize)); // writes nSize into the right spot in the message.

        // Set the checksum
        if (vSend.GetVersion() >= 209)
        {
            uint256 hash = Hash(vSend.begin() + nMessageStart, vSend.end()); // Take a hash of the message data
            unsigned int nChecksum = 0;
            memcpy(&nChecksum, &hash, sizeof(nChecksum));
            assert(nMessageStart - nHeaderStart >= offsetof(CMessageHeader, nChecksum) + sizeof(nChecksum));
            memcpy((char*)&vSend[nHeaderStart] + offsetof(CMessageHeader, nChecksum), &nChecksum, sizeof(nChecksum)); // Put that hash in the right spot.
        }
It's from net.h:703-716.

The problem I'm having is parsing the version command's data. Version _should_ be made of these things serialized, as per net.h:580:
Code:
PushMessage("version", VERSION, nLocalServices, nTime, addrYou, addrMe,
                    nLocalHostNonce, string(pszSubVer), nBestHeight);

However, VERSION = 304 according to my client, and here's the hex dump of the data I'm getting:
Code:
eswanson@eswanson-laptop:~$ python read.py 
"version" 87b (checksum: 304)
0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x9F 0x92 0x4F 0x4C 0x00 0x00 0x00 0x00
0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0xFF 0xFF 0x7F 0x00 0x00 0x01
0xB5 0x2C 0x01 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0xFF 0xFF 0x41 0x1B
0xC2 0xBA 0x20 0x8D 0xDF 0x51 0xFA 0x2A
0x26 0x52 0x6C 0x67 0x02 0x2E 0x30 0x44
0x14 0x01 0x00
You know, I just realized: the checksum is 304... So my header parsing code must be doing something wrong. The header _is_ supposed to be 24 bytes, right?

--EDIT--
Alright, I've been playing with the "version" message the client sends automatically when you connect to it. First of all, as much as I respect Satoshi, I really don't like this format. Something like protocol buffers would have been much neater and more efficient.

Secondly, I suspect that this message doesn't have a checksum because it can't be sure if the client it is talking to has a version > 209 until it gets a verack or version message. I'll play with that next, but I think the checksum is omitted initially to deal with older clients.

Here's my analysis of the components of the data section of this message:
Code:
VERSION - int (4b)
nLocalServices - uint64 (8b)
nTime - uint64 (8b)
addrYou - struct (26b) - more below
addrMe - struct (26b) - more below
nLocalHostNonce - uint64 (8b)
pszSubVer - variable length string
nBestHeight - int (4b)

Now for the address struct, here's the IMPLEMENT_SERIALIZE block:
Code:
IMPLEMENT_SERIALIZE
    (
        if (nType & SER_DISK)
        {
            READWRITE(nVersion);
            READWRITE(nTime);
        }
        READWRITE(nServices);
        READWRITE(FLATDATA(pchReserved)); // for IPv6
        READWRITE(ip);
        READWRITE(port);
    )
from net.h:222-233.

From what I can tell, this indicates that the over the wire format is:
Code:
nServices - uint64 (8b)
pchReserved - (12b)
ip - uint (4b)
port - unsigned short (2b)
Interestingly, the IP and Port are in Big-Endian / Network Byte Order, while everything in the main struct is in Little-Endian / Host Byte Order

Bitcoin Calculator | Scallion | GPG Key | WoT Rating | 1QGacAtYA7E8V3BAiM7sgvLg7PZHk5WnYc
NewLibertyStandard
Sr. Member
****
Offline Offline

Activity: 252
Merit: 268



View Profile WWW
July 28, 2010, 08:47:14 AM
 #13

Well I think clients in other languages would be very valuable. I would particularly enjoy looking through a Java client. There may be some unknown danger to connecting a different client to the production Bitcoin network, but you can aways just code it to only connect to the test network and include a note in the source code requesting that people not connect to the production network. I'm sure if it was very thoroughly tested and received plenty of attention, eventually no one would have a problem with it connecting to the production server.

Treazant: A Fullever Rewarding Bitcoin - Backup Your Wallet TODAY to Double Your Money! - Dual Currency Donation Address: 1Dnvwj3hAGSwFPMnkJZvi3KnaqksRPa74p
Cdecker (OP)
Hero Member
*****
Offline Offline

Activity: 489
Merit: 505



View Profile WWW
July 28, 2010, 02:43:07 PM
 #14

From what I can tell, this indicates that the over the wire format is:
Code:
nServices - uint64 (8b)
pchReserved - (12b)
ip - uint (4b)
port - unsigned short (2b)
    Interestingly, the IP and Port are in Big-Endian / Network Byte Order, while everything in the main struct is in Little-Endian / Host Byte Order
Great, I was able to reconstruct the IP address from my dump and it turns out correctly. So just to sum it up:

version
  • {0xf9,0xbe,0xb4,0xd9}
  • "version" (0x00 padded)
  • 4 byte message size
  • 4 byte checksum
  • 8 byte nLocalServices (always 1 if !fClient, no idea either what that means)
  • 8 byte timestamp (remember to use network byte order)
  • Remote address (the address this Node thinks he is):
    • nServices - uint64 (8b), still cryptic, don't know the meaning yet
    • pchReserved - (12b): some reserved space, apparently for later IPv6
    • ip - uint (4b)
    • port - unsigned short (2b)
  • Local address (the address this Node sees you under):
    • nServices - uint64 (8b), still cryptic, don't know the meaning yet
    • pchReserved - (12b): some reserved space, apparently for later IPv6
    • ip - uint (4b)
    • port - unsigned short (2b)
  • 8 byte nLocalHostNonce (needed for a handshake, if I'm not mistaken)
  • A subversion string ".0" in my case
  • nBestHeight - int (4b): appears to be the last block number

Want to see what developers are chatting about? http://bitcoinstats.com/irc/bitcoin-dev/logs/
Bitcoin-OTC Rating
lachesis
Full Member
***
Offline Offline

Activity: 210
Merit: 105


View Profile
July 28, 2010, 07:19:33 PM
Last edit: July 28, 2010, 10:33:09 PM by lachesis
 #15

--EDIT--
Here's my (sloppy) Python script to read and parse the initial version message. Run it with no arguments to connect to localhost or one argument to connect to the computer at that IP / hostname.
http://www.alloscomp.com/bitcoin/version.pys
I also decided to host the code on GitHub and create a Google Code page. I'll be happy to give commit access / read-write privs to the Google Code page to anyone who's interested.
and
Sample output:
Code:
"version" 87b (checksum: 0)
Version: 304
nLocalServices: 1
nTime: 1280347633
addrYou: 127.0.0.1:57461 (nServices: 1)
addrMe: 65.27.194.186:8333 (nServices: 1)
nLocalHostNonce: 2584997320055052487
vSubStr: ".0"
nBestHeight: 70852

0x30 0x01 0x00 0x00 0x01 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0xF1 0x8D 0x50 0x4C
0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0xFF 0xFF
0x7F 0x00 0x00 0x01 0xE0 0x75 0x01 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xFF 0xFF 0x41 0x1B 0xC2 0xBA 0x20 0x8D
0xC7 0xD8 0x37 0xDF 0x5D 0xC1 0xDF 0x23
0x02 0x2E 0x30 0xC4 0x14 0x01 0x00
--/EDIT--

Pretty close.
  • 4 byte checksum
Absent for version messages. I think it will be included for every message _after_ version though.

  • 8 byte timestamp (remember to use network byte order)
This is in host byte order afaict:
Code:
0x7E 0x81 0x50 0x4C  0x00 0x00 0x00 0x00
    [/li][li]A subversion string ".0" in my case[/li][/list]
    I'm getting some weird stuff here: my string is 3 bytes long and starts with 0x02:
    "0x02 0x2E 0x30" ("\02.0"). It's also not 0x00 terminated. On an older client with no subversion string, it's just one 0x00.

    Oh nevermind: it's one byte of length followed by the string.
    So 0x02: ".0". On the older client, the string's length is 0. That's much better than I anticipated.

    Everything else in your summary looks good!

    Finally, it looks like checksum is only calculated when message size > 0. That's wise. Checksum also doesn't appear to be sent for "version" messages even when one client knows the other is new enough to accept the checksum.

    Upon connecting to one another, both clients send a "version" message. With that complete, they both send "verack" upon receiving the other's version message. Then one sends an "inv" message that appears to contain a list of transactions IDs that node knows about. The other node then queues the transactions it doesn't know about to "mapAskFor" and executes a getdata request (or several - it cuts them when their size > 1000) for those transactions.

    Bitcoin Calculator | Scallion | GPG Key | WoT Rating | 1QGacAtYA7E8V3BAiM7sgvLg7PZHk5WnYc
    Cdecker (OP)
    Hero Member
    *****
    Offline Offline

    Activity: 489
    Merit: 505



    View Profile WWW
    July 29, 2010, 10:56:25 AM
     #16

    --EDIT--
    Here's my (sloppy) Python script to read and parse the initial version message. Run it with no arguments to connect to localhost or one argument to connect to the computer at that IP / hostname.
    http://www.alloscomp.com/bitcoin/version.pys
    I also decided to host the code on GitHub and create a Google Code page. I'll be happy to give commit access / read-write privs to the Google Code page to anyone who's interested.
    and
    The on google code or github wiki looks very nice to document our efforts. Could I get access to it?

    Want to see what developers are chatting about? http://bitcoinstats.com/irc/bitcoin-dev/logs/
    Bitcoin-OTC Rating
    lachesis
    Full Member
    ***
    Offline Offline

    Activity: 210
    Merit: 105


    View Profile
    July 29, 2010, 10:37:56 PM
     #17

    The on google code or github wiki looks very nice to document our efforts. Could I get access to it?
    Sure. I sent you a PM. Anyone who wants to be part of the project just drop me a PM with your Google Account / Gmail email address.

    Bitcoin Calculator | Scallion | GPG Key | WoT Rating | 1QGacAtYA7E8V3BAiM7sgvLg7PZHk5WnYc
    Cdecker (OP)
    Hero Member
    *****
    Offline Offline

    Activity: 489
    Merit: 505



    View Profile WWW
    July 31, 2010, 02:47:45 PM
     #18

    The on google code or github wiki looks very nice to document our efforts. Could I get access to it?
    Sure. I sent you a PM. Anyone who wants to be part of the project just drop me a PM with your Google Account / Gmail email address.
    Thanks mate

    Want to see what developers are chatting about? http://bitcoinstats.com/irc/bitcoin-dev/logs/
    Bitcoin-OTC Rating
    Cdecker (OP)
    Hero Member
    *****
    Offline Offline

    Activity: 489
    Merit: 505



    View Profile WWW
    July 31, 2010, 02:51:33 PM
    Last edit: July 31, 2010, 03:02:40 PM by Cdecker
     #19

    I started the Wiki page here http://code.google.com/p/pybitcoin/wiki/BitcoinProtocol

    So far we have the version message pretty worked out, except the nLocalServices, nLocalHostNonce and the nBestHeight, which I can't fathom what they are used for.

    As for the other messages I have identified these so far:

    • version
    • verack
    • getdata
    • addr
    • inv
    • getaddr
    • block
    • ping
    • reply
    • subscribe
    • sub-cancel
    • publish
    • pub-cancel

    Want to see what developers are chatting about? http://bitcoinstats.com/irc/bitcoin-dev/logs/
    Bitcoin-OTC Rating
    lachesis
    Full Member
    ***
    Offline Offline

    Activity: 210
    Merit: 105


    View Profile
    August 03, 2010, 02:18:29 AM
     #20

    Nice work! I made some small changes.

    So far we have the version message pretty worked out, except the nLocalServices, nLocalHostNonce and the nBestHeight, which I can't fathom what they are used for.
    nBestHeight is the number of blocks minus one, which means it's probably the index of the last block the host knows about. That's so the nodes can choose who should ask for blocks after the initial version message exchange. I don't know what the rest of those are, except that nLocalHostNonce is used (at least partially) to make sure the client isn't connecting to itself and nLocalServices appears to always be 1.
    [/quote]

    As for the other messages I have identified these so far:
    Don't forget "tx", used to insert a transaction.

    Bitcoin Calculator | Scallion | GPG Key | WoT Rating | 1QGacAtYA7E8V3BAiM7sgvLg7PZHk5WnYc
    Pages: [1] 2  All
      Print  
     
    Jump to:  

    Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!