I have spent much of the past two months developing all the utilities I needed to write BTC software in Python. As I quickly found out, python is dirt slow, and so in order to process the blockchain I made a super-optimized C++ layer which is pulled into Python through SWIG. The end result has been fantastic! And here I release it into the wild under the GNU Affero General Public License:
PyBtcEngine on Github by etotheipiTo be clear:
This is a computational back-end that could be used as a base for Python-based BTC software. It does not include any networking code at all. I have done my best to make this code "usable," meaning well-formatted code and lots of comments. Unfortunately, Bitcoin is f***ing complicated, and so there's only so much one can do to make the code easier to comprehend in this situation. Also, I'm not a software engineer: I'm a mathematician with an engineering background and generally write
algorithms to go into software, not the software itself. I'm sure many people will disagree with some of my software design (such as some obvious places I should've used inheritance but chose not to for the sake of simplicity). Oh well. It's OSS now, so clone it and refactor it all you want.
Library Details:Be aware that the current implementation holds everything in memory, and so it takes up about 1.2 GB of RAM right now. I plan to improve this in the future, but my computer has 8GB so I'm not in any hurry to make it more lightweight. On the other hand, because of this (and my painstakingly careful memory management), it is ridiculously fast. Here's the timings, measured on a single thread of an AMD Phenom X4 840 CPU with 8GB of 1333 MHz DDR3.
- Read blockchain into RAM: 5s
- Scan blockchain, collect headers/txs: 10s
- Organize and find longest chain: 0.5s
- Verify blkfile integrity (merkle roots and leading zeros on header hashes): 2.5s
- Get balances/ledger for a set of addresses/wallets, from scratch: 5s
Yes, you can use this tool to scan your entire blockchain file, verify there's no errors in it, find the longest chain and find all transactions for a given address(s) in under 30s! Yes, all 600 MB of blockchain. One of the reasons it's so fast is because I copy all data once into a bulk, 600MB chunk of RAM, and then everything else is pointers to locations in this chunk, and frequently pointers to these pointers. It may be complicated, but no one can say it isn't fast!
There is some complexity in the fact that the C++ code does not have any bignum or ECDSA capabilities, but the python library doesn't have any of it's utilities for scanning/maintaining the blockchain. So, the libraries must be combined to get a full backend, but I have a ton example/unit-testing code which should highlight how it is used. In particular, three files contains examples of nearly every available method:
- (C++) BlockUtilsTest.cpp
- (Python) unittest.py
- (Together) testswig.py
Below, I have copied the "STATUS" section of the README which shows the current capabilities
* STATUS: Last Updated - 01 Oct, 2011
* Legend:
* _ not implemented
* . implemented but not tested
* + implemented and partially tested
* X implemented and tested
*
* C++ Python
* -----------------------------------------------------------------
* (01) Ser/Unser Block Objects X X
* (02) Hash160/Hash256 X X
* (03) Difficulty calcs X X
* (04) Address Generation X
* (05) Address Verify/Manip X
* -----------------------------------------------------------------
* (06) BlkHeaders read/scan/org X X
* (07) BlkHeaders reorgs X
* (08) Blockchain read/scan/org X
* (09) Blockchain reorgs +
* (10) Blockchain verify integrity X
* -----------------------------------------------------------------
* (11) NonStd Tx Detection + +
* (12) Script pprint X X
* (13) Script evaluation +
* (14) Script OP_CHECKSIG X
* (15) ECDSA Sign/Verify X
* -----------------------------------------------------------------
* (16) Address/Wallet tracking X X
* (17) Scan blkchain for Tx X
* (18) Scan blkchain for NonStd X
* (19) Reorg w/ double-spend .
* (20) Add new blockdata real-time .
* -----------------------------------------------------------------
* (21) Networking/Sockets
* (22) Blockchain download
* (23) Tx construction
* (24) Tx broadcast
* (25) Tx fee detect/calc/handle
As a whole, PyBtcEngine can perform all the ECDSA operations, evaluate
most scripts, detect non-std scripts, and should handle blockchain reorganizations gracefully upon adding blockdata (though, the reorg code hasn't been tested much). The last important thing for me to add before I start
using instead of just
developing it, is transaction construction and signing. All the pieces are there, I just need to write the code to collect available TxOuts from the C++ wallet, construct the tx packet, and apply the python ECDSA code to sign it.
License:I naively picked the GNU Affero General Public License (AGPL) for this project. I have never developed OSS before, and thus I'm a newbie to the world of OSS licenses. I picked the AGPL because it seems to be more appropriate to networking-related code, and contains copyleft -- all derivatives works of this library must also be OSS with copyleft. If you believe I don't fully understand the licensing I have selected, please let me know.