Bitcoin Forum
May 02, 2024, 09:33:24 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
  Home Help Search Login Register More  
  Show Posts
Pages: [1]
1  Bitcoin / Development & Technical Discussion / Bitcoin block validator from scracth in C++ is now working, whats next? on: July 29, 2021, 03:31:54 PM
Hello,

I have been experimenting bitcoin blocks and parsing and validating blocks using my small c++ library in my spare time. I am now able to parse all blocks including most recent blocks and be able to validate basic pay to public key script using a small VM with a few opcodes I have created to validate script. A test case shows how it works in following code:
https://github.com/vimrull/vimrull/blob/main/tests/vm/testscript.cpp#L16

Now I am thinking about implementing some other opcodes to validate most of the bitcoin blockchain blocks as well and them implement all of those so I can validate the entire blockchain from genesis to most current block. It will still not be complete as I have not implemented validations like double spending attempt and basic syntactical validation. At this moment even though the code I wrote is very ugly as I was trying to learn modern C++ as well, I am confident I can learn the rest of the things that may come along (I have some rudimentary experience working as a security engineer for a well known production system).

Now, being a newbie (I am sure I do not know about what I do not know) to bitcoin I am not sure how much further should I try to learn before I may be able to contribute to the bitcoin main  (if that is at all possible with my skill set). Any suggestion what else should I try/experiment before looking into contributing to bitcoin main code?

Also where can I find a list of small/easy tasks that requires to be done and how the development is coordinated - i.e how it is accepted or rejected?

I looked at https://github.com/bitcoin/bips/blob/master/bip-0325.mediawiki and seems like I can work on it (seems like there is similarity with sigwit) and it may be a good one as it is targeted for test network and  low risk of messing things up than the main network. 

I am not 100% sure if dev & tech this is right group. Please suggest/move appropriate group if it does not belong here.
2  Bitcoin / Development & Technical Discussion / Re: Verifying OP_CHECKSIG in a non-segwit block on: July 02, 2021, 09:13:43 AM
Keep in mind that things are a bit more complicated than that when serializing a transaction for signing...

Ha ha - absolutely. It was beginners high moment when the signature validation worked for the first time Smiley. It has been long time (after school) I am doing a project like this.

Thanks for the sighash type and code_separator comment. I'll delay the transaction generation part until I am able to do the validation of most common types of signature and the script interpreter engine. But, when I am back with tx generation, I'll be able to refer back to that.

So, with this the test case is now complete with only two raw blocks and without any hard coded values (except for sanity checks): https://github.com/vimrull/vimrull/blob/b2dcb47ec109c021e67d9088c1e7eca69335f521/tests/vm/testscript.cpp#L35



3  Bitcoin / Development & Technical Discussion / Re: Verifying OP_CHECKSIG in a non-segwit block on: July 02, 2021, 04:20:19 AM
Serialization is the same, but some parts of the transaction must be modified before being hashed. ... you replace the signature script of the transaction with the pubkey script of the output that is being spent

Perfect - thanks for the pointer. Basically just after the previous transaction hash and index, goes the pubkey script from old blocks transaction:

Code:
01000000 # version
01       # one input
c997a5e56e104102fa209c6a852dd90660a20b2d9c352423edce25857fcd3704  # tx hash from block 9
00000000 # input index 0
43       # script length
# 0x41 bytes of data = public key + OP_CHECKSIG
410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909
a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac
ffffffff
02       # we have 2 outputs
00ca9a3b00000000 # 10 coins
43               # length of script
# first outout sending 10 coints to a new address
4104ae1a62fe09c5f51b13905f07f06b99a2f7159b2225f374cd378d71302fa28414e7
aab37397f554a7df5f142c21c1b7303b8a0626f1baded5c72a704f7e6cd84cac
00286bee00000000 # 40 coins
43               # length of script
# second output - sending 40 coins to address of coin origin
410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e
0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac
00000000 # lock time
01000000 # hash type


This makes sense - before the transaction signature is generated we do not have it and putting the signature inside the transaction that is being signed and generating hash creates an impossible chicken egg scenario.

This seems like solves the validation problem of most non-segwit signature - with slightly altered version to use bitcoin address instead of public  key in later versions of the payment script. I'll have to understand multi-sig and segwits signature validation. Good thing is as a side effect of this exercise, I now know how to generate a raw transaction.

[If you see this, you are not pooya87 and have spendable merit, please send some for the previous reply - helped me a lot and I do not have sMerit to signal that to the system]
4  Bitcoin / Development & Technical Discussion / Verifying OP_CHECKSIG in a non-segwit block on: June 30, 2021, 08:55:47 PM
I am trying to validate a transaction from block 170. There are two outputs 10+40=50 bitcoins total. The txo is coming from block 9.

Now I have loaded both of the blocks from hex data downloaded from blockchain.info site:

https://blockchain.info/rawblock/000000008d9dc510f23c2657fc4f67bea30078cc05a90eb89e84cc475c080805?format=hex
https://blockchain.info/rawblock/00000000d1145790a8694403d4063f323d499e655c83426834d4ce2f8dd4a2ee?format=hex

I am able to load it using some code I wrote and validate most of the parameters of the blocks. But when I try to validate the ECDSA signature using openssl functions using this code: https://github.com/vimrull/vimrull/blob/b2dcb47ec109c021e67d9088c1e7eca69335f521/tests/crypto/test_open_ssl.cpp#L5  I am unable to validate it. To validate I load the block and separate the transaction and double hash it. Then get the public key and the signature from the block and call openssl validate method. Here is my test case: https://github.com/vimrull/vimrull/blob/b2dcb47ec109c021e67d9088c1e7eca69335f521/tests/vm/testscript.cpp#L35

When it did not work I took the raw transaction from https://en.bitcoin.it/wiki/OP_CHECKSIG and use the public key and signature from the block and it validates successfully.
 
Transaction that can be validated (TXV):

Code:

0100000001c997a5e56e104102fa209c6a852dd90660a20b2d9c352423edce25857fcd37040000000043410411db9
3e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b
64f9d4c03f999b8643f656b412a3acffffffff0200ca9a3b00000000434104ae1a62fe09c5f51b13905f07f06b99a2f7159
b2225f374cd378d71302fa28414e7aab37397f554a7df5f142c21c1b7303b8a0626f1baded5c72a704f7e6cd84cac002
86bee0000000043410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb
84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac0000000001000000


Double sha256 of the above data is 7a05c6145f10101e9d6325494245adf1297d80f8f38d4d576d57cdba220bcb19 which I cannot find in the blockchain.info site. But this hash and public key and signature combination is valid according to openssl.

Here is the transaction from the actual block (TXB):

Code:

0100000001c997a5e56e104102fa209c6a852dd90660a20b2d9c352423edce25857fcd3704000000004847304402204
e45e16932b8af514961a1d3a1a25fdf3f4f7732e9d624c6c61548ab5fb8cd410220181522ec8eca07de4860a4acdd1290
9d831cc56cbbac4622082221a8768d1d0901ffffffff0200ca9a3b00000000434104ae1a62fe09c5f51b13905f07f06b99a2
f7159b2225f374cd378d71302fa28414e7aab37397f554a7df5f142c21c1b7303b8a0626f1baded5c72a704f7e6cd84cac
00286bee0000000043410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddf
b84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac00000000


The reverse of the double hash of the transaction is f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16 which can be found on blockchain.info site but if I take signature and public key found on block 170 it can not be validated.

I am guessing the serialization of the transaction during validation may be a little different than what is in raw transaction (or blockchain.info has wrong block data - which is unlikely). In my code I serialize the transaction that matches the binary format of the block. But the transaction raw value that works can not be found in actual block.

So, I am wondering how do I get/generate the data TXV when I have the data TXB in the block.

Explaining this seems to be hard- but the test case may make it a bit clear. At the HEAD of the repository the test case is passing because I have just hardcoded the raw tx in the code.

I appreciate if you take time to point any issue you see with my test case approach. As a newbie I am probably missing something very obvious.

And please excuse my code. I am trying to learn modern C++ as well while trying this project.
5  Bitcoin / Development & Technical Discussion / Re: Bitcoin block validator from scracth - bitcoin learning plan on: June 14, 2021, 05:50:33 PM
First of all you can't verify blocks like this (at random position in chain), it is a blockchain and has to be verified like a chain meaning you verify block 0 then 1 then 2.... and you can't verify block n without having already verified the previous n blocks.
....
As for SegWit you should know a couple of things:

....
P.S. Also try to think of transaction, scripts, hashes,... as a stream of byte instead of a JSON, hex, etc. that would simplify a lot of things for you such as confusing the zeros in a block hash.

Yes, of course - its a linked list that can be traversed from the tail and validated from the head! I am thinking about how to keep an updated list of transactions in memory to reduce number of passes through the entire chain - creating a map of the entire chain first by reading the block header only on first pass and then start from the beginning seems to be good enough and this is what I plan to do (I'll need 300GB+ disk space- but thats ok for now. I think I'll just make it work first and then optimize performance. And thanks for the segwith steps.

I'm not expert, but i know there's bug on multi-signature verification (before SegWit). It's solved by adding OP_0, so you might want to take note when you implement script validation for multi-signature.
See https://github.com/bitcoin/bips/blob/master/bip-0147.mediawiki.

Thats very good information. I have no other option but to take the existing chain as the source of truth and organize validator so that it does not claim anything to be invalid - and still making sure no rule is skipped - so it'll be able to catch real bad blocks.
6  Bitcoin / Development & Technical Discussion / Re: Bitcoin block validator from scracth - bitcoin learning plan on: June 13, 2021, 11:37:14 PM
Keep in mind that re-implementations of the bitcoin protocol or block validator, other than Bitcoin Core, are not safe to use in production because they might have overlooked already resolved security vulnerabilities, and should only be used for learning purposes. This also applies to projects created from scratch.

Absolutely!

Segwit data is stored in the witness_len and witness_data fields of the bitcoin transaction and appears after the txout field. witness_data has a bunch of fields which you must validate according to https://bitcoincore.org/en/segwit_wallet_dev/ section "Transaction Serialization".

Also a segwit-ready block verifier must be able to recognize blocks with two transaction ID hashing, the first kinds is the legacy txid without the witness data, the second is the hash with the witness data and this kind of txID must not be recognized before the first block that signalled for segwit support. Also the older transaction ID hashing must not be recognized after Segwit's activation timeout due to rules in the BIPs responsible for these kind of activations.

This is one of the pitfalls of making your own validator, you have to take into account all the new transaction and block versions and systematically recognize or unrecognize them at specific block heights.

Thanks. I think I have kind of got the structure by using a bit of hack. I looked at the first byte, if its zero, its a segwith block, otherwise its an old block. Its ugly but encapsulated in a function that returns std::pair which ties to is_segwit, and input_count.

But then the data inside the witness blob is still opaque to me. I have dumped inside a vector<char> field and put it back while serializing as is and call it done. I'll have to look into the documention while validating the transaction input having witness data/ txid. From first look into the document you shared, it seems like the witness blob for each input is composed of compactSize integer and then compactSize count of transaction ids. Now, I'll have to understand how to validate a transaction that contains witness-txid.
7  Bitcoin / Development & Technical Discussion / Bitcoin block validator from scracth - bitcoin learning plan on: June 13, 2021, 06:29:56 PM
I am trying to learn bitcoin by creating a small library to validate the existing blocks from scratch using C++. I have made some progress and working to understand rest of the system. This post is not a question but the forum seems to have extremely experienced people on the topic-  so, sharing here to get some advice/ corrections along the way as I move forward. This is my first post here, if my post breaks any forum rule- please let me know so I can correct it.

So far, I am able to load and parse 2048 blocks have downloaded. It has first 1024 blocks (ending at block 00000000edfa5bfffd21cc8ce76e46b79dc00196e61cdc62fd595316136f8a83 ) and another 1024 blocks from last week (ending at block 0000000000000000000d06cb8554f862f69825a7994dab6161ec0970e35f463e). Now given the above two bloc ids, I can traverse through the 2048 blocks and hit genesis block for first iteration and 1024 block older one for the second iteration. I have verified that each numbers from the second block is being correctly parsed (compared with JSON data from blockchain.info  for the same block for verification).

The MerkleRoot calculation was a bit tricky (completely missed the double hash and was doing single hash and scratching my head for few hours) - but seems to be working now. And with reversing the next_block string I can find the next block id and load and and repeat - this was easier, I just looked into the value dumped in hex and realized zeroes are at the end Smiley.

SegWit was another trouble point. I had hard time finding beginner level document that explains how it was stored. At the moment here is my logic (simplified version as I have merged two functions here):

Code:
auto witness_count = read_var_int_hex(block_stream);
for (gsl::index i=0; i<witness_count; i++)
{
    auto witness_len = read_var_int_hex(block_stream);
    if (witness_len > 0)
    {
        read_hex(block_stream, witness_len, script_.data());
    }
    witness_list.push_back(witness);
}

return witness_list;

This seems to be working for 1024 blocks from last week. Please let me know if it looks correct or not.

With this, I think syntactical validation is now complete. I can tell, if a blockheader or transaction or entire block has exact values at right place given a block serialized in hex file.

To I want to move to next phase and validate logical rules. I have found some rules here: https://en.bitcoin.it/wiki/Protocol_rules

I plan to start by validating the block difficulty, then signature validation and then rest of the rules (script validation left for last step as I will need a small VM for that).

I am not sure how updated those rules are. I can always read the source code of bitcoin core, but I want to do in my way first instead of looking into it - the code seems bigger than my attention span. One good thing about this process is that I will probably never forget the structure now as I struggled through each data structure. But, with he JSON file from blockchaininfo, it is relatively straightforward to catch the error.

Once most of the logics are validated, I plan to create a small VM to execute the script code- its a stack based VM with limited types of operations and no jump instructions, so hoping it won't be that difficult.

I do not plan to implement the networking protocol of bitcoin. I am just assuming the blocks are ready to be parsed and validated starting from genesis blocks. And about that, I am thinking about how to efficiently order the blocks without double pass- traversing it all and finding the next blocks and then come back and validate the transactions.
Pages: [1]
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!