Bitcoin Forum
May 27, 2024, 02:12:09 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Block #225430 chain fork dataset available  (Read 1519 times)
jgarzik (OP)
Legendary
*
qt
Offline Offline

Activity: 1596
Merit: 1091


View Profile
March 14, 2013, 09:24:32 PM
Last edit: March 16, 2013, 12:42:36 AM by jgarzik
 #1

For diagnostic purposes, here is a blockchain dataset built by a 0.7.2 bitcoind w/ db 4.8 + "-detachdb":

     http://gtf.org/garzik/bitcoin/chain-db48-h225429.tar.bz2

It contains blockchain + index, up to height 225429, making it easy to reproduce an injection of too-large blocks at the precise juncture where the recent chain fork occurred.

Byte size: 5776366736 (5.3G)
MD5: f26deaaf05197bcbc73d33fed2443db3
SHA1: 743d1eaac3b590e996a22e707288fd9a21aa4c63
SHA256: 4dfd766c7cdfa346ad10e648900476dfc590605f78a78dff0c2608131c0f6c46


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
deepceleron
Legendary
*
Offline Offline

Activity: 1512
Merit: 1032



View Profile WWW
March 14, 2013, 10:09:39 PM
 #2

Better would be the fork blockchain up to block 225453 (or another set of bad 225430-225453 that can be imported), so that it included the bad block and it can be fed to different version of clients and they can replicate the BDB freakout. We all have the blockchain up to 225429, but the bad chain went "poof" upon reorg.
Gavin Andresen
Legendary
*
qt
Offline Offline

Activity: 1652
Merit: 2217


Chief Scientist


View Profile WWW
March 14, 2013, 11:20:15 PM
 #3

The first part of the chain that got orphaned, starting at block 225,430, is here:

  http://skypaint.com/bitcoin/fork08.dat

The first three blocks in the 0.7-compatible chain starting at block 225,430 is:

  http://skypaint.com/bitcoin/fork07.dat

How often do you get the chance to work on a potentially world-changing project?
jgarzik (OP)
Legendary
*
qt
Offline Offline

Activity: 1596
Merit: 1091


View Profile
March 15, 2013, 06:42:22 PM
 #4

Hold off on using this dataset.  Due to a local linking mistake, it was built with BDB 5.x.

Rebuilding the dataset with BDB 4.8 will complete in a few hours.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
jgarzik (OP)
Legendary
*
qt
Offline Offline

Activity: 1596
Merit: 1091


View Profile
March 16, 2013, 12:43:02 AM
 #5

Hold off on using this dataset.  Due to a local linking mistake, it was built with BDB 5.x.

Issue fixed.  Dataset updated.  OP updated with new hashes and byte size.


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
commonancestor
Newbie
*
Offline Offline

Activity: 58
Merit: 0


View Profile
March 17, 2013, 10:14:38 AM
 #6

Devs,

1. Thanks for getting us over this glitch safely.

2. It is beyond belief that validity of a block could be decided by such implementation specific matters like BerkeleyDB record locking.
And, of course, there is no specification of what is a valid block. The code is the specification? So make no changes to the code and we won't have forks?
Please take the code at some point and write the specification of what is a valid block. Then change the code as you like and test if it's ok with the specs.
Also you may find out that the block validity rules are too weird and could refactor them better.
As Mike Hearn says, money could be lost here.
Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1150


View Profile
March 17, 2013, 08:15:09 PM
 #7

2. It is beyond belief that validity of a block could be decided by such implementation specific matters like BerkeleyDB record locking.
And, of course, there is no specification of what is a valid block. The code is the specification? So make no changes to the code and we won't have forks?
Please take the code at some point and write the specification of what is a valid block. Then change the code as you like and test if it's ok with the specs.
Also you may find out that the block validity rules are too weird and could refactor them better.
As Mike Hearn says, money could be lost here.


Specifications aren't magic; they're just words on paper. I can put anything into a specification, but it doesn't magically make code actually follow the spec. I can also take the specification and write tests, but again, the tests don't magically make the code follow the specification.

Before commenting further on the topic you need to read the Bitcoin sourcecode yourself. If you can't read it, you have no business commenting on software development anyway. If you can, you'll find that while it isn't perfect and could use some refactorings, all in all understanding the intent of the different parts is fairly easy and thus the code itself acts as a perfectly good specification.




commonancestor
Newbie
*
Offline Offline

Activity: 58
Merit: 0


View Profile
March 17, 2013, 10:46:51 PM
 #8

Specifications aren't magic; they're just words on paper. I can put anything into a specification, but it doesn't magically make code actually follow the spec. I can also take the specification and write tests, but again, the tests don't magically make the code follow the specification.

Before commenting further on the topic you need to read the Bitcoin sourcecode yourself. If you can't read it, you have no business commenting on software development anyway. If you can, you'll find that while it isn't perfect and could use some refactorings, all in all understanding the intent of the different parts is fairly easy and thus the code itself acts as a perfectly good specification.

You are right, I should read the code indeed. A protocol defined in the code rather than in a specification - pros: no maintenance effort, no ambiguity (in theory); cons: difficult to read and understand, difficult to make other implementations (including new versions of the same program). The blockchain fork happened because devs forgot that Berkeley DB was part of the protocol. Without reading the code I find this bit messy.
Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1150


View Profile
March 17, 2013, 11:50:56 PM
 #9

You are right, I should read the code indeed. A protocol defined in the code rather than in a specification - pros: no maintenance effort, no ambiguity (in theory); cons: difficult to read and understand, difficult to make other implementations (including new versions of the same program). The blockchain fork happened because devs forgot that Berkeley DB was part of the protocol. Without reading the code I find this bit messy.

"no ambiguity" <- that's exactly what failed. In v0.7, db.h, there is the following line:

Code:
class CTxDB : public CDB

That means, create a CTxDB class, that extends the CDB class. That class is from an external library. What's CDB? What version? What does it do? All this stuff is ambiguous. Yet just "include every external" library doesn't work either; how far back do you go? While not as issue now, with really large blocks even subtle stuff like performance differences between different hardware implementations are can cause forks even with identical software.


Believe me, the developers understand the importance of the problem very well. As an example Pieter Wuille and others have been working to prevent OpenSSL differences from causing a fork with IsCanonicalSignature() and similar. I don't happen to agree with Gavin on everything, maybe even not on most things, but I can agree that he has been taking his roll in pushing testing and stability very seriously since he was hired by the Bitcoin Foundation, and for that matter, even further back than that.

There aren't easy solutions to specification problem, and I really think that writing yet another specification in addition to the imperfect one we already have is currently a waste of limited manpower. It might always be a waste of manpower - Bitcoin is in uncharted computer science territory with its extremely strict requirement for consensus.

Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!