Title: Reorganization of code around classes Post by: Steve on April 14, 2011, 06:36:09 AM I've published a work in progress to reorganize the sources here: https://github.com/gasteve/bitcoin
Here is the README file included in that commit: --------------- This commit is a reorganization of the bitcoin source code (the actual behavior of the code has not been changed). The main intent of this commit is to solicit input and review of the organization. Much work remains to finish the reorganization, get it easily building across all supported platforms and to document it. This readme file is a brief overview of the work thus far and what remains to be done. Classes are now organized into their own source code files. For each class there is a header file that declares the class, a header file that has any inline or template definitions, and a source file for the method implementations. For a class named "CFoo" (class names always begin with a letter "C" by convention), the corresponding source files would be named: CFoo.h - header with the class declaration CFoo-inl.h - header with inline and template methods (if any) CFoo.cpp - method implementations The primary header file will include the inline header file at the end (if there is one). The header files use the typical #ifndef/#define pattern to use the pre-processor to ensure any given class header is only processed once by the compiler. In the header files, where possible, forward declarations of referenced classes are used rather than including the other class' header file. A policy of one class per *.h, *-inl.h, *.cpp is followed with the exception of exception classes (currently there are two, bignum_error and key_error). The exception classes follow the main class declaration in the file of the class that uses them. In a couple of cases, this results in rather small files that one might be tempted to fold into another class' files. In the interest of consistency, even these small classes have been put into their own set of files (again, exception classes are the only exception to this rule). While the classes have been separated into their own files and should be pretty clean, the non object oriented source files and headers are still a bit of a mess and in need of cleanup. Header file include statements need to be pruned. Autotools have been introduced in the build process in order to simplify the configuration and make files and provide more automatic dependency management (i.e. include file dependencies are automatically tracked such that touching any given header file would only force a recompile of exactly those source files that include it, directly or indirectly). Autotools should also enable one set of configuration and make files to be used across platforms, but this is not yet the case. The use of autotools will also enable source code distributions to be built (make dist) and enable packaging for popular package management tools (rpm, ports, apt, etc). At this point, this code has only been compiled on Mac OSX. To successfully compile, you will need to follow the instructions for building from the main branch to compile all of the dependencies (i.e. BDB, wxWidgets, etc) and then follow the usual autotools approach to building. There is a script in this directory for rebuilding the configure script and make files (autogen.sh). After running that script, then run "./configure; make" ("make install" isn't supported yet). Both the daemon and full GUI have been successfully compiled. Remaining work (in rough priority order): 1. Sync up with the head of the master bitcoin source code branch (this code is based on 0.3.20.2) 2. Organize the non class based sources by grouping related functions together and relocated source files into subdirectories (likely subdirs would be peer, gui, cli, wallet, miner, and common) <--- I could use suggestions on which functions/classes belong in which of these logical packages 3. Clean up the non class based sources and headers and prune header file includes 4. Eliminate the use of -DGUI for building the GUI 5. Clean up the comments & copyright statements, etc 6. Make it compile cleanly across all platforms Title: Re: Reorganization of code around classes Post by: HostFat on April 14, 2011, 07:49:05 AM Nice work! :D
Title: Re: Reorganization of code around classes Post by: Steve on April 15, 2011, 04:56:06 AM I've merged this work with the latest commit on the master branch...more info is in the README:
https://github.com/gasteve/bitcoin Title: Re: Reorganization of code around classes Post by: xf2_org on April 15, 2011, 05:52:39 AM I don't want to speak for any of the other dev team members, but I tend to prioritize massive re-orgs like this below just about any other change. They create a lot of noise, making sifting through history more difficult (even though git helps with this). They break existing patches. And in the language-specific arena, the two-files-per-class (.h, .cpp) creates an absolute explosion of tiny files, which is frankly a pain in the ass. Title: Re: Reorganization of code around classes Post by: Pieter Wuille on April 15, 2011, 06:42:41 AM Nice work.
I'm in favor of a reorganization, especially adding some compilation units that do not include all .h files. Some .cpp files are very large, and include large amounts of code in header files that are included everywhere, slowing down compilation a lot. Furthermore, main.cpp and main.h contain the largest part of the internal logic, without clear separation, though they contain several classes and functions which surely don't all depend on one another. I'm not sure a separate file for each and every class is the way to go, though. Ideally we have some layers and clear rules about which can access which (for example, i'm not sure rpc.cpp should interact with db.h-defined classes directly). In that sense, your work here is a step in the right direction, by at least separating the classes from eachother. If we can agree on how to organize them in separate directories, and maybe split the remaining functions as well, we can be there. Then we can still decide whether we want all separate files, or simply one .cpp/.h per layer. Do you plan to keep your branch synced with git's main for a while? Title: Re: Reorganization of code around classes Post by: ribuck on April 15, 2011, 06:59:21 AM It's my experience that software re-organizations rarely deliver as much benefit as was anticipated. Sure, some things are cleaned up, but cruft inevitably increases elsewhere.
Having said that, I don't mind what refactoring is done provided every change is totally screened against the introduction of possible security holes. No-one should mess lightly with security-critical code. Title: Re: Reorganization of code around classes Post by: xf2_org on April 15, 2011, 07:41:04 AM There is general agreement that cleanup is warranted, but I would rather that future changes drive the cleanups. It is a simple fact that we will not know what the codebase should look like, after client mode and other planned enhancements are all complete, until we actually implement those things. Thus, we cannot know if we are creating more work for ourselves, with a "pie in the sky" grand reorg. Toss into the mix that everyone's idea of a perfect codebase is different. IMNSHO, the Linux kernel way is best: evolution, not revolution. Perform cleanups as other changes are performed, in a particular area of code. Real Life will inform our cleanup better than any Grand Plan. Title: Re: Reorganization of code around classes Post by: ribuck on April 15, 2011, 09:52:52 AM I totally agree with xf2_org.
Plus, this has the advantage that functionality continues to improve, instead of everyone devoting their energies to arguing which kind of refactoring is the best, and the project ending up in refactoring paralysis. Title: Re: Reorganization of code around classes Post by: Steve on April 15, 2011, 12:10:47 PM Nice work. I'm in favor of a reorganization, especially adding some compilation units that do not include all .h files. Some .cpp files are very large, and include large amounts of code in header files that are included everywhere, slowing down compilation a lot. Furthermore, main.cpp and main.h contain the largest part of the internal logic, without clear separation, though they contain several classes and functions which surely don't all depend on one another. I'm not sure a separate file for each and every class is the way to go, though. Ideally we have some layers and clear rules about which can access which (for example, i'm not sure rpc.cpp should interact with db.h-defined classes directly). In that sense, your work here is a step in the right direction, by at least separating the classes from eachother. If we can agree on how to organize them in separate directories, and maybe split the remaining functions as well, we can be there. Then we can still decide whether we want all separate files, or simply one .cpp/.h per layer. Do you plan to keep your branch synced with git's main for a while? Yes, I will keep it synced forever. I actually needed this cleanup in order to make the code more readily reusable and reorganized for other things I would like to do (i.e. build a native Mac client, add unit tests, be able to build separate libraries and/or executables for the peer and wallet, etc). In response some of the other comments, this is not actually a refactoring at all and it's not some Grand Plan...it's just a minor bit of re-organization (and adoption of autotools). I do understand the unfortunate side effects of making history harder to follow, but if you know where the re-org happen, you understand the nature of the re-org, then it's not that difficult to follow (btw, I was pretty amazed at how git-merge made it very easy to bring it up to date with the latest commit). Regarding a separate class per file, I was tempted to combine some of the smaller ones together, but I decided I would favor consistency over the desire to have fewer files (at least for now). Also, the files don't really bother me (and many dev tools make the number of files nearly irrelevant). I am interested in separating things into these packages: peer, wallet, miner, and common (for those things that all the others need) ...I will put the files into those subdirs within src. To do that, I need to decide on which classes belong in each of those packages and which of the loose functions belong in each of those packages. If anyone would like to offer a suggestion on that, I'm interested to hear it. I think I will do this first, then get autotools building on all platforms (well, the platforms I have access to anyway...OSX, Ubuntu and Windows XP). Title: Re: Reorganization of code around classes Post by: jaromil on April 17, 2011, 01:24:45 PM re all,
i was introduced to this project by genjix before hearing of Steve's effort i've also started hacking on this https://github.com/bitcoin/bitcoin/pull/162 the branch is in the shape i wanted to get it by now, just testing the build with WX on Linux while I'm writing, then will need to fill in the gaps testing build on OSX and Win, but should be fairly easy using the current autotools setup. i think organizing the code according to GNU standards will help a lot more coders to get involved and maintainers to coordinate contributions, also usually coders that respect such standards are very good at what they write, at documenting it and testing it. i also plan to contribute more autotools knowledge, debian packaging, python bindings, code separation in libraries (i'm using libtool since that facilitates the making of test units later) and in general its a pleasure to contribute to this project which i like and admire, being myself a C++ coder i find its code just a wee messy but all in all something good we can build more on. please consider here i'm not advocating the pull of my branch as it is. if you wish i can rebase it differently, you can cherry-pick commits or i can correct bugs reported. ultimately i'd love to merge my proposal with that of Steve and move on to realize some more ideas (mostly connected to cleanup in this early phase) ciao p.s. Not sure about you, but I'd really prefer a development mailinglist.... do not expect me to be very responsive on this web forum, sry. Title: Re: Reorganization of code around classes Post by: Gavin Andresen on April 20, 2011, 07:30:57 PM I don't want to speak for any of the other dev team members, but I tend to prioritize massive re-orgs like this below just about any other change. They create a lot of noise, making sifting through history more difficult (even though git helps with this). They break existing patches. And in the language-specific arena, the two-files-per-class (.h, .cpp) creates an absolute explosion of tiny files, which is frankly a pain in the ass. After putting together the 0.3.21 release candidate today, I'm thinking after 0.3.21 is out the door it might be a good time to do a major source tree re-org. I like the idea of following the GNU directory layout standard, and it would make my job easier if source was in src/, readmes/etc were in doc/, scripts to automate the Windows build were in build/, etc. PS: jaromil is the second person I've heard who says they'd prefer a development mailing list. I don't care one way or another, but it would be easy to create one using SourceForge's mailing list feature. What do others think? Title: Re: Reorganization of code around classes Post by: xf2_org on April 20, 2011, 08:39:58 PM PS: jaromil is the second person I've heard who says they'd prefer a development mailing list. I don't care one way or another, but it would be easy to create one using SourceForge's mailing list feature. What do others think? Coming from the Linux kernel, I certainly prefer mailing lists, particularly ones that avoid Reply-To munging (http://marc.merlins.org/netrants/reply-to-harmful.html) :) AFAIK, most major open source projects communicate via email, not forums? Title: Re: Reorganization of code around classes Post by: Steve on April 21, 2011, 05:19:55 AM After putting together the 0.3.21 release candidate today, I'm thinking after 0.3.21 is out the door it might be a good time to do a major source tree re-org. I like the idea of following the GNU directory layout standard, and it would make my job easier if source was in src/, readmes/etc were in doc/, scripts to automate the Windows build were in build/, etc.
PS: jaromil is the second person I've heard who says they'd prefer a development mailing list. I don't care one way or another, but it would be easy to create one using SourceForge's mailing list feature. What do others think? [/quote] I'd vote for a mailing list...in other open source projects I've worked with, it's worked well. As for a re-org, let me know if I can help. I think a simple approach of separating the classes out into *.h, *-inl.h, *.cpp files is a good start. Some people have objected on the grounds that it creates a lot of files (in reality, less than 150), but I think it's a clear and simple rule and allows people to easily reuse any of the classes in any combination for custom projects they might work on. If you don't separate the classes like that, then what do you do? Five classes per file? Ten? I think a clear and simple rule in this regard overrides the concern over a proliferation of files (and actually, I like being able to see all the classes when doing an "ls" in the directory). Title: Re: Reorganization of code around classes Post by: alkor on April 21, 2011, 06:18:01 PM Steve, I'm looking forward to your native Mac client, since right now an official mac client is missing for the latest release.
Title: Re: Reorganization of code around classes Post by: Steve on April 21, 2011, 08:32:58 PM Steve, I'm looking forward to your native Mac client, since right now an official mac client is missing for the latest release. Actually, I've got another project that bubbled up ahead of the native Mac client in priority (just something I think will have broader appeal and is more urgently needed). I'm not prepared to talk about it just yet other than to say I will be using this branch I've created for it. I still would like a native Mac client though. |