Bitcoin Forum
April 20, 2024, 05:58:15 AM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 »  All
  Print  
Author Topic: cbitcoin - Bitcoin implementation in C. Currently in development.  (Read 20250 times)
Haplo
Full Member
***
Offline Offline

Activity: 168
Merit: 100



View Profile
May 17, 2012, 08:33:41 PM
 #41

Eh, I can understand not wanting to include any extra libraries, but I'm not so sure about GC having any effect on portability (except maybe on card-readers, which is hard to predict).

I see what you're trying to do with removing some unnecessary refcounts, which has been done (probably more effectively) in automatic reference counting systems, but I wouldn't recommend trying to do it manually. Basically, the case you listed (passing control of an object completely to another object with zero effect on count) is rarer than the more common case where an object is passed temporarily to an object which quickly uses and releases it with no effect on refcount (ie the original owner maintains control). If you want to cut out code like that, I'd suggest doing it on an expendable copy of your code later, and just counting everything for now. You're more likely to break stuff otherwise, which defeats the point of manual refcounting.


Btw, refcounting isn't only inefficient for many dead objects because of extra unnecessary counts; tracing is just that much more effective/efficient on dead objects. Age-oriented GC exploits that by having the young generation collected by a tracer and the old generation automatically refcounted. Personally I wouldn't go for all that complexity for a bitcoin library. It's not like you're decoding video for playback or anything, so a stop-the-world, copying tracing collector would be pretty efficient and work just fine, if you decide to go that direction.

I'm So Meta, Even This Acronym
1713592695
Hero Member
*
Offline Offline

Posts: 1713592695

View Profile Personal Message (Offline)

Ignore
1713592695
Reply with quote  #2

1713592695
Report to moderator
1713592695
Hero Member
*
Offline Offline

Posts: 1713592695

View Profile Personal Message (Offline)

Ignore
1713592695
Reply with quote  #2

1713592695
Report to moderator
1713592695
Hero Member
*
Offline Offline

Posts: 1713592695

View Profile Personal Message (Offline)

Ignore
1713592695
Reply with quote  #2

1713592695
Report to moderator
If you see garbage posts (off-topic, trolling, spam, no point, etc.), use the "report to moderator" links. All reports are investigated, though you will rarely be contacted about your reports.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713592695
Hero Member
*
Offline Offline

Posts: 1713592695

View Profile Personal Message (Offline)

Ignore
1713592695
Reply with quote  #2

1713592695
Report to moderator
1713592695
Hero Member
*
Offline Offline

Posts: 1713592695

View Profile Personal Message (Offline)

Ignore
1713592695
Reply with quote  #2

1713592695
Report to moderator
1713592695
Hero Member
*
Offline Offline

Posts: 1713592695

View Profile Personal Message (Offline)

Ignore
1713592695
Reply with quote  #2

1713592695
Report to moderator
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 17, 2012, 11:04:10 PM
 #42

Tracing GC would not be good for portability with embedded devices right?
I really can't see how reference counting is a major performance issue at all since the retain/release calls are in-between a lot of other code which is much more intensive.

Haplo
Full Member
***
Offline Offline

Activity: 168
Merit: 100



View Profile
May 18, 2012, 12:08:13 AM
 #43

Tracing GC would not be good for portability with embedded devices right?

It depends on if your embedded application has actual real-time requirements from bitcoin. I don't see bitcoin having any such requirements since most of what it does is networking and/or churning away at validating the blockchain, updating current balances and so on. Very little of it is actual user-interaction, which might be taken care of by a separate gui layer with different memory management.

If you actually need real-time performance for any reason, you'd have to use either something which has no pauses, ie refcounting with a freelist/standard C alloc/dealloc, or a concurrent GC. Concurrent GCs perform worse overall than stop-the-world, but they have minimal pauses so that things like real-time video chat or a phone's normal calling routines aren't choppy.

I know that they don't use garbage collection in the iOS obj-c framework, but I don't know what other phone developers do. I know that it's possible to compile GC for an embedded system, but there may be other reasons against it's use. Theoretically it shouldn't break things though, since the OS kernel being preemptible and using a realtime scheduler are the main mechanisms for enabling real-time performance. I would ask someone who actually has some experience with programming for phones. They would know better than I do.

I really can't see how reference counting is a major performance issue at all since the retain/release calls are in-between a lot of other code which is much more intensive.

Memory management is one of the biggest overheads incurred by any and every program. It also performs no useful work in regards to a program's 'business' code. As a result, every single instruction of MM overhead is significant WRT overall performance. Just to give you the gist, experiments with region-based memory management and compile time GC in mercury showed speedups of 100% or more for some applications, and no less than 33% speedup for any, while using 66% less memory (or less) than with standard GC. Of course, that was a declarative language with strong-static typing, modes, determinism, and scope declarations, but I think it should give you a good idea of what to expect with changes to memory management in general.

I'm So Meta, Even This Acronym
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 18, 2012, 01:24:21 AM
 #44

Well what garbage collecting library would you recommend for C (This? http://www.hpl.hp.com/personal/Hans_Boehm/gc/)? Also does the mercury programming language use memory like C does? If it used the stack where memory management doesn't matter, then it could be an OK comparison. Remember only dynamically allocated memory needs to be managed.
Haplo
Full Member
***
Offline Offline

Activity: 168
Merit: 100



View Profile
May 18, 2012, 01:43:29 AM
Last edit: May 18, 2012, 02:10:40 AM by Haplo
 #45

Well what garbage collecting library would you recommend for C (This? http://www.hpl.hp.com/personal/Hans_Boehm/gc/)? Also does the mercury programming language use memory like C does? If it used the stack where memory management doesn't matter, then it could be an OK comparison. Remember only dynamically allocated memory needs to be managed.

Declarative languages are strange compared to imperative languages like C. By default, prolog (and its derivatives like mercury) use what they call a WAM, which basically means everything is allocated to the stack. However, mercury also has the option to use a Boehm collector, and does allocate things to the heap. I don't know everything about how that works in mercury, and unfortunately as awesome a language as it is, it isn't well documented (and I haven't gotten around to digging through the installed examples yet).

Also, declarative languages using a WAM can generally end up with poor memory usage due to the FILO nature of stacks, and thus started using garbage collectors to collect memory on the stack o.0.

I'm not sure if there are collectors available for C besides the boehm. I did, however, find a nifty article on using the boehm library and getting good mileage from it: from linuxjournal.

EDIT: And they get better performance out of the Boehm than they do from malloc/dealloc.. which really surprises me. The only consideration is that you'll want to make sure to disable any compiler optimizations that mess with pointers in your build. The Boehm library is also pretty configurable, so you can make it do whatever you need it to do (ie for embedded systems).

I'm So Meta, Even This Acronym
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 18, 2012, 04:14:27 AM
 #46

Thanks for the feedback, it's made me think a bit. I'm thinking garbage collection may not even be required such that I can make the library without reference counting or tracing GC. I could make one version with GC and one not where objects just have to be freed at the right time.

I'll think more about this later... Should be asleep but I seem to have insomnia. Sad
jgarzik
Legendary
*
qt
Offline Offline

Activity: 1596
Merit: 1091


View Profile
May 18, 2012, 05:09:41 AM
 #47

Some of this is putting the cart before the horse Smiley

You cannot know what is the best allocator until you have an idea of the lifetime and usage pattern of the allocated objects...


Jeff Garzik, Bloq CEO, former bitcoin core dev team; opinions are my own.
Visit bloq.com / metronome.io
Donations / tip jar: 1BrufViLKnSWtuWGkryPsKsxonV2NQ7Tcj
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 18, 2012, 05:44:31 AM
 #48

Well this is the issue, what can be figured out now and what has to be fixed later. ;-) If referencing counting can be rejected early on then it makes coding easier; no need to completely re-implement memory management at a later stage.

I'd prefer to try and get it right to begin with but of-course if I'm wrong to begin with then it will just have to be changed.
Haplo
Full Member
***
Offline Offline

Activity: 168
Merit: 100



View Profile
May 19, 2012, 06:41:09 AM
 #49

Well this is the issue, what can be figured out now and what has to be fixed later. ;-) If referencing counting can be rejected early on then it makes coding easier; no need to completely re-implement memory management at a later stage.

I'd prefer to try and get it right to begin with but of-course if I'm wrong to begin with then it will just have to be changed.

Well, reference counting does have one advantage. After reading through that link I sent you I realized that for C you still have to set your pointers to null in order to tell the GC that your pointer is dead. In terms of assuring correctness you'd still need reference counting to avoid dangling pointers x.x.

Weak types = no automatic GC. I'd suggest another language but most languages suck Tongue.

I'm So Meta, Even This Acronym
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 19, 2012, 07:51:36 PM
 #50

No, you wont need reference counting to get rif of dangling pointers, you simply replace calls to "release" with a NULL assignment.

Also I guess the CG will also collect garbage when pointers go out of scope. Not sure if that is true but it makes sense so.

Maybe tomorrow or the next day I might do a test with ref counting vs tracing CG vs manual malloc/free placement. I might do a test where a loop creates CBAddress objects with random data until it finds an address beginning with particular characters. That would test a mixture of object creation/destruction and algorithm execution which may be a fair test even though the test would never be a proper representation of the final library.

But I'm not doing any more today.
Haplo
Full Member
***
Offline Offline

Activity: 168
Merit: 100



View Profile
May 20, 2012, 12:06:31 AM
 #51

No, you wont need reference counting to get rif of dangling pointers, you simply replace calls to "release" with a NULL assignment.

Also I guess the CG will also collect garbage when pointers go out of scope. Not sure if that is true but it makes sense so.

Maybe tomorrow or the next day I might do a test with ref counting vs tracing CG vs manual malloc/free placement. I might do a test where a loop creates CBAddress objects with random data until it finds an address beginning with particular characters. That would test a mixture of object creation/destruction and algorithm execution which may be a fair test even though the test would never be a proper representation of the final library.

But I'm not doing any more today.

Ah I think I get it now. As long as you always copy pointers when passing values into functions, then you get the same effect as refcounting. I would definitely like to see a test showing that that has lower overhead than dealloc/refcounting o.0. It looks to me like you'd be doing the same work as with refcounting, except without any explicit calls to dealloc.

I don't know about scope, but letting pointers fall out of scope is bad form anyway Tongue.

I'm So Meta, Even This Acronym
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 20, 2012, 11:36:32 AM
 #52

A tracing GC looks through the stack finding pointers to memory and if that memory is an object which the GC is responsible for it recursively does the same think until it finds no more reachable objects. The rest of the objects are collected. So all you need to ensure is that when you are done with a reference you point the reference somewhere else. This can indeed be done by pointing to NULL but is not always necessary as you may reuse pointers for deprecate objects where you reassign pointers to new objects. And as I've said a reference can go out of scope. No need to assign NULL to a pointer that becomes unreachable afterwards.

 
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 20, 2012, 10:37:34 PM
Last edit: May 21, 2012, 12:09:20 AM by MatthewLM
 #53

Perhaps using the garbage collector would be nice since I'm now struggling with a memory leak made by CBCreateAddressVT. The virtual function table should get freed from everything I see. :-( Once again I've forgotten to pass a pointer by reference (ie. pointer to a pointer). :-( Easy enough to fix.

No idea how to install libgc on OSX. Sad
Haplo
Full Member
***
Offline Offline

Activity: 168
Merit: 100



View Profile
May 21, 2012, 12:53:35 AM
 #54

As I said, C is not the best language for memory management x.x.

I'm So Meta, Even This Acronym
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 21, 2012, 02:29:41 AM
Last edit: May 22, 2012, 11:40:10 PM by MatthewLM
 #55

Well it doesn't matter that I cannot do any test. While it would be interesting Boehms GC may be insecure and reference counting is not too much of a headache using the right tools; it didn't take me long to fix the memory leaks. So now I'll move onto just testing the virtual functions vs ordinary functions before moving on to the script.

I was able to test removing the redundant virtual functions and it gives a speedup of about 3% which is not worth the change. Now that I did that I'll move onto the script.

Edit: Just have the arithmetic and cryptography codes to implement in the script interpreter. If anyone wants to help, the script interpreter certainly does need scrutinising.
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 23, 2012, 10:49:07 PM
 #56

The script interpreter is implemented except for the signature checking codes. If anyone wants to help, the interpreters should be scrutinised and in particular checked against the bitcoin protocol to ensure it operates correctly. This is it:

https://github.com/MatthewLM/cbitcoin/tree/master/cbitcoin/src/structures/CBScript

The interpreter is here:

https://github.com/MatthewLM/cbitcoin/blob/master/cbitcoin/src/structures/CBScript/CBScript.c#L105

All documentation is in the header file.

The tests are in this file:

https://github.com/MatthewLM/cbitcoin/blob/master/cbitcoin/test/testCBScript.c

Next I'll implement the transaction structures before moving on to signature checking. Then I can implement the transaction validation and move onto blocks.
Gavin Andresen
Legendary
*
qt
Offline Offline

Activity: 1652
Merit: 2216


Chief Scientist


View Profile WWW
May 24, 2012, 03:42:16 PM
 #57

Your unit tests look like they'll miss lots of edge cases.

I've been working on cross-implementation unit tests for Script, and am actually working on more tests today (and am working on a testnet reset that will embed the "should validate" tests into the testnet block chain).

JSON format tests are here:
  https://github.com/bitcoin/bitcoin/tree/master/src/test/data

(read by https://github.com/bitcoin/bitcoin/blob/master/src/test/script_tests.cpp )

How often do you get the chance to work on a potentially world-changing project?
realnowhereman
Hero Member
*****
Offline Offline

Activity: 504
Merit: 502



View Profile
May 24, 2012, 04:01:34 PM
 #58

Your unit tests look like they'll miss lots of edge cases.

I've been working on cross-implementation unit tests for Script, and am actually working on more tests today (and am working on a testnet reset that will embed the "should validate" tests into the testnet block chain).

JSON format tests are here:
  https://github.com/bitcoin/bitcoin/tree/master/src/test/data

(read by https://github.com/bitcoin/bitcoin/blob/master/src/test/script_tests.cpp )

I'm not familiar enough with Satoshi client internals to easily understand what script_tests is doing.  Could you confirm my understanding of the format of the two JSON files?

Obviously it's an array of arrays, but some of the inner arrays have two elements and some have three.  I guess the first two elements are the two component scripts?  Which one is which?  Is the third element simply an error message to be issued if that test fails?  What is considered a fail -- i.e. is the script meant to always reduce to one answer and that answer should be TRUE (or FALSE depending on which file is used)?

1AAZ4xBHbiCr96nsZJ8jtPkSzsg1CqhwDa
MatthewLM (OP)
Legendary
*
Offline Offline

Activity: 1190
Merit: 1004


View Profile
May 24, 2012, 05:17:42 PM
 #59

Thanks Gavin. I do need to implement more tests, I will do this at a later time, first I need to implement the transaction structures properly and OP_CHECKSIG. I'll have to use some of the tests you posted!
kokjo
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000

You are WRONG!


View Profile
May 24, 2012, 07:45:06 PM
 #60

sup

"The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts." -Bertrand Russell
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!