Bitcoin Forum
May 03, 2024, 09:47:46 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 [2]  All
  Print  
Author Topic: Thoughts on type safety and crypto RNGs  (Read 3606 times)
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1021


bits of proof


View Profile WWW
December 24, 2014, 10:02:10 AM
Last edit: December 24, 2014, 10:13:59 AM by grau
 #21

It's no surprise that easier languages attract even less skilled programmers who make more mistakes, but it's foolish to think that giving skilled programmers a tool other than a footgun is going to result in more mistakes. I think the unfortunate thing - maybe the root cause of this problem in the industry - is you definitely do need to teach programmers C at some point in their education so they understand how computers actually work. For that matter we need to teach them assembler too. The problem is C is nice enough to actually use - even the nicest machine architectures aren't - and people trained that way tend to reach for that footgun over and over again in the rest of their careers when really the language should be put on a shelf and only brought out to solve highly specialized tasks - just like assembler.

"Easy" applies to e.g. Python, but is unlikely the motivation for those who turn to Haskell or Scala. It is rather skilled programmers who turn to functional languages after they shot into their foot enough to reconsider what they stand on.

Equally, how many computer science graduates finish their education with a good understanding of the fact that a programing language is fundamentally a user interface layer between them and machine code?

Unfortunatelly many of those who get that think that they are better than the compiler and its runtime. Some might be really better, maybe even consistently, but staying ahead of compiler and runtime development is getting harder and their advantage less and less likely.
1714772866
Hero Member
*
Offline Offline

Posts: 1714772866

View Profile Personal Message (Offline)

Ignore
1714772866
Reply with quote  #2

1714772866
Report to moderator
1714772866
Hero Member
*
Offline Offline

Posts: 1714772866

View Profile Personal Message (Offline)

Ignore
1714772866
Reply with quote  #2

1714772866
Report to moderator
1714772866
Hero Member
*
Offline Offline

Posts: 1714772866

View Profile Personal Message (Offline)

Ignore
1714772866
Reply with quote  #2

1714772866
Report to moderator
"The nature of Bitcoin is such that once version 0.1 was released, the core design was set in stone for the rest of its lifetime." -- Satoshi
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
December 24, 2014, 08:23:37 PM
 #22

I'm mainly concerned about whether or not using C(++) with manual memory management is acceptable practice. Screwing up manual memory management exposes you to the king of all implicit details: what garbage happens to be in memory at that very moment.
We mostly do not use manual memory management in Bitcoin core. Virtually all use of delete is in explicit destructors, most things are just RAII. I looked a while back and think found only something like three or four instances of use delete outside of destructors, and I assume those cases will all be changed the next time they're touched (e.g. examples include the wallet encryption, which hasn't been touched in years).

(I thought it was really weird of Mike brought up manual memory management, I see I made an error in not correcting him.
Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1150


View Profile
December 24, 2014, 09:02:05 PM
 #23

I'm mainly concerned about whether or not using C(++) with manual memory management is acceptable practice. Screwing up manual memory management exposes you to the king of all implicit details: what garbage happens to be in memory at that very moment.
We mostly do not use manual memory management in Bitcoin core. Virtually all use of delete is in explicit destructors, most things are just RAII. I looked a while back and think found only something like three or four instances of use delete outside of destructors, and I assume those cases will all be changed the next time they're touched (e.g. examples include the wallet encryption, which hasn't been touched in years).

(I thought it was really weird of Mike brought up manual memory management, I see I made an error in not correcting him.


Huh? I quite clearly give Bitcoin Core as an example of C++ done right, precisely because it uses a safe subset of the language that is a higher-level language via the abstractions used. You brought up C, which just isn't a safe language to write code in.

I read Mike's post as pointing out that knowing how to use C++ correctly - what subset to use - is something that does take skill. It's notable that we've changed a few things in Bitcoin Core to, for instance, use pointers where we didn't before, gradually decreasing the safety of the system by using parts of the language beyond that safe subset.

rsvoter
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
December 24, 2014, 09:25:13 PM
 #24

It's no surprise that easier languages attract even less skilled programmers who make more mistakes, but it's foolish to think that giving skilled programmers a tool other than a footgun is going to result in more mistakes. I think the unfortunate thing - maybe the root cause of this problem in the industry - is you definitely do need to teach programmers C at some point in their education so they understand how computers actually work. For that matter we need to teach them assembler too. The problem is C is nice enough to actually use - even the nicest machine architectures aren't - and people trained that way tend to reach for that footgun over and over again in the rest of their careers when really the language should be put on a shelf and only brought out to solve highly specialized tasks - just like assembler.

"Easy" applies to e.g. Python, but is unlikely the motivation for those who turn to Haskell or Scala. It is rather skilled programmers who turn to functional languages after they shot into their foot enough to reconsider what they stand on.

Equally, how many computer science graduates finish their education with a good understanding of the fact that a programing language is fundamentally a user interface layer between them and machine code?

Unfortunatelly many of those who get that think that they are better than the compiler and its runtime. Some might be really better, maybe even consistently, but staying ahead of compiler and runtime development is getting harder and their advantage less and less likely.

Couldn't have said this better myself.
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1021


bits of proof


View Profile WWW
December 25, 2014, 09:45:58 AM
 #25

Huh? I quite clearly give Bitcoin Core as an example of C++ done right, precisely because it uses a safe subset of the language that is a higher-level language via the abstractions used.

Yes, that "safe" subset of C++ is emulating a simple and restricted reference counting runtime by hand. Certainly doable. Apple is e.g. successful forcing reference counting to application level programmer on iOS, although Objective C gives nice support for that pattern.

Continuing that line of though you could define a "safe" subset of C only using the stack, maybe even functional programming with assembler macros. Runtimes and compiler do no magic therefore talented programmers can emulate any feature of them in any language. It requires skills and it can be fun.

It is not Bitcoin the first program that moves around billions of dollars in value. It is just a new one. Most programs I wrote moved more value than Bitcoin's market capitalization, therefore I know that once you deal with other people's money it gets difficult to argument for not using a help technology offers.
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4158
Merit: 8382



View Profile WWW
December 25, 2014, 10:26:13 AM
Last edit: December 25, 2014, 10:37:23 AM by gmaxwell
 #26

Yes, that "safe" subset of C++ is emulating a simple and restricted reference counting runtime by hand. Certainly doable.
That isn't the case. Yes, reference counting is one tool, in that box but it has considerable costs. Most things are handled by RAII. And then there is unique_ptr... "By hand" is also perhaps misleading... in that, for better or worse, the developer themselves doesn't see the machinery under the hood any more than they see boundary checking in Java.
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1021


bits of proof


View Profile WWW
December 25, 2014, 11:04:51 AM
 #27

Yes, that "safe" subset of C++ is emulating a simple and restricted reference counting runtime by hand. Certainly doable.
That isn't the case. Yes, reference counting is one tool, in that box but it has considerable costs. Most things are handled by RAII. And then there is unique_ptr... "By hand" is also perhaps misleading... in that, for better or worse, the developer themselves doesn't see the machinery under the hood any more than they see boundary checking in Java.
RAII and unique_ptr implement a reference counting store where the (implicit) use count can be 1 or 0, right?
"By hand" does not mean copy and paste. The solutions you use are likely best attainable in the C++ environment.

My point is, that better support for program correctness is available elsewhere and should be used if permissible. When permissible is open for discussion, but I do not buy that the answer would be never.

I even suspect that a language and runtime that is safer to exclude side effects, implicit inputs and aids reasoning on correctness of algorithms is a better choice even for consensus definition.
Mike Hearn (OP)
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1129


View Profile
December 27, 2014, 08:36:06 PM
 #28

(I thought it was really weird of Mike brought up manual memory management, I see I made an error in not correcting him.

Sigh. You know I have the deepest respect for you Gregory, but this is not the first time I get the feeling you're commenting on things I've written without having read them closely Sad

I said:

Quote
The main concern with Core is not that the code is insecure today, but what happens in the years to come

I know how the code is currently written. I first read it a few months after it was released, remember Wink But being concerned about an extremely common class of errors is hardly weird. Multiple people on this thread have brought it up.

My experience of working on several large C++ server codebases at Google is that it's quite possible to write robust code ... for a while. When you have a single thread, everything is written by one guy and all data is request scoped, the tools C++ provides can work very well.

But eventually one of the following happens:

1) Someone introduces multi-threading for better scalability, resource management, use of a blocking library etc, and accidentally writes code that races
2) Someone refactors code written by someone else and uninitialised data creeps in
3) Someone starts using a third party library that isn't written in the same way and requires manual heap management (like OpenSSL)
4) Someone profiles and decides to reduce the amount of copying that is going on

More generally: things change, teams change and software gets more complicated. Because nothing in C++ forbids manual memory management and some things require it, eventually it ends up being used. And some time after that, someone makes a mistake.

We can't magically convert Bitcoin Core to a safer language with a stricter type system. We can anticipate that mistakes will happen, and try to put in place systems to automatically catch and handle them.

Quote from: petertodd
You might be interested to find out that Python is actually moving towards static types; the language recently added support for specifying function argument types in the syntax. How the types are actually checked is undefined in the language itself - you can use third-party modules to impose your desired rules.

This sounds somewhat like the Checker framework. It is a pluggable type system for Java. I'd like to see it adapted for Scala and Kotlin too. It has a number of very practical type systems that catch practical errors like mixing up seconds and milliseconds and other unit mismatches.
moni3z
Hero Member
*****
Offline Offline

Activity: 899
Merit: 1002



View Profile
December 27, 2014, 08:58:13 PM
Last edit: December 28, 2014, 01:03:21 AM by moni3z
 #29

Every OS has a proper method for obtaining keystream (/dev/urandom)  https://news.ycombinator.com/item?id=8049739 the problem is if you or somebody else chroot the application, and forget to make userspace CSPRNG available with it, so directly obtaining from the kernel a good idea.

Another problem are people making browser client side js wallets and not containing it inside a browser addon http://matasano.com/articles/javascript-cryptography/
Pages: « 1 [2]  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!