Thoughts on type safety and crypto RNGs

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Thoughts on type safety and crypto RNGs

December 24, 2014, 10:02:10 AM
Last edit: December 24, 2014, 10:13:59 AM by grau

#21

Quote from: Peter Todd on December 24, 2014, 07:37:19 AM

It's no surprise that easier languages attract even less skilled programmers who make more mistakes, but it's foolish to think that giving skilled programmers a tool other than a footgun is going to result in more mistakes. I think the unfortunate thing - maybe the root cause of this problem in the industry - is you definitely do need to teach programmers C at some point in their education so they understand how computers actually work. For that matter we need to teach them assembler too. The problem is C is nice enough to actually use - even the nicest machine architectures aren't - and people trained that way tend to reach for that footgun over and over again in the rest of their careers when really the language should be put on a shelf and only brought out to solve highly specialized tasks - just like assembler.

"Easy" applies to e.g. Python, but is unlikely the motivation for those who turn to Haskell or Scala. It is rather skilled programmers who turn to functional languages after they shot into their foot enough to reconsider what they stand on.

Quote from: Peter Todd on December 24, 2014, 07:37:19 AM

Equally, how many computer science graduates finish their education with a good understanding of the fact that a programing language is fundamentally a user interface layer between them and machine code?

Unfortunatelly many of those who get that think that they are better than the compiler and its runtime. Some might be really better, maybe even consistently, but staying ahead of compiler and runtime development is getting harder and their advantage less and less likely.

gmaxwell

Moderator
Legendary

Offline

Activity: 4158
Merit: 8382

Re: Thoughts on type safety and crypto RNGs

December 24, 2014, 08:23:37 PM

#22

Quote from: Peter Todd on December 24, 2014, 07:37:19 AM

I'm mainly concerned about whether or not using C(++) with manual memory management is acceptable practice. Screwing up manual memory management exposes you to the king of all implicit details: what garbage happens to be in memory at that very moment.

We mostly do not use manual memory management in Bitcoin core. Virtually all use of delete is in explicit destructors, most things are just RAII. I looked a while back and think found only something like three or four instances of use delete outside of destructors, and I assume those cases will all be changed the next time they're touched (e.g. examples include the wallet encryption, which hasn't been touched in years).

(I thought it was really weird of Mike brought up manual memory management, I see I made an error in not correcting him.

Peter Todd

Legendary

Offline

Activity: 1120
Merit: 1150

Re: Thoughts on type safety and crypto RNGs

December 24, 2014, 09:02:05 PM

#23

Quote from: gmaxwell on December 24, 2014, 08:23:37 PM

Quote from: Peter Todd on December 24, 2014, 07:37:19 AM

I'm mainly concerned about whether or not using C(++) with manual memory management is acceptable practice. Screwing up manual memory management exposes you to the king of all implicit details: what garbage happens to be in memory at that very moment.

We mostly do not use manual memory management in Bitcoin core. Virtually all use of delete is in explicit destructors, most things are just RAII. I looked a while back and think found only something like three or four instances of use delete outside of destructors, and I assume those cases will all be changed the next time they're touched (e.g. examples include the wallet encryption, which hasn't been touched in years).

(I thought it was really weird of Mike brought up manual memory management, I see I made an error in not correcting him.

Huh? I quite clearly give Bitcoin Core as an example of C++ done right, precisely because it uses a safe subset of the language that is a higher-level language via the abstractions used. You brought up C, which just isn't a safe language to write code in.

I read Mike's post as pointing out that knowing how to use C++ correctly - what subset to use - is something that does take skill. It's notable that we've changed a few things in Bitcoin Core to, for instance, use pointers where we didn't before, gradually decreasing the safety of the system by using parts of the language beyond that safe subset.

BTC: 1FCYd7j4CThTMzts78rh6iQJLBRGPW9fWv PGP: 7FAB114267E4FA04

rsvoter

Newbie

Offline

Activity: 28
Merit: 0

Re: Thoughts on type safety and crypto RNGs

December 24, 2014, 09:25:13 PM

#24

Quote from: grau on December 24, 2014, 10:02:10 AM

Quote from: Peter Todd on December 24, 2014, 07:37:19 AM

It's no surprise that easier languages attract even less skilled programmers who make more mistakes, but it's foolish to think that giving skilled programmers a tool other than a footgun is going to result in more mistakes. I think the unfortunate thing - maybe the root cause of this problem in the industry - is you definitely do need to teach programmers C at some point in their education so they understand how computers actually work. For that matter we need to teach them assembler too. The problem is C is nice enough to actually use - even the nicest machine architectures aren't - and people trained that way tend to reach for that footgun over and over again in the rest of their careers when really the language should be put on a shelf and only brought out to solve highly specialized tasks - just like assembler.

"Easy" applies to e.g. Python, but is unlikely the motivation for those who turn to Haskell or Scala. It is rather skilled programmers who turn to functional languages after they shot into their foot enough to reconsider what they stand on.

Quote from: Peter Todd on December 24, 2014, 07:37:19 AM

Equally, how many computer science graduates finish their education with a good understanding of the fact that a programing language is fundamentally a user interface layer between them and machine code?

Unfortunatelly many of those who get that think that they are better than the compiler and its runtime. Some might be really better, maybe even consistently, but staying ahead of compiler and runtime development is getting harder and their advantage less and less likely.

Couldn't have said this better myself.

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Thoughts on type safety and crypto RNGs

December 25, 2014, 09:45:58 AM

#25

Quote from: Peter Todd on December 24, 2014, 09:02:05 PM

Huh? I quite clearly give Bitcoin Core as an example of C++ done right, precisely because it uses a safe subset of the language that is a higher-level language via the abstractions used.

Yes, that "safe" subset of C++ is emulating a simple and restricted reference counting runtime by hand. Certainly doable. Apple is e.g. successful forcing reference counting to application level programmer on iOS, although Objective C gives nice support for that pattern.

Continuing that line of though you could define a "safe" subset of C only using the stack, maybe even functional programming with assembler macros. Runtimes and compiler do no magic therefore talented programmers can emulate any feature of them in any language. It requires skills and it can be fun.

It is not Bitcoin the first program that moves around billions of dollars in value. It is just a new one. Most programs I wrote moved more value than Bitcoin's market capitalization, therefore I know that once you deal with other people's money it gets difficult to argument for not using a help technology offers.

gmaxwell

Moderator
Legendary

Offline

Activity: 4158
Merit: 8382

Re: Thoughts on type safety and crypto RNGs

December 25, 2014, 10:26:13 AM
Last edit: December 25, 2014, 10:37:23 AM by gmaxwell

#26

Quote from: grau on December 25, 2014, 09:45:58 AM

Yes, that "safe" subset of C++ is emulating a simple and restricted reference counting runtime by hand. Certainly doable.

That isn't the case. Yes, reference counting is one tool, in that box but it has considerable costs. Most things are handled by RAII. And then there is unique_ptr... "By hand" is also perhaps misleading... in that, for better or worse, the developer themselves doesn't see the machinery under the hood any more than they see boundary checking in Java.

grau

Hero Member

Offline

Activity: 836
Merit: 1021

bits of proof

Re: Thoughts on type safety and crypto RNGs

December 25, 2014, 11:04:51 AM

#27

Quote from: gmaxwell on December 25, 2014, 10:26:13 AM

Quote from: grau on December 25, 2014, 09:45:58 AM

Yes, that "safe" subset of C++ is emulating a simple and restricted reference counting runtime by hand. Certainly doable.

That isn't the case. Yes, reference counting is one tool, in that box but it has considerable costs. Most things are handled by RAII. And then there is unique_ptr... "By hand" is also perhaps misleading... in that, for better or worse, the developer themselves doesn't see the machinery under the hood any more than they see boundary checking in Java.

RAII and unique_ptr implement a reference counting store where the (implicit) use count can be 1 or 0, right?
"By hand" does not mean copy and paste. The solutions you use are likely best attainable in the C++ environment.

My point is, that better support for program correctness is available elsewhere and should be used if permissible. When permissible is open for discussion, but I do not buy that the answer would be never.

I even suspect that a language and runtime that is safer to exclude side effects, implicit inputs and aids reasoning on correctness of algorithms is a better choice even for consensus definition.

Mike Hearn (OP)

Legendary

Offline

Activity: 1526
Merit: 1129

Re: Thoughts on type safety and crypto RNGs

December 27, 2014, 08:36:06 PM

#28

Quote from: gmaxwell on December 24, 2014, 08:23:37 PM

(I thought it was really weird of Mike brought up manual memory management, I see I made an error in not correcting him.

Sigh. You know I have the deepest respect for you Gregory, but this is not the first time I get the feeling you're commenting on things I've written without having read them closely Sad

I said:

Quote

The main concern with Core is not that the code is insecure today, but what happens in the years to come

I know how the code is currently written. I first read it a few months after it was released, remember Wink

But being concerned about an extremely common class of errors is hardly weird. Multiple people on this thread have brought it up.

My experience of working on several large C++ server codebases at Google is that it's quite possible to write robust code ... for a while. When you have a single thread, everything is written by one guy and all data is request scoped, the tools C++ provides can work very well.

But eventually one of the following happens:

1) Someone introduces multi-threading for better scalability, resource management, use of a blocking library etc, and accidentally writes code that races
2) Someone refactors code written by someone else and uninitialised data creeps in
3) Someone starts using a third party library that isn't written in the same way and requires manual heap management (like OpenSSL)
4) Someone profiles and decides to reduce the amount of copying that is going on

More generally: things change, teams change and software gets more complicated. Because nothing in C++ forbids manual memory management and some things require it, eventually it ends up being used. And some time after that, someone makes a mistake.

We can't magically convert Bitcoin Core to a safer language with a stricter type system. We can anticipate that mistakes will happen, and try to put in place systems to automatically catch and handle them.

Quote from: petertodd

You might be interested to find out that Python is actually moving towards static types; the language recently added support for specifying function argument types in the syntax. How the types are actually checked is undefined in the language itself - you can use third-party modules to impose your desired rules.

This sounds somewhat like the Checker framework. It is a pluggable type system for Java. I'd like to see it adapted for Scala and Kotlin too. It has a number of very practical type systems that catch practical errors like mixing up seconds and milliseconds and other unit mismatches.

moni3z

Hero Member

Offline

Activity: 899
Merit: 1002

Re: Thoughts on type safety and crypto RNGs

December 27, 2014, 08:58:13 PM
Last edit: December 28, 2014, 01:03:21 AM by moni3z

#29

Every OS has a proper method for obtaining keystream (/dev/urandom) https://news.ycombinator.com/item?id=8049739 the problem is if you or somebody else chroot the application, and forget to make userspace CSPRNG available with it, so directly obtaining from the kernel a good idea.

Another problem are people making browser client side js wallets and not containing it inside a browser addon http://matasano.com/articles/javascript-cryptography/