Bitcoin Forum
November 14, 2024, 08:46:20 PM *
News: Check out the artwork 1Dq created to commemorate this forum's 15th anniversary
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Thoughts on type safety and crypto RNGs  (Read 3647 times)
Mike Hearn (OP)
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
December 11, 2014, 01:06:39 PM
Merited by vapourminer (1), ABCbits (1)
 #1

I wrote an article about some of the failures in wallet randomness we've seen in the past 12 months:

  https://medium.com/@octskyward/type-safety-and-rngs-40e3ec71ab3a

It's a 6 minute read, but the tl;dr summary is:

1) Find ways to make the type systems you are working with stronger, either through better tools or better languages

2) Try and get entropy as directly from the kernel as possible, bypassing userspace RNGs

I should practice what I preach - bitcoinj could be upgraded to use the Checker Framework for stricter type checking, and we currently only bypass the userspace RNG when Android is detected. I'll be looking at ways to make things stricter and more direct next year.
bcearl
Full Member
***
Offline Offline

Activity: 168
Merit: 103



View Profile
December 11, 2014, 01:46:13 PM
 #2

You should not do crypto in JS or Java in the first place. In those languages, you do not have control about memory management. For example in JS, you have no control over how and were the browser stores your secret data (keys etc.). There is no way to enforce the physical deletion of private data.

Misspelling protects against dictionary attacks NOT
hexafraction
Sr. Member
****
Offline Offline

Activity: 392
Merit: 268

Tips welcomed: 1CF4GhXX1RhCaGzWztgE1YZZUcSpoqTbsJ


View Profile
December 11, 2014, 09:48:09 PM
 #3

You should not do crypto in JS or Java in the first place. In those languages, you do not have control about memory management. For example in JS, you have no control over how and were the browser stores your secret data (keys etc.). There is no way to enforce the physical deletion of private data.

Java allows very specific off-heap allocation on OpenJDK's VM, that allows for crypto data to live in a specific place in memory without fear of being copied by an eager GC, and to be erased from memory before deallocation. Netty also has some specific buffer types that are zero-copy for performance, that are useful even in non-network applications.

I have recently become active again after a long period of inactivity. Cryptographic proof that my account has not been compromised is available.
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1030


bits of proof


View Profile WWW
December 20, 2014, 09:21:04 PM
 #4

1) Find ways to make the type systems you are working with stronger, either through better tools or better languages

+1

unfortunatelly most crypto developer still build on their percieved superior programming skills, instead of using modern languages.

Most exploits arise from programming errors in low level weakly typed languages and not from those exotic "timing" and "memory" attacks that they use to justify their ancient tool set.
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4284
Merit: 8808



View Profile WWW
December 20, 2014, 09:54:57 PM
 #5

in low level
Except the issues with poor cryptographic security Mike is talking about have only been observed-- so far-- in tools written in Java, Javascript, and Python in our ecosystem. None of these are low level languages.
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1030


bits of proof


View Profile WWW
December 21, 2014, 10:40:15 AM
Last edit: December 21, 2014, 11:39:42 AM by grau
 #6

in low level
Except the issues with poor cryptographic security Mike is talking about have only been observed-- so far-- in tools written in Java, Javascript, and Python in our ecosystem. None of these are low level languages.

You are right in that I should not have used "low level" for but type unsafe, since there are high level unsafe languages, like Javascript or Python. Java is type safer and Scala is even better, and that is what Mike said.

Added:
Not using compile time checks that type safety gives is pure arrogance.

BTW what about the heartbleed bug in SSL was it not in Bitcoin core?

Unfortunatelly you only use your intelligence to pinpoint inaccuracy in my sentences.
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4284
Merit: 8808



View Profile WWW
December 21, 2014, 08:10:06 PM
 #7

BTW what about the heartbleed bug in SSL was it not in Bitcoin core?
It was an issue in OpenSSL (bitcoind doesn't expose SSL to the public in a default, or even sane, configuration at least).  Every other language also depends on system libraries too. So the language Bitcoin core was written in was irrelevant in this example.

Quote
Unfortunatelly you only use your intelligence to pinpoint inaccuracy in my sentences.
I'm sorry you feel that I'm nitpicking, but I'm not trying to.

So far our experience in this space is that there is more irresponsible and broken software written in higher level languages, there has been virtually no issues in this space from cryptographic weaknesses (or even conventional software security) in Bitcoin applications written in C / C++. I agree that sounds somewhat paradoxical... but it's not that shocking: The security of these systems depends on the finest details of the behaviour of each part of the software and the interactions, when your system obscures the details some extra work is required to review though the indirection. This somewhat offsets the gains. In cryptographic (and especially consensus) systems it's much harder to "fail safe" and a much wider spectrum of unexpected behaviour is actually bad and exploitable. Languages like Java make some kinds of errored software "more safe" when the software is incorret, but making software more correct is still something that is largely not reaching production industrial software development yet (languages with dependant types and facilities for formal analysis seem like they _may_ result in more correct software).  

There is no replacement for hard work and many view higher level languages as an escape from drudgery, so there may be some language selection bias from the attitude of the authors that has nothing to do with the language itself.  In any case, I think your barb was misplaced, at least in this thread: We've seen bad RNG behaviour from Java software several times, and not just in system libraries. (And not just RNG safety, also things like attempts at full node code being shattered by underlying crypto libraries bubbling up null pointer exceptions that cause false block rejections which would have created forks if it were widely used).

(I do agree though that using untyped languages is basically suicide for any, even moderately large, system where correctness matters.)
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1030


bits of proof


View Profile WWW
December 22, 2014, 08:17:05 AM
 #8

BTW what about the heartbleed bug in SSL was it not in Bitcoin core?
It was an issue in OpenSSL (bitcoind doesn't expose SSL to the public in a default, or even sane, configuration at least).  Every other language also depends on system libraries too. So the language Bitcoin core was written in was irrelevant in this example.

That bug in that library was exemplary for the potentially disasterous consequences of a weak memory model present in C and C++. It did not put Bitcoin at risk, but it likely did if the payment protocol had been in core already. The argument that the bug was in a library is weak and applies to the RNG problem we saw with Java on Android too. We have seen a very similar bad RNG problem in Debian Linux too written in C. Errors like those are not language specific, the consequence of the hearbleed bug however was. The bug itself was not such a desaster was it not paired with a weak memory model.

Bitcoin core can not change its technology as it would likely result in a hard fork between its older and newer versions. We can't touch Satoshi's bugs and should one of the used libraries blurp up or even store some junk, chances are good that those "features" have to be preserved.

On a side chain however the technology is not set in stone. Whatever features, even bugs an other tool and library set displays there defines the consensus of that side chain.

I am using Java and more recently Scala not just because they relieve me from some drudgery, but because their do help me to create more robust and correct programs. Ignoring major advances of computer science should be well justified. I see good reasons to stick with the tool set for Bitcoin core, but not around that. Higher level interfaces and new side chains need not to use the same hammer for all nails.

Mike gave good hints for the selection of new hammers, and that's I applauded.

Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1160


View Profile
December 22, 2014, 03:34:17 PM
 #9

So far our experience in this space is that there is more irresponsible and broken software written in higher level languages, there has been virtually no issues in this space from cryptographic weaknesses (or even conventional software security) in Bitcoin applications written in C / C++.

That's an incredibly bold statement given that there's almost no-one writing Bitcoin applications in C / C++ with the exception of Bitcoin Core itself. Equally the demographics of people writing the tiny amount of C / C++ code out there is very different than the demographics writing in more modern languages.

Fact is right now we just can't say anything about what approach is better based solely on where the most bugs have been found; we can say other industries have consistently been moving away from C and to a lesser extent C++ due to the difficulty of writing secure code in those languages.

You're also conflating two separate problems. It may turn out that writing consensus-critical code in other languages is harder, but that's a very different problem than writing secure code in the more general sense. Equally it may turn out that better approaches to writing consensus-critical code are more important than what language you choose to write it in. But right now we just don't know.

Mike Hearn (OP)
Legendary
*
expert
Offline Offline

Activity: 1526
Merit: 1134


View Profile
December 22, 2014, 06:39:22 PM
 #10

I would say that we've got very lucky with respect to Bitcoin Core:  Satoshi was a very careful developer who knew C++ very well and maximised use of its features to increase safety. The developers who followed him are also very skilled, know C++ very well and know how to avoid the worst traps.

The main concern with Core is not that the code is insecure today, but what happens in the years to come. Will the people who follow Gavin, Pieter, Gregory etc be as good? What about alt coins? What if a refactoring or multi-threading of some performance bottleneck introduces a double free? Anyway, not much we can do about this except try and make the environment as safe as possible. I've made some suggestions on how to do this in the past (auto restart on crash, use Boehm GC) and normally Gregory likes to point out possible downsides Smiley but I'm not super comfortable relying on "don't make mistakes" as a policy over the long run.

WRT RNG issues in Java, I'm not aware of any beyond the Android bugs, which were very severe but didn't have anything to do with Java as a language or platform. If there have been issues in Java SE I don't recall hearing about them. Bypassing in-process RNGs is still a good idea though.

Quote
You should not do crypto in JS or Java in the first place. In those languages, you do not have control about memory management. For example in JS, you have no control over how and were the browser stores your secret data (keys etc.). There is no way to enforce the physical deletion of private data.

It's also true of C (e.g. AES keys can persist in XMM registers for a long time after use). Although hexafraction is right that on HotSpot you can do manual heap allocations, it doesn't matter much. If an attacker has complete access to your address space then this is so close to "game over" that it hardly makes any odds whether there are multiple copies in RAM. Even if the password isn't lying around, they can just wait until it is. I'm not a big fan of spending time trying to "clean" address spaces of passwords or keys.

Note that for core crypto, it's looking more and more like long term everything will have to be done in assembly anyway. Pain.
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4284
Merit: 8808



View Profile WWW
December 22, 2014, 08:09:22 PM
Last edit: December 22, 2014, 08:46:24 PM by gmaxwell
 #11

I've made some suggestions on how to do this in the past (auto restart on crash, use Boehm GC)
Our process is not "don't make mistakes", Bitcoin Core largely uses a safer subset of C++ that structurally prevents certain kinds of errors (assuming the subset is followed, we don't have any mechanical enforcement).  I don't believe anyone writing or reviewing code for the project would describe things primary safety strategy as coming from "don't make mistakes", not with the level of review and the general avoidance of riskier techniques.

Though even equip with automatic theorem provers that could reason about cryptographic constructs no language or language facility can free you from having to avoid errors (though avoiding errors is much more than "just don't make them").

Things like "restart on crash" can be quite dangerous, because they let an attacker try their attack over and over, or keep the software running (and mining / authoring irreversible transactions) on a failing system. In most cases if we know that something that the software hasn't accounted for has happened just being shut down is better. If doing this results in a DOS attack, ... DOS attacks against the network are bad, but they're preferable to less recoverable outcomes. I think if anything we'd be likely to go the other way: On a "can never happen" indication of  corruption, write out a "your_system_appears_busted_and_bitcoin_wont_run_until_you_test_it_and_remove_th is_file.txt" that gets checked for at startup.

Quote
WRT RNG issues in Java
There have been Java bitcoin software, e.g. a vanity-generator that generated predictable keys, altcoin software that failed in various ways, bouncycastle causing inconsistency in node software from throwing surprise null pointer exceptions on weird inputs. I wasn't saying that there was any language issue there, but pointing out that even using the most confined language you can find will not prevent people from writing unsound cryptographic software. (And perhaps even making things worse, if the protection against idiotic mistakes makes people forget that they're playing with fire.)

You're also conflating two separate problems. It may turn out that writing consensus-critical code in other languages is harder, but that's a very different problem than writing secure code in the more general sense.
Actually no, you're catching the point I'm making but missing it.  Cryptographic systems in general have the property that you live or die based on implicit details. Cryptographic consensus makes the matter worse only in that a larger class of surprises which turn out to be fatal security vulnerabilities. It's quite possible, and has been observed in practise, to go end up with exploitable systems because some burred/abstracted behaviour is different than you expected. A common example is propagating errors up to to the far side when authentication fails and leaking data about the failure allowing incrementally recovering secret data.  Other examples are that implicit padding behaviour leaking information about keys (there is an example of this in Bitcoin core: OpenSSL's symmetric crypto routines had implicit padding behaviour that make the wallet encryption faster to crack than had been intended.)

I'm certainly a fan of smarter tools that make software safer (I'm conceptually a big fan of Rust, for example). But what I'm seeing deployed out in the wider world is that more actual deployed weak cryptography software is resulting from reasons unrelated to language.  This doesn't necessarily mean anything about non-cryptographic software. And some of it is probably just an attitude correlation; you don't get far in C if you're not willing to pay attention to details. So we might expect other languages to be denser in sloppy approaches. But that doesn't suggest that someone equally attentive might not do better, generally, in something with better properties. (I guess this is basically your demographic correlation).  So I'm certainly not disagreeing with these points; but I am disagreeing with the magic bullet thinking which is provably untrue: Writing in FooLang will absolutely not make your programs safe for people to use. It _may_ be helpful, indeed, but it is neither necessary nor sufficient, as demonstrated by the software deployed in the field.
2112
Legendary
*
Offline Offline

Activity: 2128
Merit: 1073



View Profile
December 22, 2014, 11:08:36 PM
 #12

Equally the demographics of people writing the tiny amount of C / C++ code out there is very different than the demographics writing in more modern languages.
Maybe it is true where you live. Where I live C++ enjoys resurgence in the form of superset/subset language SystemC, where certain things about the programs can be proven.

Likewise, gmaxwell posted here information about new research where a specific C subset (targeting specific TinyRAM architecture) can be used to produce machine-verifiable proofs. AFAIK this is still a long-shot option for Bitcoin, not something usable currently.

My comment here pertains to the consensus-critical code in the dichotomy you've mentioned later.
 

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1030


bits of proof


View Profile WWW
December 23, 2014, 10:36:11 PM
 #13

So I'm certainly not disagreeing with these points; but I am disagreeing with the magic bullet thinking which is provably untrue: Writing in FooLang will absolutely not make your programs safe for people to use. It _may_ be helpful, indeed, but it is neither necessary nor sufficient, as demonstrated by the software deployed in the field.

Neither Mike nor myself advertized a language as a magic bullet that makes programs safe.

You however seem to belive in superior powers of maintainer that outweighs advances of languages and runtime enviroments of the last decades.

I'd say you play a more dangerous game than us.
gmaxwell
Moderator
Legendary
*
expert
Offline Offline

Activity: 4284
Merit: 8808



View Profile WWW
December 24, 2014, 01:50:07 AM
 #14

So I'm certainly not disagreeing with these points; but I am disagreeing with the magic bullet thinking which is provably untrue: Writing in FooLang will absolutely not make your programs safe for people to use. It _may_ be helpful, indeed, but it is neither necessary nor sufficient, as demonstrated by the software deployed in the field.

Neither Mike nor myself advertized a language as a magic bullet that makes programs safe.

You however seem to belive in superior powers of maintainer that outweighs advances of languages and runtime enviroments of the last decades.

I'd say you play a more dangerous game than us.

You wrote, "Most exploits arise from programming errors in low level weakly typed languages". I pointed out that in our space we've observed the opposite: There have been more serious cryptographic weaknesses in software written in very high level languages like python, javascript, php, Java. etc. Thats all.  Please tone down the personal insults. You're very close to earning an ignore button press from me. I have scrupulously avoided besmirching your skills-- or even saying that I think your preferred tools are not _good_, only that that people using them suffer errors too-- but in every response you make you attack my competence.
Sergio_Demian_Lerner
Hero Member
*****
expert
Offline Offline

Activity: 555
Merit: 654


View Profile WWW
December 24, 2014, 02:31:14 AM
 #15

All coders make mistakes. In every language, in every library. Formal verification methods are generally too expensive. That's why peer review and audits exists. To detect those errors. And the more auditors, the better.
 
C++ code is generally more concise because of a higher versatility of the grammar (e.g. overloaded operators), but not as easy to understand to anyone but the programmer. C++ is very powerful, but can more easily hide information from the auditor. However the programmer has grater control regarding timing side-channels and secrets leakage.
 
Java code is generally more explicit and descriptive. It forces to do things that make the auditor's work simpler, such as class-file separation.
Obviously you can program C++ as if it were Java, but that's not how c++ libraries are built, nor how c++ programmers have learn. Nobody changes a language standard semantics.

Dynamically-typed languages are the worse, because you cannot fully understand the consequences of function without looking at every existent function call to see the argument types (and sometimes you cannot infer those without going deeper in the call tree!)

One example I remember now is Python strong pseudo-random generator seeding function. If you call the seeding function with a BigInt, it uses the BigInt as seed, but if you call it with an hexadecimal or binary string (and I've seen this), it performs a 32bit hash of the string, and then seeds the random with a 32 bit number. And this is allowed because a 32 bit hash is a default for every object. You can write Python that does not make use of dynamic typing, but that requires checking the type of every argument received, which nobody does.

I would prefer that low-lever crypto code (key management, prng, signature, encryption, authentication) is written in c/c++ (e.g. Sipa's secp256k1 library in Bitcoin) and every other layer is written in a more modern static typed language, such as Java. For most projects, that probably means that 90% of the code would be in Java and 10% would be in c/c++ (and that would probably be crypto library code)
The 90% Java code would be more secure not because Java code is more secure per se, but because it's would be easier to audit. The 10% would be harder but since it would be small you would be able to double the audit time for that part.
 
At the end, you get a more secure system having used the same audit or peer review time.
2112
Legendary
*
Offline Offline

Activity: 2128
Merit: 1073



View Profile
December 24, 2014, 03:04:17 AM
Last edit: December 24, 2014, 04:38:28 AM by 2112
 #16

I would prefer that low-lever crypto code (key management, prng, signature, encryption, authentication) is written in c/c++ (e.g. Sipa's secp256k1 library in Bitcoin) and every other layer is written in a more modern static typed language, such as Java.
I disagree that such a combination would be safer and easier to audit. Java and C++ runtimes are very hard to properly interface, especially in the exception handling and threading aspects. So the purported audit would not only involve auditing the code of the Bitcoin core but also auditing a large portion of the Java runtime.

One could make one or two restrictions in the mixed architecture you're proposing:

1) C/C++ code are only "leaves" on the call tree, i.e. only Java calls C++, C++ never calls Java.

2) "Java" is understood to mean not "validated standard conforming Java" but "subset of Java supported by the gcj ahead-of-time compiler" matched with the gcc/g++ used for the C/C++ code.

otherwise the mixed-language program will have a large minefield in the inter-language interface layer.

Edit:

Historical note: if "Java" would mean "Microsoft Visual J++" with J/Direct instead of JNI as an inter-language layer that could also work relatively smoothly. Those things are of historical interest only although there is at least one vendor in Russia that still maintains a Java toolchain that is unofficially compatible with the historical code: http://www.excelsior-usa.com/ .

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0
grau
Hero Member
*****
Offline Offline

Activity: 836
Merit: 1030


bits of proof


View Profile WWW
December 24, 2014, 05:27:18 AM
 #17

So I'm certainly not disagreeing with these points; but I am disagreeing with the magic bullet thinking which is provably untrue: Writing in FooLang will absolutely not make your programs safe for people to use. It _may_ be helpful, indeed, but it is neither necessary nor sufficient, as demonstrated by the software deployed in the field.

Neither Mike nor myself advertized a language as a magic bullet that makes programs safe.

You however seem to belive in superior powers of maintainer that outweighs advances of languages and runtime enviroments of the last decades.

I'd say you play a more dangerous game than us.

You wrote, "Most exploits arise from programming errors in low level weakly typed languages". I pointed out that in our space we've observed the opposite: There have been more serious cryptographic weaknesses in software written in very high level languages like python, javascript, php, Java. etc. Thats all.  Please tone down the personal insults. You're very close to earning an ignore button press from me. I have scrupulously avoided besmirching your skills-- or even saying that I think your preferred tools are not _good_, only that that people using them suffer errors too-- but in every response you make you attack my competence.

If you define your space with Bitcoin core, then yes, it shows very high quality, maintained by remarkable talents of which your are one of.
No doubt on that. I had no intention to insult you with incompetence.

The model that has been successful with Bitcoin core however failed so many of times that it fills libraries with dos and dont's of pointer arithmetic, anatomy of buffer overflow and zero delimited string exploits. I know, Bitcoin core developer carefully avoid those sources, it still did not protect against a bug in OpenSSL. That bug was not cryptographic in nature, but exposing the memory of the process as a consequence of missing array bounds check in the C/C++ runtime. Sure there are arguments for not having those checks in run-time, but those arguments work especially well with languages that check more at compile time, such that runtime violations are less probable.

While exceptional care can be successful, as we observe, it is hard to scale and sustain. This is why the software industry has been moving away from C/C++. It retained relevance in certain areas just like any good technology.

We need magnitudes more code and developer than Bitcoin core to build this economy, therefore it is sane to take any attainable help to sustain quality. I believe that type safe and functional languages, modern runtime enviroments do help. I do not think you doubt this, so please calm down too. I am not attacking you just personally, but doubt the extensibility of your successful model to all projects that use Bitcoin or its innovations.









Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1160


View Profile
December 24, 2014, 07:37:19 AM
Last edit: December 24, 2014, 07:48:33 AM by Peter Todd
 #18

Actually no, you're catching the point I'm making but missing it.  Cryptographic systems in general have the property that you live or die based on implicit details. Cryptographic consensus makes the matter worse only in that a larger class of surprises which turn out to be fatal security vulnerabilities. It's quite possible, and has been observed in practise, to go end up with exploitable systems because some burred/abstracted behaviour is different than you expected. A common example is propagating errors up to to the far side when authentication fails and leaking data about the failure allowing incrementally recovering secret data.  Other examples are that implicit padding behaviour leaking information about keys (there is an example of this in Bitcoin core: OpenSSL's symmetric crypto routines had implicit padding behaviour that make the wallet encryption faster to crack than had been intended.)

I'm mainly concerned about whether or not using C(++) with manual memory management is acceptable practice. Screwing up manual memory management exposes you to the king of all implicit details: what garbage happens to be in memory at that very moment.

Given that we have at least C++ available which can insulate you from manual memory management(1), there's just no excuse to be writing code that way anymore by default. Equally writing C++ in a way that exposes you to that class of errors is generally unacceptable.

Bitcoin itself is a perfect example, where some simple "don't be an idiot" development practices have resulted in a whole class of errors having never been an issue for us, letting development focus on the remaining types of errors.

1) Where manual memory management == things that can cause memory corruption and invalid accesses. There are of course other meanings of the term that refer to practices where memory is still "managed" manually at some level, e.g. allocation, but corruption and invalid accesses are not possible.

I'm certainly a fan of smarter tools that make software safer (I'm conceptually a big fan of Rust, for example). But what I'm seeing deployed out in the wider world is that more actual deployed weak cryptography software is resulting from reasons unrelated to language.  This doesn't necessarily mean anything about non-cryptographic software. And some of it is probably just an attitude correlation; you don't get far in C if you're not willing to pay attention to details. So we might expect other languages to be denser in sloppy approaches. But that doesn't suggest that someone equally attentive might not do better, generally, in something with better properties. (I guess this is basically your demographic correlation).  So I'm certainly not disagreeing with these points; but I am disagreeing with the magic bullet thinking which is provably untrue: Writing in FooLang will absolutely not make your programs safe for people to use. It _may_ be helpful, indeed, but it is neither necessary nor sufficient, as demonstrated by the software deployed in the field.

And since when did I say anything about "magic bullets"? I'm talking about acceptable bare minimum practices. Over and over again we've seen that doing manual memory management requires Herculean efforts to get right, yet people do get far enough in C(++) to cause serious problems doing it.

It's no surprise that easier languages attract even less skilled programmers who make more mistakes, but it's foolish to think that giving skilled programmers a tool other than a footgun is going to result in more mistakes. I think the unfortunate thing - maybe the root cause of this problem in the industry - is you definitely do need to teach programmers C at some point in their education so they understand how computers actually work. For that matter we need to teach them assembler too. The problem is C is nice enough to actually use - even the nicest machine architectures aren't - and people trained that way tend to reach for that footgun over and over again in the rest of their careers when really the language should be put on a shelf and only brought out to solve highly specialized tasks - just like assembler.

Equally, how many computer science graduates finish their education with a good understanding of the fact that a programing language is fundamentally a user interface layer between them and machine code?

Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1160


View Profile
December 24, 2014, 07:46:35 AM
 #19

Equally the demographics of people writing the tiny amount of C / C++ code out there is very different than the demographics writing in more modern languages.
Maybe it is true where you live. Where I live C++ enjoys resurgence in the form of superset/subset language SystemC, where certain things about the programs can be proven.

I'm referring specifically to the demographics of people writing code for Bitcoin-related applications.

Likewise, gmaxwell posted here information about new research where a specific C subset (targeting specific TinyRAM architecture) can be used to produce machine-verifiable proofs. AFAIK this is still a long-shot option for Bitcoin, not something usable currently.

My comment here pertains to the consensus-critical code in the dichotomy you've mentioned later.

C with machine-verifiable proofs has nothing to do with the type of C programming I'm criticizing; neither does SystemC. Those types of environments are so far removed from the vanilla and unsafe C(++) programming that gets people into trouble that you might as well call them different languages in all but name.

Peter Todd
Legendary
*
expert
Offline Offline

Activity: 1120
Merit: 1160


View Profile
December 24, 2014, 07:57:58 AM
 #20

Dynamically-typed languages are the worse, because you cannot fully understand the consequences of function without looking at every existent function call to see the argument types (and sometimes you cannot infer those without going deeper in the call tree!)

You might be interested to find out that Python is actually moving towards static types; the language recently added support for specifying function argument types in the syntax. How the types are actually checked is undefined in the language itself - you can use third-party modules to impose your desired rules. IIRC next major version, 3.6 (?) will be including a module with one approach to actually enforcing those argument types as a part of the standard library. Similarly class attributes have syntax support for specifying types, and again you can already use third-party modules to enforce those rules.

I wouldn't be surprised if the "sweet spot" for most tasks is a language much like Python with the ability to specify type information as well as the ability to easily enforce 100% usage of that technique in important code, while still giving programmers the option of writing quick-n-dirty untyped code where desired. And of course, with a bit of type information writing compilers that produce reasonably fast code becomes fairly easy - Cython does that already for Python without too much fuss.

Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!