Computational Complexity Classes: P, NC, NP, E, #P, ⊕PP, NP, E, #P, ⊕P, NC are some of the computational complexity classes that are upper bounded by 1) type of result, 2) machine model employed for the computation, and 3) polynomial time (algorithmic steps) consumption.
Class | Result type | Machine model | Resource consumption |
P | boolean | deterministic | polynomial time |
NC | boolean | deterministic parallel processing | polynomial time |
NP | boolean | nondeterministic | |
EXP or E | boolean | deterministic | exponential time polynomial space |
#P | count | nondeterministic | polynomial time |
⊕P | count % 2 | nondeterministic | polynomial time |
where % means modulo and boolean is a binary (two possible values) result, e.g. "yes or no", "true or false", "1 or 0".DeterminismIn this section we embark on a fantastical journey to scratch the surface of globally (over the entire space of computer science and beyond) holistically integrated overviews about the deep theoretical insights I've accumulated—in my 3 decades of (mostly autodidact) programming experience—about programming language theory combined with theoretical physics. Much of this I've learned in an accelerated fashion in erratic autodidact bursts of study since roughly 2010 when I first started to dive into Haskell and functional programming. Which btw sent my mind into a twirl (with many months of frustration to grasp strange beasts such as Monads, State transformers, category theory, etc) and turned my former C, C++, OOP notion of programming upside down and inside out. Next on my theoretical journey will hopefully be the unification of the
Holy Trinity (hey I am getting old so coding Godspiring seems apt). To that practical end, recently I am employing inversion-of-control "don't call me, I will call you" more liberally even writing my own superior
Async library that leverages ES6 (Javascript) generators instead of the
async/await coming for ES7.
One of the ways to think about the concept of entropy is the minimum possible size that some data can be
losslessly compressed to, e.g. the text '
ababab' can be compressed to '
3ab' where the first character is the repeated count. This is the
Kolmogorov complexity (c.f. Nick Szabo's point that
computing the complexity is a NP problem). Thus I suggest to conceptualize entropy in this context as a measurement of the number of
unique configurations in some data, such that the repetitive configurations can be discarded via compression without losing any information. Repetition is not additional Shannon information, and thus doesn't increase the complexity.
Figure 2: deceptive appearance of what some laymen would interpret to be complexity, is actually low entropy due to repeating structure
Algorithms produce a result (output) which is the interaction of the program's source code (i.e. the description of the logical steps of the algorithm) and the input state (variables). Thus separately the source code and the input variables each have their own Shannon information content, i.e. a close analog of entropy. Only in perfectly dependently typed programming languages are both sets of entropy
statically unified, i.e. where the statically compiled source code is a complete specification of the combined entropy. Otherwise the entropy of the source code is
dynamically combined at invocation of the algorithm with the entropy of each variant of input state. The combined entropy determines the output of the algorithm. Determinism means that this combined entropy for all possible inputs is cumulatively
finite for finite invocations (state transformations) of the algorithm. Since the “unique configurations” are
finite, it is possible to precompute (i.e. determine) the interaction of all of them for all possible inputs and predetermine whether the algorithm will terminate or not terminate for each possible input state. Determinism doesn't mean the algorithm will terminate (e.g. we can write an algorithm that gets stuck in an endless loop every time it is run, thus being entirely predetermined to not terminate), but rather deterministic (thus finite entropy) means it is possible to precompute
whether the algorithm terminates for each possible input state. The formal definition of a deterministic algorithm is a transformation function from some input state to some output state (and the “transformation function” is the source code).
In contrast, a
nondeterministic algorithm when combined with all input possibilities has an
unbounded entropy and thus has an uncountable number of “unique configurations”. Unbounded means the algorithm
may or may not have infinite entropy, but it is impossible to
count the limit of the entropy until the algorithm has terminated, if it terminates (which can't be known until the algorithm terminates, i.e. non-termination is never known). Thus the entropy can't be quantified
a priori.
The thing is, Turing machines don't have an infinite amount of memory. There's never a time where an infinite amount of symbols are written to the tape. Instead, a Turing[-complete] machine has “unbounded” memory, meaning that you can keep getting more sources of memory when you need it. Computers are like this. You can add RAM, or USB sticks, or hard drives, or network storage.
Note there are scenarios where the unbounded entropy derives entirely from the source code, e.g. expanding the infinite, non-repeating sequence of digits of an irrational number such as π. And for example the input entropy can even interact to make the unbounded entropy of the source code finite, e.g. where the input variable is the finite count of the number of digits of π to output. Yet for example there are cases where the input entropy interacts to retain the unbounded entropy of the source code, e.g. where the input variable is the count of the number of consecutive 0 digits to find in the digits of π (which
can't be proven to terminate or not).
Tangentially note that in a perfectly
dependently typed programming language which
such as Agda or Epigram, the statically compiled type(s) of the input state entirely determine this combined entropy of the source code and the input variation; thus such programs are always deterministic (regardless of the dynamic input state of each invocation) due to specifying all the possible dependencies of variables in the input types the compiler enforces. Since this static typing enumerates and enforces a
finite number of input dependencies, thus by definition the entropy can't be unbounded. However, even perfectly dependently typed algorithms become unbounded in entropy when they are composed as modules into derivative algorithms because the permutations of cross-dependencies between modules is impossible to model in a closed form due to Russell's Paradox; which even the Philip Wadler (the creator of the
Expression Problem and one of the creators of Haskell) seemed to not address in his comment about modularity that immediately
followed mine. The venerable Robert Harper
also weighed on the fact that dynamically typed languages are actually undertyped statically typed languages which belies the point that dependently typing modularity is impossible. Typing is not a panacea. Thus the algorithm for the finding a number of consecutive 0 digits in π can't be implemented in a perfectly dependently typed programming language. Real world program semantics naturally contain module composition and thus
an exponential explosion of dependent types and thus can't realistically be perfectly dependently typed. This is the reason a statically typed compiled program can still contain bugs. (The theoretical physics explanation is that in life, time is not reversible so the forward dependencies of relativism due to the
Butterfly Effect exponential explode. This is why I have theorized that the only way to
model the fundamental matter of the universe is in the higher dimensionality of the chaos frequency domain.)
It is impossible to compute the interaction of uncountable (unquantifiable) entropy. An a priori unknown can't be precomputed. Thus it is impossible precompute whether a nondeterministic algorithm will terminate or not. The only way to find out is to run the algorithm on some particular input state and wait for a result. For some types of algorithms the unbounded entropy is so uniformly distributed that is impossible to discern if the algorithm is progressing towards termination, e.g. finding the consecutive number of 0 digits in π. In that case, you could be waiting forever to find out if the algorithm will terminate. In programs that we use in every day life which are in fact written with Turing-complete (i.e. for nondeterministic machine) programming languages, the unbounded entropy is sufficiently non-uniformly distributed such that the nondeterminism manifests as what programmers lovingly named “bugs” (e.g. the infamous
Windoze Blue Screen of Death) which are sometimes (but not deterministically so) discerned from random behavior thus identified as errors incongruent with the expected functionality.
The machine (or programming language) model must match the determinism requirements of the algorithm. A deterministic Turing machine
only becomes nondeterministic and referred to as Turing-
complete iff it has the
unbounded (in space and time) capability
to compute all possible configurations of computation. That is equivalent to
allowing unbounded (tail optimized or not) recursion because for example unbounded recursion (or its non-functional programming equivalent of unbounded looping) is required to process unbounded size of input. Unbounded input size is one variant of unbounded input entropy, but not the only variant. Yet in all cases of unbounded entropy, the unbounded recursion of Turing-completeness is required. This fact will be applied to understanding the NP complexity class.
Figure 6: depiction of conceptualized unbounded recursion using mirrors that face (thus reflect or "invoke") each other
Unbounded recursion enables unbounded composition which (is the antithesis of
declarative programming and) includes complex cases of functions that invoke functions which invoke other functions, which at some point recursively invoke one of those functions already pushed on the stack (or do the non-functional programming equivalent which is unbounded looping). Unbounded composition is unbounded entropy.
The tradeoff for this generality (to be able to compute any computation), is that a Turing-complete machine model
can't generally determine if it will terminate. This is known as the
Halting problem, which even has a
proof written in the style of Dr. Suess which employs
unbounded recursion to disprove the possibility of existence of a program which could compute whether a program halts. Another way of visualizing the undecideability of the Halting problem is to note that if a program is designed so it can accept unbounded input size, then it can't be proven whether it will halt or not halt. More abstractly
undecideability results from any rule of containment that contains everything that can contain itself (
c.f last two paragraphs of linked answer), because this unbounded recursion opens the generality to the universe of possibilities where the entropy is unbounded (per the Second Law of Thermodynamics).
There is a class of algorithms where the input entropy is not unbounded, yet it increases as an exponential function of some parameters such as
exponential computation time. For example, although computing a
cryptographic hash function is deterministic and polynomial time, finding the preimage of its output requires (on average) searching half of the
2N possibilities where
N is the number of bits of the hash function. For example, the unfathomably large number of “unique configurations”
2128 = 3.402823669×1038! I explained more about the
bit security of hash functions in my recent white paper on Winternitz hash-based signatures.
Note randomization should not be conflated with nondeterminism because
randomized or probabilistic algorithms are not necessarily nondeterministic algorithms, for if they don't require unbounded entropy (i.e. unbounded recursion) it may be possible to prove whether they terminate or not.
The unpredictability of nondeterministic algorithms is due to inability to
enumerate all the possible permutations of the algorithm's entropy. The unpredictability of a randomized algorithm is due to decoupling the
control of all of the variable entropy from input entropy. Randomization makes some of the variable entropy equiprobable (or any statistical distribution), but it does not require making the entropy uncountable. Sources of statistically controlled randomization could be deterministic (e.g. a PRNG with a chosen seed) or nondeterministic (e.g.
/dev/random).
The misnamed
nondeterministic finite automa (NFA) is not nondeterministic because the state transformation of each invocation of the algorithm is deterministic (and thus are not Turing-complete and can't simulate Turing-complete algorithms) and even the
eventual state of many
successive NFA transformations is deterministic because the entropy is
finite since every NFA
can be converted to a deterministic finite automa (DFA)— albeit such a conversion incurs an exponential combinatorial explosion of explicitly enumerated state. NFAs are more accurately characterized as a compression mechanism where the eventual entropy is not combinatorially expanded, and not the misnomer of nondeterminism.
Polynomial timePolynomial time requires the computation time of the algorithm to be a function of some variable(s) where
all exponents are constants.
Cobham's thesis is the rule of thumb that polynomial time algorithms are in the asymptotic case not intractable to compute, but remember from
Computational Complexity Classes: Overview that such asymptotic intuitions are not absolute.
PP includes all (boolean result a.k.a. 'decision') algorithms which can be run on a deterministic (non-Turing-complete) machine in polynomial time. In other words, those algorithms which are known to
terminate in polynomial time. Deterministic algorithms (and deterministic Turing machines, i.e. not Turing-complete) require a
bounded entropy, meaning that the inputs and computations to the program require an
upper bound on (i.e. maximum) size and time. Another way of stating this is that it should be decidable (provable) whether the program terminates. Programs that accept input of
unbounded size or perform an
unbounded loop (recursion) are not deterministic.
Note that since P is deterministic, we can prove all algorithms in class P will terminate and thus we don't have to include the
type of non-termination in (a disjunction of) P's boolean output type.
So the significance of the P class of algorithms is:
- Provably terminates for all possible inputs.
- Polynomial time often implies the computability is feasible (tractable).
NPNP includes all (boolean result) algorithms for which the computed results can be
deterministically verified on a
nondeterministic (Turing-complete) machine in polynomial time (and polynomial space or size for the proof being verified). The point is that the algorithm may require a Turing-complete machine thus it is undecidable (i.e. unknowable a priori) whether the algorithm will terminate. In the cases where the algorithm terminates, the verification will also terminate thus for verification the algorithm behaves deterministically and could even be run on a deterministic (not Turing-complete) machine. Also the verification may even run
a different algorithm than the generation of the proof.
So the significance of the NP class of algorithms is:
- Verification of the proof is an algorithm in P, so all the significance of P applies to the verification.
- NP algorithms occur in so many natural problems.
- The significance of verification computed in P coupled with the ability to prove NP computation.
- The undecidability and/or intractability of algorithms in NP can be employed as an advantage.
An example helps explain the significance of the last two bullet points. Note that NP means deterministic, polynomial time verification of proofs of
nondeterministic, polynomial time computation. An analogous example of proving computation for a more complex class but instead of for NP then for the complexity class EXP (or E) coupled with P verification. That is deterministic, polynomial time verification of proofs of deterministic,
exponential time computation. The example is a cryptographic hash function (which btw approximates infinite
Kolmogorov complexity in the Random Oracle model[1]) requires EXP to find a preimage (analogously intractable to precompute as is the nondeterminism in NP, due to the
intractability of precomputing all the entropy of the hash function); and the algorithm is in P for verifying (the proof of computation from EXP of) a
commitment to a preimage. Hash
commitment schemes will be crucial for understanding Zerocash.
In simpler words, the point is proof verification can be tractable (because the algorithm is in P) while computing (i.e. generating) the proof can be intractable (because the algorithm is in NP or EXP) when the adversarial (cheating) prover doesn't know the hash preimage (i.e. the private key). So complexity classes of the kind such as NP (or EXP coupled with P verification) which incorporate P for verification and an intractable complexity class for proving, are the conceptual complexity class paradigm for public key cryptography.
Mathematically I am stating that if
public key = hash(private key) where
hash() is an algorithm in P but
hash-1() is an algorithm in EXP, then the algorithm for verification that
public key = hash(private key) is in P thus computation is theoretically tractable. But the algorithm for computing
private key = hash-1(public key) is in EXP thus theoretically intractable.
Note the intractability of NP can be due to the cost of not being able to discern when to terminate an algorithm. Thus for some hypothetical public key cryptography algorithm in NP, unless you already know the private key for which the algorithm will terminate, then finding an input state which maps to termination is potentially intractable. When NP algorithms do terminate, they do so in polynomial time, so the problem of intractability is not due to intractable running time of the termination case as it is for EXP. Nevertheless both NP and EXP can be intractable to compute, which can be employed as an advantage to enable public key cryptography. For example,
information-theoretic security (such as for
Pedersen commitments) is due to insufficient information which is equivalent to unbounded entropy because breaking the security is only plausible for an omniscient algorithm, and thus is an example of a problem that is analogous to those in the NP class (except Pedersen commitments aren't boolean). Again remember from
Computational Complexity Classes: Overview that such asymptotic intuitions about intractability are not absolute. Even theoretical intractability due to unbounded entropy for NP can fail in the real world considerations, which thus must be very carefully studied and considered in cryptography.
...please check back again later, as I am not yet done composing this post...