Superseded; please see
this thread which gives a metric that doesn't rely on tau or FO4 times (for which conflicting numbers are often reported in the public literature). It also has easier-to-handle units since the numerator involves meters, so by using
nanometers very small numbers can be pronounced as "XYZ hash-nanometers per second".
As more and more announcements about bitcoin-specific chips come out, it would be useful to have a metric that compares the quality of the underlying design. I recommend "hash-gate[delays] per square lambda" as the metric.
This factors out the contribution of capital to the end product, since capital can overwhelm the actual quality of the IP and give misleading projections of its future potential. A 28nm mask set costs at least 1000 times as much as a 350nm mask set, but migrating a design from 350nm to 28nm is not going to give you anywhere near 1000 times as much hashpower. This might not matter for immediate purchasing decisions -- MH/$ and MH/J matter more for that -- but it tells you how much "headroom" a given design has to improve simply by throwing more money at it and using a more-expensive IC process. Or, alternatively, how much of its performance is due to money having been thrown at it. That is important for investors (and the line between preorder-customers and investors is a bit blurry these days with all the recent announcements…)
There are two important parameters for any chip manufacturing process, and both are reasonably easy to find and not kept secret:
1. Tau, the "time constant" of the process. This is the time it takes a symmetric inverter to drive a wire high or low, assuming the wire is loaded with four identical inverters. This is also called the FO4 or the normalized gate delay. It is measured in seconds per rise-time, or (s/rises). You can usually find this number for a given process by googling around for a while. This site seems to list quite a few of them. Circuit designers will often measure critical paths in gate delays, as in "the critical path through a 64-bit Nafziger-Ling domino adder is 7 gate delays".
2. Lambda, the feature size of the process. Lambda is half the width of the narrowest transistor gate, and you can always place transistors on at least a half-gate-width grid (sometimes finer but not always). A 90nm process has lambda=45nm. The most popular process-independent way of measuring circuit area is in terms of square lambdas. Not everything shrinks by the same amount when a new process comes out, but it's reasonably close.
Given these two numbers, take the overall throughput of the device (hashes/sec), multiply by the gate delay time for the process and divide by the chip area measured in square lambdas. This number can then be used to project the performance of the same design under the huge assumption that the layout won't have to be changed radically. This assumption is almost always false, but assuming the design is ported with the same level of skill and same amount of time as the original layout, it's unlikely to be wrong by a factor of two or more. So I would consider this metric to be useful for projecting the results of porting a design up to roughly a factor of 2x. That might sound bad, but at the moment we don't have anything better. It also gives you an idea of how efficiently you're utilizing the transistors; once I get the numbers I'm looking forward to seeing how huge the divergence is between CPUs/GPUs/FPGAs/ASICs.
I propose to denote this metric by the greek letter $\eta$, from which the latin letter "H" arose. "H" is for hashpower, of course. Here is a table of some existing designs and their $\eta$-factor (I will update this periodically):
Design | MH/s | Device | Process node, $\tau$, $\lambda$ | Area in mm^2 | Area in $\lambda^2$ | $\eta$ |
GPU miner | ?? MH/s | GPU | ?, ?, ? | ? | ? | ? |
Ztex | 210MH/s | Spartan-6 | 45nm, 4.5ps, 22.5nm | 300mm^2 | ~600*109 | 7.981*10-15 |
Tricone | 255MH/s | Spartan-6 | 45nm, 4.5ps, 22.5nm | 300mm^2 | ~600*109 | 9.69*10-15 |
Bitfury | 300MH/s | Spartan-6 | 45nm, 4.5ps, 22.5nm | 300mm^2 | ~600*109 | 11.4*10-15 |
BFL_Single | 832MH/s | 2x EP3SL150F780 | 65nm, 9.3ps, 32.5nm | ? | ? | ? |
BFL SC | 1.5GH/s | Custom | ?, ?, ? | ? | ? | ? |
Avalon | ? | Custom | ?, ?, ? | ? | ? | ? |
Block Eruptor | ? | Custom | ?, ?, ? | ? | ? | ? |
bASIC | ? | Custom | ?, ?, ? | ? | ? | ? |
Reclaimer | ? | Custom | ?, ?, ? | ? | ? | ? |
If anybody can add information to the table, please post below. Getting die sizes can be difficult; I know the Spartan-6 die size above is a conservative estimate (it definitely isn't any bigger or it wouldn't fit in the csg484).
It would probably make sense to renormalize all of the numbers relative to 1*10
-15 since that's the ballpark for FPGA/GPU hashers and things are only going to go up from here. Unfortunately there isn't a simple way to make that part of the units ("femtohash-gates per square lambda"? fHG/L^2? yuck.)
The tau numbers I'm using above are from the Europractice site. They seem to be exactly half of the numbers I'm used to working with; I haven't figured out yet why that is. But I'd prefer to use those for now since they give you a single source for a wide variety of processes.
This metric does not take power consumption into account in any way. I believe there ought to be a separate process-independent metric for that.
I'm going to be offline for the next five days, so I may not be able to respond to postings in this thread immediately.[/s]