Process-invariant hardware metric: hash-meters per second (η-factor)

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Process-invariant hardware metric: hash-meters per second (η-factor)

October 21, 2012, 10:25:38 PM
Last edit: July 22, 2013, 03:56:08 AM by eldentyrell

Merited by vapourminer (1)

UPDATED 21-Jul-2013: added column showing delivery/verification status. "Verified" means by an independent third party. "Delivered" means at least a few have been sold in arm's-length transactions (i.e. not special favors to developers or reviewers).
UPDATED: 22-Jun-2013 changed BFL numbers from post-tapeout claim (7.5GH/s) to actual measurement (4GH/s).
UPDATED: 21-Jun-2013 added Bitfury ~~65nm~~ 55nm figures (and fixed arithmetic error).

Known Figures

Design

MH/s

Device

Process node, $\lambda$

Area

η (H*pm/s)

status

Bitfury 55nm	2 GH/s	Custom	55nm, 27.5nm	14.44 mm²	2,880.45	verified
Avalon	275 MH/s	Custom	110nm, 55nm	16.13 mm²	2,836.52	verified, delivered
BFL SC	4.0GH/s	Custom	65nm, 32.5nm	56.25mm²	2,441.11	verified, delivered
Bitfury Spartan-6	300MH/s	Spartan-6	45nm, 22.5nm	120mm²	28.47	delivered
Tricone	255MH/s	Spartan-6	45nm, 22.5nm	120mm²	24.20	verified, delivered
Ztex	210MH/s	Spartan-6	45nm, 22.5nm	120mm²	19.75	verified, delivered
BFL_MiniRig_1Card	1.388 GH/s	2 x Altera Arria II EP2AGX260	40nm, 20nm	306.25mm²	18.14	verified, delivered
ATI 5870	393 MH/s	Evergreen	40nm	334mm²	9.39	verified, delivered
BFL_Single	832MH/s	2x EP3SL150F780	65nm, 32.5nm	?		verified, delivered
Block Eruptor	?	Custom	?, ?	?	conflicting data	announced
Reclaimer	?	Custom	?, ?	?		announced

I will list a chip in the table above when we have all of the following data:

Hashrate either in a claim from the manufacturer or measurement by a third party
Die size either in an unambiguous claim by the manufacturer or die photo from a third party
Process node in an unambiguous claim by the manufacturer
A plausible date by which independent verification will be possible.

Summary

As more and more announcements about bitcoin-specific chips come out, it would be useful to have a metric that compares the quality of the underlying design. I recommend "hash-meters per second" as a metric. This is calculated by dividing the hashrate (in H/s) by the die area in square meters and then multiplying by the cube of the process's feature size in meters (half of the process node's "name", so a 90nm process has a 45nm feature size). If you use hash-picometers instead of hash-meters you wind up with reasonable-sized numbers.

Current GPUs and FPGAs get 8-24 H*pm/s; the three ASICs we have numbers for have η-factors around 2,400-2,800 H*pm/s -- 100 times more efficient use of silicon than FPGAs and GPUs.

Migrating a design from one process to another by direct scaling -- when possible -- will not change this metric. Therefore it gives you a good idea of how the "rising tide" of semiconductor process technology will lift the various "boats".

Details

Process-invariant metrics factor out the contribution of capital to the end product, since the expenditure of capital can overwhelm the quality of the actual IP and give misleading projections of its future potential. A 28nm mask set costs at least 1000 times as much as a 350nm mask set, but migrating a design from 350nm to 28nm is not going to give you anywhere near 1000 times as much hashpower.

This metric probably does not matter for immediate end-user purchasing decisions -- MH/$ and MH/J matter more for that -- but for investors, designers, and long-range planning purposes it gives a better idea of how much "headroom" a given design has to improve simply by throwing more money at it and using a more-expensive IC process. Alternatively, this can be seen as a measure of how much of its performance is due to money having been thrown at it. That is important for investors -- and the line between presale-customers and investors is a bit blurry these days with all the recent announcements.

As semiconductor processes become more advanced, two important things happen:

1. The transistors get smaller (area).

2. The time required for transistors to turn on gets shorter (speed).

Area

Generally #1 (area) is indicated by the process name. For example, in a 90nm process the smallest transistor gates are 90nm long.

Chip designers refer to half of this length (i.e. 45nm on a 90nm process) as the feature size. The feature size is half of a gate length because you can always place transistors on a grid whose squares are at least half the length of the smallest gate. Usually you get an even finer grid than that, but it's not universally guaranteed.

Therefore, to get an area-independent measure of the size of a circuit, measure the circuit's area (units: square meters) and divide that by the square of the feature size (units: square meters) to get a unitless quantity. Well, almost unitless. Technically the units for a process's feature size are "meters per lambda" rather than meters, meaning the units for the final quantity should be (hash-meters) per (second*lambda-cubed).

Speed

Semiconductor processes are also characterized by a measure called "tau", which is the RC time constant of the process. This is the time it takes a symmetric inverter to drive a wire high or low, assuming the wire has no load.

The raw tau factor ignores the load presented by wires and other gates, so instead some desginers prefer to use This is also called the FO4 or the normalized gate delay. FO4 is the same measurement, but each gate drives four copies of itself.

Unfortunately the tau and FO4 numbers can be hard to come by, and they frequently get mixed up with each other (one is listed where the other ought to be). Also, there is a bit of "wiggle room" in exactly how the RC circuit or loading is done, so it's common to see inconsistent numbers cited by different sources for the same process. Because of this, using tau or FO4 directly in a competitive metric is a bad idea: people will fight over which tau or FO4 numbers to use. A previous proposal used gate delays as part of the metric, but I no longer recommend that metric since if it were to gain popularity it would inevitably lead to people playing games with the tau/FO4 numbers, picking and choosing whichever number cast their favorite product in the best light.

Fortunately, there is a fix. All we need here is a relative comparison of two circuits. It turns out that both tau and FO4 scale more or less linearly with the gate length (and therefore with the feature size). So instead of converting hashes/sec into hashes/tau or hashes/FO4 we can use the feature size as a proxy for the gate delay time and multiply the measure of hashes/sec by the feature size instead of multiplying by the tau/FO4 time. The resulting number will be totally meaningless as an absolute quantity, but the ratio of this metric for two different circuits will still give the ratio of their performance on equivalent processes.

Formula

So the forumla is:

(hashrate / area_in_square_lambda) * gate_switching_time

The units for this number are simply "hashes" (or "hashes per square lambda").

However remember that we're using feature_size (measured in meters per lambda) as a proxy for gate_switching_time since there is less wiggle room in how feature_size is measured and the two values tend to scale proportionally. This substitution gives us:

(hashrate / area_in_square_lambda) * feature_size

Since area_in_square_lambda is (area_in_square_meters / feature_size²) we can substitute to get:

(hashrate / (area_in_square_meters / feature_size²)) * feature_size

which is equivalent to

((hashrate * feature_size²) / area_in_square_meters) * feature_size

collecting the occurrences of feature_size gives us:

(hashrate * feature_size³) / area_in_square_meters

or alternatively:

(hashrate / area_in_square_meters) * feature_size³

Example

The Bitfury hasher gets 300MH/s:

300*10⁶H/s

It runs on a Spartan-6, which a 300mm² or 300*10^-6m²die. Dividing the
hashrate by the area in meters gives:

1*10¹²H/(s*m²)

This is why the Bitfury hasher a convenient example -- out of coincidence its hashrate in H/s just happens to be the same as its die area in square millimeters. This makes the numbers simpler.

Multiplying the number above by the feature_size (22.5*10^-9) cubed (11390.625*10^-27 meters) gives

11390.625*10^-15H*m/s

which is:

11.390625*10^-12H*m/s

The SI units for 10^-12 are "pico", so the Bitfury hasher gets

11.390 H*pm/s

Summary

To compute the metric, take the overall throughput of the device (hashes/sec), divide by the chip area measured in square meters and multiply by the cube of the process's feature size. Shortcut: take the hashrate in gigahashes per second, divide by the area in mm², multiply by the feature size (half the minimum gate length) in nanometers three times.

This number can then be used to project the performance of the same design under the huge assumption that the layout won't have to be changed radically. This assumption is almost always false, but assuming the design is ported with the same level of skill and same amount of time as the original layout, it's unlikely to be wrong by a factor of two or more. So I would consider this metric to be useful for projecting the results of porting a design up to roughly a factor of 2x. That might sound bad, but at the moment we don't have anything better. It also gives you an idea of how efficiently you're utilizing the transistors; once I get the numbers I'm looking forward to seeing how huge the divergence is between CPUs/GPUs/FPGAs/ASICs.

I propose to denote this metric by the greek letter η, from which the latin letter "H" arose. "H" is for hashpower, of course. Here is a table of some existing designs and their η-factor (I will update this periodically):

This metric does not take power consumption into account in any way. I believe there ought to be a separate process-independent metric for that.

If anybody can add information to the table, please post below. Getting die sizes can be difficult; I know the Spartan-6 die size above is a conservative estimate (it definitely isn't any bigger or it wouldn't fit in the csg484).[/list][/list]

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

RHA

Sr. Member

Offline

Activity: 392
Merit: 250

Re: Process-invariant hardware metric: hashes per meter-second (H/ms)

October 24, 2012, 11:30:01 PM

Nice concept, but you've messed the SI units a bit. Do edit the text.
First you are multiplying by area, than in example you divide by it. (I assume the latter is correct.)

Quote from: eldentyrell on October 21, 2012, 10:25:38 PM

Take the hashrate (in H/s), multiply by the die area (in square
meters), and divide by the square of the process's lambda (in meters).
The resulting quantity is measured in hashes per meter-second.

For the above you would get Hm/s.

Quote from: eldentyrell on October 21, 2012, 10:25:38 PM

Example
The Bitfury hasher gets 300MH/s: 300*10⁶H/s
It runs on a Spartan-6, which is a 45nm device with lambda=22.5nm on a 300mm² die.
Dividing the hashrate by the area gives: 1*10⁶H/(s*mm²)
Converting from mm² to m² gives 1H/(s*m²)
Dividing this by lambda (22.5*10^-9 meters) gives 0.0444*10⁹H/(s*m)
which is 44.44*10⁶ H/(s*m) or roughly 44MH/s*m.

Summary
To compute the metric, take the overall throughput of the device
(hashes/sec), divide by the chip area measured in square meters and
divide again by the lambda factor for the process used.

For the above you get H/m³s rather than H/ms.

RHA

Sr. Member

Offline

Activity: 392
Merit: 250

Re: Process-invariant hardware metric: hashes per meter-second (H/ms)

October 24, 2012, 11:40:20 PM

If you want to see how efficiently the transistors are utilized, you have to multiply by lambda (in meters) rather than divide by it.
In effect the units will be actually H/ms, but the calculated values will be a dozen orders of magnitude smaller.

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Process-invariant hardware metric: hash-meters per second

November 02, 2012, 02:14:41 AM
Last edit: November 08, 2012, 01:33:24 PM by eldentyrell

Quote from: RHA on October 24, 2012, 11:30:01 PM

First you are multiplying by area, than in example you divide by it. (I assume the latter is correct.)

Quote from: RHA on October 24, 2012, 11:40:20 PM

you have to multiply by lambda (in meters) rather than divide by it.

Yes, I swapped the multiplication and division in the instructions. Thanks for catching this! I've fixed it.

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Process-invariant hardware metric: hash-meters per second

November 02, 2012, 02:36:53 AM
Last edit: November 14, 2012, 02:01:26 AM by eldentyrell

I've also added figures for the ATI 5870 since it seemed to be the popular card (I've never mined with GPUs so I'm probably wrong here). I was initially surprised to find that it is has an η-factor that is actually on par with most Spartan-6 bitstreams. Three reasons for this:

1. We have an exact die size for the 5870 but only an upper bound for the Spartan-6. The Spartan-6 is definitely smaller than 300mm², but Xilinx won't say how much smaller and I haven't gotten around to grinding the top off of one of my dead chips yet.

2. If you think about it, FPGAs have an enormous amount of routing, and any given design uses only a tiny fraction of it (probably under 5%). Since η measures only how efficiently the silicon is used and has no bearing on power efficiency, it shouldn't be all that surprising that the unused routing on an FPGA accounts for about the same amount of silicon as the architecture-task mismatch on a GPU. The main difference is that on an FPGA that unused routing sits idle and consumes no power.

3. The $/(MH/s) for 5870's and volume-priced Spartan-6's using a high-end bitstream is nearly identical -- 2 $/(MH/s). Unfortunately the mining-hardware market is a lot smaller than the GPU market, so FPGA mining board vendors' markups have to be a lot higher than ATI's.

Edit: it turns out that my estimate of the die size for Spartan-6 was wildly off -- wrong by 250%. See below for actual measurements from a demolished chip.

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Process-invariant hardware metric: hash-meters per second

November 06, 2012, 11:59:45 PM
Last edit: November 08, 2012, 01:25:51 PM by eldentyrell

Updated to list BFL as 65nm. Still no die size.

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Process-invariant hardware metric: hash-meters per second

November 08, 2012, 02:28:58 AM
Last edit: November 08, 2012, 01:25:42 PM by eldentyrell

Updated with die size for BFL SC. We now have the first ASIC η-factor figures! Thanks for the transparency, BFL. Hopefully your competitors will follow suit.

kano

Legendary

Offline

Activity: 4620
Merit: 1851

Linux since 1997 RedHat 4

Re: Process-invariant hardware metric: hashes per second per nanometer (H/s/nm)

November 08, 2012, 04:35:33 AM

So ... reading through the first post ... I've not spotted where it says what use this is.

When firstly it ignores the majority of the non-GPU devices currently mining
BFL Singles, Icarus, Lancelot, ModMiner, Cairnsmore ...

and secondly you forgot to multiply by the colour Tongue

I'm also not sure how a Bitfury FPGA can come up ~5,192.7 times higher than a BFL SC.
22.5 MH/s/nm vs 4,333 H/s/nm
Makes it seem meaningless.

Pool: https://kano.is - low 0.5% fee PPLNS 3 Days - Most reliable Solo with ONLY 0.5% fee Bitcointalk thread: Forum
Discord support invite at https://kano.is/ Majority developer of the ckpool code - k for kano
The ONLY active original developer of cgminer. Original master git: https://github.com/kanoi/cgminer

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Process-invariant hardware metric: hash-meters per second

November 08, 2012, 05:12:49 AM
Last edit: November 08, 2012, 01:25:33 PM by eldentyrell

Quote from: kano on November 08, 2012, 04:35:33 AM

So ... reading through the first post ... I've not spotted where it says what use this is.

Here, I'll put it in red for you:

Quote

Migrating a design from one process to another by optical scaling -- when possible -- will not change this metric. Therefore it gives you a good idea of how the "rising tide" of semiconductor process technology will lift the various boats.

This metric probably does not matter for immediate end-user purchasing decisions -- MH/$ and MH/J matter more for that -- but for investors, designers, and long-range planning purposes it gives a better idea of how much "headroom" a given design has to improve simply by throwing more money at it and using a more-expensive IC process. Alternatively, this can be seen as a measure of how much of its performance is due to money having been thrown at it. That is important for investors -- and the line between presale-customers and investors is a bit blurry these days with all the recent announcements.

Quote from: kano on November 08, 2012, 04:35:33 AM

When firstly it ignores the majority of the non-GPU devices currently mining
Icarus, Lancelot, ModMiner, Cairnsmore …

None of these mine without a bitstream. The bitstream affects the hashrate (and therefore the η-factor) a lot while the particular board has very little impact aside from the number of chips on it. That's why the η-factor is listed by bitstream, per chip -- just like hashrates.

Quote from: kano on November 08, 2012, 04:35:33 AM

BFL Singles,

I can't calculate that because I don't know the die size for the chip in the BFL Single. If BFL or somebody else posts this information I'll be happy to update it.

Quote from: kano on November 08, 2012, 04:35:33 AM

22.5 MH/s/nm vs 4,333 H/s/nm

The "M" was a typo (fixed). All the numbers in that column have the same units.

galambo

Sr. Member

Offline

Activity: 966
Merit: 311

Re: Process-invariant hardware metric: hash-meters per second

November 08, 2012, 05:47:52 AM
Last edit: November 08, 2012, 05:59:17 AM by galambo

#10

i want to go to heaven now. will you take me home with soft wings?

i want to see for myself how many angels can dance up on the head of a pin

we can quantify them in units hash-yoctometers per second

kano

Legendary

Offline

Activity: 4620
Merit: 1851

Linux since 1997 RedHat 4

Re: Process-invariant hardware metric: hashes per second per nanometer (H/s/nm)

November 08, 2012, 06:34:23 AM

#11

Quote from: eldentyrell on November 08, 2012, 05:12:49 AM

Quote from: kano on November 08, 2012, 04:35:33 AM

When firstly it ignores the majority of the non-GPU devices currently mining
Icarus, Lancelot, ModMiner, Cairnsmore …

So you're saying your metric isn't much use since you can't list the most common mining FPGA's?

There are bitstreams and devices used WAY more than anything you listed (ignoring GPUs)

Quote

Quote from: kano on November 08, 2012, 04:35:33 AM

BFL Singles,

I can't calculate that because I don't know the die size for the chip in the BFL Single. If BFL or somebody else posts this information I'll be happy to update it.

https://bitcointalk.org/index.php?topic=79825.0 <- 6 months ago

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Process-invariant hardware metric: hash-meters per second

November 08, 2012, 07:02:54 AM
Last edit: November 08, 2012, 01:25:20 PM by eldentyrell

#12

Quote from: kano on November 08, 2012, 04:35:33 AM

Quote from: eldentyrell on November 08, 2012, 05:12:49 AM

the η-factor is listed by bitstream, per chip -- just like hashrates.

So you're saying your metric isn't much use since you can't list the most common mining FPGA's?

Please go troll somewhere else. That isn't even remotely close to what I said.

All but one of the boards you listed use the exact same chip.

Quote from: kano on November 08, 2012, 04:35:33 AM

Quote from: eldentyrell on November 08, 2012, 05:12:49 AM

I can't calculate that because I don't know the die size for the chip in the BFL Single. If BFL or somebody else posts this information I'll be happy to update it.

https://bitcointalk.org/index.php?topic=79825.0 <- 6 months ago

https://bitcointalk.org/index.php?topic=79825.0 <- The die size is not listed in that thread.

Again, please go troll in another thread.

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Process-invariant hardware metric: hash-meters per second

November 08, 2012, 07:28:26 AM

#13

Quote from: galambo on November 08, 2012, 05:47:52 AM

i want to go to heaven now. will you take me home with soft wings?

No problem. That's my specialty.

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Process-invariant hardware metric: hash-meters per second

November 11, 2012, 10:22:50 PM
Last edit: November 14, 2012, 02:02:07 AM by eldentyrell

#14

Updated with info for the BFL SC card, thanks to BFL themselves (hint to their competitors: maybe you might want to consider releasing figures like they have?)

This increases the urgency of getting an exact die size for the Spartan-6. We've always known the die size is less than 300mm^2: for one, the package cavity is square but the chip is rectangular: in FPGA editor it's almost twice as tall as it is wide. There's no guarantee that aspect ratio matches the silicon, but it's unlikely to be off by so much that it's square in real life.

I really doubt that BFL is squeezing nearly 2x the eta-factor out of their chips as anybody else, so I now suspect that the Spartan-6 die is substantially smaller than the 300mm^2 package cavity. Unfortunately I seem to have lost the two dead chips I had… argh. I'm almost tempted to sacrifice one of the occasionally-flaky-but-mostly-working ones.

Also keep in mind that there's a substantial amount of per-die overhead for I/O pads and clocking infrastructure, so using two huge chips (like BFL does) instead of five tiny ones (like Bitfury would to get the same hashrate) is inherently a more efficient use of silicon -- but not 2x more efficient. Sadly there aren't any bitstreams for Virtex-class devices that have had as much care put into them as the Bitfury/Tricone/BFL bitstreams for their respective devices.

Edit: the die size estimate for the Spartan-6 was off by 250%; I have actual measurements now (see below).

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

⇾ Re: Process-invariant hardware metric: hash-meters per second

November 14, 2012, 01:53:07 AM

#15

A moment of silence, please, for the XC6LX150 pictured below. He gave his life (and his 224MH/s -- slowest chip in the cluster) in the name of science:

Apologies for the low-tech measuring equipment and misaligned ruler.

It turns out that my estimate of the size of the Spartan-6 was an overestimate by more than a factor of 2! The die itself is 10mm on one side and between 11mm and 12mm on the other side. Let's call it 10x12 = 120mm². I've updated the η-factors for all the Spartan-6 bitstreams; these are now final numbers using actual measurements (not estimates) for all of the parameters.

PS: this means that Xilinx's "CS" package, which is suppose to stand for "Chip Scale" is not actually a Chip Scale Package. I had assumed it was.

tucenaber

Sr. Member

Offline

Activity: 337
Merit: 252

Re: Process-invariant hardware metric: hash-meters per second

November 23, 2012, 02:41:55 AM

#16

Quote from: eldentyrell on October 21, 2012, 10:25:38 PM

The SI units for 10^-12 are "~~nano~~pico"

ftfy

2112

Legendary

Offline

Activity: 2128
Merit: 1073

Re: Process-invariant hardware metric: hash-meters per second

February 05, 2013, 02:09:48 PM

#17

Paging Mr eldentyrell!

Avalon chip count and power usage are available. You can now update your comparison table.

chip count:

https://bitcointalk.org/index.php?topic=141300.0

chip power:

https://bitcointalk.org/index.php?topic=140539.msg1497127#msg1497127

Thanks.

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Process-invariant hardware metric: hash-meters per second

June 21, 2013, 09:47:36 PM

#18

Quote from: 2112 on February 05, 2013, 02:09:48 PM

Avalon chip count and power usage are available. You can now update your comparison table.

Thanks, but I need the actual die measurements, not the number of chips-per-wafer.

Please let me know if/when they are posted by either the Avalon manufacturers (I'll take their word for it) or some third party (must include a photo).