DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
January 27, 2012, 11:05:40 PM |
|
Someone who knows more than I about the process would have to expand on it a bit. One of the largest advantages of structured ASICs are reduced power consumption. The increased NRE cost eats into any per unit cost savings especially for "small (10K unit) runs. Even on larger runs (25K, 50K 100K units) the per unit cost is lower but it isn't anything like 70% lower than an equivalent FPGA. sASIC - 10K unit run ------------------------ * significantly reduced power consumption * not a significant reduction in overall cost Last Gen FPGA ------------------------ * will consume significantly higher power than an equivelent 40nm FPGA. * could be sold at clearance to eliminate inventory (65nm is now 2 generations old) BFL "mystery" chip ------------------------- * significantly INCREASED power consumption * significantly reduced cost (relative to current gen FPGA) But, the hardcopy process has several options that can reduce costs. I breifly read up on one of the offerings that did not include some 'screen layers'(term?) or some such that were quite a bit cheaper than a full cutom hardcopy were. All hardcopy (and all sASICS) are "custom". FPGA - fixed logic units - fixed interconnects sASIC - fixed logic units - CUSTOM interconnects ASIC - CUSTOM logic units - CUSTOM interconnects FPGA "waste" a lot of transistors. They form an interconnect mesh between nearby LUTs (Logic Units). This mesh is what the ISE software uses to build the routing and gives FPGA their flexibility. However that flexibility has a cost. Every transistor you didn't use (and you may only use 10% of potential interconnects) costs you money and power. An sASIC uses the same fixed grid of LUTs. This allows a design that works on an FPGA to also work on an equivelent sASIC. However there is no programable routing. All the LUTs are "islands". The fab makes a mask of YOUR individual custom routing and creates a routing layer and then combines that with the standardized mass produced logic layer. More complex routing requires more layers and that means more cost.
|
|
|
|
sadpandatech
|
|
January 27, 2012, 11:34:19 PM Last edit: January 27, 2012, 11:45:37 PM by sadpandatech |
|
Someone who knows more than I about the process would have to expand on it a bit. One of the largest advantages of structured ASICs are reduced power consumption. The increased NRE cost eats into any per unit cost savings especially for "small (10K unit) runs. Even on larger runs (25K, 50K 100K units) the per unit cost is lower but it isn't anything like 70% lower than an equivalent FPGA. *a very easy to follow explanation of the differences between fpga, sASIC and ASIC was here* Thank you for taking the time to lay that out like that. It makes perfect sense. On that HardCopy cost I still wanna believe Altera was offering some kind of sASIC production that allowed for much smaller batches at a reduced cost. Which, if so, kinda leaves me a bit lost on just how that would work for older gens. If I'm following the manu correctly. Once they move on to a new gen with whoever their main manu is, where would they have builds for older gens done at? Bah, I need to go back to school. I managed to squeeze in 2 years before I was married. Since though, I've been lucky if I could get in 4-8 cred hours a year on the next 2 year piece of paper.. >.<
|
If you're not excited by the idea of being an early adopter 'now', then you should come back in three or four years and either tell us "Told you it'd never work!" or join what should, by then, be a much more stable and easier-to-use system. - GA
It is being worked on by smart people. -DamienBlack
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
January 27, 2012, 11:39:25 PM |
|
Thank you for taking the time to lay that out like that. It makes perfect sense. On that HardCopy cost I still wanna believe Altera was offering some kind of sASIC production that allowed for much smaller batches at a reduced cost. Which, if so kinda leaves me a bit lost on just how that would work for older gens. If I'm following the manu correctly. Once they move on to a new gen with whoever their main manu is, where would they have builds for older gens done at? I don't know but they may have some fab capacity at older gen or they may just be fabless (AMD has no fabs anymore) and just pay a foundry to do the fabrication. Nvidia (and now AMD) do that. They are intellectual property companies with no manufacturing assets. Sometimes that can bite you in the ass. Just ask AMD and the 7970 missing Christmas delivery because their partner couldn't get 28nm working in time.
|
|
|
|
yrral
Member
Offline
Activity: 79
Merit: 10
|
|
January 28, 2012, 03:27:09 AM |
|
Could it somehow be that they're using the pcb as an interconnect and using the 2 fpgas as one? So they only need to implement 1/2 the sha circuitry on each?
|
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
January 28, 2012, 03:33:43 AM |
|
Could it somehow be that they're using the pcb as an interconnect and using the 2 fpgas as one? So they only need to implement 1/2 the sha circuitry on each?
Unlikely because a single hash is trivially easy. Splitting work between two "nodes" only makes sense if one node can't handle it in a timely manner. It is never more efficient there is always intra-node overhead. So building dependent parallel solutions is something you do when you have no choice. A supercomputer for example can never be built with a single petaflop chip. Thus you have no choice but to accept the overhead and build it with a 1000 terraflop chips. If a single petaflop chip existed you would just use that because it would be more efficient.
|
|
|
|
RandyFolds
|
|
January 28, 2012, 03:48:56 AM Last edit: January 28, 2012, 04:03:17 AM by RandyFolds |
|
It is never more efficient there is always intra-node overhead.
Unless of course you are talking about BFL's rig box. e: too stoned, removed extra 'of course'
|
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
January 28, 2012, 03:50:01 AM |
|
It is never more efficient there is always intra-node overhead.
Unless of course you are talking about BFL's rig box, of course. Yeah. Wonder when they will revise those stats down?
|
|
|
|
yrral
Member
Offline
Activity: 79
Merit: 10
|
|
January 28, 2012, 03:58:30 AM |
|
Could it somehow be that they're using the pcb as an interconnect and using the 2 fpgas as one? So they only need to implement 1/2 the sha circuitry on each?
Unlikely because a single hash is trivially easy. Splitting work between two "nodes" only makes sense if one node can't handle it in a timely manner. It is never more efficient there is always intra-node overhead. So building dependent parallel solutions is something you do when you have no choice. A supercomputer for example can never be built with a single petaflop chip. Thus you have no choice but to accept the overhead and build it with a 1000 terraflop chips. If a single petaflop chip existed you would just use that because it would be more efficient. Well from what I understand the issue is that the intra-chip routing takes away a lot of the fpga circiutry away from what could be used to hash. So if you could somehow use the PCB to route, you would kind of have a ghetto sASIC with 2 logic units (the fpgas) and the interconnect (pcb). Also, if the cross-chip communication (through the pcb) were pipelined there should be no overhead of the across-chip communication, correct?
|
|
|
|
Inspector 2211
|
|
January 28, 2012, 05:19:55 AM |
|
Could it somehow be that they're using the pcb as an interconnect and using the 2 fpgas as one? So they only need to implement 1/2 the sha circuitry on each?
Unlikely because a single hash is trivially easy. Splitting work between two "nodes" only makes sense if one node can't handle it in a timely manner. It is never more efficient there is always intra-node overhead. So building dependent parallel solutions is something you do when you have no choice. A supercomputer for example can never be built with a single petaflop chip. Thus you have no choice but to accept the overhead and build it with a 1000 terraflop chips. If a single petaflop chip existed you would just use that because it would be more efficient. Well from what I understand the issue is that the intra-chip routing takes away a lot of the fpga circiutry away from what could be used to hash. So if you could somehow use the PCB to route, you would kind of have a ghetto sASIC with 2 logic units (the fpgas) and the interconnect (pcb). Also, if the cross-chip communication (through the pcb) were pipelined there should be no overhead of the across-chip communication, correct? yes and no. Yes, it's a compelling idea. FPGA prices rise super- linear, i.e. an FPGA with twice the gates typically costs more than twice the dollar amount. No, it's not a practical idea because SHA-256 means 256 bits wide, so you need to route 256 signals from FPGA 1 to FPGA 2. While this is possible, think 10-layer PCB. Think $$$.
|
|
|
|
rph
|
|
January 28, 2012, 08:17:11 AM |
|
Yes, it's a compelling idea. FPGA prices rise super- linear, i.e. an FPGA with twice the gates typically costs more than twice the dollar amount.
That is true on the very high-end (Virtex7), but not true for low-cost/high-yield parts like spartan6. 6s150 has the lowest cost per LUT in the family. -rph
|
|
|
|
makomk
|
|
January 28, 2012, 11:39:57 AM |
|
I don't think there are any FPGA that run at 600 Mhz. More likely they are using a "larger" chip. Spartan 6-150 is used because it takes ~150K LUT to fit a complete unrolled double bitcoin hash logic. Thus 1 hash per clock running at 200 Mhz = 200 MH/s.
If their FPGA have enough LUT to fit 2 complete unrolled hashers then the board would do 4 hashes per clock. 800 MH/s = 200 Mhz.
Almost, except that Spartan-6 LUTs aren't equivalent to the LUTs used in other FPGAs - about half of them are useless for Bitcoin mining because they don't have any adders. Expect somewhere in the ballpark of half that many LUTs for a completely unrolled miner on a more suitable FPGA. Sure, but first they need to ship. I they have collected 1000 orders till the day the ship and they only released pictures with sanded off ICs they have achieved their delay. If they'd put the pictures online with everybody to see what kind of units they are using their competition could go to work right away. That's not the main obstacle to competition though - the main obstacle seems to be how on earth to get the FPGAs at a low enough price.
|
Quad XC6SLX150 Board: 860 MHash/s or so. SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
|
|
|
bulanula
|
|
January 28, 2012, 12:25:03 PM |
|
I think all of you are so concerned with BFL technology because you are located in China and wanna counterfeit and replicate their design to sell for cheaper If they got such a great deal on 65nm FPGA why not sell it straight away at normal market price ?
|
|
|
|
Costia
Newbie
Offline
Activity: 28
Merit: 0
|
|
January 28, 2012, 12:31:01 PM |
|
I think all of you are so concerned with BFL technology because you are located in China and wanna counterfeit and replicate their design to sell for cheaper If they got such a great deal on 65nm FPGA why not sell it straight away at normal market price ? more profit the reason it was cheap is probably because nobody wants it anymore except miners
|
|
|
|
bulanula
|
|
January 28, 2012, 12:32:22 PM |
|
I think all of you are so concerned with BFL technology because you are located in China and wanna counterfeit and replicate their design to sell for cheaper If they got such a great deal on 65nm FPGA why not sell it straight away at normal market price ? more profit the reason it was cheap is probably because nobody wants it anymore except miners Yeah, I kinda figured that one out but it still amazes me the insisting people on here that want to copy BFL's design
|
|
|
|
sadpandatech
|
|
January 28, 2012, 02:38:34 PM |
|
I think all of you are so concerned with BFL technology because you are located in China and wanna counterfeit and replicate their design to sell for cheaper If they got such a great deal on 65nm FPGA why not sell it straight away at normal market price ? more profit the reason it was cheap is probably because nobody wants it anymore except miners Yeah, I kinda figured that one out but it still amazes me the insisting people on here that want to copy BFL's design I can't speak for anyone else, but it should be apparent from what I have said here and in other posts that I would not be capable of doing so. And most of the guys around here that can are more than capable of designing their own. I think the catch is going to be that even if we could reverse it and it was more prudent to do so than to build our own, that we will not find the same discount on the chips they are using. And your idea is actually not bad. Design the PCB and Core and then just sell that and/or the chips and skip all the manu costs.
|
If you're not excited by the idea of being an early adopter 'now', then you should come back in three or four years and either tell us "Told you it'd never work!" or join what should, by then, be a much more stable and easier-to-use system. - GA
It is being worked on by smart people. -DamienBlack
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
January 28, 2012, 03:14:05 PM |
|
And your idea is actually not bad. Design the PCB and Core and then just sell that and/or the chips and skip all the manu costs.
This idea comes up a lot but I don't think people realize how difficult it is to do manually and how trivially cheap it is to do in bulk via automated assembly house. For a board that sized if you had 1000 boards built at once you are talking $10 maybe $15 per board and that includes stuff like xray verifcation of solder joints. Assembly houses often will source all the "mundane" components. So you just give them the boards, the FPGA, and 2-4 weeks later you have 1000 finished units. If you gave people the parts the failure rate on a dual FPGA board is going to be 10% or higher. Of course the company (any company) will get the negative word of mouth from all the people who destroyed their $600 toys. Trying to do assembly by hand to save money would be like buying a car in parts and assembling it by hand without the proper tools.
|
|
|
|
sadpandatech
|
|
January 28, 2012, 03:19:48 PM |
|
And your idea is actually not bad. Design the PCB and Core and then just sell that and/or the chips and skip all the manu costs.
If you gave people the parts the failure rate on a dual FPGA board is going to be 10% or higher. Of course the company (any company) will get the negative word of mouth from all the people who destroyed their $600 toys. Trying to do assembly by hand to save money would be like buying a car in parts and assembling it by hand without the proper tools. That was not the customer base I had in mind. I was thinking more along the lines of small businesses looking to fab 1k+ for resale. That was not at all clear from my post though.
|
If you're not excited by the idea of being an early adopter 'now', then you should come back in three or four years and either tell us "Told you it'd never work!" or join what should, by then, be a much more stable and easier-to-use system. - GA
It is being worked on by smart people. -DamienBlack
|
|
|
xaxik
Newbie
Offline
Activity: 10
Merit: 0
|
|
January 29, 2012, 02:25:42 PM Last edit: January 29, 2012, 03:25:45 PM by xaxik |
|
Part of my EP3SL150 design: https://i.imgur.com/FXfnN.pngPart of the BFL sandpapered "bitcoin processor" design: https://i.imgur.com/ejws0.pngThe BFL "bitcoin processor" is Altera Stratix III of unknown size. The EP3SL150 FPGA configuration bitstream is around 5MB long. On their board is 8MB flash. EP3SL150 uncompressed bitstream size is 47 Mbit. EP3SL200 and 260 is 93 Mbit, which is 11.6 MB. It doesn't fit into 8MB flash uncompressed, but it may fit into it with compression. So my guess is, that they use EP3SL200 or 260. It is definitely possible to put 3 fully unrolled pipelines into the 260 (at 150MH/s each) and, with some tuning, gain the advertised speed. To me, the case is closed. BFL just monetizes their access to chips at prices unavailable to anyone else. Now, why they just don't resell these $4k+ chips on the market?? -x Btw, thanks for this image:
|
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
January 29, 2012, 04:45:19 PM |
|
Solid analysis. Which makes all the cloak and dagger and (propreitary blend of FPGA and ASIC technology) so stupid. The "magic" is the low price last gen FPGA. Hell BFL could provide a detailed schematic and nobody could copy them. Well they could except the boards would cost $5K to $10K each. Had BFL been a little more open with the tech, issues they are having, and solutions to rectify it they could have built some real trust with the community.
|
|
|
|
legolouman
|
|
January 29, 2012, 05:13:08 PM |
|
I don't think they got factory left overs. I think they scooped these up from an abandoned project. That would explain why the dies have no info.
|
If you love me, you'd give me a Satoshi! BTC - 1MSzGKh5znbrcEF2qTrtrWBm4ydH5eT49f LTC - LYeJrmYQQvt6gRQxrDz66XTwtkdodx9udz
|
|
|
|