Bitcoin Forum
April 26, 2024, 06:28:50 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 3 »  All
  Print  
Author Topic: SHA256d IC design question  (Read 951 times)
investorpgroovy (OP)
Newbie
*
Offline Offline

Activity: 58
Merit: 0


View Profile
January 02, 2018, 02:18:58 AM
 #1

for any ASIC/FPGA designers , I would love to have an offline discussion about this

What is the possibility that someone would have verified verilog/VHDL  for an SHA256d ASIC  from a current or defunct company ( BF or BM28NM for example) that they would be willing to sell ? Does anyone know if the 16nm and 10nm designs use the same logic as the 28nm, as it seems to me that is the case.

Is anyone trying to complete a improved ASIC at this point ?



1714112930
Hero Member
*
Offline Offline

Posts: 1714112930

View Profile Personal Message (Offline)

Ignore
1714112930
Reply with quote  #2

1714112930
Report to moderator
1714112930
Hero Member
*
Offline Offline

Posts: 1714112930

View Profile Personal Message (Offline)

Ignore
1714112930
Reply with quote  #2

1714112930
Report to moderator
1714112930
Hero Member
*
Offline Offline

Posts: 1714112930

View Profile Personal Message (Offline)

Ignore
1714112930
Reply with quote  #2

1714112930
Report to moderator
Every time a block is mined, a certain amount of BTC (called the subsidy) is created out of thin air and given to the miner. The subsidy halves every four years and will reach 0 in about 130 years.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714112930
Hero Member
*
Offline Offline

Posts: 1714112930

View Profile Personal Message (Offline)

Ignore
1714112930
Reply with quote  #2

1714112930
Report to moderator
1714112930
Hero Member
*
Offline Offline

Posts: 1714112930

View Profile Personal Message (Offline)

Ignore
1714112930
Reply with quote  #2

1714112930
Report to moderator
1714112930
Hero Member
*
Offline Offline

Posts: 1714112930

View Profile Personal Message (Offline)

Ignore
1714112930
Reply with quote  #2

1714112930
Report to moderator
the_electronrancher
Jr. Member
*
Offline Offline

Activity: 112
Merit: 4


View Profile
January 02, 2018, 07:00:58 PM
 #2

Do you have the money to tape one out?

The entire field of asics has a very narrow bloodline starting with grandpa Icarus and his kids.  There are improvements along the way, but you can see his heritage in all the children once you get to know them a bit.

Most generational changes are stuffing more hashcores and doing manufacturability/convenience tweaks.  There's only so many ways to unroll a loop, but you need a couple experienced layout guys to place it - you wouldn't want to P&R the whole thing, it would perform poorly.
investorpgroovy (OP)
Newbie
*
Offline Offline

Activity: 58
Merit: 0


View Profile
January 03, 2018, 06:45:35 AM
Last edit: January 03, 2018, 07:00:02 AM by investorpgroovy
 #3

This is purely exploratory but financing the tapeout/production is the easy part..Its the R&D that worries me

It had occurred to me to start with ICARUS..but based on my experience adapting an FPGA to an ASIC is pretty risky compared to starting with a verified design and just moving to a new process. Especially with the time factor... honestly I don't know what IP was used in those FPGA boards thats not synthesisable or not available to license



I am guessing based on your comment you already know what I am thinking. I kinda figured that there had been a number of improvements since icarus in order to get the hashrate/watt up...My initial thought was that if I could get my hands on something that was verified already in a relatively recent iteration it would derisk a project like this...

That being said, I haven't reviewed the designs of any of the ASICs or FPGAs... If the IP is out there and its a matter of just getting a few layout guys, well that's pretty damn interesting....  
the_electronrancher
Jr. Member
*
Offline Offline

Activity: 112
Merit: 4


View Profile
January 03, 2018, 09:33:01 PM
 #4

If you have deep pockets, you can get chipworks to reverse any chip you want.  Start with a larger geometry to save yourself some bucks.

But adapting verilog from fpga to asic is not difficult.  I can assure you that many digital designs are prototyped in fpga and then synthesized into asic.

I would say that other than asicboost, the architecture changes have been small in the last few generations - it's an unrolled loop, pipelined to give one result per clock.  There's really only one answer there, I would expect that everyone's design is very similar for this important part.



investorpgroovy (OP)
Newbie
*
Offline Offline

Activity: 58
Merit: 0


View Profile
January 04, 2018, 07:44:35 AM
 #5

Sure everyone uses FPGAs for design..

to clarify a bit more.  the issue I would worry about with going from FPGA is I dont know how it was designed .If  we started with icarus and for example they designed it using the "free " IP that xilinix offers you, you start adding on to the development process time... not to mention timing issues and what not.

The last project where I acquired a FPGA design with the plan to convert quickly to asic ended up being a nightmare,  went a year beyond schedule and had crazy licensing fees (it was a complex design with 4 arm cores, 3 levels of on die cache and so on ) .. it was successful in the end.. but it wasn't easy,

here a guide to the type of issues I am talking about .... http://www.onsemi.com/pub/Collateral/HBD872-D.PDF

so I am not a designer myself and I suspect these designs are very simple... so let me ask, are you basicaly saying these chips are so simple I don't need to worry about CPU core/memory timing type issues and 3rd party licensing ?
investorpgroovy (OP)
Newbie
*
Offline Offline

Activity: 58
Merit: 0


View Profile
January 04, 2018, 07:50:37 AM
 #6

I mean I am just generally exploring to try and and find the lowest cost.shortest time to market.. not a huge fan of hiring someone to reverse engineer something if the other option is just to start with Icarus..
the_electronrancher
Jr. Member
*
Offline Offline

Activity: 112
Merit: 4


View Profile
January 04, 2018, 02:20:37 PM
 #7

I didn't say simple, I said there are not many unique ways to make an optimized sha core to give the ideal performance of one hash per clock once pipelined. In my opinion, only one.

You're probably going to want to license the pll from the foundry, but the logic would be built from gates.  Some company built a hashcore they licensed out back in the 0.13 days, not sure of the name but it doesn't seem worth the money.

Pm me the part number of that asic project that you mentioned, I'd like to take a look

Entropy-uc
Hero Member
*****
Offline Offline

Activity: 756
Merit: 501


View Profile
January 04, 2018, 07:21:20 PM
 #8

I believe the Hashfast design ended up in the hands of their silicon integrator, which did the design work in the first place. Similarly, Terrahash went bankrupt, so their design IP ended up in play and is probably held by someone who assigns it a modest value.  BFL - I don't know what became of them at all, but they had a design.

The catch is that all of these designs were laid out using VHDL to standard cell libraries.

Bitfury clearly demonstrated that laying out at the transistor level gave massive advantage for power efficiency.  His 64 nm chip performed better than the 28 nm generation.  It was also buggy as hell.

I don't think it's feasible to deliver a power competitive design with standard cells.  You will need to start with transistor level design of an unrolled hashing core.  From there, it's likely there are power optimizations that are possible.  The design will likely need to be optimized thermally as well, to limit hot spots.

Delivering a working SHA256 hash core isn't all that hard.  Being competitive from a power efficiency standpoint will be difficult.  I doubt it's practical to expect you will be within 20% of Bitmain on a given node until your 3rd or 4th generation.

Good luck, but I think you will find your money would be better invested convincing a major player like AMD or NVIDIA to develop a solution.
investorpgroovy (OP)
Newbie
*
Offline Offline

Activity: 58
Merit: 0


View Profile
January 05, 2018, 03:47:53 AM
 #9

so I had assumed that all these guys were using standard IP libraries.. If what your saying is true, then its obvious why no one has picked up the IP from a defunct firm and ran with it.

Based on your comment I found a few details... seems like you have a good point

The BFL at 28nm was I guess 400GH/s at .27J/Gh or 1600GHS at .76J/GH ..but the layout size was massive compared to bitmain/bitfury, its seems like in addition to using standard libraries they had a significantly different design methodology

The bitfury at 28nm was at around .2 J/GH and supposedly the 16nm is .1J/GH..
The bitmain 1385 is listed at .18 J/G in 16nm at the slowest speed (21 ghs)

Interestingly Global Foundries (formerly the AMD fab) fabbed the BFL device

I think Jensen (CEO) of Nvidia already has plans to build specialized mining "GPUS" for ether.
Entropy-uc
Hero Member
*****
Offline Offline

Activity: 756
Merit: 501


View Profile
January 05, 2018, 05:57:37 AM
Last edit: January 05, 2018, 07:21:19 AM by Entropy-uc
 #10

so I had assumed that all these guys were using standard IP libraries.. If what your saying is true, then its obvious why no one has picked up the IP from a defunct firm and ran with it.

Based on your comment I found a few details... seems like you have a good point

The BFL at 28nm was I guess 400GH/s at .27J/Gh or 1600GHS at .76J/GH ..but the layout size was massive compared to bitmain/bitfury, its seems like in addition to using standard libraries they had a significantly different design methodology

The bitfury at 28nm was at around .2 J/GH and supposedly the 16nm is .1J/GH..
The bitmain 1385 is listed at .18 J/G in 16nm at the slowest speed (21 ghs)

Interestingly Global Foundries (formerly the AMD fab) fabbed the BFL device

I think Jensen (CEO) of Nvidia already has plans to build specialized mining "GPUS" for ether.

So far the public intentions has been to offer mining gpus that don't have video outputs.  The sole purpose is to prevent miners from dumping their gpu gear onto the market used when the inevitable crash comes and they can't mine profitably.

Global Foundries operates on a standard contract fab model so it's not really surprising that they built the BFL devices.

I don't think the transistor level design requirement is that big of a barrier.  Bitfury did it by himself on a kitchen table over the course of a year.  The problem is you won't find a design house willing to work that way.  They have their tool sets and their work flows and they aren't going to diverge from it.  So you will need to buy your own set of design tools and find a team of borderline Asperger's cases to do the transistor design.  

Somebody should really fund a Professor to do the design work under an open hardware license.  One the transistor design for SHA256 is done you just have to bring that into the fab's design tools and optimize for placement.  Conductor losses are becoming dominant at these process nodes so that is where the biggest optimizations will be found.
the_electronrancher
Jr. Member
*
Offline Offline

Activity: 112
Merit: 4


View Profile
January 05, 2018, 06:47:44 PM
 #11

Borderline aspergers, lol.

I'd like to learn a little more about this transistor level implementation, I'm having a hard time picturing what could reasonably be exploded or minimized in the hash core.  Xor?  It's just flops and wiring otherwise, I would be surprised if the flop was exploded, but maybe - if you have any links to check out, it would be an interesting read.
NotFuzzyWarm
Legendary
*
Offline Offline

Activity: 3612
Merit: 2506


Evil beware: We have waffles!


View Profile
January 05, 2018, 08:03:14 PM
 #12

It is mainly all about efficient layout of the signal paths between the cores and coms. Like the Cray super computers proved decades ago, using very short and direct pathways with minimal reliance on multiple layers has a very dramatic effect on speed and power consumption. Standard Foundry IP blocks only care about functions and not optimum I/O speed between the blocks.

- For bitcoin to succeed the community must police itself -    My info useful? Donations welcome! 1FuzzyWc2J8TMqeUQZ8yjE43Rwr7K3cxs9
 -Sole remaining active developer of cgminer, Kano's repo is here
-Support Sidehacks miner development. Donations to:   1BURGERAXHH6Yi6LRybRJK7ybEm5m5HwTr
investorpgroovy (OP)
Newbie
*
Offline Offline

Activity: 58
Merit: 0


View Profile
January 07, 2018, 07:28:02 AM
 #13

I would never settle for just borderline aspergers on my engineering teams when I can hire full on aspies instead.

My main objective was to try and figure out if there was a way to get something to market quickly enough to challenge the incumbent players, the thing that bothers me specifically with bitmain is that they unfairly mine, driving up difficulty then release the parts into the market..

I started out in dram so I know once you get to the point that you need to do transistor level layout its a lot harder to jump into a new market essentially from scratch...and its become clear to me that there is no shortcut in terms of buying the IP from a defunct company as everything has advanced so much since anyone with a reasonably fast chip has been in the market... a better focus would be a "breaking" a different algo


that being said Its beyond my scope of knowledge but I wonder if there is a more efficient way to go about hashing fundamentally.

Bitmain  BM1382 calculates 63 hashes per clock cycle (Hz) and BM1384 calculates 55 hashes per clock cycle.
BitFury's BF756C55 is claimed to have 756 cores for about 11.6 hashes per clock cycle.
the_electronrancher
Jr. Member
*
Offline Offline

Activity: 112
Merit: 4


View Profile
January 07, 2018, 04:10:39 PM
 #14

Hashes per cycle means number of hash cores.  Each core is an unrolled sha engine, so you continuously feed data into the front end, and finished hashes come out the back end and you check result to see what difficulty result a particular nonce generated.

There is delay in filling the engine, but once it's full they all give one hash per clock as each clock starts a new hash on the front end, and spits out a finished one on the back end.

So you naturally want to stuff as many copies of the engine in as your little power supply lines can handle.  Smiley
Entropy-uc
Hero Member
*****
Offline Offline

Activity: 756
Merit: 501


View Profile
January 07, 2018, 06:48:32 PM
Last edit: January 07, 2018, 08:31:08 PM by Entropy-uc
 #15

I would never settle for just borderline aspergers on my engineering teams when I can hire full on aspies instead.



Actually you want engineers with a balance if it's any sort of a team.  With full on cases they will only be effective with a strong manager who can command respect on a technical level.  There's an amusing article on medium from a few months back talking about an example of this; it was something like 'We fired out best programmer and it was the smartest thing I ever did'.

Semi design isn't my area of expertise.  But as an outsider I don't understand why it wouldn't be feasible to have the transistor level layout be done in a platform independent way.  The would then allow for design debug to be completed at a low cost node for less than $1 M, then you could focus on building at the expensive node with confidence there won't be a catastrophe.  If that approach is feasible I don't really see why the whole thing couldn't be done in an open source fashion.  An open sourced transistor layout for SHA-256 would seriously break open the whole competitive oligopoly that exists now.  There are plenty of folks with double digits millions from bitcoin at this point that would see the benefit, so fund raising should be feasible.

I don't think there's much promise in pursuing other algorithms.  Basically ether's algo is the only one without existing silicon and a chance to survive long term.  I guarantee you there are people working on it already.

 
the_electronrancher
Jr. Member
*
Offline Offline

Activity: 112
Merit: 4


View Profile
January 07, 2018, 07:58:07 PM
 #16

Your idea about starting at a larger node is a good one, you would certainly want to debug on a cheap process.

 An open sourced transistor layout for SHA-256 would seriously break open the whole competitive oligopoly that exists now.  There are plenty of folks with double digits millions from bitcoin at this point that would see the benefit, so fund raising should be feasible.

This I think is the tough part.  Bitcoin has changed from a cool open-source environment to ultra-greed mode.  Those who have the ability to do this design certainly aren't going to want to do it for free and see some other Chinese or Russian shop take the design, kill them on manufacturing cost so the original project creators get pushed out of business, and then the takers become the next Bitmain on the originator's backs.

Entropy-uc
Hero Member
*****
Offline Offline

Activity: 756
Merit: 501


View Profile
January 07, 2018, 08:34:58 PM
 #17

Your idea about starting at a larger node is a good one, you would certainly want to debug on a cheap process.

 An open sourced transistor layout for SHA-256 would seriously break open the whole competitive oligopoly that exists now.  There are plenty of folks with double digits millions from bitcoin at this point that would see the benefit, so fund raising should be feasible.

This I think is the tough part.  Bitcoin has changed from a cool open-source environment to ultra-greed mode.  Those who have the ability to do this design certainly aren't going to want to do it for free and see some other Chinese or Russian shop take the design, kill them on manufacturing cost so the original project creators get pushed out of business, and then the takers become the next Bitmain on the originator's backs.



That's the whole point of doing the transistor design as open hardware.  You eliminate the biggest barrier to entry by putting the transistor layout into the public domain.  There still would only be a handful of folks that would go to masks at 10 nm or lower nodes, but they would be forced to keep pricing competitive because there are dozens of entities capable of entering the market.

I am sure you could find faculty that would find this a worthwhile project, and you could easily fund a few spins at an 8 inch 64 nm fab for under $1M.  That's 50 BTC.  I paid that much as a bounty to fix my FPGA supplier's garbage code back in 2012!
the_electronrancher
Jr. Member
*
Offline Offline

Activity: 112
Merit: 4


View Profile
January 07, 2018, 11:34:29 PM
 #18

Well, if you still have 50BTC you want to throw at it I have a couple of layout guys who will moonlight doing it.  Do the first tapeout at MOSIS, then buy a mask set once it's verified.  At that point, if you want to open source the GDS you're free to do so.
Entropy-uc
Hero Member
*****
Offline Offline

Activity: 756
Merit: 501


View Profile
January 08, 2018, 04:12:30 AM
 #19

Well, if you still have 50BTC you want to throw at it I have a couple of layout guys who will moonlight doing it.  Do the first tapeout at MOSIS, then buy a mask set once it's verified.  At that point, if you want to open source the GDS you're free to do so.

It would take a lot more than that.  Form a 501(c), build a project and test plan and publish a budget.  Identify a qualified team that's committed to moving forward if the needed resources are available, and the key milestones where funds are needed.

With that it really wouldn't be hard to raise the funds.  Whether it's via angel investors for a start up, or a kickstarter approach with a public domain solution as the end point would be up to you.
the_electronrancher
Jr. Member
*
Offline Offline

Activity: 112
Merit: 4


View Profile
January 08, 2018, 08:38:10 PM
 #20

Are you by any chance a marketing guy?
Pages: [1] 2 3 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!