Bitcoin Forum
May 10, 2024, 08:56:11 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
  Home Help Search Login Register More  
  Show Posts
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 [29]
561  Local / Майнеры / Re: Bitfury on: May 25, 2012, 04:39:15 PM
лучше выкладывайте халявные спеки, собрать в большом шкафу и прикрутить винтеляторы  сами сможем )
или продавайте платы, как в принципе все и делают

да тут не в этом дело - я же писал... больше вопросов во всяких ндс и заградительной хрени - американцам и китайцам получится устройство дешевле, как не крути, чем в РФ, Украину или страны ЕС. Причем ощутимо дешевле, даже больше чем интерес по стоимости самой прошивки или там сборки.

Есть тема ставить стойки в Исландии, возможно это также тема кому-то экспортнуть свои риги (если конечно Ваше государство возвращает при экспорте НДС). Исландия для развития коллокейшинов НДС возвращает. Это хлоп и сходу -20% стоимости железа.

Исландия - датацентр datacell.com контактное лицо - Андреас Финк скайп: andreasfink

базово - стоимость электричества около $0.04 ... если договориться для начала скажем на $0.10 - $0.11 и постепенно падать до себестоимости электричества - будет толк... выгоднее намного, чем ндсы всякие платить.

и тем более выгоднее экономии "на спичках" вроде сборки корпуса или стойки.
562  Bitcoin / Hardware / Re: FPGA development board "Lancelot" - official discussion thread. on: May 25, 2012, 03:43:42 PM
So the real contest is time; 28nm vs ASICs.

no, i mean:

130nm ASICs will fuck 28 nm FPGAs to shit.  Cheesy

Haha, yes I know they will, but if the 28nms come out before the ASICs they will at least have a chance of entering the market Grin

no, i think 28nm FPGAs will never have chance.

too many things will happen in 2013 and 2014.

These things are what I'm excited to see. I need to get some sleep now!

About 28 nm FPGAs chances... I've counted approximately translation of FPGA into ASIC. for example if my design translated - it would get approximately 8.7 million transistors count. And it is comparable to Pentium II design, so what we have with Spartan-6 0.045 um (45 nm) is what you could get if you squeeze hard into 0.35 um ASIC.

But squeezing design so hard into ASIC would be difficult, as many errors will happen on the way. It is LIKELY that builders of ASICs would squeeze more or less kind of simple VHD design, which would give approx. 3 times worse performance, and then start gradually improve technology with about 3-4 month iteration with each try. I think ngzhang understands well, what design mistakes - when in simulators it works but in hardware does not means when it gets to ASIC production.

So back to issue about ASIC vs FPGA - I suppose that 45 nm equivalent FPGA of Spartan6 class like 0.18 um ASIC.
Then 28 nm FPGA Artix7 could be like 0.112 um ASIC (if just counted, but I suppose comparable to 90 nm more, because it has CARRY CHAIN IN EVERY SLICE, AND I HAVE ROUND DESIGN THAT USES THIS FACT, AND WHICH IS 20% smaller Spartan6 is really bad with their Slice X stuff).

Then interesting thing about FPGA prices. They will fall if volumes will be bulk. This is why I am insisting on making FPGA-based products better, with better pricing - to make it at least competitive against ASICs.

Also - costs for chip production for vendor like Xilinx or Altera is not that much than silicon costs.... So production Spartan6 or Virtex7 does not make much difference in raw material / work cost. If they would want - they could sell say 6.8 billion transistor chip for $60-70 and not for $1k-$2k for specific needs, still they would earn profits. And this is huge risk for ASIC builders. Such chip indeed would be very powerful and definitely would blow off low-end 90 nm ASIC solution. And this is what could happen - Xil or Altera will just lower prices for some specific application of their chip, to take share of this. But this will only happen of course if there will be more or less significant sales amount, say we get all-together to levels 10k chips per month.

So there's no "cheap and secure" entry into ASIC world. Those who go with 90 nm will still compete with FPGA. And it is just only about organization of FPGA sales and production, if FPGA devices vendors would have so high expenses, that they could not resist such ASIC.

The killing solution however would be to get 28-nm chip with SIMPLE design from first order. It is doable, I believe in about $4 - $6 mio. But I doubt that someone would invest it this day. At some day it will happen of course. I quoted multiple companies already about ASIC when did FPGA-based design, and typically 90nm with investments about $500k could blow off Spartans, but would be hard to compete against 28-nm.

So, please comment ? If this is just hobby for you and you won't like to stand head-to-head with upcoming ASIC or not ?

Why do you think that 28-nm would not compete ?

You probably have up to date worked the most on bitstream design as well. Interested to hear your point of view, where I made mistake ?




563  Bitcoin / Hardware / Re: BitFury Design, Licensing, Mass production on: May 25, 2012, 02:10:20 AM
Also hole in center - but I cannot fit there round exactly as below - lacking 2 slices and it is difficult to make workaround for that,
I have one more question/suggestion. I understand that you're using LUT-RAM to store the the FIPS-180-x constants. I also guess that each hashing cell in your design has a dedicated private copy of that ROM. Have you tried pairing (or quad-ing) the neighboring cells to share a single ROM storage? If the cell clocks are in sync the neighbors could share those without paying to much cost in switch/route overhead.

I'm just asking, I'm not trying to question your skills or anything like that. I'm not involved in Bitcoin mining and I read this board for intellectual stimulation only.

Why not to wait before full disclosure ? It would be interesting anyway... I think that you do not even imagine scale of these optimizations, as you are asking questions that I started to ask when started this journey :-) I would even describe storyline how different decisions were make, and if would have enough time for that - arrange in a way that you could try to guess what was done next, not disclosing it before you read next page. Just be around to remind me to do this.
564  Bitcoin / Hardware / Re: BitFury Design, Licensing, Mass production on: May 25, 2012, 02:01:23 AM
I can't disprove you, and didn't intend to.  I was just asking what your price point was, should the community prove both willing and able to hit that price point.  I consider compatition a good thing, but it's still replication of work and if it can reasonablely be avoided, benefits all concerned.  Replication of work can't always be reasonablely be avoided.  I'm personally not interested in a fpga solution, even though I would consider it technically superior.  I'm personally interested in an asic solution, because I can use the waste heat to warm my home in winter, so I believe that basement mining operations will always be a factor, because professional mining operations not only have to deal with the overhead of dumping heat, they also have to outperform the basement miners to such a degree to overcome the economic advantages of co-generation.

Well. ASIC is much more expensive. If you would get simple ASIC that would just work, but you get it with 100% success probability at affordable level - that would be rather slow 90nm or 130 nm chip, it would be like what 28-nm chips could give with their best performance and special price from fpga manufacturers. Say if altera or xil would decide to sell their chips less but more q-tys for mining - then ASIC project would end up with failure. And it would took long time to develop it as well.

If you would like to make really tricky and quick ASIC, then there could be epic failures and risks are very high. I would like to design such ASIC, but only if backed with investments about $1 - $1.2 mio and having possibility to order several times multiwafer which would be thrown away as "bad" result.... But when I am talking about that conditions people say that I am completely crazy. But that is true - without doing actual experiments, you would not get TOP results, as going proven way - you will get proven results of course... Or even simpler things like e-asic or altera hardcopy.

However really quick ASIC will outperform, but lot of money could be just thrown away with it, so there will be gradually improving and competing ASIC solutions. And this will be risky hop... Look how difficult hop from GPU to FPGA is, however FPGA are superior... Still most miners use GPU, and I still see much space to promote GPU mining, when these cards are installed @ home. And with ASIC it will be even more difficult. So for ASICs to succeed difficulty AND exchange rate should rise to level, so whole mining market would increase in side - currently it is like $1 mio per month mined, then if exchange rate will not rise - it will be about $0.5 mio per month... For low-end ASIC it would be tough play, also when you consider that optimized ASIC or discounted 28-nm chips would blow it out from game even without returning NRE costs.

Also ASIC must be available to interested parties, because myself is interested in condition, that there will be no single party in bitcoin who owns network... And if that happens - I'll really work hard to prevent this condition. Also I see that FPGAs can be competetive to ASIC while GPU would have much bigger gap in performance, so ASIC over GPU takeover is much more likely than ASIC vs FPGA.

+++ for mining co-generation - that's really nice. I can now implement pool and heat in in summer :-) without mining I would not waste heat for that :-) Now I have more than enough useful heat. However many still thinks that it is crazy way.
565  Bitcoin / Hardware / Re: BitFury Design, Licensing, Mass production on: May 25, 2012, 01:40:14 AM
BTW re power Molex: Remember, Molex chain has a maximum of 5 amps/60 watts of 12v, and 5 amps/25 watts of 5v. That is not per plug, and a lot of chains have 3 or 4 plugs on it. PCI-E 6 is 2 amps/75 watts of 12v, PCI-E 8 is 4 amps/150w of 12v, P4 is 16 amps/192 watts of 12v, EPS12v is 32 amps/384w of 12v.

If you're putting 6 or 8 FPGAs on a board inside the rig, you're going to want to be using PCI-E plugs not Molex.

6 Amps per pin limit for molex from datasheet, if I remember it correctly and about 100 times plugging before tin
degrades guaranteed. So this should not make big problems. 4 pins used, 2 pins for +, 2 pins for -, 12V powered.
The only problem is how to make efficient down conversion from 12 V to 1.2 V. I used TPS40090-based one,
and its COP and size is not very nice, however works without any problems.

Molex's peripheral cable uses AMP 1-480424-0, AMP 60619-1, AMP 1-480426-0, and AMP 60620-1 for the socket housing, socket, pin housing, and pins respectively. This is rated for 13 amps. However, these are commonly wired with 18 AWG wiring. Putting 13 amps through this will melt the wiring, let alone 13 amps on every single plug in the chain. And although it has 4 pins, only one set are for 12v, the other set is for 5v.

1.5 sq. mm wires. ( AWG 15 ).


Also, I hope you're using enterprise rated PSUs. You don't need redundant, but the largest non-redundant PSUs you can typically buy are in the 850w to 900w range. Not quite the 1100w you're aiming for. They do make 600-700w 2U non-redundant PSUs though, so you can always double up.

For current big bf-110 we used RSP-3000-12 - 200 Amps @ 12 V . Boards however consume only ~160 Amps. So there's reserve for overclocking or for grid under-voltage condition. I expect to put something nice and reliable installed into box as well. They're not cheap, but it is crazy to put cheap PSU to power such expensive equipment.

Quote
Well - I am using now 3-phase 0.4 kV L-L @ 33 amps when chiller is running and @ 22 amps when chiller is not running. I think that if someone would like to install many of these things @ home - he should invest in cable connected directly to transformer to not disrupt power distribution for other houses. How much would it cost to me, if I would ask in America utility company for say 50 kW power to my household ?

I have installed PSUs with power factor correction (PFC). and it is important, as otherwise power losses would be significant.

About DCs I thought that way - 350 W per 1 U x 4 = 1.4 kW.... Otherwise how they would setup 4 x 1 U @ 350 W ?

In the US we use 240v single phase for electric ovens and driers and electric water heaters and larger air conditioners. Its provided as two hots and a neutral, with the two hots 120v lines that have their phases 180 degrees out of phase (so peak + on one is peak - on the other). Seems that Europe uses 3 phase 400v for the same reasons.

If we're limited to 4 120v 20a circuits per rack (which is what I've been told is a common limit, and you have to pay extra for the other two), and each 4U uses 9a (= 1100 watts) and we place an empty U between each (for airflow/cooling and easy maintenance reasons), thats 40u used in a 42u rack, or 8800w.

You shouldn't include your chiller, because a DC provides its own cooling. So, 22 amps of 400v is 8800w.

So, either your math is wrong or you misstated something, because we're coming out to the same power usage.

I have not misstated. 0.4 kV L-L - that's about 230 Volts phase - to - neutral fed from transformer. When it gets to me, due to losses in cables - it gets to standard 215 - 220 Volts phase - to - neutral voltage. And then
22 x 218 x 3 = 14.4 kW approx.

I am not using 4U design right now - 4U design is during planning... look @ photos on www.bitfury.org - that are the bf-110 used now for mining!

566  Bitcoin / Hardware / Re: BitFury Design, Licensing, Mass production on: May 25, 2012, 01:26:06 AM
I understand that developing such a product requires huge investments and risks, but this community is rabidly pro open source.  You might succeed at this for a time, but someone is going to replicate it and release it to the wild eventually.  I'm sure that you understand that, so the goal is to get as much profit from it as you can before the open source developers catch up.  May I offer an alternative to all this?  If you have an idea of how much revenue you expect to earn, let us know.  You can proceed with your licensing method however you see fit, but if the community can raise the funds, would you accept a collective donation in order to release the bitstream open source?

Well. Here's the plan - finish before 20th-30th June design of smaller device (4U or 2U sized - we have debates with shalab.si about that). To accomplish it, I actually need at 6th-15th June spec ready and circuit in approved state. So if till 1st June there will be no constructive discussion here about device, then I'll go finishing it alone. As far as I understood for many people that's just nice hobby... But I am kind of person who dislikes things half-done and half-working... So I've already invested in it half year full time + many nights into it and all non-spent cash as well, and I expect others who would like to deliver hashing power to have more serious approach as well. About open source - if someone would do this "as hobby" investing like 10 hours per week or so, he would need approx. 40 weeks to use all clues I gave already to make design. By that time I would glad to know that person, because his skill is great, I would be even more excited if someone would make it 350-370 Mh/s without rising VCCINT voltage.  But I consider case that someone replicate within next 33-50 days AND PUBLISH OPEN-SOURCE as low chance. Then at that time _WE_ will have everything ready for deployment. And yes - I love competition as it enforces you to do best things and be one step ahead.

Meanwhile still doing design negotiations who will make assembly of boards and where, who will do assembly of these devices and delivery and who will sell and service them worldwide. Then we'll start production - I aim at 2-3 month lead time, but during first period I plan to reinvest all of income into devices production to lower lead times and make interesting offer - basically you pay higher if you get device quick. So if someone would like to install 10 devices, he could start testing first one tomorrow.

During production period I expect to ramp of from 500 chips per month level to 1000-1200 chips per month delivered. within 6 month frame. That is about 4500 - 5000 chips delivered. But it could be better, if additional financing would come in play, that would allow to use other equities to finance bitcoin farms productions instead of just paying in cash. Actually I would like to push spartans hard and make target higher at 10k chips, but that would be difficult to get.

So expectations are for licensing - $100k - $200k for chips licensing and $25k-$50k for bitstream hashing power improvement within current offer.

Then, when spartan's will not be hot anymore - all of existing solution will go opensource, and main production line would switch to Artix, so price per Mh/s would drop another time and profit margin for manufacturing will be about difference between DIY spartan6 and artix7 solution.

So what is problem with opensource - I am pro-opensource, but I really doubt that with donations it is possible to get that amounts for such non-popular project like spartan6-miner. If it would be mass-product with say $1 - $10 donations it would work however. It is the problem I see - amounts are high compared to % of users interested in it within current bitcoin community size. Manufacturers of boards unlikely would jump into it. Miners possibly, but that depends % how much they would invest into disclosure and how many chips they would get then. And I believe many of miners would just hate this initiative, because they would like better that things won't change, they get still 50 BTC coinbase, etc. etc.

But if I am wrong - then I expect you to prove it. Let's develop procedure on how fundraising will be performed, how much efforts needed to advertise it, who will oversee funds (say what will happen if only 30% of target amount raised, as funds should be returned back). Right today I am not bound with any agreements about this bitstream usage, but I think it will be fair and necessary step of mine to use raised money and give buy offer to all current co-owners of current equipment, because this open-sourced bitstream could give sharp difficulty rise. Also this should be done relatively quickly - within 4-week time frame. Because when designing will be finished, I would definitely enter into bindings with different parties, and would not be able to decide alone whether this should go open source or not, and it could be difficult to get consensus. But anyway - this will be interesting case - if it works, then definitely for many chips there will be best bitstreams available, nice competition among designers would rise.
567  Bitcoin / Hardware / Re: BitFury Design, Licensing, Mass production on: May 24, 2012, 11:08:06 PM
JTAG might work but I don't think you're supposed to use JTAG that way. If you can make it work and not damage the hardware, go ahead, its your product, all of us will just treat it like a black box. BTW, BFL minirigs are using SATA connectors for their serial communication between boards, and Ive seen other designs repurpose connectors SATA internally or Infiniband connectors externally.

Well. Yep. JTAG could devastate things.... So better to leave there only bitstream programming and communication
with chips.

BTW re power Molex: Remember, Molex chain has a maximum of 5 amps/60 watts of 12v, and 5 amps/25 watts of 5v. That is not per plug, and a lot of chains have 3 or 4 plugs on it. PCI-E 6 is 2 amps/75 watts of 12v, PCI-E 8 is 4 amps/150w of 12v, P4 is 16 amps/192 watts of 12v, EPS12v is 32 amps/384w of 12v.

If you're putting 6 or 8 FPGAs on a board inside the rig, you're going to want to be using PCI-E plugs not Molex.

6 Amps per pin limit for molex from datasheet, if I remember it correctly and about 100 times plugging before tin
degrades guaranteed. So this should not make big problems. 4 pins used, 2 pins for +, 2 pins for -, 12V powered.
The only problem is how to make efficient down conversion from 12 V to 1.2 V. I used TPS40090-based one,
and its COP and size is not very nice, however works without any problems.

You're probably only going to see GPIO on ARM boards meant for embedded usage, otherwise you're looking at a custom build of a standard USB->serial controller. Raspberry Pis are like $25 and have enough power to run cgminer, so if you can get them in bulk it seems they'll do what you want.

Problem is you might run out of GPIO pins because you'd need one to every controller (one per 4?).

Just whatever you do, don't put USB cables in the case. They easily get disconnected during transit and sometimes just unplug themselves due to case vibration or high pressure air cooling.

That's it! I felt that USB is good only when you connect one board device to your home PC :-)

Quote
Quote
200 Mh/s bitstream - would produce 18 Gh/s - it would be compared to 71% of Mini-Rig and so product price could be $10'859.
250 Mh/s bitstream - would produce 22.5 Gh/s - it would be compared to 89,2% of Mini-Rig and so product price could be $13'643.
300 MH/s bitstream - would produce 27 Gh/s - it would be compared to 107% of Mini-Rig and so product price could be $16'365.
325 MH/s bitstream - would produce 29.2 Gh/s - it would be compared to 116% of Mini-Rig and so product price could be $17'722.

Remember, if you're overclocking these, provide some way to underclock+undervolt these to extend the life of these once diff goes too high in 3-4 years. This is a big feature that a lot of people are asking for.

These are not overclocked, overclocking could give +20 - 25%, but will consume more power and possibly damage chips permanently.

Wait, you're doing 325 mh per Spartan 6 SLX150? Holy crap man. Is that just back of the napkin math, or do you have a working bitstream that can do that?

not working yet... bitstream requires approx. 300 hours of additional complex work to make that working. Just if you look carefully, there's unused approx. 1200 slices in the left part of the design + approx. 40 DSP48 + 80 BRAMs Smiley my estimations that I can fit there 6 rounds definitely. 8 sha256 cores would be quite difficult however. which would work at same clock rates like rest of design. And also there's about 550 slices left in top area - again - enough to fit 2 sha256 cores. Also hole in center - but I cannot fit there round exactly as below - lacking 2 slices and it is difficult to make workaround for that, but maybe I can put there special smaller round that would work at 2/3 of main clock. So it would be certainly 90/65 x clock speed compared to current 82/65 x clock. But if there will be request to implement it (as I stated in license - I would jump in @ 5000 chips approx at $5 per chip level) - I will start doing that work. 90/82 speedup is minimal estimation, actually it can become 92.6 / 82 - but I don't know if I really could get it... it depends whether BRAMs will be used or not actually if BRAMs will be used - then 8 additional cores seems feasible, or at least 7 cores... if however they will not work even at smaller clocks or will need additional set of registers after them, then I'll have to get back to concept without BRAMs and it will give 6 rounds. I've estimated this already - it is very quick... then more or less difficult is write correct code - I believe that it would take about 60 hours for two different DSP-BRAM and DSP-nobram models and about 40 hours for top rounds modifications... And then most interesting game starts - fitting it into a chip :-) Which would require lots of design modification, manual separation of control wires, looking how routing of these new cores would intersect with routings of big design.... This will take all of the rest time... Maybe if I'll be lucky it would be done in another 100 hours, but as from previous experience - it is not the way - as you only think that you are "near", while in reality you have not finished half road.... That's like estimating distance to oasis in desert :-)

Still, consider some method to undervolt them. At some point within the next 2-3 years for most users, the cost of electricity will exceed the value of the coins, but undervolting will catch them up for at least another year of usage.

I actually think that programmable voltage in range 1.15 V .. 1.45 V would be good. If someone would decide to risk with Spartans but get higher returns - then that's OK. Their devices would give +20%. If one would decide to go with less electricity consumption - then switch to 1.15 V and lower clocks. It just adds challenge with PSU - it should be still efficient in all ranges of power consumption.

Quote
Also, another thing, make sure the total power usage for the box fits so an integer number of these fits on a 120v 20a  (the most common circuit in DCs, you often get two of these per rack). ~1000 watts each would be fine if you intend on putting two on a circuit.

Well one box 4U would consume about 1.3 - 1.5 kW power that's 10,8 - 12,5 Amps @ 120 V seems to be bad.
With 11-12 boards it would be not more than 10 Amps @ 120 V however.

Well, on typical hardware, you shouldn't exceed 10 amps on 120v. That gives you a 1200 watt continuous PSU. Two of those will just barely fit on an enterprise 120v 20a line in a DC and hopefully not trip it, or one will fit on a household 120v 15a line (note: household lines _suck_, never drive them at >12a 24/7).

There ARE servers that require 208/240v service in DCs, usually some nearline data warehousing server that you shove 40 drives in, but a lot of DCs don't offer this or require a special order for it, and no house in America would typically have that service outside of an electric oven range or dryer socket.

Well - I am using now 3-phase 0.4 kV L-L @ 33 amps when chiller is running and @ 22 amps when chiller is not running. I think that if someone would like to install many of these things @ home - he should invest in cable connected directly to transformer to not disrupt power distribution for other houses. How much would it cost to me, if I would ask in America utility company for say 50 kW power to my household ?

I have installed PSUs with power factor correction (PFC). and it is important, as otherwise power losses would be significant.

About DCs I thought that way - 350 W per 1 U x 4 = 1.4 kW.... Otherwise how they would setup 4 x 1 U @ 350 W ?
568  Bitcoin / Hardware / Re: Algorithmically placed FPGA miner: 245MH/s/chip and still rising on: May 24, 2012, 09:07:27 PM
About "there's no long lines" - I've already commented, but will try to draw it, where epic fail for parallel expander is exactly....

say computing w0+w1 and feeding to w9:

                                        ---+---------------------------------
                                   ---+---------------------------------
                              ---+--------------------------------
                          ---+-------------------------------
                     ---+------------------------------
                ---+-----------------------------
           ---+----------------------------
      ---+----------------------------
 ---+---------------------------
w0 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13 w14 w15 w16

How many wires ? biggest cross-section just for that ? 9x32 bits :-)
The same happens when pushing w9 to w16... and w14 to w16...
Lazy to calculate - but near 512 bits cross-section...

I've thought about this - it actually prevented me from falling asleep yesterday night.

A fully unrolled, pipelined miner would use 125 single columns, or maybe 125 double columns. The latter case leads to the dreaded U-turn.
The "current" sixteen w values, 32 bits each, would be percolated down the 125 columns as well. At 4 slices per 32 bits,
that's 64 slices.
I see this, for instance, from ZTEX's source code.

Now, in order to retrieve w[i-16], 32 VERTICAL wires are needed.
In order to retrieve w[i-15], another 32 VERTICAL wires are needed, but, as you correctly point out, w[i-16] and w[i-15] can be combined, so only 32 wires are needed. w[i-7] can be added to the sum of w[i-16] and w[i-15], so still only 32 vertical wires are needed. Likewise for w[i-2].

When I said that in a fully unrolled miner no long lines are needed, I meant no HORIZONTAL long lines.
For instance, in ZTEX's Verilog code, references are made to the current stage and to the prior stage only.

OK, I don't know out of the top of my head how many vertical wires are available in a Spartan-6, but I just tried to make the case that only 32 are needed. If 32 are not available in a single column, then two columns per stage have to be used, which leads to the dreaded U-turn.

Well - interconnect goes between switches. there's 2 slices per switch (slice L/M and slice X). so chip is like 70-72 switches wide (including BRAM/DSP) and 192 switches tall - means horizontal quad interconnect is 2.5 times larger than vertical............. So it is wise to use horizontal interconnect for W round... and I tried such designs, however it starts consuming switches when you would like to make U turn............ because you would add there some registers etc...... painful and tough...
569  Bitcoin / Hardware / Re: BitFury Design, Licensing, Mass production on: May 24, 2012, 07:51:32 PM
Second, I was surprised by reaction about comparision with ButterFly Labs with 'Estimated price'. People, that's $90k is with 20% VAT, which is not paid in US for example and with having in mind that it would have single hot air outlet and that's all about it... but to get mini-rig we would pay +20% on customs VAT, so it would be not $15'295 but $18'354 + shipment costs,

Remember, US customers don't pay VAT, so you've probably quoted a EU price that is useless for US customers. If you can get BFL Minirig prices on Spartan 6 hardware, BFL is screwed.

Well, if we count on bulk purchase by miners directly via assembly house without any intermediate - then BFL is blow
away definitely. As BitFury-110 was built for less money than BFL asked, if consider that we pay VAT. And it is not possible to bring such quantity of BFLs through customs and declare them as personal items when nobody cares.

Quote
Third, 300 Mh/s is not limit of this bitstream, it can give even more rounds, if you count that almost all DSPs are not used in left part of design, and some free space in topmost part. So it could definitely give 8% at least better performance, it would then cost to us $0,57 per Mh/s, which is even less than BFL Mini-Rig.

Someone said in another thread that the sea of hashers design will have three types of cores, ones near DSPs, one near BRAM, ones near neither.

There's already three types, more or less common however. One difficult piece is 28-bit adder in the middle as there exists cuts for PLL/DCM modules of slices. It was tricky to implement, but done.

Quote
I've been asked by email and skype about smaller editions. And I would say that in my opinion best solution would be standard 4-U chassis, 0.5 meter long, with 14-15 boards with 6 chips on board installed (that is 84 - 90 chips), like with shalab.si original ideas, but a bit different layout to prevent overheating chips.

That gives people mostly what they want.

Yep, and my initiative is to make things more or less compatible. Why to invent multiple fancy form-factors with fancy fans ? When it would be nightmare to support it for miner ? Mining is not that profitable - to play with the equipment long days and nights. Unfortunately instead of getting here exact numbers / requests / discussion about future design, we have more common discussion. Every designer still wants to invent own kind of wheel. I understand this of course, even for myself - because every developer thinks that he understands details better, which is actually true... But when it comes to maintenance - it is better to have simpler things as MOST RELIABLE THING - IS THE THING THAT DOES NOT EXISTS.... So there's tradeoff between creativity and will to produce maintainable and reliable devices.

Quote
Withing this chassis single Intel Atom D525 motherboard is installed. Boards with 6 spartans can be even without microcontrollers and flash, everything could be programmed right via LPT-port. Bandwith required to communicate with every chip is quite low - about 300 bps. So with all chips it would be about 27 kbps. Bitstream loading over LPT port however will be slow. For smaller scale RS-485 is overkill. Why to bother about it and not implement using USB - simply because flashing chip or flashing controller
adds up cost of controller and also costs of programming and testing them, also when something should be updated, and you have to reprogram every controller - that rises service cost. I would like to say, that current design of BitFury rack, where controller only translates RS-485 to SPI bus with Spartan and back requires almost zero maintenance.

Use SATA plugs for the actual connector, but normal serial over it. SATA has 7 pins and is enterprise ready. The cost of serial is the complex plug, not the actual design. You could use a very tiny FPGA for the controller on each board to interface with the serial using GPIO pins or something.

SATA plugs! thanks, nice idea! 4-in cables are there.... This as simple as just installing proper size of SATA PCB layout on PCB! $1.67 x 16 = $26.7 for all of the connectivity. Just connecting each board to another one. Simply installing them and connecting. 1.6m long wire.

http://www.satacables.com/

About controller - I've mentioned only 500 bps per chip, so for 90 chips it would be max. 45k bps in and out...
Information about protocol is avail in initial post. So pinout can be following:

PIN1 - SCK
PIN2 - MOSI
PIN3 - MISO
PIN4 - GROUND
PIN5 - RESET
PIN6 - PROGDATA
PIN7 - PROGSCK

So it would be possible (a) upload bitstream (b) reset all chips (c) send/receive work.

OR - ANOTHER POSSIBILITY IS TO USE JTAG (but I don't know actually how well it work via such long chains like 90 chips, even when TCK, TMS will be transmitted using buffers.

PIN1 - TMS
PIN2 - TCK
PIN3 - TDI
PIN4 - GROUND
PIN5 - UNUSED
PIN6 - RST
PIN7 - TDO

Please not that key is not important, as improper insertion of cabling won't fry things. However it would be
perfectly possible to transmit jobs over JTAG. JTAG also seems to be nicer, because key can be programmed into board right using this slot.

So single SATA and single power molex. Are there cons in JTAG vs 2 SPIs ?

Quote
Cost of such chassis with power supply and Intel ATOM motherboard could vary in $400 - $600 range. Cost of Spartan6 chips when purchased in bulk quantities (WITHOUT VAT) would vary in $70 - $95 range, depending on shipment location and quantity of chips ordered. Cost of other components (using numbers from our current design):

No. Use something smaller and lower power to run this, not some shitty Atom board. Use a Pi or that new $50 Via x86 board http://www.geek.com/articles/chips/via-launch-a-49-android-pc-20120522/

Nice board, is there any other boards that support GPIO for example or SPI or JTAG output without any additional converters ? So I could launch software on linux ? That would be excellent for this project. Possibly with 2 cores, so one core could drive GPIO in realtime manner, while other core is communicating with world. But not necessary, as SPI tolerate lags!

Quote
200 Mh/s bitstream - would produce 18 Gh/s - it would be compared to 71% of Mini-Rig and so product price could be $10'859.
250 Mh/s bitstream - would produce 22.5 Gh/s - it would be compared to 89,2% of Mini-Rig and so product price could be $13'643.
300 MH/s bitstream - would produce 27 Gh/s - it would be compared to 107% of Mini-Rig and so product price could be $16'365.
325 MH/s bitstream - would produce 29.2 Gh/s - it would be compared to 116% of Mini-Rig and so product price could be $17'722.

Remember, if you're overclocking these, provide some way to underclock+undervolt these to extend the life of these once diff goes too high in 3-4 years. This is a big feature that a lot of people are asking for.

These are not overclocked, overclocking could give +20 - 25%, but will consume more power and possibly damage chips permanently.

Also, another thing, make sure the total power usage for the box fits so an integer number of these fits on a 120v 20a  (the most common circuit in DCs, you often get two of these per rack). ~1000 watts each would be fine if you intend on putting two on a circuit.

Well one box 4U would consume about 1.3 - 1.5 kW power that's 10,8 - 12,5 Amps @ 120 V seems to be bad.
With 11-12 boards it would be not more than 10 Amps @ 120 V however.


570  Local / Майнеры / Re: bitfury облизывайтесь ;) on: May 24, 2012, 01:35:39 PM
Вчера пытался найти, где находится производство... я правильно понял, это Украина?
вы очень эмоцианальны, наверно с самых югов Wink но даже там так дела не ведут.
если это ваша разработка, потрудитесь представить ее как следует, чтобы всем было понятно и глупых вопросов не возникало. Если вы нацелены продавать ее только буржуям, то не совсем понятно ваше появление тут и этот замечательный пост.

С наилучшими пожеланиями Вам из Таврической губернии ))))))

Насчет эмоций это дело такое... обычно нет... но обсуждение в топике очень сильно улыбнуло. напомнило рассуждение колхозников о том, что председатель колхоза все разворовал и все плохо. а не о том как это сделать, и т.д - поэтому так с чувством юмора и отписал.

Производства своего нет. Заказывали у поставщика+контрактного производителя электроники. Сам заказ выглядел примерно так - 1.5 месяца согласовывали цены (обещалась максимум неделька-две) и условия поставки. 1.5 месяца собирали деньги на батч производства. И еще 3 месяца производили (тут тоже обещалось 7-8 недель). Плюс каждый этап процесса сопровождался хорошими порциями головной боли. У себя делали только собственно сборку стоек.

Насчет цели продавать ее только буржуям - нет такой цели. Но надежных партнеров, которые хотели-бы влазить в головняк с производством, учитывая сопутствующие риски, при небольшой марже - тоже нет. Т.к. у нас люди, которые занимаются производством больше любят производить какие-нибудь электронные счетчики с маржой в 300%. Чем подобные сложные устройства с маржой в 10-20%.

Далее с продажей возникают юридические моменты - если это оборудование под заказ физ-лица или юр-лица, которое Вам по контракту собрали - это одно. А если это товар на продажу - это другое (связанное с оформлением документации на продукцию). Т.е. к примеру если битфюри стойку сделать как полноценный продукт, аля роутер циски и получить возможность официального его экспорта - то как раз где-то в $80k она и встанет без дополнительного охлаждения. А чтобы было меньше - нужно больше стоек собрать, чтобы расходы на организацию ее оформления размазались. С холодильниками тоже ввиду различного регулирования свои ньюансы есть - собранный холодильный аггрегат на Украине вывезти будет тоже не просто - готовое изделие проблема, полуфабрикатом - нет проблем. В контрабас играться желания ну никакого нет вообще - доходы тут низкие, а сроки согласно законодательства весьма конкретные. Далее - при экспорте это оборудование потребует еще дополнительного оформления ввиду ньюансов связанных с криптографией и возможным двойным назначением подобного продукта - это еще одни дополнительные расходы, обосновать службам и ксайлинксу (они тоже за этим присматривают), что чипы не уйдут туда куда не следует, и не будут применяться например в устройствах аля GSM-дешифраторы и т.д. Поэтому - мы оборудование и НЕ ПРОДАЕМ.. и на сайте нигде не сказано, что мы что-либо ПРОДАЕМ, т.к. это решается в каждом случае индивидуально в зависимости от законодательства страны, в которой оборудование собирается и эксплуатируется.

Поэтому - для Украины и для России сборка у местного контрактного производителя электроники более актуальна, чем скажем импорт или экспорт готовой продукции. Т.е. грубо насобирался батч заказчиков, которые под конкретную техническую документацию ЗАКАЗЫВАЮТ ДЛЯ СВОИХ НУЖД печатные платы - произвели батч. При этом заказчики понимают, что у них не продукт в руках, который они потом могут свободно рекламировать и продавать, а вещь для внутреннего использования (конечно на свой страх и риск рекламировать и продавать могут - но это уже их проблемы). Хотя если кто-либо желает серьезно вложиться в выпуск именно такого продукта - мы рассмотрим варианты сотрудничества.

Теперь по России - не знаю никого из контрактных производителей. По Украине есть несколько вариантов по производству - но объем нужен на 250 чипов в месяц МИНИМУМ, чтобы было желание выбегивать все производственные вопросы. И расходы на "выбегивания" составят порядка еще $25-$30 сверху на чип + возможно какие-то дополнительные расходы связанные с решением очередных "проблем".

Чтобы было понятнее - с таким заказом - это нам (вам) больше контрактный производитель нужен, чем наоборот,
придется приседать и делать "ку" (фильм киндзадза смотрели ? ) Плюс по срокам тоже могут быть срывы.
Далее - по Украине уже есть договор на поставку чипов по спец-цене, но вот заниматься их экспортом и продажей мы не имеем права - только в готовом оборудовании, а экспортировать оборудование из Украины смысла особого нет, поскольку те-же потери на НДС при экспорте вернутся частично, плюс потребуются значительные усилия по оформлению продукции. По России нету - поэтому первый батч придется готовить в количестве от ~650-700 чипов. Если попросите 100 чипов - ценник дадут около $220 за чип, и никто заниматься особо не будет.

Насчет экскурсии к стойкам - в принципе не проблема, но не понятен смысл экскурсии ? Оно ведь так удаленно "завораживает" и масштабом и необычностью, а потом через минут 15 становится скучно... Ну гудят, считают... Насчет вопроса "а насколько это головняк ?" - могу и без визита сказать - что серьезный головняк. Хоть стойки обслуживания уже не требуют, но поскольку это все хозяйство за городом установлено, а загородний интернет 3ж в киевской области полное трижды ж, то и результат соответствующий - основная сейчас головная боль - как аггрегировать несколько неустойчивых 3ж каналов в один, чтобы работа гарантированно наружу выходила.

Далее - по поводу того как будет выглядеть более приемлимое устройтсво - открыл ветку на английском языке. Прицел - 4-U корпус, в который можно набирать платочки с вычислителями. При этом платочку хочется унифицировать. Плюс возможно сделать тот-же УСБ на платочку, но как дополнительную опцию - надо - взяли и впаяли туда LQFP-чип контроллера ручками и разъемчик. Т.к. собирать риги на шпильках или еще как-то с USB-шиной это не правильный подход... Плюс корпуса лепить самопальные тоже не правильно - т.к. правильное охлаждение сделать - это тоже отдельная инженерная задача, которая требует тщательного подхода. Имея стандартный корпус - для Украины скажем производства inpc с intel atom'ом - дальше можно будет риг развивать.
Причем возможно другие платки туда тоже можно будет ставить - нужно опередиться с форм-фактором платки - предположительно 180 x 160 mm хватит. в корпусе место под стандартные 230 x 160 мм вертикально оставить. крепления для платок. далее - высота платки (т.е. сколько их влазит в корпус) - возможно тут (а) - подстроиться под 25 мм высоту или (б) - поставить в нижней части корпуса специальный крепежный элемент, чтобы различного размера по высоте платки можно было ставить (в) блоки питания - чтобы не было проблем - нормальные сертфиицированные блоки - meanwell тот-же.

Ну посмотрим как пойдет.... пока что идет неудовлетворительно... все хотят сделать именно свой вариант платы... а это плохо - т.к. была-бы унифицированная плата - можно было-бы сделать под нее специальный тестировщик, чтобы проверять работоспособность плат еще до "отгрузки". А по другому получится как у нас - 3 чипа из 360 не дышат.
571  Local / Майнеры / Re: bitfury облизывайтесь ;) on: May 24, 2012, 02:04:31 AM
на 90к прайса, еще транспортные расходы составят 20% стоимости + растаможка 30% + кустомизация + 30% и вытекает уже под 200к
а еще свою ТП на 20кВт + дизельку, это еще 500к как минмум...

Вот бл@ диву даюсь... деградация на лицо... буржуи как-то корректнее соображают... там что написано в тексте-то ? ориентировочная цена, да ? и от чего зависит цена, а ? ну вы конечно в чем-то правы - в принципе принято цену ставить с потолка.... не хватает на летнюю резину для БМВ... не порядок... надо где-нибудь срочно украсть... иначе всьо пропало )))) а вы тут всякие сказки выдумываете про кустомизацию под 30% и прочее говно... еще скажите что я у вас весь газ стырил )))) для порядку )))) что-то же надо сказать )))) как говорится когда есть фонтан )))))

чип учитывая пошлины обойдется примерно в $110-$120 в Москве при батче в 650 чипов
платка сама обойдется где-то в $0.5-$0.55 Mh/s - как раз из-за всяческих пошлин и прикруток ваших сборщиков.

далее - собрать риг с обвесом (блоки, охлаждения и прочее) - это от $5k до $15k в зависимости от того как, кто из чего делает и где это ставит. можно куда-нибудь в мерзлоту поставить - будет дешевле по охлаждению. еще $0.05 - $0.15 Mh/s сверху.

далее - лицензия на битстрим - предположительно $20 - $25 / чип. но нужны объемы. меньше чем 500 чипов прошивать никто ездить не будет. но это предположительно.

предполагаю дальнейшее развитие темы - производство заглохнет толком и не начавшись (кто-то соберет бабло и вложит в МММ)... пойдет бурление говен... а люди пойдут дальше долбить МММ (пытаться отбить потери)... как в фильме пираМММида - "пропала Россия, скоро все мычать будем"...

С наилучшими пожеланиями Вам из Киевской Руси ))))))
572  Bitcoin / Hardware / Re: BitFury Design, Licensing, Mass production on: May 24, 2012, 01:23:32 AM
Would it be possible to have some sort of client/server setup for the bitstreams? Basically have some way to pay for a license and run a properly designed miner that pulls down the bitstream from the server in a secure fashion. I guess in the end, once the user has the bitstream loaded on to the fpga they could simply unload it and store it locally? I'd pay money monthly for a bitstream that dramatically increased my hashing power if it meant keeping that bitstream secure. Would it be easier to have a kickstarter fund started up where miners and fpga manufacturers could pay to opensource the bitstream?

Unloading bitstream is impossible....

About remote activation - it is pretty possible thing. There's actually several possibilities - one possibility that bitstream reads Device DNA code (it's serial number), encrypts and transmit it to server, server sends activation code, bitstream matches it - and if Device is in white list proceeds with execution. But this is possibly moderate protection, as bitstream could be reverse-engineered and wires going to Device DNA could be replaced with connection to logic fabrics.

More complex protection would require 2-stage loading - first-stage bitstream that is auto-generated, and mainly generates "special" kinds of public-private key pairs with embedded proof of work (like calculating it for 4 seconds in true Spartan). It would not work in simulators, and it would be difficult to reverse-engineer as code for calculation is generated. Then comes DPR (dynamic partial reconfiguration) to load first protected bitstream.

Also this bitstream could be "rented" - say working with nTime less than specified.

But pity moment about this I wrote you is that it requires quite high effort to be done, compared to fusing AES key.

It is bad that chip manufacturer implemented AES only, because if they would implement in silicon some public-private key infrastructure with Xilinx certificate - it would be much simpler.

I would just take public key + xilinx signature for public key that you supplied, check against xilinx certificate and done, I can deliver bitstream. This would work like charm, but there's no such feature.

Even with e-fuse it is less protection compared to SRAM + battery for AES key.

Just compare attack cost... You pay $20, and get thing that costs say $50k+ you could pay additionally then $20k just to break it. I understand that you are interested, believe me that when there will be no heat about Spartan - I will disclose bitstream + disclose ways how it was achieved, because it can be valuable for learning.

EDIT: https://bitcointalk.org/index.php?topic=49971.msg918095#msg918095 - claims that bitstream format is available under NDA... So - as long as you can reverse-engineer bitstream and patch it - protection won't work. So it would be just lost time. Protection should be assisted with hardware.
573  Bitcoin / Hardware / Re: Algorithmically placed FPGA miner: 245MH/s/chip and still rising on: May 24, 2012, 12:13:56 AM
In a completely unrolled design, there are no long lines.
The start vector is fed in on the left side, then the calculations percolate down to the right, and at the right a "matching" circuit determines if a "golden nonce" was found. There is no feedback from the right side to the left side.
Thus, while I do think that Bitfury's approach is EASIER (as one only has to worry about a few hundred wires and their associated delays, and not tens of thousands), I fail to see why it is inherently faster. I don't think it is inherently faster.
Maybe the Xilinx router goofs up wires that would be short and local and sends them the long way like a crooked cab driver an out-of-town tourist. But, to reiterate, a fully unrolled miner does not involve a feedback from the right side to the left side.

TheSeven said correctly - Spartan routing resources are ugly. no handy BENTQUADs etc.... plus 50% of Slices.X. adds up problems. With Artix my highest expectation 2x Spartan.... but I am afraid to make such predictions, because I've heard that on 28-nm chips there's even more problems with power distribution..... Do not want to make again troubles, like having estimation of 500 Mh/s per chip, then target of 400 Mh/s and finishing with 300 Mh/s.

About "there's no long lines" - I've already commented, but will try to draw it, where epic fail for parallel expander is exactly....

say computing w0+w1 and feeding to w9:

                                        ---+---------------------------------
                                   ---+---------------------------------
                              ---+--------------------------------
                          ---+-------------------------------
                     ---+------------------------------
                ---+-----------------------------
           ---+----------------------------
      ---+----------------------------
 ---+---------------------------
w0 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13 w14 w15 w16

How many wires ? biggest cross-section just for that ? 9x32 bits :-)
The same happens when pushing w9 to w16... and w14 to w16...
Lazy to calculate - but near 512 bits cross-section...

And in Spartan-6 there's difficult to pass more than 256-bit cross-section in 8 slices height long-way (there's
32 QUAD routes per each switch - so 256-bits would use QUAD routes in horizontal case for 8 slices height).

Then what will happen - it will go to DOUBLE route, and will go wide outside of your round expander area slowing
down interconnect for other parts of design....

I've started with that :-( Plus it is a question how this design would survive reality that sha256 is VERY TOUGH TEST for bit error rates. even small infrequent errors are amplified by avalanche effect through rounds.

with unrolled rounds however it is true - no problem there - it works like charm... unrolled design is also more compact than rolled one.... and rolled design within 240 slices is very difficult... even 248 would be easier. as in 240 I had to fight for each register, and reuse parts of logics to do other things.... in my design rounds only looks similar, but in reality there's 3 kinds of rounds with special cases. and they are different.

PS. You've answered before I written post... Anyway I think this will be helpful for those who try with parallel rounds... With ASICs it will do same mess BTW Smiley lots of wires for round expander Smiley + lots of clock problems.

PPS. So getting quick and dense parallel design is tough task - that's why I respect this work!

574  Bitcoin / Hardware / Re: Algorithmically placed FPGA miner: 245MH/s/chip and still rising on: May 23, 2012, 11:53:36 PM
This is most likely due to the Spartan6's awful long distance routing fabric,
This isn't Spartan's fault. This is a property of any modern FPGA: most of the delay and energy loss occurs in the routing fabric. So the easiest way to speed up the design is to minimize the demand on routing resources.

I was always perplexed why everyone here was focusing on unrolling the combinatorial logic. After gaining some experience with the currently available EDA tool suites for FPGA it became obvious: they make the place and route of repetitive designs very difficult.

The "sea of tight hashers" approach will probably be also beneficial for the future ASIC designs, although not by such a wide margin.

Does anyone know if bitfury's design stores the SHA-256 constants in BRAMs or has them spread over through the SLICEs?

You have all the clues... Turn on your head and just guess using data you have - print screen from PlanAhead - I certify that it is correct one... Try placing some BRAM and watch your timings... Why would you ask then ?

With routing fabric - it is the same... Open FPGA Editor, and start placing routes manually, understood how QUAD, DOUBLE, SINGLE routes works within spartan, what are costs of switch to switch hop, and switch to logic entry etc. It is interesting, believe me :-) Most pity however with them - is that P&R tool is far from ideal, and less routing resources left - worse design it produces. In SHA-256 round expander kills routing, as taking that w[0], w[1] and w[9] requires a lot of routing, because you basically pulling data from N rounds behind... so you basically put either SRL or BRAM to do that... near end of game... however if working really hard on it - spartan has barely enough resources just to route these parallel rounds - if you find right placement schema to use more adequately vertical and horizontal interconnect. Also interconnect works in one direction only, so if rounds placed in smart way, you'll get more efficiency in routing resources usage ( i.e. A,B  <---> C,D while A --> C and B <--- D are interconnected and placed into same regions).

So I really respect author's work of fitting 1.5 parallel rounds into Spartan 6 - it is tough and very nice work. And probably Spartan is showing his bad temper in error rates. In case of rolled rounds - only single round failures, in case of unrolled rounds - if some part of chip fails more frequently than other - you get higher performance degradation. In my experience during debug runs - it starts to degrade from central slices to peripheral, when you rise clocks. It is interesting indeed if design performance would actually match performance that tools display.

Finally I would say that implementing FPGA design mostly about placement and routing... Do not even start trying it, if you are not prepared to waste weeks figuring all of that things, or use only simple designs, when you have about clocks 2-3 times smaller than chip's maximums... designs @ 50 - 100 Mhz would be easy....
575  Bitcoin / Hardware / Re: BitFury Design, Licensing, Mass production on: May 23, 2012, 10:10:48 PM
My 2 cents:
The pricing is pretty much OK, but the AES key programming could turn into a logistical nightmare.
In a few days, EldenTyrell will reveal his offer and if it is competitively priced, but does not involve sending boards back and
forth for AES key implantation, it'll probably be the offer I'll pursue.

Well - sending boards back and forth is bad case, and it will not work that way. Better case to program it in place where they are assembled. But that requires high level of trust to one who is doing such programming. Also it is cheaper to assemble in higher volumes. So if there will be 1, 2 or 3 points of PCB assembly with nice flows of chips - it is not problem to get trusted person, who will turn board on, program AES and done. I am ready to solve it, so it would be without any delay. But not for 20 points of assembly (soldering of PCB, not building complete product) with 10 chips per month each.
Anyway - to get good prices you have to make volumes, this is why we started with racks, and not with small units - it was much cheaper that way.

EDITED: Or - how else could it be protected ? Because selling it for $20 is like uploading it to ftp. Say you will not pirate it, but one of 100 users would certainly do, and they'll even think that they did it right way.

EDITED: Or another possibility was offered - that if chips are supplied to different board vendors using same channel, then we could program chips at some point during their initial delivery. Probably this could be best option for smaller purchases, but it depends.
576  Bitcoin / Hardware / Re: BitFury Design, Licensing, Mass production on: May 23, 2012, 10:05:51 PM
I'm interested, but I'm not understanding something.  What is the "bitstream" in this context, and what is it for?

This topic is more intended for developers of FPGA boards. I suppose they would join soon. For better understanding I will explain whole cycle of these devices:

Xilinx (www.xilinx.com) produces programmable logic chips - XC6SLX150 - the key is that they are "programmable" bitstream is configuration that configures wires and logic cells inside of chip, so it is not like usual program of GPU miner etc. There is no such thing like CPU in FPGA by default, although FPGA designer can implement many CPUs. Depending on effort - there could be many different implementations of same algorithm within FPGA. Most difficult part is LAYOUT within chip to get high performances, as unlike programming there you have to deal with wire and logic latency, design synchronization and etc. So - to get nice product you should definitely have best bitstream available for FPGA. - This point is solved - BITSTREAM IS QUITE COMPETETIVE .

Then comes in play PCB (printed circuit board) that provides connectivity and power for FPGA chips, it has many cheaper components. But their sole role to give jobs to FPGA, download results and make FPGA and its user happy. PCB itself is manufactured in multiple specialized factories. Also there's multiple assembly plants around globe who would gladly take PCB without components and solder them. At this point you'll have working PCB.

Then comes assembly of PCBs into devices - basically putting boards into chassis, adding PSU, configuring.

Then it comes to delivery to end-user. With also servicing it within warranty period, if required.

So with the rest of points - making PCB, assembling them - I am looking for partners that would deliver these devices best to their customers.

So if you are in - and would like to buy such devices - you can (a) ask someone nearby you who already have Spartan-6 solution - that you are aware and would like to buy NNN {put your numbers here } devices from them; (b) if there's nobody nearby, you could contact shalab.si for example soon and import modules from Slovenia etc.

Luckily, with help of this topic, I suppose to find those who are interesting in making and selling this stuff. So you could go locally and just order it. To make things more interesting for developers, I would like to ask you tell numbers - how many Mh/s you want. Target price I think is $0.60 for Mh/s without VAT (I suppose that everyone knows well what is Mh/s and possibly only problems with these fancy bitstream Mhz numbers).
577  Bitcoin / Hardware / BitFury Design, Licensing, Mass production on: May 23, 2012, 08:24:45 PM
Dear BitCoiners, BitCoin talk users!

First of all I would like to thank to other developers of Spartan-6 based works, especially those who contacted me soon - shalab.si, fpgamining.com team, Greg and all those who understand how hard that work is. Also I would like to thank all those who supported me during development of these mining racks and invested money, so I haven't went into "preorder"
fall like offering more Mh/s power than hardware actually could deliver.

I am actually in love with bitcoins, because this is exactly the thing I lacked in 2006 year, and tried hard to invent... Unfortunately I were not smart enough to find out proof-of-work based blockchain. And this solution actually useful even in wider range of applications than money transfer themselves, as it can limit or remove human factor in distributed database consistency. It is interesting how this will evolve.

Second, I was surprised by reaction about comparision with ButterFly Labs with 'Estimated price'. People, that's $90k is with 20% VAT, which is not paid in US for example and with having in mind that it would have single hot air outlet and that's all about it... but to get mini-rig we would pay +20% on customs VAT, so it would be not $15'295 but $18'354 + shipment costs, so it is $0,728 per Mh/s. We've spent on boards production about $0,50 per Mh/s and on installation about $0,12 per Mh/s, totalling $0,62 per Mh/s. Count bitstream offer (it will be later in this topic) - it would be $0,69 per Mh/s. SO IT IS $0,03 LESS THAN THAN BFL Mini-Rig and it really works for almost 2 month, so we are sure now that it does not fry chips etc. We have not even managed to fry chip, when fed it with 1.5 V VCCINT core voltage. But, price would be definately higher (near estimated production cost), if counted other issues, like climate control, getting specialized area for them, getting power (as it consumes 20 kW together with chiller). All of these costs were already paid, because it is installed in basement of household with already existing heat pump.

Third, 300 Mh/s is not limit of this bitstream, it can give even more rounds, if you count that almost all DSPs are not used in left part of design, and some free space in topmost part. So it could definitely give 8% at least better performance, it would then cost to us $0,57 per Mh/s, which is even less than BFL Mini-Rig.

I've been asked by email and skype about smaller editions. And I would say that in my opinion best solution would be standard 4-U chassis, 0.5 meter long, with 14-15 boards with 6 chips on board installed (that is 84 - 90 chips), like with shalab.si original ideas, but a bit different layout to prevent overheating chips. Withing this chassis single Intel Atom D525 motherboard is installed. Boards with 6 spartans can be even without microcontrollers and flash, everything could be programmed right via LPT-port. Bandwith required to communicate with every chip is quite low - about 300 bps. So with all chips it would be about 27 kbps. Bitstream loading over LPT port however will be slow. For smaller scale RS-485 is overkill. Why to bother about it and not implement using USB - simply because flashing chip or flashing controller
adds up cost of controller and also costs of programming and testing them, also when something should be updated, and you have to reprogram every controller - that rises service cost. I would like to say, that current design of BitFury rack, where controller only translates RS-485 to SPI bus with Spartan and back requires almost zero maintenance.

Cost of such chassis with power supply and Intel ATOM motherboard could vary in $400 - $600 range. Cost of Spartan6 chips when purchased in bulk quantities (WITHOUT VAT) would vary in $70 - $95 range, depending on shipment location and quantity of chips ordered. Cost of other components (using numbers from our current design):

motherboard 4-layer $111 ($24,5 PCB and $8,53 soldering), daugthercard 6-layer $18.39 ($2,2 PCB and $5,5 soldering). these are costs with components with VAT and connectors. If connectors will be removed and VAT will be removed, then cost would be like $77,7 for motherboard for 6 chips and daugthercard $12,83 . Totalling $154,68 per board .

So server with 14-15 boards could cost in range $8'445 to $11'470. More likely that actual manufacturing cost plus work will be somewhere about $10'000 to $10'500. Then, say it would have 90 chips. I would point out how important Mh/s  are:

BFL Mining rig quite comparative product promised to be sold at $15'295 for 25,2 Gh/s - that is $0.607 per Mh/s

200 Mh/s bitstream - would produce 18 Gh/s - it would be compared to 71% of Mini-Rig and so product price could be $10'859.
250 Mh/s bitstream - would produce 22.5 Gh/s - it would be compared to 89,2% of Mini-Rig and so product price could be $13'643.
300 MH/s bitstream - would produce 27 Gh/s - it would be compared to 107% of Mini-Rig and so product price could be $16'365.
325 MH/s bitstream - would produce 29.2 Gh/s - it would be compared to 116% of Mini-Rig and so product price could be $17'722.

So at 200 Mh/s - there's almost no difference between bulk order and product price. At 250 Mh/s there's $3'143 income.
At 300 Mh/s there's $5'865 income. And at 325 Mh/s there's $7'222 income. Additional 50 Mh/s per chip gives 86%  income increase if prices are set at BFL levels of $0.607 per Mh/s. Additional 75 Mh/s per chips gives +129% income.

Calculating these costs and also costs handling sales, manufacturing etc. lead me to following licensing targets for about 1000 chips per month installed:

$20 - $25 per chip (depending on chip price and costs incurred by AES-key programming) for current bitstream and
$5 - $7.5 for future upgrade (separately), which can be opted when such update actually done.

From our side most important point is bitstream protection. This incurs costs of AES-key programming by moving trusted person to assembly plant from time-to time, powering on boards and fusing chips with AES-key. For simplicity it would be great to have 6 spartans on single board tied on JTAG line via buffers. Then encrypted bitstream could be available without any additional protection to it. Of course if quantities will be small and location distant, it would be difficult to execute programming. We are already planning to program chips with AES key in Hong-Kong, and we have good access to EU, because we are located in Ukraine. Existing boards could be upgraded as well.

Also - why I am insisting that 4U design one of the best sizes:
1) it would consume 1.3 - 1.5 kW per 4U, and that fits into envelope 350W per 1U heat production, which is not difficult
to implement in either datacenter or homebrew setup;
2) Chassis itself could be sold with minimal margin, so people could build mining power within nice chassis step by step;
3) When local price to electricity would be unaffordable, people can send what they built to Iceland for $0.04 per kw*h
special setup, so compatibility with datacenters is nice feature;
4) It gives $1'800-$1'900 "entry-level" price and board-by-board upgrade possibility;
5) "special single board" can be available as well - with USB, and it would be nice option that it can be put into server,
basically I expect to lower costs to have boards with pads for USB-related components, just they will not be soldered for
items that will be put into servers;
6) also it would be nice to have some DRAM on board for other purposes, but again - this would be not soldered at all, but later, if boards will be re-used for other tasks such DRAM capability and at least 1G ethernet external connectivity would make difference for long-term product life. At stage of PCB design it adds only NRE costs of designing PCB itself, and no cost at manufacturing.

Thinking where equipment will be placed - in small quantities - probably in homes, but when it will get larger - I have already found location and discussed this issue with Andreas Fink (Skype: andreasfink ) from datacell.com

[5/22/2012 10:24:36 AM] Andreas Fink: Our datacenter is designed for 20kW cooling per rack
[5/22/2012 10:24:37 AM] bitfury.org: but if air inlet drops 10 degrees Celsius, performance can be increased. and optimal temperature is about 0 degrees Celsius if we use industrial chips. that gives 20% performance increase.
[5/22/2012 10:24:44 AM] Andreas Fink: we use hot aisle concept
[5/22/2012 10:25:17 AM] bitfury.org: buyers of these devices actually concentrate on pure performance and cost of electricity, cooling etc. many of them installing such installations right at home, because it is cheaper.
[5/22/2012 10:25:27 AM] bitfury.org: you can google for "bitcoin mining"
[5/22/2012 10:25:40 AM] Andreas Fink: Well we operate our datacenter in iceland. energy is cheap there and coolign is easy.
[5/22/2012 10:27:05 AM] Andreas Fink: We could build a custom datacenter just for that if there's enough demand.
[5/22/2012 10:28:05 AM] Andreas Fink: what is the potential of your installation if it produces bitcoins?
[5/22/2012 10:29:38 AM] bitfury.org: very difficult to say, because we have just started to announce our solution.
[5/22/2012 10:29:54 AM] Andreas Fink: a rough estimate to get an idea.
[5/22/2012 10:29:54 AM] bitfury.org: current bitcoin network hashpower is 10'000 Gh/s, single bitfury produces 110 Gh/s
[5/22/2012 10:30:03 AM] bitfury.org: so estimation is about 30-40 racks
[5/22/2012 10:30:28 AM] bitfury.org: that's potential for FPGA bitcoin mining.
...
[5/22/2012 11:37:42 AM] Andreas Fink: the electricity company charges like 3-4 eurocents per kWh
...

It depends - if we manage to start with 4U working with them and without using diesel/ups-backed power which is  expensive and climate control equipment - it would be cheapest collocation for mining available in world.

[5/22/2012 11:02:04 AM] Andreas Fink: http://www.sgi.com/products/data_center/ice_cube_air/
[5/22/2012 11:02:17 AM] Andreas Fink: put them in such a shelter, we provide power and internet and put it into the green field

I suppose that shelter should be designed as well as SGI could be expensive, but generally you've got the point. And that is why I am sticking with rackmount design. It will be big pain to move devices that are non-compliant with datacenters, putting Mini-Rig there for example. I suppose this is the moment, when buyers of BFL should start thinking twice, what they would do with their rigs, when income will be about the same as consumed power. For myself that problem is solved - mining equipment is integral part of house heating now. For cold countries this can be as well solution - integrating house / DHW heating with mining. Today you will probably say that it is absolutely crazy, but 5 years in the future things would change, and those who make useful use of excessive heat will save on electricity bills, while those who will pay additionally for air conditioning would loose, as they could not compete.

So finally, I would welcome everyone interested to join efforts. First - design chassis and PCB with more-or-less interexchangable parts. Getting for example with-USB and without-USB version, with-flash and without-flash version, etc. so actual board can be customized by needs of our partner who provide product, but in general they all are near the same, and could be fitted inside 4U chassis. This way we can save on designing this thing, build product in which our customers will be confident, that even if one supplier stops selling boards, he can possibly with higher costs order elsewhere, but still finish his 4U box. More important is different metal pieces like holders of boards, etc. Which would depend on supplier of these chassis solution.

I am going to provide soon draft of 4U box, as I see it in 3D with airflow calculations. For those who asked about current interface, I am attaching right into this message .vhd source code of bitstream side, current .ucf file (but please, it can change - we used 1 mm FGG484C, and it could be possibly better to use CSG 0.8 mm steps), only important point that communication enters BOTTOM of chip. And communication part in dsPIC33F firmware. As you probably would see - same communication can be performed over LPT port without bothering with production of additional conversion boards.

Here are links to download:
http://www.bitfury.org/bfdetails/sha_top.ucf
http://www.bitfury.org/bfdetails/sha_top.vhd
http://www.bitfury.org/bfdetails/shaspi.vhd
http://www.bitfury.org/bfdetails/jobs.c

Meanwhile - we used FGG484 because we though to put more bypass capacitors below chip for overclocking purpose, as with CSG there
will be difficult to place such numbers of capacitors. However we've aimed for 320 Mhz that makes too much errors, with 240 Mhz
it would be fine to place less capacitance, and also get industrial CSG chip, as this would remove any possible overheating problems.
CSG chips have about 1.5 times (!) lower junction-case thermal resistance according to datasheet. However this would be great to be
confirmed by someone who used CSG chips - I saw ZTEX used them.

So - how we shall proceed ? I suppose best way will be to:
1) Write to this topic, if you are in (possibly after getting in touch in skype: bitfury.org );
2) Writing estimated quantities per month, assembly plant location (place where chips can be turned on);
3) Writing your final destination where you are going to ship these boards for further packaging;
4) Writing number of chips you already use - because if there's many already - then possibly re-work on IR station could cost
   significantly less than buying new chips;
5) Writing your requirements to board - specifying way of Spartan programming, currently used interfaces (i.e. USB with USB chip name),
   so we can get common specification for board interfaces and chips that shall be on board. Probably for overclockers it would be nice
   to have ability of voltage setting.

Then next iteration will be to approve specification and doing PCB schematic design, and then doing layout, and
doing similar way with chassis. Hopefully there's people who like working 24 hours + night Smiley

Kind regards,
Mr. V // Skype: BitFury.org

PS. Special note for those who know me personally. I dislike indexing of personal information via google, etc, so please, do not write here - "oh I know this guy"... Write it in skype in personal communication (it will be visible at least for
special services and not to general johny hacker who mines information and would like to get some cash), not on public forums  or public web pages. And I discourage everyone from doing so with their information. Imagine that it is possible to enter your name to google not only to your friends and partners, but also to your foes, those who you even do not know...

PPS. I've also uploaded stats file http://www.bitfury.org/bfdetails/bitfury_status.txt with current number of working cores, you may see that some cores are not working, error rates.... There were no special handling with them - we've just put cores "as is" into motherboards, and there are no visible defects. We've not bothered replacing these cores. Just to those who say that it is not real thing Smiley

EDITED - also forgot part of conversation with Andreas Fink about Iceland VAT there is 25.5% but for collocated servers it is returned. So it is like having 0% VAT compared to 18-20% VAT in EU countries.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 [29]
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!