Bitcoin Forum
December 05, 2016, 02:47:50 AM *
News: Latest stable version of Bitcoin Core: 0.13.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 »
  Print  
Author Topic: Algorithmically placed FPGA miner: 255MH/s/chip, supports all known boards  (Read 109499 times)
Inspector 2211
Sr. Member
****
Offline Offline

Activity: 383



View Profile
March 10, 2012, 05:23:58 PM
 #141

Quote from: Inspector 2211
BFL Single, watch out below.

What makes you think this cannot similarly be applied to the single ( even after a hardware modification ) Huh

1. Consensus on this forum is, that the BFL Single uses Altera FPGAs of an unknown type (Stratix?) and one would first want to determine
    the exact FPGA being used, before speculating whether DSP blocks could be used to a similarly beneficial effect.
    Without knowing the exact FPGA make/model, it's way too premature to state that DSP blocks could be used there -
    maybe that particular FPGA make/model does not even have DSP blocks.

or

2. Maybe they are already using this trick, maybe that's their secret sauce which allows them to reach 830 MH/s with but two
    FPGAs.

Just my 2 cents.
1480906070
Hero Member
*
Offline Offline

Posts: 1480906070

View Profile Personal Message (Offline)

Ignore
1480906070
Reply with quote  #2

1480906070
Report to moderator
1480906070
Hero Member
*
Offline Offline

Posts: 1480906070

View Profile Personal Message (Offline)

Ignore
1480906070
Reply with quote  #2

1480906070
Report to moderator
1480906070
Hero Member
*
Offline Offline

Posts: 1480906070

View Profile Personal Message (Offline)

Ignore
1480906070
Reply with quote  #2

1480906070
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1480906070
Hero Member
*
Offline Offline

Posts: 1480906070

View Profile Personal Message (Offline)

Ignore
1480906070
Reply with quote  #2

1480906070
Report to moderator
1480906070
Hero Member
*
Offline Offline

Posts: 1480906070

View Profile Personal Message (Offline)

Ignore
1480906070
Reply with quote  #2

1480906070
Report to moderator
bulanula
Hero Member
*****
Offline Offline

Activity: 518



View Profile
March 10, 2012, 05:25:51 PM
 #142

Quote from: Inspector 2211
BFL Single, watch out below.

What makes you think this cannot similarly be applied to the single ( even after a hardware modification ) Huh

1. Consensus on this forum is, that the BFL Single uses Altera FPGAs of an unknown type (Stratix?) and one would first want to determine
    the exact FPGA being used, before speculating whether DSP blocks could be used to a similarly beneficial effect.
    Without knowing the exact FPGA make/model, it's way too premature to state that DSP blocks could be used there -
    maybe that particular FPGA make/model does not even have DSP blocks.

or

2. Maybe they are already using this trick, maybe that's their secret sauce which allows them to reach 830 MH/s with but two
    FPGAs.


Just my 2 cents.

That is what I meant. It seems this guy has found a way to speed up the hashrate using DSPs so what is so hard to understand Turbor and gigavps ?

I was asking why couldn't BFL also do this "trick" and a valid question indeed. One bitstream or FPGA "trick" likely could be applied on a range of different FPGA hardware because the basic operating principles are the same for all FPGAs etc.

I'm no expert but I understand ( reasonably well ) how FPGA works and this DSP trick allows you to do 3 loops of SHA256 in the same chip ( cheap Spartan 6 ones ) that previously only allowed us to do 2 loops etc.
Inspector 2211
Sr. Member
****
Offline Offline

Activity: 383



View Profile
March 10, 2012, 05:45:38 PM
 #143

That is what I meant. It seems this guy has found a way to speed up the hashrate using DSPs so what is so hard to understand Turbor and gigavps ?

He doesn't use DSPs throughout (because there are not enough DSPs to go around), but only at the most critical spots, i.e. where the adders feed into longlines. That's the brilliance of it. That's the design idea I had completely missed before.

I was asking why couldn't BFL also do this "trick" and a valid question indeed. One bitstream or FPGA "trick" likely could be applied on a range of different FPGA hardware because the basic operating principles are the same for all FPGAs etc.

Agreed.

I'm no expert but I understand ( reasonably well ) how FPGA works and this DSP trick allows you to do 3 loops of SHA256 in the same chip ( cheap Spartan 6 ones ) that previously only allowed us to do 2 loops etc.

No, using this DSP trick has nothing to do with being able to squeeze three SHA-256 instances into a FPGA.
You can do that with the plain old stream-powered ripple carry adders.
Using DSPs in a few strategic places, however, ensures that the critical path (a deadly combination of two 32-bit adder stages and one longline path) stays well below 5 ns, when otherwise (with ripple carry adders) it can barely achieve 5 ns.

BFL-Engineer
Full Member
***
Offline Offline

Activity: 227



View Profile WWW
March 10, 2012, 05:45:44 PM
 #144

  Number of DSP48A1s:                           30 out of     180   16%
Aha! Interesting. When uncle Moshe (Gavrielov) gives you DSPs, make DSPeade. Wink

Thank you for providing an important puzzle piece on how Dr. Tyrell does it.

The multiplier in the DSP48-block is not needed in SHA-256, hence what he obviously uses is the 18-bit adder
BCOUT = B + D.
He uses 30 DSP blocks, 10 per red / green / blue SHA-256 instance.
For a 32 bit adder, two 18-bit adders BCOUT=B+D are needed.
Thus, he can implement five 32-bit adders per SHA instance.

So, why not just use [slow] 32-bit ripple adders everywhere, and use a few [very fast] DSP adders in some places?

The answer is, IMHO, that he uses the fast DSP adders only where they feed into longlines.
Were he to use normal ripple adders where he feeds into longlines, the aggregate delay would limit
the design to a 5 ns clock cycle.
Using the fast DSP adders will allow this design, when properly fine-tuned, to march into 4 ns clock cycle
territory, for a total MH/s number of approximately 125 MH/s or approximately 375 MH/s per Spartan6-150.

BFL Single, watch out below.



I remember nghzang mentioned that going to 200MHz on chips was not suggested (chips got so hot), and he gave
out a bitstream with a "Use at your own risk". Three loops on the same chip suggests far greater number of
Registers is being used. Since each stage toggle rate approaches 50% (This idea behind Digest functions is that their toggle-rate
must approach 50% in each stage to be effective, and so is the case in SHA256), I wonder how hot the chips will get in high
frequencies, approaching 180MHz or 190MHz...


Good Luck,

BF Labs Inc.  www.butterflylabs.com   -  Bitcoin Mining Hardware
Inspector 2211
Sr. Member
****
Offline Offline

Activity: 383



View Profile
March 10, 2012, 05:58:47 PM
 #145

I remember nghzang mentioned that going to 200MHz on chips was not suggested (chips got so hot), and he gave
out a bitstream with a "Use at your own risk". Three loops on the same chip suggests far greater number of
Registers is being used. Since each stage toggle rate approaches 50% (This idea behind Digest functions is that their toggle-rate
must approach 50% in each stage to be effective, and so is the case in SHA256), I wonder how hot the chips will get in high
frequencies, approaching 180MHz or 190MHz...

Hahaha - funny that you mention it.
You guys have found that out the hard way, haven't you?
(By the way, I have a total of 12 BFL singles on order, so I'm not anti-BFL at all.)

But you are correct and you raise a valid point.

At 200 MH/s, Dr. Tyrell's design dissipates about 8 W, and so it's fair to assume that it dissipates 12 W at 300 MHz, which is
probably stretching the boundaries of a tiny 20mm x 20mm plastic chip like that. I mean, you can mount a big cooler on it,
but there is a thermal resistance from the FPGA die to the cooler.
Maybe these devices have to be run inside a freezer to successfully achieve a consistent hash rate of 300 MH/s and beyond.
Time will tell.
Wandering Albatross
Member
**
Offline Offline

Activity: 70



View Profile
March 10, 2012, 08:54:32 PM
 #146

Quote from: Inspector 2211
Maybe these devices have to be run inside a freezer to successfully achieve a consistent hash rate 300 MH/s and beyond.

Maybe a peltier device would suffice.

BTC: 1JgPAC8RVeh7RXqzmeL8xt3fvYahRXL3fP
kakobrekla
Hero Member
*****
Offline Offline

Activity: 714


Psi laju, karavani prolaze.


View Profile
March 10, 2012, 08:56:05 PM
 #147

Quote from: Inspector 2211
Maybe these devices have to be run inside a freezer to successfully achieve a consistent hash rate 300 MH/s and beyond.

Maybe a peltier device would suffice.

Yeah cause those are free and need no power to run!

DeepBit
Donator
Hero Member
*
Offline Offline

Activity: 532


We have cookies


View Profile WWW
March 10, 2012, 09:22:16 PM
 #148

Peltiers are cool and handy, but for big mining operations phase-change heat pumps will be FAR more efficient.
Remember that Peltier modules consume a lot of current and act almost as 200% efficiency heaters on the other side :)

Welcome to my bitcoin mining pool: https://deepbit.net ~ 3600 GH/s, Both payment schemes, instant payout, no invalid blocks !
Coming soon: ICBIT Trading platform
Wandering Albatross
Member
**
Offline Offline

Activity: 70



View Profile
March 10, 2012, 10:31:06 PM
 #149

Quote from: DeepBit
Peltiers are cool and handy, but for big mining operations phase-change heat pumps will be FAR more efficient.
Remember that Peltier modules consume a lot of current and act almost as 200% efficiency heaters on the other side Smiley

How would a phase-change heat pump work for cooling a chip? Is there an off-the-shelf device? Or would it be DIY?

For the peltier I realize power is needed and heat is also produced but if your goal is to cool the chip it would work for that and peltier makers target this application of their products.  e.g. micropelt

Perhaps a better way to efficiency is to create your own cheaper power and don't create as much heat by running miners at lower freq.

BTC: 1JgPAC8RVeh7RXqzmeL8xt3fvYahRXL3fP
TheSeven
Hero Member
*****
Offline Offline

Activity: 504


FPGA Mining LLC


View Profile WWW
March 10, 2012, 10:48:35 PM
 #150

Quote from: DeepBit
Peltiers are cool and handy, but for big mining operations phase-change heat pumps will be FAR more efficient.
Remember that Peltier modules consume a lot of current and act almost as 200% efficiency heaters on the other side Smiley

How would a phase-change heat pump work for cooling a chip? Is there an off-the-shelf device? Or would it be DIY?

For the peltier I realize power is needed and heat is also produced but if your goal is to cool the chip it would work for that and peltier makers target this application of their products.  e.g. micropelt

Perhaps a better way to efficiency is to create your own cheaper power and don't create as much heat by running miners at lower freq.

What about just using a BFL single instead of wasting half the electricity on cooling?  Roll Eyes

My tip jar: 13kwqR7B4WcSAJCYJH1eXQcxG5vVUwKAqY
DeepBit
Donator
Hero Member
*
Offline Offline

Activity: 532


We have cookies


View Profile WWW
March 10, 2012, 10:49:04 PM
 #151

Quote from: DeepBit
Peltiers are cool and handy, but for big mining operations phase-change heat pumps will be FAR more efficient.
Remember that Peltier modules consume a lot of current and act almost as 200% efficiency heaters on the other side :)
How would a phase-change heat pump work for cooling a chip? Is there an off-the-shelf device? Or would it be DIY?
Any good solution for that will be some kind of DIY, otherwise it will be either unsuitable or expensive.

For phase-change I would use two contours - one with normal liquid passing through many special waterblocks and second with freon for cooling the first contour (if we need lower-than-environment temps, or course).

Welcome to my bitcoin mining pool: https://deepbit.net ~ 3600 GH/s, Both payment schemes, instant payout, no invalid blocks !
Coming soon: ICBIT Trading platform
Gomeler
Hero Member
*****
Offline Offline

Activity: 635



View Profile
March 11, 2012, 12:11:38 AM
 #152

Very interesting thread but I can actually chime in on this conversation about using a phase-change system to remove heat. If you're talking a traditional single-stage gas system like in your refrigerator then forget about it. The piping required plus the MINUSCULE load for each cold-head would make this cost prohibitive. Your best bet would be to repurpose a mini-fridge, use a proper condenser and throw a TXV for refrigerant metering and use something like r134a or n-butane/iso-butane and aim for evaporator temperatures in the 0-20 Celsius range. Then stick with your dinky little heatsinks and fans and not worry about having to mill expensive evaporators for such a small heatload.

Mini-fridges or even something like a deep-chest freezer would be the perfect insulated box to work with. The issue is such systems are designed to remove the heat from a load that doesn't generate additional heat. Without modification you will kill a freezer/fridge. That's where the replacement condenser and TXV come in to place. Make the compressor happy and you'll have shockingly low compressor loads and could very well run these FPGAs at astonishing speeds.

That all being said, compressed gasses are fun but can easily explode in your face with dire consequences if you aren't careful. Plenty of forums out there for amateur refrigeration. Take a gander at some of the things people have made and consider the tool costs. My own set of tools and gasses would buy a number of FPGAs and likely make me more money in the process Cheesy

eldentyrell
Donator
Legendary
*
Offline Offline

Activity: 966


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 01:59:09 AM
 #153

 Number of DSP48A1s:                           30 out of     180   16%
Aha! Interesting. When uncle Moshe (Gavrielov) gives you DSPs, make DSPeade. Wink

This isn't my "secret sauce", but it is unique to my design.  When I run out of SRL16s in the places where I need them, I use the DSP48's as 32-bit-wide 16-bit-wide, 6-bit-deep FIFOs.  Useful trick.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
eldentyrell
Donator
Legendary
*
Offline Offline

Activity: 966


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 02:00:55 AM
 #154

The multiplier in the DSP48-block is not needed in SHA-256, hence what he obviously uses is the 18-bit adder
BCOUT = B + D.

Nah; I use the DSP48s as big fat FIFOs; they have lots of registers inside and if you configure them right everything's a no-op.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
rjk
Sr. Member
****
Offline Offline

Activity: 420


1ngldh


View Profile
March 20, 2012, 02:01:58 AM
 #155

The multiplier in the DSP48-block is not needed in SHA-256, hence what he obviously uses is the 18-bit adder
BCOUT = B + D.

Nah; I use the DSP48s as big fat FIFOs; they have lots of registers inside and if you configure them right everything's a no-op.
Interesting use case; so essentially no added latency using them this way?

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
eldentyrell
Donator
Legendary
*
Offline Offline

Activity: 966


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 02:02:32 AM
 #156

Quote from: Inspector 2211
BFL Single, watch out below.
What makes you think this cannot similarly be applied to the single ( even after a hardware modification ) Huh

The BFL single definitely isn't a Spartan 6.

BTW, I will offer a 10BTC bounty to anybody who posts the JTAG IDCODE readout from the BFL single -- merely to satisfy my curiosity.  There was a JTAG header on the last PCB I saw them post.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
eldentyrell
Donator
Legendary
*
Offline Offline

Activity: 966


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 02:08:09 AM
 #157

is there a way to port it to Ztex  or other FPGA board's?

Yes.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
eldentyrell
Donator
Legendary
*
Offline Offline

Activity: 966


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 02:09:34 AM
 #158

Sorry for falling off the radar there.  Real life, quality time with git bisect, and some voltage drop issues on my own boards conspired to slow things down the last week or so.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
c_k
Donator
Sr. Member
*
Offline Offline

Activity: 242



View Profile
March 20, 2012, 05:42:02 AM
 #159

Ooh new speed update, what do we have to do to encourage you to make this available to all to use?

Even if only binary blobs initially.

X6500's would be awesome with this Wink

catfish
Sr. Member
****
Offline Offline

Activity: 270


teh giant catfesh


View Profile
March 20, 2012, 01:00:36 PM
 #160

I see Dr Tyrell's concerns about getting paid could result in this bitstream likely to be a big tease (and never released or sold).

If Dr Tyrell had a large capital sum to invest in his own bitcoin mining operation then I don't see why this thread exists at all, since he'd keep it to himself and his large array of LX150s. After all, with a lot of FPGA units and a 150% advantage over the open-source bitstream (which, on the Ztex units at least, is still competitive for those with high electricity costs), all you need is an assembly design and appropriate cooling - and you have a steady, respectable income stream with negligible running costs.

I would understand if there was a credible way of selling the bitstream to FPGA miners without any risk of piracy, too. I am an aspiring small-size miner (9 Ghash GPU capacity, being replaced in a few weeks by 5 Ghash of FPGAs, with the plan to decommission / sell all GPU rigs, and continue scaling up the FPGAs) - and FWIW I'm a happy Ztex customer. Hence I will soon have over 25 LX150 boards and it'd be trivial to use my business-case spreadsheet to work out how much I'd be prepared to pay for a bitstream that converts my 208 MH/s units into 300-odd MH/s units.

However, unless there are miners out there with tens of thousands of LX150-based boards already running, I can't see any single miner paying enough to 'compensate' for the number of hours development involved, assuming Dr Tyrell is looking at private-sector consultancy rates and how much he'd have been paid if he had developed this bitstream for a private client.

As it stands, I'd obviously benefit from a 300 MH/s bitstream for the LX150 so I'd be interested in 'buying' it - but since there is already a 208 MH/s bitstream for *free*, the value of the incremental speed hike (and concomitant increase in mining income) is necessarily limited. Even if I was prepared to pay a full year's worth of *all* mining income attributed to the increased FPGA hashing speed, it wouldn't be in 'professional consultancy' ballpark figures.


The question is whether Dr Tyrell wants to make some money from this, or not at all. Right now it's a bit of a tease, but because it shows that three instances per LX150 *is* possible, it's probable that the open-source effort will, eventually, squeeze enough hints from the thread to build their own version. Of course, this just slows down the open-source process by a few months, during which time Dr Tyrell will have had time to further optimise his design... but once there is an open-source bitstream running 3 instances (at around 300 MH/s), the window for Dr Tyrell to make money from his bitstream closes.

There's no way that cautious FPGA miners will send their hundreds-of-boards investment to someone for encrypted bitstreams to be loaded out of their sight. If a unit fails, and needs re-programming, then it'd need to be sent back to Dr Tyrell and we're not all in the USA. If the only way to prevent 'theft' of the bitstream would be to lock the FPGA so it can't be used for other purposes (or reprogrammed) then, again, no miner would agree to this (since standard FPGA boards still have value if Bitcoin crashes, and can be resold to their original market - whereas hardcoded Bitcoin-specific devices cannot).


Hence I'm a bit confused by this thread. I'm willing to pay for a 300 MH/s bitstream, but the price can't be much more than the actual difference in income between a *free* 208 MH/s bitstream and Dr Tyrell's. If the assumption is that the first sale of unencrypted code will result in piracy, then Dr Tyrell needs ONE big customer... and how many FPGA miners have scaled up to thousands of units yet? This big customer also has to be found *before* the open-source effort finds out how to replicate the design.

So - is there a 'group buy' process in place yet? I'm happy to club together with other multi-FPGA miners to get a *significantly* faster bitstream... but if the open-source guys figure out the 'tricks' then I'll just use that... certainly this thread has given the chances of Dr Tyrell's work actually being 'independently discovered' a big boost, due to the information he's supplying.

...so I give in to the rhythm, the click click clack
I'm too wasted to fight back...


BTC: 1A7HvdGGDie3P5nDpiskG8JxXT33Yu6Gct
Pages: « 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!