Bitcoin Forum
April 30, 2024, 07:46:35 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 »
  Print  
Author Topic: Algorithmically placed FPGA miner: 255MH/s/chip, supports all known boards  (Read 119415 times)
Inspector 2211
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250



View Profile
March 10, 2012, 05:23:58 PM
 #141

Quote from: Inspector 2211
BFL Single, watch out below.

What makes you think this cannot similarly be applied to the single ( even after a hardware modification ) Huh

1. Consensus on this forum is, that the BFL Single uses Altera FPGAs of an unknown type (Stratix?) and one would first want to determine
    the exact FPGA being used, before speculating whether DSP blocks could be used to a similarly beneficial effect.
    Without knowing the exact FPGA make/model, it's way too premature to state that DSP blocks could be used there -
    maybe that particular FPGA make/model does not even have DSP blocks.

or

2. Maybe they are already using this trick, maybe that's their secret sauce which allows them to reach 830 MH/s with but two
    FPGAs.

Just my 2 cents.

               ▄█▄
            ▄█ ▀█▀
     ▄ ▄███▄▄████▄▀ ▄▄▀▄
    ▀█▄████
██████▀▄█████▀▄▀
   ▄█▀▄
███████████████████▄
 ▄██▀█▀
▀▀▀███▀▀▀█████▄▄▄▀█▀▄
 ▄█▀▀   ▀█
███▀▄████████ █▀█▄▄
██▀  ▀ ▀ ▀
██████████▄   ▄▀▀█▄
     ▀ ▀
  ███▀▀▀▀▀████▌ ▄  ▀
          ████████████▌   █
        █████████████▀
        ▀▀▀██▀▀██▀▀
           ▀▀  ▀▀
BTC-GREEN       ▄▄████████▄▄
    ▄██████████████▄
  ▄██████
██████████████▄
 ▄███
███████████████████▄
▄█████████████████████████▄
██████████████████████████
███████████████████████████
███████████████████████████
▀█████████████████████████▀
 ▀███████████████████████▀
  ▀█████████████████████▀
    ▀█████████████████
       ▀▀█████████▀▀
Ecological Community in the Green Planet
❱❱❱❱❱❱     WHITEPAGE   |   ANN THREAD     ❰❰❰❰❰❰
           ▄███▄▄
       ▄▄█████████▄
      ▄████████████▌
   ▄█████████████▄▄
 ▄████████████████████
███████████████▄
▄████████████████████▀
███████████████████████▀
 ▀▀██████▀██▌██████▀
   ▀██▀▀▀  ██  ▀▀▀▀▀▀
           ██
           ██▌
          ▐███▄
.
BitcoinCleanup.com: Learn why Bitcoin isn't bad for the environment
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714506395
Hero Member
*
Offline Offline

Posts: 1714506395

View Profile Personal Message (Offline)

Ignore
1714506395
Reply with quote  #2

1714506395
Report to moderator
1714506395
Hero Member
*
Offline Offline

Posts: 1714506395

View Profile Personal Message (Offline)

Ignore
1714506395
Reply with quote  #2

1714506395
Report to moderator
1714506395
Hero Member
*
Offline Offline

Posts: 1714506395

View Profile Personal Message (Offline)

Ignore
1714506395
Reply with quote  #2

1714506395
Report to moderator
bulanula
Hero Member
*****
Offline Offline

Activity: 518
Merit: 500



View Profile
March 10, 2012, 05:25:51 PM
 #142

Quote from: Inspector 2211
BFL Single, watch out below.

What makes you think this cannot similarly be applied to the single ( even after a hardware modification ) Huh

1. Consensus on this forum is, that the BFL Single uses Altera FPGAs of an unknown type (Stratix?) and one would first want to determine
    the exact FPGA being used, before speculating whether DSP blocks could be used to a similarly beneficial effect.
    Without knowing the exact FPGA make/model, it's way too premature to state that DSP blocks could be used there -
    maybe that particular FPGA make/model does not even have DSP blocks.

or

2. Maybe they are already using this trick, maybe that's their secret sauce which allows them to reach 830 MH/s with but two
    FPGAs.


Just my 2 cents.

That is what I meant. It seems this guy has found a way to speed up the hashrate using DSPs so what is so hard to understand Turbor and gigavps ?

I was asking why couldn't BFL also do this "trick" and a valid question indeed. One bitstream or FPGA "trick" likely could be applied on a range of different FPGA hardware because the basic operating principles are the same for all FPGAs etc.

I'm no expert but I understand ( reasonably well ) how FPGA works and this DSP trick allows you to do 3 loops of SHA256 in the same chip ( cheap Spartan 6 ones ) that previously only allowed us to do 2 loops etc.
Inspector 2211
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250



View Profile
March 10, 2012, 05:45:38 PM
 #143

That is what I meant. It seems this guy has found a way to speed up the hashrate using DSPs so what is so hard to understand Turbor and gigavps ?

He doesn't use DSPs throughout (because there are not enough DSPs to go around), but only at the most critical spots, i.e. where the adders feed into longlines. That's the brilliance of it. That's the design idea I had completely missed before.

I was asking why couldn't BFL also do this "trick" and a valid question indeed. One bitstream or FPGA "trick" likely could be applied on a range of different FPGA hardware because the basic operating principles are the same for all FPGAs etc.

Agreed.

I'm no expert but I understand ( reasonably well ) how FPGA works and this DSP trick allows you to do 3 loops of SHA256 in the same chip ( cheap Spartan 6 ones ) that previously only allowed us to do 2 loops etc.

No, using this DSP trick has nothing to do with being able to squeeze three SHA-256 instances into a FPGA.
You can do that with the plain old stream-powered ripple carry adders.
Using DSPs in a few strategic places, however, ensures that the critical path (a deadly combination of two 32-bit adder stages and one longline path) stays well below 5 ns, when otherwise (with ripple carry adders) it can barely achieve 5 ns.


               ▄█▄
            ▄█ ▀█▀
     ▄ ▄███▄▄████▄▀ ▄▄▀▄
    ▀█▄████
██████▀▄█████▀▄▀
   ▄█▀▄
███████████████████▄
 ▄██▀█▀
▀▀▀███▀▀▀█████▄▄▄▀█▀▄
 ▄█▀▀   ▀█
███▀▄████████ █▀█▄▄
██▀  ▀ ▀ ▀
██████████▄   ▄▀▀█▄
     ▀ ▀
  ███▀▀▀▀▀████▌ ▄  ▀
          ████████████▌   █
        █████████████▀
        ▀▀▀██▀▀██▀▀
           ▀▀  ▀▀
BTC-GREEN       ▄▄████████▄▄
    ▄██████████████▄
  ▄██████
██████████████▄
 ▄███
███████████████████▄
▄█████████████████████████▄
██████████████████████████
███████████████████████████
███████████████████████████
▀█████████████████████████▀
 ▀███████████████████████▀
  ▀█████████████████████▀
    ▀█████████████████
       ▀▀█████████▀▀
Ecological Community in the Green Planet
❱❱❱❱❱❱     WHITEPAGE   |   ANN THREAD     ❰❰❰❰❰❰
           ▄███▄▄
       ▄▄█████████▄
      ▄████████████▌
   ▄█████████████▄▄
 ▄████████████████████
███████████████▄
▄████████████████████▀
███████████████████████▀
 ▀▀██████▀██▌██████▀
   ▀██▀▀▀  ██  ▀▀▀▀▀▀
           ██
           ██▌
          ▐███▄
.
BFL-Engineer
Full Member
***
Offline Offline

Activity: 227
Merit: 100



View Profile WWW
March 10, 2012, 05:45:44 PM
 #144

  Number of DSP48A1s:                           30 out of     180   16%
Aha! Interesting. When uncle Moshe (Gavrielov) gives you DSPs, make DSPeade. Wink

Thank you for providing an important puzzle piece on how Dr. Tyrell does it.

The multiplier in the DSP48-block is not needed in SHA-256, hence what he obviously uses is the 18-bit adder
BCOUT = B + D.
He uses 30 DSP blocks, 10 per red / green / blue SHA-256 instance.
For a 32 bit adder, two 18-bit adders BCOUT=B+D are needed.
Thus, he can implement five 32-bit adders per SHA instance.

So, why not just use [slow] 32-bit ripple adders everywhere, and use a few [very fast] DSP adders in some places?

The answer is, IMHO, that he uses the fast DSP adders only where they feed into longlines.
Were he to use normal ripple adders where he feeds into longlines, the aggregate delay would limit
the design to a 5 ns clock cycle.
Using the fast DSP adders will allow this design, when properly fine-tuned, to march into 4 ns clock cycle
territory, for a total MH/s number of approximately 125 MH/s or approximately 375 MH/s per Spartan6-150.

BFL Single, watch out below.



I remember nghzang mentioned that going to 200MHz on chips was not suggested (chips got so hot), and he gave
out a bitstream with a "Use at your own risk". Three loops on the same chip suggests far greater number of
Registers is being used. Since each stage toggle rate approaches 50% (This idea behind Digest functions is that their toggle-rate
must approach 50% in each stage to be effective, and so is the case in SHA256), I wonder how hot the chips will get in high
frequencies, approaching 180MHz or 190MHz...


Good Luck,

BF Labs Inc.  www.butterflylabs.com   -  Bitcoin Mining Hardware
Inspector 2211
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250



View Profile
March 10, 2012, 05:58:47 PM
Last edit: March 10, 2012, 11:41:39 PM by Inspector 2211
 #145

I remember nghzang mentioned that going to 200MHz on chips was not suggested (chips got so hot), and he gave
out a bitstream with a "Use at your own risk". Three loops on the same chip suggests far greater number of
Registers is being used. Since each stage toggle rate approaches 50% (This idea behind Digest functions is that their toggle-rate
must approach 50% in each stage to be effective, and so is the case in SHA256), I wonder how hot the chips will get in high
frequencies, approaching 180MHz or 190MHz...

Hahaha - funny that you mention it.
You guys have found that out the hard way, haven't you?
(By the way, I have a total of 12 BFL singles on order, so I'm not anti-BFL at all.)

But you are correct and you raise a valid point.

At 200 MH/s, Dr. Tyrell's design dissipates about 8 W, and so it's fair to assume that it dissipates 12 W at 300 MHz, which is
probably stretching the boundaries of a tiny 20mm x 20mm plastic chip like that. I mean, you can mount a big cooler on it,
but there is a thermal resistance from the FPGA die to the cooler.
Maybe these devices have to be run inside a freezer to successfully achieve a consistent hash rate of 300 MH/s and beyond.
Time will tell.

               ▄█▄
            ▄█ ▀█▀
     ▄ ▄███▄▄████▄▀ ▄▄▀▄
    ▀█▄████
██████▀▄█████▀▄▀
   ▄█▀▄
███████████████████▄
 ▄██▀█▀
▀▀▀███▀▀▀█████▄▄▄▀█▀▄
 ▄█▀▀   ▀█
███▀▄████████ █▀█▄▄
██▀  ▀ ▀ ▀
██████████▄   ▄▀▀█▄
     ▀ ▀
  ███▀▀▀▀▀████▌ ▄  ▀
          ████████████▌   █
        █████████████▀
        ▀▀▀██▀▀██▀▀
           ▀▀  ▀▀
BTC-GREEN       ▄▄████████▄▄
    ▄██████████████▄
  ▄██████
██████████████▄
 ▄███
███████████████████▄
▄█████████████████████████▄
██████████████████████████
███████████████████████████
███████████████████████████
▀█████████████████████████▀
 ▀███████████████████████▀
  ▀█████████████████████▀
    ▀█████████████████
       ▀▀█████████▀▀
Ecological Community in the Green Planet
❱❱❱❱❱❱     WHITEPAGE   |   ANN THREAD     ❰❰❰❰❰❰
           ▄███▄▄
       ▄▄█████████▄
      ▄████████████▌
   ▄█████████████▄▄
 ▄████████████████████
███████████████▄
▄████████████████████▀
███████████████████████▀
 ▀▀██████▀██▌██████▀
   ▀██▀▀▀  ██  ▀▀▀▀▀▀
           ██
           ██▌
          ▐███▄
.
Wandering Albatross
Member
**
Offline Offline

Activity: 70
Merit: 10



View Profile
March 10, 2012, 08:54:32 PM
 #146

Quote from: Inspector 2211
Maybe these devices have to be run inside a freezer to successfully achieve a consistent hash rate 300 MH/s and beyond.

Maybe a peltier device would suffice.

BTC: 1JgPAC8RVeh7RXqzmeL8xt3fvYahRXL3fP
kakobrekla
Hero Member
*****
Offline Offline

Activity: 714
Merit: 500


Psi laju, karavani prolaze.


View Profile
March 10, 2012, 08:56:05 PM
 #147

Quote from: Inspector 2211
Maybe these devices have to be run inside a freezer to successfully achieve a consistent hash rate 300 MH/s and beyond.

Maybe a peltier device would suffice.

Yeah cause those are free and need no power to run!

DeepBit
Donator
Hero Member
*
Offline Offline

Activity: 532
Merit: 501


We have cookies


View Profile WWW
March 10, 2012, 09:22:16 PM
 #148

Peltiers are cool and handy, but for big mining operations phase-change heat pumps will be FAR more efficient.
Remember that Peltier modules consume a lot of current and act almost as 200% efficiency heaters on the other side :)

Welcome to my bitcoin mining pool: https://deepbit.net ~ 3600 GH/s, Both payment schemes, instant payout, no invalid blocks !
Coming soon: ICBIT Trading platform
Wandering Albatross
Member
**
Offline Offline

Activity: 70
Merit: 10



View Profile
March 10, 2012, 10:31:06 PM
 #149

Quote from: DeepBit
Peltiers are cool and handy, but for big mining operations phase-change heat pumps will be FAR more efficient.
Remember that Peltier modules consume a lot of current and act almost as 200% efficiency heaters on the other side Smiley

How would a phase-change heat pump work for cooling a chip? Is there an off-the-shelf device? Or would it be DIY?

For the peltier I realize power is needed and heat is also produced but if your goal is to cool the chip it would work for that and peltier makers target this application of their products.  e.g. micropelt

Perhaps a better way to efficiency is to create your own cheaper power and don't create as much heat by running miners at lower freq.

BTC: 1JgPAC8RVeh7RXqzmeL8xt3fvYahRXL3fP
TheSeven
Hero Member
*****
Offline Offline

Activity: 504
Merit: 500


FPGA Mining LLC


View Profile WWW
March 10, 2012, 10:48:35 PM
 #150

Quote from: DeepBit
Peltiers are cool and handy, but for big mining operations phase-change heat pumps will be FAR more efficient.
Remember that Peltier modules consume a lot of current and act almost as 200% efficiency heaters on the other side Smiley

How would a phase-change heat pump work for cooling a chip? Is there an off-the-shelf device? Or would it be DIY?

For the peltier I realize power is needed and heat is also produced but if your goal is to cool the chip it would work for that and peltier makers target this application of their products.  e.g. micropelt

Perhaps a better way to efficiency is to create your own cheaper power and don't create as much heat by running miners at lower freq.

What about just using a BFL single instead of wasting half the electricity on cooling?  Roll Eyes

My tip jar: 13kwqR7B4WcSAJCYJH1eXQcxG5vVUwKAqY
DeepBit
Donator
Hero Member
*
Offline Offline

Activity: 532
Merit: 501


We have cookies


View Profile WWW
March 10, 2012, 10:49:04 PM
 #151

Quote from: DeepBit
Peltiers are cool and handy, but for big mining operations phase-change heat pumps will be FAR more efficient.
Remember that Peltier modules consume a lot of current and act almost as 200% efficiency heaters on the other side :)
How would a phase-change heat pump work for cooling a chip? Is there an off-the-shelf device? Or would it be DIY?
Any good solution for that will be some kind of DIY, otherwise it will be either unsuitable or expensive.

For phase-change I would use two contours - one with normal liquid passing through many special waterblocks and second with freon for cooling the first contour (if we need lower-than-environment temps, or course).

Welcome to my bitcoin mining pool: https://deepbit.net ~ 3600 GH/s, Both payment schemes, instant payout, no invalid blocks !
Coming soon: ICBIT Trading platform
Gomeler
Hero Member
*****
Offline Offline

Activity: 697
Merit: 500



View Profile
March 11, 2012, 12:11:38 AM
 #152

Very interesting thread but I can actually chime in on this conversation about using a phase-change system to remove heat. If you're talking a traditional single-stage gas system like in your refrigerator then forget about it. The piping required plus the MINUSCULE load for each cold-head would make this cost prohibitive. Your best bet would be to repurpose a mini-fridge, use a proper condenser and throw a TXV for refrigerant metering and use something like r134a or n-butane/iso-butane and aim for evaporator temperatures in the 0-20 Celsius range. Then stick with your dinky little heatsinks and fans and not worry about having to mill expensive evaporators for such a small heatload.

Mini-fridges or even something like a deep-chest freezer would be the perfect insulated box to work with. The issue is such systems are designed to remove the heat from a load that doesn't generate additional heat. Without modification you will kill a freezer/fridge. That's where the replacement condenser and TXV come in to place. Make the compressor happy and you'll have shockingly low compressor loads and could very well run these FPGAs at astonishing speeds.

That all being said, compressed gasses are fun but can easily explode in your face with dire consequences if you aren't careful. Plenty of forums out there for amateur refrigeration. Take a gander at some of the things people have made and consider the tool costs. My own set of tools and gasses would buy a number of FPGAs and likely make me more money in the process Cheesy
eldentyrell (OP)
Donator
Legendary
*
Offline Offline

Activity: 980
Merit: 1004


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 01:59:09 AM
Last edit: March 20, 2012, 02:37:17 AM by eldentyrell
 #153

 Number of DSP48A1s:                           30 out of     180   16%
Aha! Interesting. When uncle Moshe (Gavrielov) gives you DSPs, make DSPeade. Wink

This isn't my "secret sauce", but it is unique to my design.  When I run out of SRL16s in the places where I need them, I use the DSP48's as 32-bit-wide 16-bit-wide, 6-bit-deep FIFOs.  Useful trick.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
eldentyrell (OP)
Donator
Legendary
*
Offline Offline

Activity: 980
Merit: 1004


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 02:00:55 AM
 #154

The multiplier in the DSP48-block is not needed in SHA-256, hence what he obviously uses is the 18-bit adder
BCOUT = B + D.

Nah; I use the DSP48s as big fat FIFOs; they have lots of registers inside and if you configure them right everything's a no-op.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
rjk
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
March 20, 2012, 02:01:58 AM
 #155

The multiplier in the DSP48-block is not needed in SHA-256, hence what he obviously uses is the 18-bit adder
BCOUT = B + D.

Nah; I use the DSP48s as big fat FIFOs; they have lots of registers inside and if you configure them right everything's a no-op.
Interesting use case; so essentially no added latency using them this way?

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
eldentyrell (OP)
Donator
Legendary
*
Offline Offline

Activity: 980
Merit: 1004


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 02:02:32 AM
 #156

Quote from: Inspector 2211
BFL Single, watch out below.
What makes you think this cannot similarly be applied to the single ( even after a hardware modification ) Huh

The BFL single definitely isn't a Spartan 6.

BTW, I will offer a 10BTC bounty to anybody who posts the JTAG IDCODE readout from the BFL single -- merely to satisfy my curiosity.  There was a JTAG header on the last PCB I saw them post.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
eldentyrell (OP)
Donator
Legendary
*
Offline Offline

Activity: 980
Merit: 1004


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 02:08:09 AM
 #157

is there a way to port it to Ztex  or other FPGA board's?

Yes.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
eldentyrell (OP)
Donator
Legendary
*
Offline Offline

Activity: 980
Merit: 1004


felonious vagrancy, personified


View Profile WWW
March 20, 2012, 02:09:34 AM
 #158

Sorry for falling off the radar there.  Real life, quality time with git bisect, and some voltage drop issues on my own boards conspired to slow things down the last week or so.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators.  So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
c_k
Donator
Full Member
*
Offline Offline

Activity: 242
Merit: 100



View Profile
March 20, 2012, 05:42:02 AM
 #159

Ooh new speed update, what do we have to do to encourage you to make this available to all to use?

Even if only binary blobs initially.

X6500's would be awesome with this Wink

Energizer
Sr. Member
****
Offline Offline

Activity: 273
Merit: 250



View Profile
March 20, 2012, 01:49:25 PM
 #160

I totally agree with you catfish! You can count me in such club! But before all Dr Tyrell should inform us whether he is willing to sell his bitstream to us or not!

And in case he is not! We may then fund the open-source project to speed up its development!
Pages: « 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!