Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

jonand

Newbie

Activity: 12
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 13, 2011, 09:44:45 PM

#461

Makomk's sha256-transform.v improved the verilog-xilinx port a bit. unroll=2 is now working on the sp605 board with at least a 63MHz hash_clk.

100 shares found and speed seems to stabilize somewhere in the 15Mh/s region.

Anyone else running LX45?

Keninishna

Hero Member

Activity: 556
Merit: 500

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 13, 2011, 10:05:07 PM

#462

I'm interested in getting into fpgas, however where can I source a spartan-6 lx150 for cheap? on the hardware comparison site in the comments it says

Quote

3N 484-pin chip is ~$150, 0.67Mhash/$

NF6X

Member

Activity: 98
Merit: 10

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 13, 2011, 11:06:38 PM

#463

Quote from: Keninishna on August 13, 2011, 10:05:07 PM

I'm interested in getting into fpgas, however where can I source a spartan-6 lx150 for cheap? on the hardware comparison site in the comments it says

Quote

3N 484-pin chip is ~$150, 0.67Mhash/$

That's about what they cost in single quantity, and as FPGAs of that size go, that is cheap. That price is just for the chip; it would need to be soldered onto a suitable board to be usable. It's in a 484-pin ball grid array package with 1mm ball pitch, which is a bit too advanced for most hobbyists to solder down themselves. Unfortunately, off-the-shelf development boards with LX150 chips on them are pretty expensive, partly because they support features of the device which add substantial cost to the board (such as having lots of layers in order to route out all 484 pins) that we don't need for the mining application. I don't have the link handy or remember the manufacturer's name at the moment, but the cheapest off-the-shelf LX150-based board that I've seen recently costs around $600-$700 if I recall correctly.

There's another thread in which folks are working on a fairly low-cost LX150-based board that's optimized for mining:

https://bitcointalk.org/index.php?topic=22426.0

NF6X

Member

Activity: 98
Merit: 10

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 01:29:15 AM

#464

I found the Spartan 6 LX150 boards that I mentioned previously:

http://www.hdl.co.jp/en/index.php/xilinx-series1/spartan-6.html

They start at around $700 for LX150 boards (less for smaller parts). I don't know of any cheaper off-the-shelf Spartan 6 LX150-based boards right now, but I'd love to hear about any.

iidx

Newbie

Activity: 35
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 02:57:42 AM

#465

Hi Guys,

I had a bunch of spare stuff laying around at work, so I whipped up a mining configuration as an exercise.

Supplies:
6 Xilinx ML605 cards (XC6VLX240T, Virtex 6)
PCIe switch development kit (an external board that connects to your PC and has a ton of PCIe slots)
TI USB to GPIO pod (for the ML605 power supplies)

Starting with the Xilinx Verilog port of the code, I found that I could fit 2 instances of the LOOP = 0 core. However, that wasn't enough for me. I figured that if I used the DSP48s in the Virtex 6, I could fit at least 1 more in there. With the 3 cores, I am using about 558 of the hardware multipliers. Sadly, there isn't enough to fit a 4th in there. I may be able to fit it in there if I use a few less hardware multipliers, but it will be tight. I ended up running the cores at 125 Mhz, because that's the same speed my PCIe internal interface runs at. There is more headroom available, 150 Mhz is probably doable, but power will start to become a concern.

It turned out that the ML605's power supplies could supply enough power to run 3 cores, but the digital power managers were not set to allow the full rated current. I re-programmed the power managers to allow for the rated current in order to get the 3 core version of my design working. The designs use about 16-18 watts of power and 16-17A on the VCCint rail (1v). I had to supply additional cooling to make sure the power supplies didn't over heat (they have no cooling normally).

Next, I had to connect it to my PC for mining data. Now, of course 6 serial ports wasn't going to be the most elegant solution (and my PC actually had no serial ports). I used an off the shelf PCIe core in conjunction with the Xilinx hard IP to connect the 3 bitcoin cores to the PC. Sadly, the PCIe core is a licensed product, so I won't be able to share the source here.

The hardest part for me was last - I had to figure out what data to get from a mining pool, what to do with it and how to get it in the card. I found some open source C# mining libraries (I need to credit the guy, but I don't have the code in front of me), modified it and wrote a mining program to feed and poll all of the cards. It was a pain in the ass to get that finally working, but through analyzing a bunch of different mining software I figured it out.

But finally, my experiment is working @ 2250 Mhash/s and about 100w! The cost is out of control, but since I had these cards laying around from other experiments, I figured I'd give it a shot.

I'd be happy to contribute the changes to the modifications to usethe DSP48s, but I can't actually distribute the PCIe DMA/PIO engine since it's licensed (not from Xilinx). I'm happy to distribute the source for the software too, since I didn't really find a C#/.NET windows version that suited my needs.

Questions and comments welcome!

newMeat1

Full Member

Activity: 210
Merit: 100

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 03:34:38 AM

#466

Can you humor me with a guess of how much this hardware would cost, iidx?

2250 Mhash/s- Wow! I wish I had that

Join CampBX

NF6X

Member

Activity: 98
Merit: 10

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 04:25:42 AM

#467

That ML605 hack sounds delightful! I love it!

Another group at my company has custom emulation platforms with more than 50 (!!) Virtex 5 parts each. I wish I could spend some quality with one of those and make it slave away in the Bitcoin mines, but sadly, that's not going to happen. I could get away with stuff like that when I was working for a little startup instead of a megacorp, but then we couldn't afford toys like that back then.

iidx

Newbie

Activity: 35
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 04:37:59 AM
Last edit: August 14, 2011, 04:56:36 AM by iidx

#468

Quote from: newMeat1 on August 14, 2011, 03:34:38 AM

Can you humor me with a guess of how much this hardware would cost, iidx?

2250 Mhash/s- Wow! I wish I had that

It's pretty unreasonable, I think each of the cards were $2000 when I bought them about 6 months ago for a project. Xilinx has "generously" reduced the price to $1795 now... Not sure what the raw chip prices are, maybe $500 each in volume.

Quote

Another group at my company has custom emulation platforms with more than 50 (!!) Virtex 5 parts each. I wish I could spend some quality with one of those and make it slave away in the Bitcoin mines, but sadly, that's not going to happen. I could get away with stuff like that when I was working for a little startup instead of a megacorp, but then we couldn't afford toys like that back then.

Wow, 50 devices!! Maybe you can ask if you can do some performance testing for that group Cheesy

I actually have another board that has a Virtex 5 on it (ML555) that I thought about also trying to use. However, it has a pretty small device and no heatsink, so it probably would be a bad idea and yield 125-150 MHash at best.

fpgaminer (OP)

Hero Member

Activity: 560
Merit: 517

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 05:07:33 AM

#469

Quote

Finally got around to coding some maximum clock speed improvements for users of smaller Cyclone III and IV devices - now available from my new partial-unroll-speed branch. Expected minimum device size and speed is roughly as follows:

More fantastic work, makomk! Cool

*applause*

Quote

I've been playing around with the xilinx-verilog port in the github repo and can confirm that it works just fine on the Xilinx Spartan-6 XC6LX9 microboard eval board from Avnet for $69.

Thank you for taking the time to share your experiences with all these mini eval boards, jonand. That's great information.

It's a shame that LX9 microboard uses a 324 landing. Would be neat to re-solder an LX150 to it, but the LX150 doesn't come in 324 package :/

Quote

But finally, my experiment is working @ 2250 Mhash/s and about 100w! The cost is out of control, but since I had these cards laying around from other experiments, I figured I'd give it a shot.

*drools* For reference, an AMD 5850 only gets ~350MH/s for ~150W. Tongue

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner | Bitcoin Hardware Wallet

1NT4RyJMqtRuDRr6zHdXdKSpmX3SR5he6z

Keninishna

Hero Member

Activity: 556
Merit: 500

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 09:05:01 AM

#470

Quote from: NF6X on August 13, 2011, 11:06:38 PM

Quote from: Keninishna on August 13, 2011, 10:05:07 PM

I'm interested in getting into fpgas, however where can I source a spartan-6 lx150 for cheap? on the hardware comparison site in the comments it says

Quote

3N 484-pin chip is ~$150, 0.67Mhash/$

How about this universal board? http://www.hdl.co.jp/en/index.php/accessories/zkb-054.html It appears it'll take a socketed version of the chip and is only about 180$. If the chip can be sourced for 150-170$ It will still be expensive at 330$ but cheaper than 700$

Silverpike

Newbie

Activity: 54
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 09:50:29 AM

#471

Quote from: Keninishna on August 14, 2011, 09:05:01 AM

You are a little off here. This is not a board that can be used to mount FPGAs. This board is designed specifically to aggregate the other FPGA dev boards this company sells onto one motherboard. It won't help for finding a cheap host for raw FPGA parts.

Keninishna

Hero Member

Activity: 556
Merit: 500

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 10:26:00 AM
Last edit: August 14, 2011, 11:05:42 AM by Keninishna

#472

Quote from: Silverpike on August 14, 2011, 09:50:29 AM

Quote from: Keninishna on August 14, 2011, 09:05:01 AM

I had a feeling it wasn't that easy. Definitely a challenge to make fpgas feasible bitcoin miners. I found a site that offers a board for about 500$ http://shop.ztex.de/product_info.php?products_id=64&language=en

Holy moly take a look at this 12x LX150s on one pcie board. http://www.dinigroup.com/new/DNBFC_S12_PCIe.php Grin

Also this page seems like a good reference http://www.fpga-faq.com/FPGA_Boards.shtml

ngzhang

Hero Member

Activity: 592
Merit: 501

We will stand and fight.

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 11:06:17 AM
Last edit: August 14, 2011, 11:44:00 AM by ngzhang

#473

Hi guys.
I'm working hard around the XC6SLX150 -3N FPGA this week. I'm trying to design a FPGA computing unit for bitcoin mining and some other related project.
By the work of this thread:

https://bitcointalk.org/index.php?topic=22426.580
Modular FPGA Miner Hardware Design Development

I think, if we do not use expensive power-modules(instead of discrete comps, cheaper but the design and test difficulty is upupup), find a way to buy cheap FPGAs, well Design for manufacturability, have hundreds of people will buy it, etc......
After that, I'm quite sure the daughter board will be able to be manufactured in 400$ for all costs.

But the question is, 2 of XC6SLX150 -3n will finally give us how much MH/s? If we want to make FPGA mining to be a feasible choice, the MH/s pre $ must close the GPU mining.
1 HD6870( now buy new HD5850s are difficult) is about 180$, provide a hashing power of 270MH/s, about 1.5Mhs/$. I think at least, a 1Mhs/$ is necessary for FPGA mining.
Can we optimize the XC6SLX150 to about 200Mh/s performance? Is it possible?

That means: 2 fully pipelined cores run at 100MHz, per FPGA. Note that XC6SLX150 has about 60% logic resource of XC6VLX240T and 180 DSP48A1s (XC6VLX240T has 768 of ESP48E1s).

If the above performance is possible, I can make the 400$ prize dual XC6SLX150 board come true in 1-2 month.

iidx

Newbie

Activity: 35
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 08:00:50 PM

#474

I don't know the usage for the single unrolled core on a S6 150, but here's the usage from my design in the V6 240T using 2 cores. I'm guessing you'll want to use all 180 of those DSP48s to reduce the logic usage. The number of slices and LUTs used is going to be close to your maximum capacity, but with 372 (186 per core) DSP48s used.

Code:

Device Utilization Summary:

Slice Logic Utilization:
  Number of Slice Registers:               101,697 out of 301,440   33%
    Number used as Flip Flops:              98,581
    Number used as Latches:                      1
    Number used as Latch-thrus:                  0
    Number used as AND/OR logics:            3,115
  Number of Slice LUTs:                     88,763 out of 150,720   58%
    Number used as logic:                   67,920 out of 150,720   45%
      Number using O6 output only:          32,057
      Number using O5 output only:           1,667
      Number using O5 and O6:               34,196
      Number used as ROM:                        0
    Number used as Memory:                   9,892 out of  58,400   16%
      Number used as Dual Port RAM:              0
      Number used as Single Port RAM:            0
      Number used as Shift Register:         9,892
        Number using O6 output only:         7,362
        Number using O5 output only:             0
        Number using O5 and O6:              2,530
    Number used exclusively as route-thrus: 10,951
      Number with same-slice register load: 10,889
      Number with same-slice carry load:        62
      Number with other load:                    0

Slice Logic Distribution:
  Number of occupied Slices:                27,898 out of  37,680   74%
  Number of LUT Flip Flop pairs used:      105,799
    Number with an unused Flip Flop:        28,962 out of 105,799   27%
    Number with an unused LUT:              17,036 out of 105,799   16%
    Number of fully used LUT-FF pairs:      59,801 out of 105,799   56%
    Number of slice register sites lost
      to control set restrictions:               0 out of 301,440    0%

Specific Feature Utilization:
  Number of RAMB36E1/FIFO36E1s:                 40 out of     416    9%
    Number using RAMB36E1 only:                 40
    Number using FIFO36E1 only:                  0
  Number of RAMB18E1/FIFO18E1s:                  0 out of     832    0%
  Number of BUFG/BUFGCTRLs:                      5 out of      32   15%
    Number used as BUFGs:                        5
    Number used as BUFGCTRLs:                    0
  Number of ILOGICE1/ISERDESE1s:                 0 out of     720    0%
  Number of OLOGICE1/OSERDESE1s:                 0 out of     720    0%
  Number of BSCANs:                              0 out of       4    0%
  Number of BUFHCEs:                             0 out of     144    0%
  Number of BUFIODQSs:                           0 out of      72    0%
  Number of BUFRs:                               0 out of      36    0%
  Number of CAPTUREs:                            0 out of       1    0%
  Number of DSP48E1s:                          372 out of     768   48%
  Number of EFUSE_USRs:                          0 out of       1    0%
  Number of FRAME_ECCs:                          0 out of       1    0%
  Number of GTXE1s:                              4 out of      20   20%
    Number of LOCed GTXE1s:                      4 out of       4  100%
  Number of IBUFDS_GTXE1s:                       1 out of      12    8%
  Number of ICAPs:                               0 out of       2    0%
  Number of IDELAYCTRLs:                         0 out of      18    0%
  Number of IODELAYE1s:                          0 out of     720    0%
  Number of MMCM_ADVs:                           1 out of      12    8%
  Number of PCIE_2_0s:                           1 out of       2   50%
    Number of LOCed PCIE_2_0s:                   1 out of       1  100%
  Number of STARTUPs:                            1 out of       1  100%
  Number of SYSMONs:                             0 out of       1    0%
  Number of TEMAC_SINGLEs:                       0 out of       4    0%

Silverpike

Newbie

Activity: 54
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 09:09:49 PM

#475

Quote from: ngzhang on August 14, 2011, 11:06:17 AM

But the question is, 2 of XC6SLX150 -3n will finally give us how much MH/s? If we want to make FPGA mining to be a feasible choice, the MH/s pre $ must close the GPU mining.
1 HD6870( now buy new HD5850s are difficult) is about 180$, provide a hashing power of 270MH/s, about 1.5Mhs/$. I think at least, a 1Mhs/$ is necessary for FPGA mining.
Can we optimize the XC6SLX150 to about 200Mh/s performance? Is it possible?

200MH/s is not possible on this part. Artforz was able to tweak his to approx 118MH, and that was with a well optimized design with overclocking. This open-source design isn't terribly efficient, but actually produces a very good result of approx 100MH on the Spartan 150 if the design can be routed at 100 MHZ.

200MH is simply way out of the question for an S6-LX150.

fpgaminer (OP)

Hero Member

Activity: 560
Merit: 517

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 10:46:29 PM

#476

Quote

200MH is simply way out of the question for an S6-LX150.

That won't stop me from trying Grin

As far as I can tell with the poking around I've done so far, the current bottleneck on the S6-LX150 is the far dependencies caused by the W calculations. These references make it so that the rounds are not isolated, and so cannot be routed into a uniform chain. This forces ISE to do completely absurd routing, splattering the placement of a round's components across a good 1/4th of the chip. And that, obviously, leads to massive routing delays. On my last few compiles, the worst-case paths were >80% routing (8ns+ of routing, with 2ns of logic).

If W is buffered between each round as a 512-bit register, instead of chains of shift registers and BRAMs, then the rounds can be isolated, but ISE fails to Map such a design for reasons I have not yet nailed down. 512-bits*~100 is quite a lot of registers Undecided

If I, or someone else, can find a way to isolate the rounds and put them into a more consistent chain, then I highly suspect that both performance and area will improve considerably.

I may create a "fake" design that focuses specifically on the W calculations (without digester rounds), and see if I can somehow get them routed into a sensible structure (even if it requires manual placement Angry

)

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner | Bitcoin Hardware Wallet

1NT4RyJMqtRuDRr6zHdXdKSpmX3SR5he6z

Silverpike

Newbie

Activity: 54
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 10:58:57 PM

#477

Quote from: fpgaminer on August 14, 2011, 10:46:29 PM

Quote

200MH is simply way out of the question for an S6-LX150.

That won't stop me from trying Grin

Good luck! Wink

Quote

My criticism of this design (your design?) is that there is too much pipelining. If you have ever taken a computer architecture class, pipelining can be a very serious impediment to having a high speed design. This sounds counter-intuitive, but the cost you pay for all those registers is very high. This can be mitigated quite a bit on FPGAs, since registers are part of each CLB (and in a sense they can come for "free" if you have enough combinatorial logic). This is certainly part of your routing problem.

Quote

If W is buffered between each round as a 512-bit register, instead of chains of shift registers and BRAMs, then the rounds can be isolated, but ISE fails to Map such a design for reasons I have not yet nailed down. 512-bits*~100 is quite a lot of registers Undecided

)

The roadblock to having a high density FPGA design (in this case) is not your routing issues. The logic you are using to compute the basic hashes is not optimal, and you have not spent any time trying to optimize for your critical path. I would suggest you concentrate your efforts in this domain (hint hint). Keep in mind that you are duplicating each round 128 times, so any logic savings per round is magnified by 128x.

fpgaminer (OP)

Hero Member

Activity: 560
Merit: 517

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 11:27:54 PM

#478

Quote

My criticism of this design (your design?) is that there is too much pipelining.

Thank you for the criticism. I really do appreciate the feedback, and I am by no means an expert

My intuition is similar to yours, in that a more traditional serial design should achieve better utilization and performance on the Spartan-6 architecture. But it is very easy to underestimate the massive amount of optimizations that occur in the fully unrolled design that takes my current primary focus.

I have a functioning serial implementation, but so far my estimates for its total performance once put in parallel on the S6-LX150 is not exciting. Something like 120MH/s of performance. It's in the back of my mind, and there is plenty more work to be done in optimizing and perfecting it, but it hasn't shown me enough promise to warrant being in my mental spotlight like the unrolled design.

Quote

The logic you are using to compute the basic hashes is not optimal, and you have not spent any time trying to optimize for your critical path.

The current critical path is approximately two 3-way 32-bit adders implemented as 16 total slices, thanks to the Spartan-6 fast carry look ahead chains. Is there a means of optimizating that logic that I have missed?

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner | Bitcoin Hardware Wallet

1NT4RyJMqtRuDRr6zHdXdKSpmX3SR5he6z

iidx

Newbie

Activity: 35
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 11:40:18 PM

#479

Quote from: fpgaminer on August 14, 2011, 10:46:29 PM

Quote

200MH is simply way out of the question for an S6-LX150.

That won't stop me from trying Grin

Yeah, it looks like a "giant snake" that traverses the chip Cheesy

Quote

These are the adders that I tried to move into DSP48s, as they have dedicated carry paths to and from adjacent DSPs in a column. I didn't look at all how to optimize the actual math/operations at all though.

ngzhang

Hero Member

Activity: 592
Merit: 501

We will stand and fight.

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 15, 2011, 04:25:12 AM
Last edit: August 15, 2011, 05:06:59 AM by ngzhang

#480

Quote from: Silverpike on August 14, 2011, 10:58:57 PM

Quote from: fpgaminer on August 14, 2011, 10:46:29 PM

Quote

200MH is simply way out of the question for an S6-LX150.

That won't stop me from trying Grin

Good luck! Wink

Besides pipelining, there a another way to enhance performance in IC design, which is logic-copy.
I didn't read the codes yet( because I put all my spare time on the dual XC6SLX150 mining board design), but after read this thread, if we facing terrible routing problems, why not we try another architecture.
The possible way is, implement a core, optimized roll up, like a calculate equipment(maybe better use DSP48As) around a signal 512bit register(maybe use LUTs to implement Distributed RAM instead of using registers), runing at 200MHZ+(it's very possible), about 64clocks per hash. and we can implentment 100+ of them per chip.
This way, we can also generate a very MH/s.

Certainly, I'm not a expert, just for discussing.

EDIT1:
I found this :
http://www.heliontech.com/downloads/fast_hash_xilinx_datasheet.pdf#view=Fit
In this Commercial Ip core, they use 309 slices (SLX150 has 23038 of them), generate a transport of 977Mbps.
If we use 80% slices of one SLX150, we can implement 60 of these cores, generate a transport of 58G. about 228MH/s.

So ,reach 200MH/s is very possible, isn't it?

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 [24] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 »

Bitcoin Forum > Bitcoin > Mining > Hardware > Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

« previous topic next topic »