Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

mnemonix

Newbie

Offline

Activity: 19
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

April 20, 2013, 08:31:59 AM

#761

Thx for your work you put in the miner!

I ported the Xilinx_VHDL miner to the ml605 dev board.

Actually, straight forward ... Replaced the dcm with a newer Virtex6-aquivalent, wired the pins to rs232 and clock, adjusted the baud rate and it run instantly.

It does 200MHash/sec and is user by about 85% ...

kramble

Sr. Member

Offline

Activity: 384
Merit: 250

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

April 20, 2013, 08:52:43 AM

#762

Quote from: AJRGale on April 20, 2013, 06:22:28 AM

So the question is, will any 150K gate fpga work with the full miner? or is there something I'm missing (EG: http://www.digilentinc.com/Products/Detail.cfm?NavPath=2,400,790&Prod=BASYS2 with 250K gates, slap on the full miner, and bam, 1hash a clock? )

NO!! Don't confuse gate with LE (logic element). Older fpga's often quoted a gate count (such as the one you linked to Spartan 3E 250K gates). Newer fpga's use a Logic Element (or Logic Cell) count (and google tells me there are 12 gates to a LE). So a Spartan 6 LX150 with 147,443 logic cells roughly equates to 1.7 million gates by my calculation (I can't find any direct quote for the actual figure, so take that as very approximate). You can see the spartan family spec at http://www.xilinx.com/support/documentation/data_sheets/ds160.pdf

The board you linked to will be (almost) useless for mining. You need to look for a purpose-built Spartan LX150 based miner and use the firmware (bitstream) that comes with it (and even then the economics look pretty grim).

If you want to compile your own bitstream for the Spartan series, you can download free software from the Xilinx web site http://www.xilinx.com/products/design-tools/ise-design-suite/ise-webpack.htm but beware that it is limited to the smaller devices (LX75 maximum I think, but do your own due dilligence). You need the full (very expensive) version to compile for the LX150.

Regards
Mark

Github https://github.com/kramble BLC BkRaMaRkw3NeyzsZ2zUgXsNLogVVkQ1iPV

AJRGale

Hero Member

Offline

Activity: 767
Merit: 500

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

April 20, 2013, 10:02:25 AM

#763

Quote from: kramble on April 20, 2013, 08:52:43 AM

Quote from: AJRGale on April 20, 2013, 06:22:28 AM

Ah, Sorry for my newbishness, never played with one of these devices (blame the 2 companies for their heavy secretive efforts unless you buy their $5000 suite)
my mistake, so when a company quotes "Gates" number, i have to look for ALM, LE, Slice etc?

Basically i want to know what a full miner roll out fits on, how many LEs i'll go to digi-key and look something up and go from there

minernb

Newbie

Offline

Activity: 14
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

April 20, 2013, 10:56:36 PM

#764

Quote from: AJRGale on April 20, 2013, 10:02:25 AM

Basically i want to know what a full miner roll out fits on, how many LEs i'll go to digi-key and look something up and go from there

Hi,

The Altera DE1 has 18K LE.

The non-optimized version fits using the factor 4 in the roll(?), for a total of 16K LE used. I get 3.10 MH/s.
The makomk_mod version fits using factor 2 (but all works are rejected, I don't know way!). It reports 12MH/s.

kramble

Sr. Member

Offline

Activity: 384
Merit: 250

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

April 21, 2013, 08:30:24 AM
Last edit: April 21, 2013, 08:49:39 AM by kramble

#765

Quote from: minernb on April 20, 2013, 10:56:36 PM

The makomk_mod version fits using factor 2 (but all works are rejected, I don't know way!). It reports 12MH/s.

I had the same problem with the DE0-Nano (22k LE), this was Makomk's response ...

Quote from: makomk on November 04, 2012, 02:40:30 PM

Quote from: kramble on November 01, 2012, 03:22:52 PM

I've now started looking at the code in the DE2_115_makomk_mod branch, but I've hit a problem. The code compiles fine at CONFIG_LOOP_LOG2=2, 3 and 4 but its producing the wrong hashes (I'm just running at 40MHz for testing, not full blast) ... the mine.tcl script submits hashes to the pool, but they are all rejected!

Yeah, that branch doesn't work with CONFIG_LOOP_LOG2!=1. You probably want http://www.makomk.com/gitweb/?p=Open-Source-FPGA-Bitcoin-Miner.git;a=summary de0-nano-hax branch, projects/DE2_115_Unoptimized_Pipelined project. The voltage regulators are also indeed horribly inefficient on the DE0-nano.

I can't answer AJRGale's query about the LE's needed for a fully unrolled core as I haven't built anything larger than a one-sixth core which (just) fitted into 22k LE on an EP4CE22 on the Nano.

Regards
Mark

Github https://github.com/kramble BLC BkRaMaRkw3NeyzsZ2zUgXsNLogVVkQ1iPV

senseless

Hero Member

Offline

Activity: 1118
Merit: 541

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

April 21, 2013, 10:40:32 AM
Last edit: April 21, 2013, 11:07:47 AM by senseless

#766

Quote from: fpgaminer on April 15, 2013, 05:40:10 AM

This is a DSP48E1 based design, and I have compiled and run it at 400MH/s.

Have you done any testing as to which adders provide the best increase to the fmax? In order to get multiple cores in there going to need to pick and choose which adders to replace with dsps and which not to. I'm currently at 66% LUT usage with 99% memory LUT and 108% dsp usage with 2 unrolled cores (I had one core do even nonces while the other does odd nonces to make life easy). I've been slowly working down the number of dsps utilized per core to make it fit. I'm thinking it might be possible to get 3 full cores on the A7 200.

Does the DSP performance increase compound? If I change one adder over to DSP utilization and it gives a 10% fmax increase... would changing additional adders down the chain affect that 10%? or will that one adder always give a 10% boost? I'm wondering if it will be possible to go through the adders one by one and calculate the increase in frequency for each one to find which adders would be the most effectively utilized under DSP48 blocks to get the best timing.

Get your VCU1525 blockchain edition board today for $3250 w/o mods, $3350 w/ mods, $3600 w/ mods & DDR4!

anomalies

Newbie

Offline

Activity: 13
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

April 22, 2013, 01:55:25 AM

#767

hi, another question from a newbs.. Grin

have any of you guys heard of parallella? http://www.parallella.org
what you guys think about it? Cheesy

AJRGale

Hero Member

Offline

Activity: 767
Merit: 500

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

April 22, 2013, 04:21:40 AM

#768

Quote from: anomalies on April 22, 2013, 01:55:25 AM

hi, another question from a newbs.. Grin

have any of you guys heard of parallella? http://www.parallella.org
what you guys think about it? Cheesy

Ahh yes, that my friend is a completely different ball game to FPGA
i've been waiting for them to kick off, i want one to play with 64 threads per chip... mmmm

paszczakojad

Newbie

Offline

Activity: 15
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

April 24, 2013, 03:33:54 PM
Last edit: April 24, 2013, 06:49:38 PM by paszczakojad

#769

Quote from: senseless on April 21, 2013, 10:40:32 AM

Quote from: fpgaminer on April 15, 2013, 05:40:10 AM

This is a DSP48E1 based design, and I have compiled and run it at 400MH/s.

I compiled fpgaminer's DSP code on A7 200 and I got 356 MHz on -3 grade, 311 MHz on -2 grade and 262 MHz on -1. The -3 variant only exists in extended temperature version, so it's much more expensive - so the -2 is the best choice in my opinion.

The usage was 20% slice logic, 34% slice logic distribution and 92% DSP.

What were your results? I.e. what maximum clocking do you have without DSP?

Now I'm trying to replace some DSPs with adder IP core - I think best candidates are these that don't use PCIN input (because they are simpler), like dsp_e, dsp_wp and dsp_t1p. When I replaced dsp_e with adder I got 302 MHz (-2 version), 23% logic, 37% distrib, 75% DSP. Then I replaced dsp_wp: 271 MHz, 24% logic, 38% distrib, 63% DSP. Compilation took over 5 hours, while it takes 30 min when using only DSP. Then I replaced dsp_t1p and the compilation takes ages to complete (it didn't complete yet) Sad

The estimation is that DSP usage will be 49%, so theoretically I should be able to fit two such cores. Even if I have to lower the clock to, say, 200 MHz then total output would be 400 MH/s, which would be better than 311 MH/s with one DSP-only core.

fpgaminer (OP)

Hero Member

Offline

Activity: 560
Merit: 517

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

April 25, 2013, 12:14:13 AM

#770

Quote

When I replaced dsp_e with adder I got 302 MHz

I find it odd that your Fmax is dropping when you replace the DSPs with LUTs. You may want to fiddle around with Vivado's settings to make sure register retiming (or whatever Vivado calls it) is enabled. Alternatively, implement the adders as two stages of 16-bits each. Since the DSPs that are being replaced are two stage (or three) anyway.

Also, for dsp_t1p, it would be best to replace both dsp_t1p and compressor_t1p with a single LUT adder, since the LUT fabric can implement 3 way additions just as efficiently as 2-way addition.

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner | Bitcoin Hardware Wallet

1NT4RyJMqtRuDRr6zHdXdKSpmX3SR5he6z

paszczakojad

Newbie

Offline

Activity: 15
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

April 25, 2013, 05:10:48 AM

#771

Quote from: fpgaminer on April 25, 2013, 12:14:13 AM

Quote

When I replaced dsp_e with adder I got 302 MHz

I used 2-stage adders, because DSP adders worked in 2 cycles and I didn't want to debug too much. IP core generator recommended 3 cycles for the best performance - I'll try that next.

After replacing dsp_e, dsp_wp and dsp_t1p I got 46% DSPs used - so it's enough to fit two cores.

Khertan

Full Member

Offline

Activity: 193
Merit: 100

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

May 03, 2013, 06:30:39 PM

#772

I m currently playing with the DE0 Nano code from Kramble.

And i ve a question, you said that running it at higher speed than 40Mhz could damage an unmodified DE0 Nano, and i didn't understand why.

As from Quartus PowerPlay Power Analyser, the design at 50 Mhz use only 328mW, that s arround 273mA right ? it s supposed to support 500mA, isn't it ?

Did i miss something ?

bitcoin:1Khertan7mpfbabM531QTsnDXBdK7sDYxL -- BitPurse Bitcoin Client for n9 : http://khertan.net/projects/bitpurse

kramble

Sr. Member

Offline

Activity: 384
Merit: 250

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

May 03, 2013, 08:32:43 PM
Last edit: May 03, 2013, 09:12:41 PM by kramble

#773

Quote from: Khertan on May 03, 2013, 06:30:39 PM

No, I was just being conservative in case someone inexperienced just cranked it up to the max (and following the example of fpgaminer in his original readme). You can run it faster as long as you are happy the power supply will support it (I had a conversation with hardcore_fc a few months back about the regulators, it may be worth you looking back over it). I am currently running one board at 170Mhz (with a hardwired external 1.2V core supply as described at www.makomk.com) and a second at 80MHz on a conventional 3.3V external supply.

You are correct that a USB supply will probably be limited to 500mA, but this is at 5Volts. I haven't played with the Powerplay Analyser, but I would expect that this is reporting the power at the 1.2V fpga core rail. You have to account for the other devices on the DE0-Nano board too.

I just dug out some notes I made of measurements with the 3.3V supply. 40Mhz was 0.48A, 80Mhz 0,85A, 100Mhz 1.0A, 120MHz 1.2A and 140Mhz 1.36A, so roughly 10mA per Mhz. The regulators were getting very hot at the higher speeds (even though I was pointing a fan at the board), hence my caution at running the DE0-Nano at these sorts of speeds. The regulators themselves are overtemperature protected, but looking at the datasheet, this only kicks in at T(junction) of 175C, while the max operating temperature is 125C. It also quotes 85C/Watt junction-ambient assuming a big chunk of PCB copper dedicated to heatsinking, so you can work out roughly what they can practically support.

Given the tiny returns from mining on the Nano, my opinion was that its not worth risking the boards at the higher speeds. I'm happy with my current setup (as described above) as nothing is getting above 60C, but its your call on your own stuff.

[EDIT] I should add that I'm using a serial interface to communicate with the boards, rather than the quartus_stp jtag usb cable, which is why I can get away with a 3.3V external supply. If you are using the usb for communication, then an external 3.3V supply won't work as it will pull current from the usb instead (there are a couple of blocking diodes so no harm should occur). You could use a 5V external supply to supplement the usb's 500mA, but then its all getting a bit Heath Robinson, and the onboard regulators are under more heat stress at 5V than 3.3V. Oh, and the DE0-Nano manual says the minimum external supply is 3.6V (I just happened to have 3.3V to hand and it worked fine, but its technically out of spec so YMMV).

Regards
Mark

Github https://github.com/kramble BLC BkRaMaRkw3NeyzsZ2zUgXsNLogVVkQ1iPV

fpgaminer (OP)

Hero Member

Offline

Activity: 560
Merit: 517

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

May 03, 2013, 10:20:06 PM

#774

I've been asked a few times about a mining script for the current KC705 firmware. I wrote a plugin for Modular Python Bitcoin Miner. Here's the message I sent to someone about it:

Quote

I uploaded the custom MBPM module, which is compatible with the current KC705 mining code, here:
https://mega.co.nz/#!Oh5HTDRB!C0RLYW4yZN8gbg38FfgLpzmKFcseOql3Xx1i_gXTfdM

You'll want to download a copy of MPBM's testing branch. Then extract the above archive into
Code:
modules/fpgamining
such that you end up with:

Code:
modules/fpgamining/kc705_uart/__init__.py
modules/fpgamining/kc705_uart/kc705uartworker.py

Once you start MPBM, you can now add a KC705 Worker by openning up the MPBM web-interface (http://127.0.0.1:8832) and clicking the "Workers" button on the left. On Windows, I ran MPBM under Cygwin, and the "Port" ended up being /dev/com2 for me. The Baudrate is 115200.

~fpgaminer

I haven't had a chance to clean it up and put it on the repo yet.

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner | Bitcoin Hardware Wallet

1NT4RyJMqtRuDRr6zHdXdKSpmX3SR5he6z

gingernuts

Member

Offline

Activity: 89
Merit: 10

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

May 04, 2013, 12:01:41 AM

#775

Looking at Digikey right now,for the chips you could actually buy today,

The Small Kintex XC7K160T is $230 ish in -1 grade and $280 ish in -2 grade
The Biggest Artix XC7A200T is $200 ish in -1 grade and $270 ish in -2 grade and both of these can be developed with the free Webpack software

The Kintex used on the KC705, XC7K325T is $1000 ish in the -1 grade, and $1500 odd in the -2 grade (They have a $1200 one, but not in stock), and needs a full Vivado/ISE license to play with - even if I were to buy a KC705 dev-kit, I can't see how the 325T device is going to be good bang for the buck...

Interestingly in a Kintex -> Artix migration guide Xilinx seem to reckon that a -1 grade Kintex is 1.6x as fast as a -1 Artix so while the 7A200T looks like a winner in terms of price and slices/DPS modules, I'm wondering whether the Kintex XC7K160 might not be the best value overall...

fpgaminer (OP)

Hero Member

Offline

Activity: 560
Merit: 517

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

May 05, 2013, 07:16:29 AM

#776

For those with a VC707 devkit (Virtex 7), I've done a blind port of the KC705_experimental project:

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner/tree/master/projects/VC707_experimental

Bitstream: https://mega.co.nz/#!7x4nkS4b!O2aEv0Khp541jwY8FIwpiUeYstoXAOSyMqUKxhBMwKY

Completely untested. Let me know if it works, or doesn't!

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner | Bitcoin Hardware Wallet

1NT4RyJMqtRuDRr6zHdXdKSpmX3SR5he6z

Khertan

Full Member

Offline

Activity: 193
Merit: 100

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

May 05, 2013, 04:16:55 PM
Last edit: May 05, 2013, 07:54:52 PM by Khertan

#777

Quote from: kramble on May 03, 2013, 08:32:43 PM

Given the tiny returns from mining on the Nano, my opinion was that its not worth risking the boards at the higher speeds. I'm happy with my current setup (as described above) as nothing is getting above 60C, but its your call on your own stuff.

Regards
Mark

Thanks, indeed for bitcoin mining i ll not risk to burn mine little nano, i'm asking because i'm working on a other project, i want to understand things to not burn it.

I ll try to monitor the usb power used and temperature.

At 40Mhz PowerPlay estimate 296mA ... for the fpga only of course. But i've play with settings to reduce power usage from your original code / project settings.
So look like powerplay underestimate power usage

Thanks a lot for your explanation.

bitcoin:1Khertan7mpfbabM531QTsnDXBdK7sDYxL -- BitPurse Bitcoin Client for n9 : http://khertan.net/projects/bitpurse

xbaby

Newbie

Offline

Activity: 16
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

May 07, 2013, 06:07:36 AM

#778

I'm trying to compile the "projects/X6000_ztex_comm4" myself, for devices "xc6slx150, speed -3", under Xilinx ISE v13.4, and code from Github without any modification.

using default compiling option from "xilinx_fpgaminer.xise", under the goal of "Timing Performance", the placement failed. after change goal to "Minimum Runtime", the project compiled successfully, but the timing constrains can't be met. from the PAR report, the clock speed is only 153MHz (cycle 6.54ns). I'd like to ask what optimization options need to use to achieve > 190MHz clock speed? please help me, thanks very much.

Code:

+-------------------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
|                               |   Period    |       Actual Period       |      Timing Errors        |      Paths Analyzed       |
|           Constraint          | Requirement |-------------+-------------|-------------+-------------|-------------+-------------|
|                               |             |   Direct    | Derivative  |   Direct    | Derivative  |   Direct    | Derivative  |
+-------------------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
|TS_CLK_100MHZ                  |     10.000ns|      9.689ns|     13.082ns|            0|          633|         1456|      3690036|
| TS_dynamic_clk_blk_clkfx      |      5.000ns|      6.541ns|          N/A|          633|            0|      3690036|            0|
+-------------------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+

Slice Logic Utilization:
  Number of Slice Registers:                84,129 out of 184,304   45%
    Number used as Flip Flops:              84,129
    Number used as Latches:                      0
    Number used as Latch-thrus:                  0
    Number used as AND/OR logics:                0
  Number of Slice LUTs:                     50,798 out of  92,152   55%
    Number used as logic:                   35,040 out of  92,152   38%
      Number using O6 output only:          15,507
      Number using O5 output only:             581
      Number using O5 and O6:               18,952
      Number used as ROM:                        0
    Number used as Memory:                   3,297 out of  21,680   15%
      Number used as Dual Port RAM:              0
      Number used as Single Port RAM:            0
      Number used as Shift Register:         3,297
        Number using O6 output only:           449
        Number using O5 output only:             0
        Number using O5 and O6:              2,848
    Number used exclusively as route-thrus: 12,461
      Number with same-slice register load: 12,036
      Number with same-slice carry load:       425
      Number with other load:                    0

Slice Logic Distribution:
  Number of occupied Slices:                15,049 out of  23,038   65%
  Nummber of MUXCYs used:                   22,144 out of  46,076   48%
  Number of LUT Flip Flop pairs used:       58,734
    Number with an unused Flip Flop:           959 out of  58,734    1%
    Number with an unused LUT:               7,936 out of  58,734   13%
    Number of fully used LUT-FF pairs:      49,839 out of  58,734   84%
    Number of slice register sites lost
      to control set restrictions:               0 out of 184,304    0%

Khertan

Full Member

Offline

Activity: 193
Merit: 100

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

May 07, 2013, 03:11:23 PM

#779

Quote from: kramble on May 03, 2013, 08:32:43 PM

I've tryed to fit a 2 loop with a 32 hasher, this could be fit in a DE0 Nano, after some auto magic Quartus Area optimization, but with a far less fmax (120Mhz).
That s fit with only few 1xx lut free

Unfortunatly i mess up the things, as trying to convert things to two loop i break something in the cnt or feedback ...

bitcoin:1Khertan7mpfbabM531QTsnDXBdK7sDYxL -- BitPurse Bitcoin Client for n9 : http://khertan.net/projects/bitpurse

kramble

Sr. Member

Offline

Activity: 384
Merit: 250

Re: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

May 07, 2013, 04:24:50 PM
Last edit: May 07, 2013, 05:38:43 PM by kramble

#780

Quote from: Khertan on May 07, 2013, 03:11:23 PM

I've tryed to fit a 2 loop with a 32 hasher, this could be fit in a DE0 Nano, after some auto magic Quartus Area optimization, but with a far less fmax (120Mhz).
That s fit with only few 1xx lut free

Unfortunatly i mess up the things, as trying to convert things to two loop i break something in the cnt or feedback ...

I was not able to get the LOOP_LOG2=2 code to fit myself but makomk achieved 27.5MH/s on a Nano (https://bitcointalk.org/index.php?topic=74749.msg847182#msg847182 (EDIT updated to a better link)), so I guess that with some expert tweaking it does indeed work. I decided to go a different route and try to fit 22 hashers (which nicely gives 66 stages in three rounds, so just discarding the last two to give the 64 needed) using a variant of sha256_transform from makomk's github (since the makomk branch in the official distribution does not work unless LOOP_LOG2=1). It did take a fair bit of tinkering in the simulator to get the timing right (and I ended up discarding makomk's pipelining of the K values since it was too confusing, so there is an opportunity for some further gain by putting it back in).

Interestingly this 66 round core generalized quite well as I was able to use it on a EP4CE10 as 6 rounds of 11 hashers and on an LX9 as 11 rounds of 6 hashers (rather disappointing utilization, but I'm even more of a novice at Xilinx ISE as I am at Quartus). Anyway this was just playing around for the sake of it rather than a serious attempt to build a miner on these devices, though I did construct one of each using TQFP devices built on breakout adapter's, which are currently hashing away at the majestic rates of ~~12.7MH/s~~ 11.7MH/s (140MHz) and 5MH/s (110MHz) respectively.

Best of luck
Mark

Github https://github.com/kramble BLC BkRaMaRkw3NeyzsZ2zUgXsNLogVVkQ1iPV

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 [39] 40 41 42 43 44 45 46 47 48 49 »

Bitcoin Forum > Bitcoin > Mining > Hardware > Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

« previous topic next topic »