Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

iidx

Newbie

Offline

Activity: 35
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 14, 2011, 11:40:18 PM

#481

Quote from: fpgaminer on August 14, 2011, 10:46:29 PM

Quote

200MH is simply way out of the question for an S6-LX150.

That won't stop me from trying Grin

As far as I can tell with the poking around I've done so far, the current bottleneck on the S6-LX150 is the far dependencies caused by the W calculations. These references make it so that the rounds are not isolated, and so cannot be routed into a uniform chain. This forces ISE to do completely absurd routing, splattering the placement of a round's components across a good 1/4th of the chip. And that, obviously, leads to massive routing delays. On my last few compiles, the worst-case paths were >80% routing (8ns+ of routing, with 2ns of logic).

Yeah, it looks like a "giant snake" that traverses the chip Cheesy

Quote

The current critical path is approximately two 3-way 32-bit adders implemented as 16 total slices, thanks to the Spartan-6 fast carry look ahead chains. Is there a means of optimizating that logic that I have missed?

These are the adders that I tried to move into DSP48s, as they have dedicated carry paths to and from adjacent DSPs in a column. I didn't look at all how to optimize the actual math/operations at all though.

ngzhang

Hero Member

Offline

Activity: 592
Merit: 501

We will stand and fight.

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 15, 2011, 04:25:12 AM
Last edit: August 15, 2011, 05:06:59 AM by ngzhang

#482

Quote from: Silverpike on August 14, 2011, 10:58:57 PM

Quote from: fpgaminer on August 14, 2011, 10:46:29 PM

Quote

200MH is simply way out of the question for an S6-LX150.

That won't stop me from trying Grin

Good luck! Wink

Besides pipelining, there a another way to enhance performance in IC design, which is logic-copy.
I didn't read the codes yet( because I put all my spare time on the dual XC6SLX150 mining board design), but after read this thread, if we facing terrible routing problems, why not we try another architecture.
The possible way is, implement a core, optimized roll up, like a calculate equipment(maybe better use DSP48As) around a signal 512bit register(maybe use LUTs to implement Distributed RAM instead of using registers), runing at 200MHZ+(it's very possible), about 64clocks per hash. and we can implentment 100+ of them per chip.
This way, we can also generate a very MH/s.

Certainly, I'm not a expert, just for discussing.

EDIT1:
I found this :
http://www.heliontech.com/downloads/fast_hash_xilinx_datasheet.pdf#view=Fit
In this Commercial Ip core, they use 309 slices (SLX150 has 23038 of them), generate a transport of 977Mbps.
If we use 80% slices of one SLX150, we can implement 60 of these cores, generate a transport of 58G. about 228MH/s.

So ,reach 200MH/s is very possible, isn't it?

lame.duck

Legendary

Offline

Activity: 1270
Merit: 1000

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 15, 2011, 09:54:20 AM

#483

Quote from: ngzhang on August 15, 2011, 04:25:12 AM

EDIT1:
I found this :
http://www.heliontech.com/downloads/fast_hash_xilinx_datasheet.pdf#view=Fit
In this Commercial Ip core, they use 309 slices (SLX150 has 23038 of them), generate a transport of 977Mbps.
If we use 80% slices of one SLX150, we can implement 60 of these cores, generate a transport of 58G. about 228MH/s.

So ,reach 200MH/s is very possible, isn't it?

Hm, the datasheet tells 126 MHz performance and 1 clock cycle per hashing round, i would interpret this numbers as this would give us approx 1 MHash/s for bitcoin hashing.

There are some papers on the net on SHA2 cores (McEnvoy and another one) which are capable of running at 120 MHz on a quite old Virtex2 using 1k Lut/SLices??? reaching similar perfomance numbers, they use a pipelined design which need ca. 68 rounds for a single SHA2 Hash.

But regarding the resource usage, there are one the numbers for a single core, but not how the design scales up up to a FPGA full of cores.

ngzhang

Hero Member

Offline

Activity: 592
Merit: 501

We will stand and fight.

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 15, 2011, 10:06:59 AM

#484

Quote from: lame.duck on August 15, 2011, 09:54:20 AM

Quote from: ngzhang on August 15, 2011, 04:25:12 AM

I apologize that if there are no misunderstanding, the datasheet tells that IP core could run at 126MHz and provide a hash rate of 977Mbps

That means approx. 8bit/clk.
And also means a bitcoin hashing rate at 3.8MH/s(1 bitcoin hash is 256bit of data, is that right?).

lame.duck

Legendary

Offline

Activity: 1270
Merit: 1000

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 15, 2011, 11:58:46 AM

#485

Quote from: ngzhang on August 15, 2011, 10:06:59 AM

Quote from: lame.duck on August 15, 2011, 09:54:20 AM

I apologize that if there are no misunderstanding, the datasheet tells that IP core could run at 126MHz and provide a hash rate of 977Mbps

That means approx. 8bit/clk.
And also means a bitcoin hashing rate at 3.8MH/s(1 bitcoin hash is 256bit of data, is that right?).

IMHO No, one bitcoin hash uses 2 'normal' SHA256 hashes, but this would give 1,9 Mhash which ist still the double if using the MHz/64=MHash/s asumption. I have no clue how the troughput will be counted, my understanding so far was that it will use the output data rate for hashing 64 bit chunks of input data. (If the input data set to be hashed is larger than 64 bit, the input will be processed in 64 bit chunks that are expanded to 256 bit, but the output data size will not grow in size)

makomk

Hero Member

Offline

Activity: 686
Merit: 564

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 16, 2011, 02:00:59 PM

#486

Quote from: fpgaminer on August 14, 2011, 10:46:29 PM

Quote

200MH is simply way out of the question for an S6-LX150.

That won't stop me from trying Grin

I saw similar failures at one point. Try enabling register duplication for the Map stage and/or register rebalancing during synthesis. I think I can probably hit at least 140 MHz for 70 Mhash/s on SLX75 with two pipeline stages per round and both of those enabled, plus some other bits, but I need to fix some stuff and test the changes in simulation.

Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS

lame.duck

Legendary

Offline

Activity: 1270
Merit: 1000

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 17, 2011, 01:43:24 PM

#487

Quote from: makomk on August 10, 2011, 06:10:10 PM

Finally got around to coding some maximum clock speed improvements for users of smaller Cyclone III and IV devices - now available from my new partial-unroll-speed branch. Expected minimum device size and speed is roughly as follows:

I've got so far:

EP3C25C6 135MHz
EP2C35C6 111(108)MHz
EP2C35C8 80Mhz

for the 85Degree Celsius slow timing model after playing with the options given from the 'timimg optimizing advisor). One point was that rerunning the compile process a second time doesn't not always give better or equal result (with timing driven options 'on'), so it could be wise the work with revisions or some other provisions made for keeping the optimum bitstream.

One idea i've got from the numbers: would it be more performant to use for the adressed cases only one pipeline that would compute both hashes alternating at a lower resource count even if the pipelines are not 100 % equal?

Anoynomous

Newbie

Offline

Activity: 11
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 17, 2011, 09:57:44 PM

#488

hi to all,
i am having a little trouble here. I had some experience in designing sha1 hash cracker on fpga, so this project caught my interest. When i downloaded the code and tried to compile it for S6 lx150, it took about an hour to just synthesize the code and then the software said i had overused my resources.. so i wanted to knw, where did i go wrong?...

fpgaminer (OP)

Hero Member

Offline

Activity: 560
Merit: 517

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 17, 2011, 10:54:34 PM

#489

Quote

i am having a little trouble here. I had some experience in designing sha1 hash cracker on fpga, so this project caught my interest. When i downloaded the code and tried to compile it for S6 lx150, it took about an hour to just synthesize the code and then the software said i had overused my resources.. so i wanted to knw, where did i go wrong?...

Which project did you use?

For S6-LX150, this is probably the preferred project to start from:
https://github.com/progranism/Open-Source-FPGA-Bitcoin-Miner/tree/master/projects/LX150_makomk_Test
You'll want to adjust main_pll.v:98 to 5 for 50MHz, to make the compile easier and the firmware actually usable (assuming you have the S6-LX150T dev board) without cooling.

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner | Bitcoin Hardware Wallet

1NT4RyJMqtRuDRr6zHdXdKSpmX3SR5he6z

Anoynomous

Newbie

Offline

Activity: 11
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 18, 2011, 01:46:18 AM

#490

Quote from: fpgaminer on August 17, 2011, 10:54:34 PM

For S6-LX150, this is probably the preferred project to start from:
https://github.com/progranism/Open-Source-FPGA-Bitcoin-Miner/tree/master/projects/LX150_makomk_Test
You'll want to adjust main_pll.v:98 to 5 for 50MHz, to make the compile easier and the firmware actually usable (assuming you have the S6-LX150T dev board) without cooling.

well i had used LX150_test. and i dnt have a lx150 dev board, so i think i will just share my ideas here..

the critical path in this circuit is "t1 = rx_state[`IDX(7)] + e1_w + ch_w + rx_w[31:0] + k"..
but k and rx_w[31:0] can be calculated one loop ahead and added to rx_state[`IDX(7)] at the point below:

state_buf[`IDX(7)] <= rx_state[`IDX(6)];

the new code should look like this:
state_buf[`IDX(7)] <= rx_state[`IDX(6)] + rx_w[31:0] + k;
----> where k and rx_w are of next loop

This will reduce the adders to:
t1 = rx_state[`IDX(7)] + e1_w + ch_w;

this should improve clock speed, provided routing issues dont interfere....

Anoynomous

Newbie

Offline

Activity: 11
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 18, 2011, 02:02:19 AM

#491

if the above solution is applied, the calculation of new_w will be the new critical path...
new_w = s1_w + rx_w[319:288] + s0_w + rx_w[31:0];

again s0_w can be calculated a loop ahead and added to rx_w[31:0]. this way our new_w will be shortened to:

new_w = s1_w + rx_w[319:288] + rx_w[31:0];

dcreasing the critical path and possibly increasing the clock frequency...

Can anbody tell me the %age LUT utilized after synthesis... there may be a possibility of replacing the adders logic...

fpgaminer (OP)

Hero Member

Offline

Activity: 560
Merit: 517

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 18, 2011, 04:30:49 AM

#492

Quote

well i had used LX150_test. and i dnt have a lx150 dev board, so i think i will just share my ideas here..

Oops, sorry, LX150_Test isn't really usable at the moment. I really need to add a useful README outlining all those different project variations ...

Thank you for contributing your idea!

Please take a look at the project variation I linked: https://github.com/progranism/Open-Source-FPGA-Bitcoin-Miner/tree/master/projects/LX150_makomk_Test

You will find that your idea, for the most part, has already been implemented in there. Specifically look around this line.

BUT: You did point something out that I think I missed. In the code I linked you'll see that the pre-calculated T1 value is stored in a separate register, not tx_state[7] as you listed in your example. On looking at my code, I believe you are correct; tx_state[7] is never used (except for the last round) so it could be removed or replaced with the partial calculation. Good catch, Anoynomous!

Not sure if the compiler catches this optimization automatically or not.

Quote

again s0_w can be calculated a loop ahead and added to rx_w[31:0]. this way our new_w will be shortened to:

Now that, I hadn't thought of. Another fantastic catch, Anoynomous!

Double check me on this:

Code:

tx_pre_w <= s0(rx_w[2]) + rx_w[1];     // Calculate the next round's s0 + the next round's w[0].
tx_new_w <= s1(rx_w[14]) + rx_w[9] + rx_pre_w;

Quote

if the above solution is applied, the calculation of new_w will be the new critical path...

The calculation of tx_state[0] is the current critical path:

Code:

t1 = rx_t1_part + e1_w + ch_w
tx_state[0] <= t1 + e0_w + maj_w;

Which is actually pretty good, since it's implemented as only two adders.

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner | Bitcoin Hardware Wallet

1NT4RyJMqtRuDRr6zHdXdKSpmX3SR5he6z

Anoynomous

Newbie

Offline

Activity: 11
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 18, 2011, 05:26:10 AM

#493

Quote from: fpgaminer on August 18, 2011, 04:30:49 AM

Double check me on this:

Code:

tx_pre_w <= s0(rx_w[2]) + rx_w[1];     // Calculate the next round's s0 + the next round's w[0].
tx_new_w <= s1(rx_w[14]) + rx_w[9] + rx_pre_w;

right.. though the tx_pre_w can be saved at w[0]'s place that is to be transmitted to next loop, it will save a register..

makomk

Hero Member

Offline

Activity: 686
Merit: 564

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 18, 2011, 09:38:55 AM

#494

Quote from: fpgaminer on August 18, 2011, 04:30:49 AM

BUT: You did point something out that I think I missed. In the code I linked you'll see that the pre-calculated T1 value is stored in a separate register, not tx_state[7] as you listed in your example. On looking at my code, I believe you are correct; tx_state[7] is never used (except for the last round) so it could be removed or replaced with the partial calculation. Good catch, Anoynomous!

Not sure if the compiler catches this optimization automatically or not.

I'm reasonably sure Altera's compiler for Cyclone IV does because of the large decrease in resource usage. On Cyclone IV it uses less resources to store the partially pre-calculated T1 value than it does to store tx_state[`IDX(7)] because registering logic outputs is practically free but registering the output of another register ties up an entire LE per bit that can't be used for anything else. No idea if Xilinx's tools catch this though.

Quote from: Anoynomous on August 18, 2011, 05:26:10 AM

Quote from: fpgaminer on August 18, 2011, 04:30:49 AM

Double check me on this:

Code:

tx_pre_w <= s0(rx_w[2]) + rx_w[1];     // Calculate the next round's s0 + the next round's w[0].
tx_new_w <= s1(rx_w[14]) + rx_w[9] + rx_pre_w;

right.. though the tx_pre_w can be saved at w[0]'s place that is to be transmitted to next loop, it will save a register..

Oooh, cunning - nice one Anoynomous! Costs a register overall due to having to get rx_w[2] out of storage, but might be worthwhile. In theory could it be cheaper to do this with s1(rx_w[14]) + rx_w[9] instead?

Code:

tx_pre_w <= s1(rx_w[15]) + rx_w[10];     // Calculate the next round's s1 + the next round's w[9].
tx_new_w <= s0(rx_w[1]) + rx_w[0] + rx_pre_w;

Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS

mb300sd

Legendary

Offline

Activity: 1260
Merit: 1000

Drunk Posts

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 19, 2011, 12:17:28 AM

#495

Do you know if the code will run/fit on a Spartan XC2S30? I have a ProxMark3 (RFID hacking tool) that I've been playing with the FPGA on, wondering if its capable of mining.. I can deal with the ARM code to interface between the FPGA and USB.

1D7FJWRzeKa4SLmTznd3JpeNU13L1ErEco

ngzhang

Hero Member

Offline

Activity: 592
Merit: 501

We will stand and fight.

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 19, 2011, 03:20:07 AM

#496

Quote from: mb300sd on August 19, 2011, 12:17:28 AM

I'm very sad to say, it is impossible...
the LX150 we used has approx. 150,000 logic-cells, but the XC2S30 has less than 1,000 of them.
in addition, the logic-cells in spartan6 is far enhanced than spartan2.

rph

Full Member

Offline

Activity: 176
Merit: 100

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 20, 2011, 09:18:23 PM

#497

There are ways to get the critical path down to a single 2-input 32 bit adder.
If you think carefully about what you're building.

-rph

Ultra-Low-Cost DIY FPGA Miner: https://bitcointalk.org/index.php?topic=44891

newMeat1

Full Member

Offline

Activity: 210
Merit: 100

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 21, 2011, 03:16:52 AM

#498

I sure hope you're right!

Join CampBX

fpgaminer (OP)

Hero Member

Offline

Activity: 560
Merit: 517

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 21, 2011, 09:07:06 AM

#499

Quote

There are ways to get the critical path down to a single 2-input 32 bit adder.
If you think carefully about what you're building.

You want 3-input adders on 6 series Spartans, not 2-input. And yes, of course you can reduce the critical path to a single adder, but it requires an immense quantity of registers.

And before you suggest it, don't tell me to run the FPGA faster to avoid extra pipeline registers Tongue

. Spartan-6 isn't designed to run faster than ~250MHz. The memory doesn't run faster than that, and I think even the DSPs top out at that level.

https://github.com/fpgaminer/Open-Source-FPGA-Bitcoin-Miner | Bitcoin Hardware Wallet

1NT4RyJMqtRuDRr6zHdXdKSpmX3SR5he6z

Venkatesh Srinivas

Newbie

Offline

Activity: 18
Merit: 0

Re: Official Open Source FPGA Bitcoin Miner (Spartan-6 Now Tops Performance per $!)

August 21, 2011, 02:13:52 PM

#500

For anyone who has run this design on the LX9 microboard, what sort of hashrate did you get? And how many slices were used (and at what unrolling level?).

Thanks,
-- vs

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [25] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 »

Bitcoin Forum > Bitcoin > Mining > Hardware > Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)

« previous topic next topic »