Bitcoin Forum
November 08, 2024, 08:15:22 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [21] 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 »
  Print  
Author Topic: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013)  (Read 432941 times)
themike5000
Member
**
Offline Offline

Activity: 99
Merit: 10


View Profile
July 14, 2011, 06:55:58 PM
 #401

Is there any way to run the fpgaminer program on a pc without a quartus license?

Now there is:

https://github.com/teknohog/Open-Source-FPGA-Bitcoin-Miner/tree/master/projects/DE2_115_makomk_serial

Basically, my earlier serial port code for Xilinx, adapted into the DE2-115 project. As with the other projects, this runs at 50 MHz by default, but I chose the makomk version so that I could run it at 109 MHz, and it works great Cheesy

I also found a way to program the FPGA without Altera tools, using UrJTAG. There is a script included for this.

I'll also post .sof and .svf files for a quick start, as soon as the 50 MHz versions are ready.

By the way, one reason for this port was the problem of getting tclcurl working. The Quartus II software for Linux is 32-bit only, and while it runs on a 64-bit system, it is tricky to install a suitable library.

Thank you!  But unfortunately, my board does not have a serial connection Undecided

Vertcoin: VdHjU3L2dcHCR3uQmqpM6mf4LCvp2678wh
teknohog
Sr. Member
****
Offline Offline

Activity: 520
Merit: 253


555


View Profile WWW
July 14, 2011, 07:07:53 PM
 #402

Thank you!  But unfortunately, my board does not have a serial connection Undecided

Ah, I noticed this has already been discussed in some form. I think somebody also mentioned level converters, you could then use any general I/O pins for the serial port. For example, I've used an old Nokia phone cable for this, and then there are other simple circuits, for example using the MAX232.

(A lot of embedded systems have serial ports at this 3.3V level. I guess you could have a serial connection with such levels at both ends, without any conversion.)

world famous math art | masternodes are bad, mmmkay?
Every sha(sha(sha(sha()))), every ho-o-o-old, still shines
makomk
Hero Member
*****
Offline Offline

Activity: 686
Merit: 564


View Profile
July 14, 2011, 09:03:59 PM
 #403

By the way - and this is probably what I get for not reading data sheets - I only just realized that the Virtex-5 and Virtex-6 series don't suffer from the annoying "only half the CLBs have fast carry logic" restriction of Spartan-6 FPGAs. Guess that could partly explain why Spartan-6 FPGAs have so much trouble with this design...

It appears ISE has no trouble at all fitting a 150 MHash/sec design on the Kintex-7 XC7K70T either (as in, it reaches that clock without even trying hard) - though note that this is an untested tweaked design and obviously I don't have the FPGA to run it on anyway. Kintex-7 is supposed to basically be a much cheaper 28nm equivalent of the Virtex-6 series.

Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
July 14, 2011, 10:07:50 PM
 #404

Quote
I also found a way to program the FPGA without Altera tools, using UrJTAG. There is a script included for this.
Cheesy Fantastic! I played with UrJTAG once, but it just barfed on me. Maybe I'll give it another try. Their disclaimer that it might damage the hardware is a bit disconcerting ... but I doubt it's likely.

Quote
Too good to be true?
Unless you have the code, or the physical device in your hand, you shouldn't believe it. This applies to every number someone quotes, including myself. It's like that quote you find in a lot of version control tutorials and guides. If it isn't commit-ed, it doesn't exist.

Quote
By the way - and this is probably what I get for not reading data sheets - I only just realized that the Virtex-5 and Virtex-6 series don't suffer from the annoying "only half the CLBs have fast carry logic" restriction of Spartan-6 FPGAs. Guess that could partly explain why Spartan-6 FPGAs have so much trouble with this design...
That might explain it. Since SHA-256 is almost entirely addition logic, then that means half the Spartan-6 chips are useless. But I couldn't even get 64 rounds fit on my LX150. Maybe the CLBs are ordered in such a way that it was forced to route around the useless ones, causing massive routing delays? If so, padding with registers between rounds might fix that (as others have suggested). If those useless CLBs can't be used for anything else, might as well pack 'em with a bunch of registers.

It could also be a combination of poor routing design, and having to route around the useless CLBs.

Quote
It appears ISE has no trouble at all fitting a 150 MHash/sec design on the Kintex-7 XC7K70T either (as in, it reaches that clock without even trying hard) - though note that this is an untested tweaked design and obviously I don't have the FPGA to run it on anyway. Kintex-7 is supposed to basically be a much cheaper 28nm equivalent of the Virtex-6 series.
A great first number! I'm very interested in the Kintex-7 series, and hope they are made available soon. I'll be checking their booth at CES next year for sure  Cheesy Think they'll sell me a devkit for Bitcoins?  Tongue

makomk
Hero Member
*****
Offline Offline

Activity: 686
Merit: 564


View Profile
July 14, 2011, 10:29:50 PM
Last edit: July 14, 2011, 10:45:02 PM by makomk
 #405

That might explain it. Since SHA-256 is almost entirely addition logic, then that means half the Spartan-6 chips are useless. But I couldn't even get 64 rounds fit on my LX150. Maybe the CLBs are ordered in such a way that it was forced to route around the useless ones, causing massive routing delays? If so, padding with registers between rounds might fix that (as others have suggested). If those useless CLBs can't be used for anything else, might as well pack 'em with a bunch of registers.

It could also be a combination of poor routing design, and having to route around the useless CLBs.
My personal suspicion is that support for this particular Spartan-6 misfeature in ISE is buggy and badly tested; I'm fairly sure that at one point I was trying to cram on more adders than there were even SLICEM/Ls it could fit them in and it tried to map them anyway without realising this was impossible. It also doesn't seem to report usage of this particular resource in any useful way. It's possible that their routing design is also bad too, of course. I recently upgraded to ISE 13.2 and its behaviour in this department seems to have changed quite a bit from ISE 13.1, though I'm not sure if this is entirely for the better...

Edit: Think I may have been wrong about this; it's a bit under the maximum number of SLICEM/Ls available in theory.

Also, have you tried disabling shift register inferrence or turning up the threshold for it? Those compete for the same resources as adders.

A great first number! I'm very interested in the Kintex-7 series, and hope they are made available soon. I'll be checking their booth at CES next year for sure  Cheesy Think they'll sell me a devkit for Bitcoins?  Tongue
Heheheheheheh. They certainly seem impressive, though I'll wait until someone's actually submitting shares with one, or at least until they're actually available for sale, before I actually believe it ;-).

Edit 2: 200 MHash/sec, though that increased total build time to over an hour. I have a feeling this isn't even pushing the tools slightly yet...

Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
TheSeven
Hero Member
*****
Offline Offline

Activity: 504
Merit: 500


FPGA Mining LLC


View Profile WWW
July 14, 2011, 11:11:41 PM
 #406

Also, have you tried disabling shift register inferrence or turning up the threshold for it? Those compete for the same resources as adders.
That's my suspicion as well, however I failed to figure out how to configure this, didn't have time to search thoroughly. Do you have a hint?

My tip jar: 13kwqR7B4WcSAJCYJH1eXQcxG5vVUwKAqY
makomk
Hero Member
*****
Offline Offline

Activity: 686
Merit: 564


View Profile
July 14, 2011, 11:27:31 PM
 #407

That's my suspicion as well, however I failed to figure out how to configure this, didn't have time to search thoroughly. Do you have a hint?
"HDL Options" section of the XST configuration, options "Shift register extraction" and "Shift register minimum size". The default is to create shift registers as short as 2 stages, which is a bit daft. (I assume you've already figured out this quirk of ISE, but for some reason you have to select fpgaminer_top in the hierarchy before it'll show you the list of process steps and let you configure them.)

This probably matters most for Spartan-6; other Xilinx FPGAs don't seem to have such annoying limitations on adders.

Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
makomk
Hero Member
*****
Offline Offline

Activity: 686
Merit: 564


View Profile
July 16, 2011, 11:17:46 AM
 #408

By the way, anyone running with LOOP_LOG2=1, 2 or 3 might find the experimental partial-unroll-opt branch useful - it reduces the resource usage somewhat. Unfortunately it also breaks LOOP_LOG2=4 and greater and hasn't been tested on actual FPGAs.

Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
July 17, 2011, 08:05:58 AM
 #409

July 17th, 2011 - Code Updates and Minor Cleanup
teknohog's Xilinx Verilog port on the public repo has been updated. teknohog's serial modifications to makomk's code have been added as a separate project. OrphanedGland's port to Stratix devices, using VHDL, has been merged into the public repo. To top it all off, I updated the project's main README.md file, to prominently include a list of contributors and their donation addresses, because they deserve recognition for their hard work. I will modify the first post in this thread to include the same list Smiley

As it wasn't mentioned before on the first post, I am mentioning here that makomk made improvements to my base Verilog code. These changes improved both the overall performance of the design, and its area consumption, allowing the design to fit on a smaller, cheaper EP4CE75 chip. Great work makomk!

fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
July 17, 2011, 08:07:38 AM
Last edit: July 19, 2011, 11:26:44 PM by fpgaminer
 #410

udif and makomk, I do not have donation addresses for you. If you would like your donation address added to the first post with the Contributors list, and in the github project's README.md, please contact me and let me know.

lame.duck
Legendary
*
Offline Offline

Activity: 1270
Merit: 1000


View Profile
July 17, 2011, 08:23:03 PM
 #411

By the way, anyone running with LOOP_LOG2=1, 2 or 3 might find the experimental partial-unroll-opt branch useful - it reduces the resource usage somewhat. Unfortunately it also breaks LOOP_LOG2=4 and greater and hasn't been tested on actual FPGAs.

Great work, on my EP3C25C6 Board the Device utilisation is reduced by approx 20% and the Fmax increases from 87MHz to 103Mhz but maybe there is still some room. I did some compile cycles i get different results and the difference is substantial. 10 Mhz  or so.
I could verify that it works, by erarning shares.

For the EP2C35C8 i am able to change LOOP_LOG2 from 3 to 2 with no change for Fmax. The test with the real hardware isn't done yet  but maybe tomorrow.

Is the toplevel compatible with the serial toplevel?

And going OT, could someone produce a bitstream for the EP1C80 for me?
makomk
Hero Member
*****
Offline Offline

Activity: 686
Merit: 564


View Profile
July 17, 2011, 08:59:47 PM
 #412

Great work, on my EP3C25C6 Board the Device utilisation is reduced by approx 20% and the Fmax increases from 87MHz to 103Mhz but maybe there is still some room. I did some compile cycles i get different results and the difference is substantial. 10 Mhz  or so.
I could verify that it works, by erarning shares.

For the EP2C35C8 i am able to change LOOP_LOG2 from 3 to 2 with no change for Fmax. The test with the real hardware isn't done yet  but maybe tomorrow.
Glad to hear a success story!

Is the toplevel compatible with the serial toplevel?
Unfortunately, some manual merging of changes is probably going to be required because both this and serial support modify fpgaminer_top.v. Shouldn't be too difficult to do in theory though, especially now that teknohog's merged his serial code with an older version of my modificatiosn.

And going OT, could someone produce a bitstream for the EP1C80 for me?
I wasn't even aware there was such a device...

Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
lame.duck
Legendary
*
Offline Offline

Activity: 1270
Merit: 1000


View Profile
July 17, 2011, 09:15:11 PM
 #413

And going OT, could someone produce a bitstream for the EP1C80 for me?
I wasn't even aware there was such a device...

Err...  EP1S80F1508C6, unfortunalty altera issues only licenses for quartus that are sold by licensed distributirs so i have no possiblity to produce a bitstream myself. Sad
mlut
Newbie
*
Offline Offline

Activity: 6
Merit: 0


View Profile
July 19, 2011, 07:28:40 AM
 #414

Does someone got the current version on github (fpgaminer + makomk modifications) successfully running in an FPGA?
It compiles problemless in a EPS4GX230 device but creates me only Stales, no Shares.
The "original" fpgaminer version runs problemless ...
fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
July 19, 2011, 11:25:36 PM
 #415

Does someone got the current version on github (fpgaminer + makomk modifications) successfully running in an FPGA?
It compiles problemless in a EPS4GX230 device but creates me only Stales, no Shares.
The "original" fpgaminer version runs problemless ...
Are you using projects/DE2_115_makomk_mod/ with CONFIG_LOOP_LOG2 set to something other than 0? It must be set to 0 for the version on my public repo. The version on makomk's repo will work with other settings, but it basically just reverts to the unoptimized version if you do that so there's little point in using it unless CONFIG_LOOP_LOG2 is 0.

You also mentioned in your PM that you're using an "old" mining tcl script. How old? It was updated a few weeks ago to reflect changes in the code.

fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
July 20, 2011, 12:48:32 AM
 #416

Just a quick progress report. I dug back into my Spartan-6 LX150, updated its code to the version that can roll up, and used a CONFIG_LOOP_LOG2 setting of 5; I just wanted to get something compiled and working  Tongue And after waiting long enough, it did indeed churn out a valid result! So ... progress! I am going to clean up the project and get it into the public repo. From there I will crank down the LOOP_LOG2 setting as low as it will go, and begin adding extra pipelining to see if it will push further.

mlut
Newbie
*
Offline Offline

Activity: 6
Merit: 0


View Profile
July 20, 2011, 09:28:48 AM
 #417

Does someone got the current version on github (fpgaminer + makomk modifications) successfully running in an FPGA?
It compiles problemless in a EPS4GX230 device but creates me only Stales, no Shares.
The "original" fpgaminer version runs problemless ...
Are you using projects/DE2_115_makomk_mod/ with CONFIG_LOOP_LOG2 set to something other than 0? It must be set to 0 for the version on my public repo. The version on makomk's repo will work with other settings, but it basically just reverts to the unoptimized version if you do that so there's little point in using it unless CONFIG_LOOP_LOG2 is 0.

You also mentioned in your PM that you're using an "old" mining tcl script. How old? It was updated a few weeks ago to reflect changes in the code.

Correct, i used DE2_115_makomk_mod, changed the FPGA type and the clock pin, CONFIG_LOOP_LOG2 was 0.
Do you publish also a version where CONFIG_LOOP_LOG2 can be different from 0?
Because the virtual_wires are the same I though that I can still use the tcl scripts from your "original" design !?!?!? Is this not a good idea?
fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
July 21, 2011, 01:52:53 AM
 #418

Quote
Correct, i used DE2_115_makomk_mod, changed the FPGA type and the clock pin, CONFIG_LOOP_LOG2 was 0.
Did you set the correct voltage for the clock pin?
Is the clock 50MHz? If it is not 50MHz, you will have to adjust the sdc file, so Quartus knows the correct speed of the clock.

What dev board is it, by the way?

Quote
Because the virtual_wires are the same I though that I can still use the tcl scripts from your "original" design !?!?!? Is this not a good idea?
That is correct, the tcl mining script should work fine, as long as you're using the latest version: https://github.com/progranism/Open-Source-FPGA-Bitcoin-Miner/tree/master/scripts/mine

Quote
Do you publish also a version where CONFIG_LOOP_LOG2 can be different from 0?
The unoptimized version works with any setting of CONFIG_LOOP_LOG2. makomk's personal repository also has a version of his code that works with any setting of CONFIG_LOOP_LOG2, but as I pointed out there isn't much point to using it if CONFIG_LOOP_LOG2 isn't 0 because it basically reverts to the unoptimized version.

TheSeven
Hero Member
*****
Offline Offline

Activity: 504
Merit: 500


FPGA Mining LLC


View Profile WWW
July 21, 2011, 08:50:26 AM
 #419

Just a quick progress report. I dug back into my Spartan-6 LX150, updated its code to the version that can roll up, and used a CONFIG_LOOP_LOG2 setting of 5; I just wanted to get something compiled and working  Tongue And after waiting long enough, it did indeed churn out a valid result! So ... progress! I am going to clean up the project and get it into the public repo. From there I will crank down the LOOP_LOG2 setting as low as it will go, and begin adding extra pipelining to see if it will push further.

Just to let you know, I just managed to route a fully-unrolled doubly-pipelined miner at 50MHz on the LX150 overnight, so it's definitely doable. As I don't have access to an LX150 I haven't verified its correctness yet, but even if it has some bugs, I don't think fixing them would affect timing much. I might try to improve it during the weekend.

My tip jar: 13kwqR7B4WcSAJCYJH1eXQcxG5vVUwKAqY
fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
July 21, 2011, 09:56:25 AM
 #420

Just to let you know, I just managed to route a fully-unrolled doubly-pipelined miner at 50MHz on the LX150 overnight, so it's definitely doable. As I don't have access to an LX150 I haven't verified its correctness yet, but even if it has some bugs, I don't think fixing them would affect timing much. I might try to improve it during the weekend.
Is the code available anywhere? I'd be happy to compile and run it on my board  Grin I'm guessing it's a modified version of your VHDL port?

I tried compiling a version of my code, LOOP_LOG2=1, and two extra pipeline registers with Register Balancing enabled. It couldn't even get past Mapping, with an area constraint error  Tongue

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [21] 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!