Wow, this is such a coincidence! I was just browsing the forums tonight, and stumbled upon this thread. I finally registered an account just to post in this thread.
I've been working on an FPGA miner for the past few weeks! It's
fully working*, currently running on my desk in front of me and generating up some tasty shares
I'll give an overview of my work:
Current PerformanceDevice: Altera Cyclone 3 C120 Dev Kit
Performance: 70Mhash/s
Power: 2.26W
Efficiency: 30.9 Mhash/W
It's written in Verilog, all crafted
painstakingly by hand. There are two alternative designs. One is a serial design composed of many SHA256 cores running in parallel, each core computing a hash in 64 cycles (2 cores needed for the full hash). Each full core (2 half cores) consumes about 2800 LEs. The second design (currently running in front of me) is a pipelined version with one LOOOOONNNNGGGG chain of hashing stages running in parallel. That design computes 1 full hash every clock cycle. It runs at a maximum of 70MHz right now. Actually, I haven't tried pushing it to its limit, so it may very well run much faster. I'm hoping for 100MHz.
These are my results after off-and-on work for a few weeks. I've actually put most of my efforts into the serial design, because the pipelined design takes at least an hour to synthesize each time. The serial design can currently fit 42 full cores into the C120, each running at 90MHz and computing a full hash every 64 cycles. That's about 59Mhash/s.
The latest revision of the pipelined design consumes 90,000 LEs, so it's pretty big. I'm working to cram it into <64,000LEs so I can get two of them in one C120 chip, and push their clock to 100MHz, giving me a whopping 200Mhash/s.
I haven't used the on-board power meter before, but if I'm reading it correctly the FPGA is currently using 2.26 Watts. That ... seems
really low, but Altera's website verifies that that's actually above average for a C120, so I guess it's accurate. That's 31 Mhash/W, which is
1200% more efficient than the most efficient GPU listed on the
Wiki. So efficient, it's basically free. Poor guy runs terribly hot though. I need to go put a fan on him...
The only downside is that this board in particular, the C120, costs $1000. The same design will easily fit into the DE2-115 board (from Terasic), which only costs $600. I have one of those too, so I'll test on him later. You're not likely to pay off that $600 quickly, though, so I guess it isn't economical yet. A reduced version may run in the DE0-Nano board, which is $80, but obviously it won't have the same performance (about 25%).
All my efforts are put into optimizing every last bit of the design, so we'll see how far I push the poor FPGA. It already out-performs my GTX 285 card, so I'm happy
and at a fraction of the power cost.
And I'm only getting started
Who wants to front the money to buy me a Stratix board and move this into Hardcopy?
* By fully working, I really do mean it. It's happily submitting hashes to a pool. I was quite thrilled when my little baby submitted his first share