Algorithmically placed FPGA miner: 255MH/s/chip, supports all known boards

Bitcoin Forum

October 18, 2024, 10:59:28 AM

Welcome, Guest. Please login or register.

News: Latest Bitcoin Core release: 28.0 [Torrent]

Home

Help

Bitcoin Forum > Bitcoin > Mining > Hardware > Algorithmically placed FPGA miner: 255MH/s/chip, supports all known boards

Pages: « 1 2 3 4 5 6 [7] 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 »

« previous topic next topic »

Author

Topic: Algorithmically placed FPGA miner: 255MH/s/chip, supports all known boards (Read 119437 times)

BTC-engineer

Sr. Member

Offline

Activity: 360
Merit: 250

Re: Algorithmically placed FPGA miner: 192MH/s and rising

March 09, 2012, 10:24:49 PM

#121

Quote from: eldentyrell on March 09, 2012, 01:26:17 AM

The design is very easy to forward-port to the Xilinx 7-series parts; I just haven't had a reason to do that yet. I've even backwards-ported it to older devices, but the effort/reward tradeoff there doesn't usually work out (it did this time only because I got the chips almost-for-free). It's also possible to port it to most SASIC platforms, but my "are you serious about this" threshold for exploring that is really really high (and only with people based in the USA since there would be contracts involved).

Congratulations also from me for the great progress in your hard work.

Interesting that you think your design could be easy forward-ported to the new xilinx 28nm FPGA's. This surprise me a litter bit, because I always thought your design is so highly spartan 6 LX150 optimized/specific. How deep did you already look into the Artix architecture and didn't you have to do a lot of work just by newly 'filling up' the bigger chip, independently from the slightly other architecture?

I'm playing with the idea to build up a FPGA board with Artix FPGA's.
One of the fist ones which will come out will be the 352K version of the Artix, but it doesn't look like the first chips will be available <6-8 month :-(

█
▀██
███▄
█████
▄██████████ █████
▄███████████████ █████▄
▄██████████████████ ██████
█████████████████████ ███████
██████████████████████ ████████
▄████████▀ █████████
██████ ▄██████ ██████████
███▀ ▄██████████ ███████████
██ ████████████ ████████████
█████████████ ██████████
█████████████ ███████
█████████████▄ ██▀
██████████████
▀███████████████▄
▀███████████▀

FLUX

█
█
█

VALVE UBISOFT GAMING ECOSYSTEM Origin GAMELOFT
█ WEBSITE █ WHITEPAPER █ MEDIUM █ TWITTER █ FACEBOOK █ TELEGRAM █

█
█
█

17 - 24 April
Public Sale

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 09, 2012, 10:29:17 PM

#122

Quote from: rjk on March 09, 2012, 10:22:16 PM

Quote from: eldentyrell on March 09, 2012, 10:08:27 PM

I'll even let somebody bring their own board but I have to keep the board afterwards. I'll probably need a ztex board at some point so when I do the demo we'll probably have somebody who doesn't know me bring a ztex board and I'll buy it from them as part of the demo.

I'm not sure I understand this requirement. Are you somehow burning an irreversible encryption key into the chip first? Is there no way to undo that step?

Large Spartan chips like the 150 have a WRITE-ONLY nonvolatile register that can hold a bitstream decryption key. There is (supposedly) no way to read the key back from the register; all you can do is hand the device an encrypted bitstream and let it use the key to decrypt+load.

The device also has a unique identity register (DNA). Unfortunately it is utterly trivial to create a circuit that looks exactly like this unique identity register and then modify an unencrypted design to use that instead of the true DNA register. So, chip-specific designs must be encrypted.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

Inspector 2211

Sr. Member

Offline

Activity: 448
Merit: 250

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 09, 2012, 10:29:48 PM

#123

Quote from: eldentyrell on March 09, 2012, 10:08:27 PM

Quote from: Wandering Albatross on March 09, 2012, 05:30:38 AM

Potential bidders for the IP are altera, xilinx, possibly others (like terasic, etc.) and the BTC FPGA community. I know very little about the fpga market but

The topology makes use of a few Xilinx-specific features, so it would require effort to port that. However, the geometry is very Xilinx-specific. Porting to Altera is as much work as porting to a SASIC platform like eASIC.

Quote from: Wandering Albatross on March 09, 2012, 05:30:38 AM

I'd guess that big players (altera,xilinx) wouldn't see BTC mining as a big enough market

Correct. This is still way below Xilinx's radar.

Quote from: Wandering Albatross on March 09, 2012, 05:30:38 AM

How do you convince anyone that what you have is legit? You'd have to let them see something under NDA? What if they say "no thanks" and go do it themselves based on what they saw.

When there is a need for me to convince people I will be happy to give live, in-person demos here in NorCal. I'll even let somebody bring their own board but I have to keep the board afterwards. I'll probably need a ztex board at some point so when I do the demo we'll probably have somebody who doesn't know me bring a ztex board and I'll buy it from them as part of the demo.

EldenTyrell, I'm here in the South Bay (with a home office in north-east San Jose and a business/mining office in Santa Clara next to Nvidia) and I have a ZTEX board and I can sell it to you for what I paid for it, or $50 less, or whatever we agree on.

In case you put your bitstream up on Kickstarter, I'll also make a low-to-mid 3-figure pledge for early access to a 240 MH/s or better bitstream. (Right now, it's running at 209 MH/s and I'm not really interested in paying for, say, 220 MH/s.)

▄█▄ ▄█ ▀█▀ ▄ ▄███▄▄████▄▀ ▄▄▀▄ ▀█▄██████████▀▄█████▀▄▀ ▄█▀▄███████████████████▄ ▄██▀█▀▀▀▀███▀▀▀█████▄▄▄▀█▀▄ ▄█▀▀ ▀████▀▄████████ █▀█▄▄ ██▀ ▀ ▀ ▀██████████▄ ▄▀▀█▄ ▀ ▀ ███▀▀▀▀▀████▌ ▄ ▀ ████████████▌ █ █████████████▀ ▀▀▀██▀▀██▀▀ ▀▀ ▀▀

BTC-GREEN

Ecological Community in the Green Planet
❱❱❱❱❱❱ WHITEPAGE | ANN THREAD ❰❰❰❰❰❰

FACEBOOK ❱❱ TWITTER ❱❱ YOUTUBE
J O I N I C O IIILIVE

TheSeven

Hero Member

Offline

Activity: 504
Merit: 500

FPGA Mining LLC

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 09, 2012, 10:30:06 PM

#124

Yery interesting results... I'd like to see a bit more information though:

Where is the critical path, and how much could that be optimized? (Can you give a best-case estimate of the physical limits of achievable hashrate?)
How many pipeline stages does this design have, per core? Are the sha256 rounds doubly registered?
This looks pretty much crammed into the FPGA
If you provide this as a hardmacro, is there even sufficient room to easily add a PC interface to it?
As the developer of MPBM, and being someone who has done at least a little VHDL design and implemented a miner core, I do understand very well what order of magnitude of effort this is. Especially with this all-broken Xilinx toolchain. However, a simple miner software can be written in basically no time (and that's how MPBM started months ago). But if you design something for flexibility like the new MPBM generation or cgminer, it'll take at least 10 times as long. May I ask how much time you have realistically spent on implementing and optimizing this FPGA design and the neccessary tools to generate it?
Assuming the bitcoin FPGA community (and possibly some board vendors) would want you to optimize this design until you're hitting real roadblocks (300MH/s maybe?), and release everything that's neccessary to regenerate and further improve it under an open source license, roughly how much money would we need?

My tip jar: 13kwqR7B4WcSAJCYJH1eXQcxG5vVUwKAqY

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 09, 2012, 10:33:54 PM
Last edit: March 09, 2012, 11:22:37 PM by eldentyrell

#125

Quote from: BTC-engineer on March 09, 2012, 10:24:49 PM

Interesting that you think your design could be easy forward-ported to the new xilinx 28nm FPGA's.

Well, feature size isn't something you can detect using Verilog code...

Quote from: BTC-engineer on March 09, 2012, 10:24:49 PM

This surprise me a litter bit, because I always thought your design is so highly spartan 6 LX150 optimized/specific. How deep did you already look into the Artix architecture

Xilinx UG474 says that the 7-series slices (both M+L) are identical to the Virtex-6 slice, which is a strict superset of the Spartan-6 slice. I verified this by looking at the diagram. Then I opened up each of the Artix devices in fpga_editor to look at the geometry. That's about the extent of my investigation. Mostly stuff just switches faster, uses less power, more SLICEL's, and you get more routing -- but the routing is basically undocumented anyways.

I have to say I am baffled by the bizzarre shape of the Artix fabric. One of their devices looks like a rectangle with a chunk hacked out of the right hand side and shoved over. WTF?

I do need the device to be at least 128 slices wide to get a "zero effort" port. So, Artix200 or higher. There's a huge hole in the middle of the Artix200, but (unlike the holes in the Spartan6) you get wires that run "over the top of" whatever circuitry is in the hole. And there are still more than 128 columns even after leaving out the hole.

If there is enough demand for Artix100 I may be able to re-arrange things to fit the narrower device -- we'll see. I'm hoping the Artix200 comes out very quickly after the 100; if so it should attract the bitcoin miners (unless something crazy happens it should be cheaper $/LUT than the 100).

Quote from: BTC-engineer on March 09, 2012, 10:24:49 PM

Artix, but it doesn't look like the first chips will be available <6-8 month :-(

Yeah, I hear Xilinx's availability estimates are pretty much worthless.

BTCurious

Hero Member

Offline

Activity: 714
Merit: 504

^SEM img of Si wafer edge, scanned 2012-3-12.

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 09, 2012, 11:07:45 PM

#126

*notices the topic title*
Grats on your recent 10MH/s advancement

Bitcoin-OTC rating

kano

Legendary

Offline

Activity: 4592
Merit: 1851

Linux since 1997 RedHat 4

⇾ Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 09, 2012, 11:11:24 PM
Last edit: March 09, 2012, 11:23:08 PM by kano

#127

Quote from: eldentyrell on March 09, 2012, 10:10:06 PM

Quote from: kano on March 09, 2012, 05:37:23 AM

[sarcasm]just make sure you don't use free miners like cgminer where many many hundreds of hours have been spent without the requirement of payment[/sarcasm]

Duh.

I wrote my own miner from scratch; it has longpoll and multipool support. Just ask Luke-Jr, who has graciously suffered through the pool side of the debugging process

I can tell you from first-hand experience that writing a miner requires about 1% of the effort I put into the HDL design. That's not an exaggeration; I kept a (very coarse) log of how I spent my time and it really does work out to about 100:1. I suspect ztex has had a similar experience.

I don't mean any disrespect to the authors of cgminer/mpbm/etc. They've done a great thing for the bitcoin mining community. But these things aren't even in the same league in terms of time commitment.

Yeah if you write a total piece of shit miner Tongue

Edit: So you wrote the fully optimised CL code yourself also without taking that from someone else?
And you worked out the 61 + 61 sha256 optimisation yourself also?
(and all the other optimisations in there) for the stream you've done here?

Pool: https://kano.is - low 0.5% fee PPLNS 3 Days - Most reliable Solo with ONLY 0.5% fee Bitcointalk thread: Forum
Discord support invite at https://kano.is/ Majority developer of the ckpool code - k for kano
The ONLY active original developer of cgminer. Original master git: https://github.com/kanoi/cgminer

eldentyrell (OP)

Donator
Legendary

Offline

Activity: 980
Merit: 1004

felonious vagrancy, personified

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 09, 2012, 11:51:11 PM

#128

Quote from: kano on March 09, 2012, 11:11:24 PM

I think I have created some confusion, and have inadvertently offended you (and others). Please accept my apologies.

Everything I wrote about "miners" was meant to refer only to the part of the code that runs on the CPU: fetching work from the pool and submitting shares. I did not mean to imply that writing the OpenCL code that runs on the GPU itself is easy or trivial! I know that is quite difficult, and no, I have never tried to write GPU hashing code.

Please understand that my response was in the context of what I interpreted (perhaps incorrectly) to be an accusation that any attempt to raise funds for my efforts would somehow be cheating the authors of cgminer/mpbm/etc. The point I was trying to make is that (1) I am not using any of this software; I wrote my own and (2) if somebody does modify cgminer to act as a front end to my bitstream they won't be using the part of cgminer that was hard to write -- they'll only be using the CPU part.

kakobrekla

Hero Member

Offline

Activity: 714
Merit: 500

Psi laju, karavani prolaze.

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 10, 2012, 12:25:22 AM

#129

Quote from: TheSeven on March 09, 2012, 10:30:06 PM

Assuming the bitcoin FPGA community (and possibly some board vendors) would want you to optimize this design until you're hitting real roadblocks (300MH/s maybe?), and release everything that's neccessary to regenerate and further improve it under an open source license, roughly how much money would we need?

This has been mislooked?

Bit4X.com | BitBet.us | #bitcoin-assets | OTC web of trust

TheSeven

Hero Member

Offline

Activity: 504
Merit: 500

FPGA Mining LLC

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 10, 2012, 12:34:34 AM

#130

Quote from: eldentyrell on March 09, 2012, 11:51:11 PM

I think I have created some confusion, and have inadvertently offended you (and others). Please accept my apologies.

I didn't feel offended, and I still don't do. But I have the impression that the bitcoin community in general is very generous as far as donations are concerned

It isn't so much the number of people, but rather the amounts of money some people have to spare...

Quote from: eldentyrell on March 09, 2012, 11:51:11 PM

Everything I wrote about "miners" was meant to refer only to the part of the code that runs on the CPU: fetching work from the pool and submitting shares. I did not mean to imply that writing the OpenCL code that runs on the GPU itself is easy or trivial! I know that is quite difficult, and no, I have never tried to write GPU hashing code.

Please understand that my response was in the context of what I interpreted (perhaps incorrectly) to be an accusation that any attempt to raise funds for my efforts would somehow be cheating the authors of cgminer/mpbm/etc. The point I was trying to make is that (1) I am not using any of this software; I wrote my own and (2) if somebody does modify cgminer to act as a front end to my bitstream they won't be using the part of cgminer that was hard to write -- they'll only be using the CPU part.

You apparently have no idea what kind of effort that is, as much as others have no idea how hard it is to optimize an FPGA design.
Writing good miner software isn't trivial either (MPBM is approaching 10000 lines of code, and there's no OpenCL involved at all).

To get back to my original question: Do you think that it might be possible to community fund your effort? I wouldn't put too much hope on the FPGA board vendors here (at the current production volumes those are also people who'll never earn any adequate profits for the time that they've spent designing, testing, fixing and organizing things).
So if we do some fundraising to pay you semi-adequately, would you agree to completely open source this project?
And we might need a ballpark number of what you would consider an adequate reward...

My tip jar: 13kwqR7B4WcSAJCYJH1eXQcxG5vVUwKAqY

kano

Legendary

Offline

Activity: 4592
Merit: 1851

Linux since 1997 RedHat 4

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 10, 2012, 12:42:48 AM
Last edit: March 10, 2012, 02:17:05 AM by kano

#131

I did put that in sarcasm brackets for a reason

Simply coz you give the impression that it's a "get paid lots or no one will be allowed to ever see it."
If it's a "I wrote and did it all from scratch without any help from looking at anything anyone else has ever done" then I guess that MAY be justified ...

If you haven't looked at sha256() optimisations then you are somewhere in the ball-park of 5% slower than it could be.

The 2 simplest and most effective optimisations are:
(ignoring the midstate as being the real first sha256())
The first 3 of 64 stages in the 1st of the double sha256() are only needed to be done once per 2^32 hashes (per full nonce range)
The last 3.5 stages of the 2nd of the double sha256() are not required since you already know the answer at that point.
There are quite a few other optimisations of W calculations that are constant over a full nonce range
Then there are the partial calculations of some of the W that are constant over a full nonce range
Quite a few parts of the early stages of the 2nd double sha256() are reduced to fixed constants also.

Edit: some of that may not be FPGA related but some of it certainly also is.

PulsedMedia

Sr. Member

Offline

Activity: 402
Merit: 250

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 10, 2012, 02:09:12 AM

#132

Really cool work, for what i understand this already offers around 30% more per cycle? That's simply awesome.
If i were a miner with a significant any scale and investment into FPGAs i would definitely throw some BTC to your direction, especially if that meant i get unlimited access to the bitstream

http://PulsedMedia.com - Semidedicated rTorrent seedboxes

pieppiep

Hero Member

Offline

Activity: 1596
Merit: 502

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 10, 2012, 02:15:24 AM

#133

I you put this at kickstarter or sell it or what ever, how much do you want for it?
Is it around $500 or more around $2500 or even $50,000 ?
How many hours did you spend roughly?

2112

Legendary

Offline

Activity: 2128
Merit: 1073

Re: Algorithmically placed FPGA miner: 192MH/s and rising

March 10, 2012, 08:09:18 AM

#134

Quote from: eldentyrell on March 09, 2012, 01:12:50 AM

Number of DSP48A1s: 30 out of 180 16%

Aha! Interesting. When uncle Moshe (Gavrielov) gives you DSPs, make DSPeade. Wink

Please comment, critique, criticize or ridicule BIP 2112: https://bitcointalk.org/index.php?topic=54382.0
Long-term mining prognosis: https://bitcointalk.org/index.php?topic=91101.0

BR0KK

Hero Member

Offline

Activity: 784
Merit: 500

Re: Algorithmically placed FPGA miner: 202MH/s and rising

March 10, 2012, 03:50:06 PM

#135

is there a way to port it to Ztex or other FPGA board's?

Bitcoin Spenden: 13U22Hwi2tgyrThT6m3ivjEBYfJv8NFUAU
Bitcoin Marktplatz Deutschland: https://www.bitcoin.de/r/697nng
Pyramining refs:
http://www.pyramining.com/referral/6m4kqpz8e
http://www.pyramining.com/referral/t3m7kgfb4
http://www.pyramining.com/referral/mes3bx4d9
http://www.pyramining.com/referral/yqrhs9egc
http://www.pyramining.com/referral/e9rdbt382

Inspector 2211

Sr. Member

Offline

Activity: 448
Merit: 250

Re: Algorithmically placed FPGA miner: 192MH/s and rising

March 10, 2012, 04:19:36 PM

#136

Quote from: 2112 on March 10, 2012, 08:09:18 AM

Quote from: eldentyrell on March 09, 2012, 01:12:50 AM

Number of DSP48A1s: 30 out of 180 16%

Aha! Interesting. When uncle Moshe (Gavrielov) gives you DSPs, make DSPeade. Wink

Thank you for providing an important puzzle piece on how Dr. Tyrell does it.

The multiplier in the DSP48-block is not needed in SHA-256, hence what he obviously uses is the 18-bit adder
BCOUT = B + D.
He uses 30 DSP blocks, 10 per red / green / blue SHA-256 instance.
For a 32 bit adder, two 18-bit adders BCOUT=B+D are needed.
Thus, he can implement five 32-bit adders per SHA instance.

So, why not just use [slow] 32-bit ripple adders everywhere, and use a few [very fast] DSP adders in some places?

The answer is, IMHO, that he uses the fast DSP adders only where they feed into longlines.
Were he to use normal ripple adders where he feeds into longlines, the aggregate delay would limit
the design to a 5 ns clock cycle.
Using the fast DSP adders will allow this design, when properly fine-tuned, to march into 4 ns clock cycle
territory, for a total MH/s number of approximately 125 MH/s or approximately 375 MH/s per Spartan6-150.

BFL Single, watch out below.

BTC-GREEN

Ecological Community in the Green Planet
❱❱❱❱❱❱ WHITEPAGE | ANN THREAD ❰❰❰❰❰❰

FACEBOOK ❱❱ TWITTER ❱❱ YOUTUBE
J O I N I C O IIILIVE

TheSeven

Hero Member

Offline

Activity: 504
Merit: 500

FPGA Mining LLC

Re: Algorithmically placed FPGA miner: 192MH/s and rising

March 10, 2012, 04:48:22 PM

#137

Quote from: Inspector 2211 on March 10, 2012, 04:19:36 PM

BFL Single, watch out below.

Oh yeah!

750MH/s on X6500, at $550 bulk that's <0.74$/MH or >1.36MH/$. Wow! This can blow away GPUs!

And probably LargeCoin as well...

My tip jar: 13kwqR7B4WcSAJCYJH1eXQcxG5vVUwKAqY

bulanula

Hero Member

Offline

Activity: 518
Merit: 500

Re: Algorithmically placed FPGA miner: 192MH/s and rising

March 10, 2012, 04:58:50 PM

#138

Quote from: Inspector 2211

BFL Single, watch out below.

What makes you think this cannot similarly be applied to the single ( even after a hardware modification ) Huh

jamesg

VIP
Legendary

Offline

Activity: 1358
Merit: 1000

AKA: gigavps

Re: Algorithmically placed FPGA miner: 192MH/s and rising

March 10, 2012, 05:04:31 PM

#139

Quote from: bulanula on March 10, 2012, 04:58:50 PM

Quote from: Inspector 2211

BFL Single, watch out below.

What makes you think this cannot similarly be applied to the single ( even after a hardware modification ) Huh

Bulanula,

Slow down. Please read his post more carefully. He is suggesting that $$$/Mh is in competition with the BFL single and his math is pretty close. I am getting 830 mh/s for $600 or $.072/Mh which is pretty darn close.

Turbor

Legendary

Offline

Activity: 1022
Merit: 1000

BitMinter

Re: Algorithmically placed FPGA miner: 192MH/s and rising

March 10, 2012, 05:12:51 PM

#140

Quote from: bulanula on March 10, 2012, 04:58:50 PM

What makes you think this cannot similarly be applied to the single ( even after a hardware modification ) Huh

BitMinter -----> Knives4Bitcoin.com <-----

Pages: « 1 2 3 4 5 6 [7] 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 »

Bitcoin Forum > Bitcoin > Mining > Hardware > Algorithmically placed FPGA miner: 255MH/s/chip, supports all known boards

« previous topic next topic »

Jump to: