Bitcoin Forum
December 08, 2016, 10:13:18 AM *
News: Latest stable version of Bitcoin Core: 0.13.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 »  All
  Print  
Author Topic: FPGA development board "Lancelot" - accept bitsteam developer's orders.  (Read 96017 times)
bitfury
Sr. Member
****
Offline Offline

Activity: 266


View Profile
May 25, 2012, 09:01:34 PM
 #101

One interesting thing that I have researched - single-bit design. I.e. instead of carry chains you use D-flip flop for carry and D-flip flop for result. Then it would require 32 times less wires for W-expander. Allows constant-optimization. This would be smallest CORE, but with one IF - IF you are capable to design long digital delay lines (i.e. like SRL16 in spartan) within chip. I know that it is pretty doable. But nobody I contacted can work at that level, and basically it is unlikely what you will find in cell library from TSMC for example. This carefully designed thing can beat everything and rise calculations speed at silicon maximum. I doubt however that there's many _developers_ who would even understand what I wrote about here and zero who can do that not in theory but with more or less guaranteed result in hardware.

A carryless adder, 32 bits in + 32 bits in results in 64 bits out in a non-canonical "2 output bits for each output bit" representation?
The problem with this approach is, it's not compatible with the XOR operation, not even with rotate and shift operations.
So, yes, while you can build a large multiplier that way, converting the result to a canonical representation as the final step,
you cannot build SHA-256 that way. I have investigated it, and it's not possible.

If you have been talking about something else entirely, I apologize.

Not exactly. I mentioned case when you do adding in 32 clocks.... One bit at one clock edge. So one D-trigger holds output, and other D-trigger holds carry, which fed back to adder on next clock.

So you get ONE wire instead of 32 wires for round expander fully unrolled. Still design is pipelined.

But you need really long and compact shift registers without access to internal bits of course. These are required to do rotation operations (basically by delaying for 32 clocks all variables in calculation, but doing different delays for RORs). And then really long delays for W round (that would be 224-bit delay line and 256-bit delay line).

I've tried to experiment this with BRAMs - it is nice - when you have 32 rounds of round expander around single BRAM :-)
but actually static RAM is nowhere near efficiency and density of such shift registers implemented in silicon.

As this register would work only in dynamics, basically you need only capacitor to hold bit and circuit to charge next capacitor on clock pulse. It will not work at slow clocks then of course. And as far as I know it is extremely hard to implement such circuit in silicon (basically because I have spoken not with elite ASIC developers indeed).

If that approach would save 3-4 times transistor count compared to serie of flip-flops, the design then would shine :-)
1481191998
Hero Member
*
Offline Offline

Posts: 1481191998

View Profile Personal Message (Offline)

Ignore
1481191998
Reply with quote  #2

1481191998
Report to moderator
1481191998
Hero Member
*
Offline Offline

Posts: 1481191998

View Profile Personal Message (Offline)

Ignore
1481191998
Reply with quote  #2

1481191998
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1481191998
Hero Member
*
Offline Offline

Posts: 1481191998

View Profile Personal Message (Offline)

Ignore
1481191998
Reply with quote  #2

1481191998
Report to moderator
1481191998
Hero Member
*
Offline Offline

Posts: 1481191998

View Profile Personal Message (Offline)

Ignore
1481191998
Reply with quote  #2

1481191998
Report to moderator
Inspector 2211
Sr. Member
****
Offline Offline

Activity: 383



View Profile
May 25, 2012, 09:26:07 PM
 #102

One interesting thing that I have researched - single-bit design. I.e. instead of carry chains you use D-flip flop for carry and D-flip flop for result. Then it would require 32 times less wires for W-expander. Allows constant-optimization. This would be smallest CORE, but with one IF - IF you are capable to design long digital delay lines (i.e. like SRL16 in spartan) within chip. I know that it is pretty doable. But nobody I contacted can work at that level, and basically it is unlikely what you will find in cell library from TSMC for example. This carefully designed thing can beat everything and rise calculations speed at silicon maximum. I doubt however that there's many _developers_ who would even understand what I wrote about here and zero who can do that not in theory but with more or less guaranteed result in hardware.

A carryless adder, 32 bits in + 32 bits in results in 64 bits out in a non-canonical "2 output bits for each output bit" representation?
The problem with this approach is, it's not compatible with the XOR operation, not even with rotate and shift operations.
So, yes, while you can build a large multiplier that way, converting the result to a canonical representation as the final step,
you cannot build SHA-256 that way. I have investigated it, and it's not possible.

If you have been talking about something else entirely, I apologize.

Not exactly. I mentioned case when you do adding in 32 clocks.... One bit at one clock edge. So one D-trigger holds output, and other D-trigger holds carry, which fed back to adder on next clock.

So you get ONE wire instead of 32 wires for round expander fully unrolled. Still design is pipelined.

But you need really long and compact shift registers without access to internal bits of course. These are required to do rotation operations (basically by delaying for 32 clocks all variables in calculation, but doing different delays for RORs). And then really long delays for W round (that would be 224-bit delay line and 256-bit delay line).

I've tried to experiment this with BRAMs - it is nice - when you have 32 rounds of round expander around single BRAM :-)
but actually static RAM is nowhere near efficiency and density of such shift registers implemented in silicon.

As this register would work only in dynamics, basically you need only capacitor to hold bit and circuit to charge next capacitor on clock pulse. It will not work at slow clocks then of course. And as far as I know it is extremely hard to implement such circuit in silicon (basically because I have spoken not with elite ASIC developers indeed).

If that approach would save 3-4 times transistor count compared to serie of flip-flops, the design then would shine :-)


Ah, I see what you mean.
Maybe a less radical approach, adding 4 bits per clock, would be more practical. A 4-bit adder fits inside one slice.
I'll think about it...

>one capacitor to hold bit

I the old days, you could buy such chips: Called CCD or charge-coupled device.
Steve Wozniak based the video memory of the Apple I on such a device. It not only stored the all the characters in the video buffer (1024 bytes IIRC), but generated the video signal as well, as its content rotated constantly. A new character would be inserted by breaking the loop for a moment.
kano
Legendary
*
Offline Offline

Activity: 1932


Linux since 1997 RedHat 4


View Profile
May 25, 2012, 10:09:22 PM
 #103

Hmm so bitfury might have quite a bit of incentive in this FPGA vs ASIC discussion Smiley
https://bitcointalk.org/index.php?topic=3889.msg915037#msg915037
(weapon of choice? Cheesy)

Pool: https://kano.is BTC: 1KanoiBupPiZfkwqB7rfLXAzPnoTshAVmb
CKPool and CGMiner developer, IRC FreeNode #ckpool and #cgminer kanoi
Help keep Bitcoin secure by mining on pools with Stratum, the best protocol to mine Bitcoins with ASIC hardware
bitfury
Sr. Member
****
Offline Offline

Activity: 266


View Profile
May 27, 2012, 05:02:34 PM
 #104

Hmm so bitfury might have quite a bit of incentive in this FPGA vs ASIC discussion Smiley
https://bitcointalk.org/index.php?topic=3889.msg915037#msg915037
(weapon of choice? Cheesy)

https://bitcointalk.org/index.php?topic=76351.msg925049#msg925049

That's for ngzhang and all folks - I've run some comparison of FPGA vs ASIC Hardcopy...
Artix7 allows tricks like LX150 unlike Cyclone V though. So by following that thread,
you can understand why claims about 28nm being obsolete are questionable.
bleza
Newbie
*
Offline Offline

Activity: 8


View Profile
June 05, 2012, 08:58:55 AM
 #105

subscribed
count me in for 4 or 5 boards  Cool
ngzhang
Hero Member
*****
Offline Offline

Activity: 592


We will stand and fight.


View Profile
June 05, 2012, 04:18:42 PM
 #106

got 4 samples today.... Cheesy

CEO of Canaan-creative, Founder of Avalon project.
https://canaan.io/
Business contact: love@canaan.io
All PMs will be unread.
spiccioli
Legendary
*
Offline Offline

Activity: 1376

nec sine labore


View Profile
June 05, 2012, 04:30:48 PM
 #107

got 4 samples today.... Cheesy

show some pics or it didn't happen Wink

spiccioli
arklan
Legendary
*
Offline Offline

Activity: 1204


Just along for the ride...


View Profile
June 05, 2012, 04:48:21 PM
 #108

got 4 samples today.... Cheesy

oh, you TEASE.
ngzhang
Hero Member
*****
Offline Offline

Activity: 592


We will stand and fight.


View Profile
June 05, 2012, 05:23:36 PM
 #109

got 4 samples today.... Cheesy

show some pics or it didn't happen Wink

spiccioli

 Cheesy

samples have full of small design bugs, already make a TODO list.

unfortunately, the firmware side is a bit delayed. i could only do the test by previous icarus bitsteam. hours for no error now. i hope they can make a breakthrough in the near future.

this afternoon, i tested the on-board power module at school's lab . it can provide a 16A continues current for each FPGA core, 25A peak. form 7~14A is a wide high-efficiency Zone (>85%). looks like Lancelot can become a very good development platform for various of third party bitsteams.

tomorrow i will take some photos. it's black night now:D

CEO of Canaan-creative, Founder of Avalon project.
https://canaan.io/
Business contact: love@canaan.io
All PMs will be unread.
Turbor
Legendary
*
Offline Offline

Activity: 1008


BitMinter


View Profile WWW
June 05, 2012, 06:32:32 PM
 #110

got 4 samples today.... Cheesy

show some pics or it didn't happen Wink

spiccioli

 Cheesy

samples have full of small design bugs, already make a TODO list.

unfortunately, the firmware side is a bit delayed. i could only do the test by previous icarus bitsteam. hours for no error now. i hope they can make a breakthrough in the near future.

this afternoon, i tested the on-board power module at school's lab . it can provide a 16A continues current for each FPGA core, 25A peak. form 7~14A is a wide high-efficiency aone (>85%). looks like Lancelot can become a very good development platform for various of third party bitsteams.

tomorrow i will take some photos. it's black night now:D

Cool, looking forward for some pics.

Dexter770221
Legendary
*
Offline Offline

Activity: 1026


View Profile
June 06, 2012, 04:06:55 PM
 #111

My bitcoins can't wait to go to your wallet Wink

Under development Modular UPGRADEABLE Miner (MUM). Looking for investors.
Changing one PCB with screwdriver and you have brand new miner in hand... Plug&Play, scalable from one module to thousands.
fuxianhui888
Full Member
***
Offline Offline

Activity: 205



View Profile
June 06, 2012, 06:51:38 PM
 #112

pics look really nice, Smiley Smiley
allinvain
Legendary
*
Offline Offline

Activity: 2002



View Profile
June 07, 2012, 12:29:35 PM
 #113

pics look really nice, Smiley Smiley

Where where where  Huh

Smiley

Edit...found it  Grin Very nice indeed!

fuxianhui888
Full Member
***
Offline Offline

Activity: 205



View Profile
June 07, 2012, 01:39:39 PM
 #114

pics look really nice, Smiley Smiley

Where where where  Huh

Smiley

Edit...found it  Grin Very nice indeed!

I like his camera , Grin Grin Grin Grin Grin
rgzen
Member
**
Offline Offline

Activity: 93



View Profile
June 07, 2012, 02:57:03 PM
 #115

Hello,
I want to know how much will the dev kit cost.
Thanks.
ngzhang
Hero Member
*****
Offline Offline

Activity: 592


We will stand and fight.


View Profile
June 07, 2012, 03:12:35 PM
 #116

Hello,
I want to know how much will the dev kit cost.
Thanks.

69$.
include a platform cable USB, a USB stick for software, some link cables.

CEO of Canaan-creative, Founder of Avalon project.
https://canaan.io/
Business contact: love@canaan.io
All PMs will be unread.
ngzhang
Hero Member
*****
Offline Offline

Activity: 592


We will stand and fight.


View Profile
June 08, 2012, 01:41:42 PM
 #117

waiting is boring ... Embarrassed




CEO of Canaan-creative, Founder of Avalon project.
https://canaan.io/
Business contact: love@canaan.io
All PMs will be unread.
rjk
Sr. Member
****
Offline Offline

Activity: 420


1ngldh


View Profile
June 08, 2012, 01:54:14 PM
 #118

Wow that is awesome. Is it just for your testing purposes, or will they be for sale?

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
ngzhang
Hero Member
*****
Offline Offline

Activity: 592


We will stand and fight.


View Profile
June 08, 2012, 01:57:30 PM
 #119

Wow that is awesome. Is it just for your testing purposes, or will they be for sale?

do you want one?  Cheesy

CEO of Canaan-creative, Founder of Avalon project.
https://canaan.io/
Business contact: love@canaan.io
All PMs will be unread.
rjk
Sr. Member
****
Offline Offline

Activity: 420


1ngldh


View Profile
June 08, 2012, 01:59:29 PM
 #120

Wow that is awesome. Is it just for your testing purposes, or will they be for sale?

do you want one?  Cheesy
Not me personally, but I imagine that some of the guys with large clusters might make use of them. What's the maximum power draw for each port, and the total maximum?

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
Pages: « 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!