Bitcoin Forum
May 10, 2024, 05:49:31 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3] 4 5 »  All
  Print  
Author Topic: Re: Stratix-5 A7 or D5 project 01.05.2013 update  (Read 6569 times)
kingcoin
Sr. Member
****
Offline Offline

Activity: 262
Merit: 250


View Profile
April 21, 2013, 10:45:58 AM
 #41

Which Stratix7 and ArriaV do you have in mind?

Did I miss something? Are those new chips? Never mentioned any Arria series chip. And nowhere is mentioned Stratix - VII

Sorry brainfart. Which chip in the Virtex7 and StratixV families did you have in mind?
1715363371
Hero Member
*
Offline Offline

Posts: 1715363371

View Profile Personal Message (Offline)

Ignore
1715363371
Reply with quote  #2

1715363371
Report to moderator
Be very wary of relying on JavaScript for security on crypto sites. The site can change the JavaScript at any time unless you take unusual precautions, and browsers are not generally known for their airtight security.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715363371
Hero Member
*
Offline Offline

Posts: 1715363371

View Profile Personal Message (Offline)

Ignore
1715363371
Reply with quote  #2

1715363371
Report to moderator
1715363371
Hero Member
*
Offline Offline

Posts: 1715363371

View Profile Personal Message (Offline)

Ignore
1715363371
Reply with quote  #2

1715363371
Report to moderator
funnow (OP)
Full Member
***
Offline Offline

Activity: 347
Merit: 100


View Profile WWW
April 21, 2013, 10:58:32 AM
 #42

Which Stratix7 and ArriaV do you have in mind?

Did I miss something? Are those new chips? Never mentioned any Arria series chip. And nowhere is mentioned Stratix - VII

Sorry brainfart. Which chip in the Virtex7 and StratixV families did you have in mind?

Virtex-7 -> It will not be used!
So Stratix V A7 or maybe Stratix V D5 -> will be used
kingcoin
Sr. Member
****
Offline Offline

Activity: 262
Merit: 250


View Profile
April 21, 2013, 11:44:18 AM
 #43

Virtex-7 -> It will not be used!
So Stratix V A7 or maybe Stratix V D5 -> will be used

A simple compile using Quartus 12.1 of the unoptimized design(1) in a A7 device gives the following result:

; Device                          ; 5SGXEA7K2F40C2                            ;
; Logic utilization (in ALMs)     ; 32,617 / 234,720 ( 14 % )                 ;

In this device you should fit at least 6 miners, more if you can share resources and use the dsp blocks.

A 10ns clock gives a slack of  -1.9ns, which pretty bad performance. I don't understand why.

1) git://github.com/progranism/Open-Source-FPGA-Bitcoin-Miner.git

Before you start making a PCB you should get the tools to explore implementations and do simulations to make sure your implementation is working, also you would probably want to get a dev kit to play with before you design your own PCB. They come with schematics so you can see how things are done, even though that they are somewhat overengineered and have lots of stuff you don't need like serdes, pcie, ddr3 interfaces and so on.

funnow (OP)
Full Member
***
Offline Offline

Activity: 347
Merit: 100


View Profile WWW
April 21, 2013, 01:21:18 PM
 #44

Virtex-7 -> It will not be used!
So Stratix V A7 or maybe Stratix V D5 -> will be used

A simple compile using Quartus 12.1 of the unoptimized design(1) in a A7 device gives the following result:

; Device                          ; 5SGXEA7K2F40C2                            ;
; Logic utilization (in ALMs)     ; 32,617 / 234,720 ( 14 % )                 ;

In this device you should fit at least 6 miners, more if you can share resources and use the dsp blocks.

A 10ns clock gives a slack of  -1.9ns, which pretty bad performance. I don't understand why.

1) git://github.com/progranism/Open-Source-FPGA-Bitcoin-Miner.git

Before you start making a PCB you should get the tools to explore implementations and do simulations to make sure your implementation is working, also you would probably want to get a dev kit to play with before you design your own PCB. They come with schematics so you can see how things are done, even though that they are somewhat overengineered and have lots of stuff you don't need like serdes, pcie, ddr3 interfaces and so on.


Thanks for help.
It will be not easy to develop a good pcb... I have a friend who knows something about FPGA, but He is busy with other projects.
phk
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
April 21, 2013, 02:06:57 PM
 #45

A simple compile using Quartus 12.1 of the unoptimized design(1) in a A7 device gives the following result:

; Device                          ; 5SGXEA7K2F40C2                            ;
; Logic utilization (in ALMs)     ; 32,617 / 234,720 ( 14 % )                 ;

For what it's worth, I used 5SGXEA7H1F35C1 and got 11% utilization (~8 cores?), and Fmax >200MHz.
The only change I made was a new PLL megafunction.
kingcoin
Sr. Member
****
Offline Offline

Activity: 262
Merit: 250


View Profile
April 21, 2013, 02:46:28 PM
 #46

A simple compile using Quartus 12.1 of the unoptimized design(1) in a A7 device gives the following result:

; Device                          ; 5SGXEA7K2F40C2                            ;
; Logic utilization (in ALMs)     ; 32,617 / 234,720 ( 14 % )                 ;

For what it's worth, I used 5SGXEA7H1F35C1 and got 11% utilization (~8 cores?), and Fmax >200MHz.
The only change I made was a new PLL megafunction.


I have a serial interface and some other communication logic in there which can explain the extra ALM's.

200MHz Fmax on the hash clock? That's more what I would expect, or even faster for such a device. Did you use derive_pll_clocks and derive_clock_uncertainty in the SDC file for your timing analysis?

kingcoin
Sr. Member
****
Offline Offline

Activity: 262
Merit: 250


View Profile
April 21, 2013, 03:54:21 PM
 #47

For what it's worth, I used 5SGXEA7H1F35C1 and got 11% utilization (~8 cores?), and Fmax >200MHz.

BTW: was this the design found in the rtl directory or in some of the projects?

I will try to re-run it as my timing result can't be correct...
funnow (OP)
Full Member
***
Offline Offline

Activity: 347
Merit: 100


View Profile WWW
April 21, 2013, 05:12:45 PM
 #48

kingcoin -> for now I can only say thank you. I have to talk with some potential investors in the project, so for now I can't pay you for your help.
kingcoin
Sr. Member
****
Offline Offline

Activity: 262
Merit: 250


View Profile
April 21, 2013, 08:55:01 PM
 #49

For what it's worth, I used 5SGXEA7H1F35C1 and got 11% utilization (~8 cores?), and Fmax >200MHz.

BTW: was this the design found in the rtl directory or in some of the projects?

I will try to re-run it as my timing result can't be correct...

I checked my design and my clock was not properly constrained due to a syntax error in my SDC file. I ran it with the corrected SDC file.The result is still not great: 185.84 MHz. Again, it was the plain unrolled design from the "rtl" directory without any specific optimization. However, the speed difference could be due to the difference between the C1 and the C2 device.
kingcoin
Sr. Member
****
Offline Offline

Activity: 262
Merit: 250


View Profile
April 21, 2013, 08:56:43 PM
 #50

kingcoin -> for now I can only say thank you. I have to talk with some potential investors in the project, so for now I can't pay you for your help.

No problem. I did not expect to get payed. These are very expensive FPGA's. In general the cheaper FPGA's will usually get you higher H/s/$. But of course I don't know what kind of deal you can get.
turtle83
Sr. Member
****
Offline Offline

Activity: 322
Merit: 250


Supersonic


View Profile WWW
April 21, 2013, 09:01:34 PM
 #51

kingcoin -> for now I can only say thank you. I have to talk with some potential investors in the project, so for now I can't pay you for your help.

No problem. I did not expect to get payed. These are very expensive FPGA's. In general the cheaper FPGA's will usually get you higher H/s/$. But of course I don't know what kind of deal you can get.


I would be very curious to know what khash/s it could achieve for LTC. Nobody has (atleast in public) claimed to be mining LTC with FPGA.

I dont know much about FPGA or embedded programming, but from what i read, scrypt is very sensitive to the amount of fast RAM available to the cores.. The new expensive FPGAs apparently have them in abundance..

Further reading: http://bitcoin.stackexchange.com/questions/1305/what-features-of-scrypt-make-tenebrix-gpu-resistant

phk
Newbie
*
Offline Offline

Activity: 28
Merit: 0


View Profile
April 22, 2013, 12:31:43 AM
 #52

For what it's worth, I used 5SGXEA7H1F35C1 and got 11% utilization (~8 cores?), and Fmax >200MHz.

BTW: was this the design found in the rtl directory or in some of the projects?

I will try to re-run it as my timing result can't be correct...
It sounds like you have since sorted it out, but yes I was using a cloned DE2-115 project which refers to the common ../../src/xxx.v

I just changed the target device and created a new PLL.

kingcoin
Sr. Member
****
Offline Offline

Activity: 262
Merit: 250


View Profile
April 22, 2013, 06:35:36 AM
 #53

Virtex-7 -> It will not be used!

I don't know why you ruled out the Virtex7. I just gave the biggest member (xc7v2000t) of the Virtex7 family a run through Vivado. I constrained the clocks at 200MHz and the timing was reported as 4.575ns or 218Mhz. The utilization of a single core was

Code:
+----------------------------+-------+-------+-----------+-------+
|          Site Type         |  Used | Loced | Available | Util% |
+----------------------------+-------+-------+-----------+-------+
| Slice LUTs                 | 44753 |     0 |   1221600 |  3.66 |

Hence it might be possible to fit 27 hash cores (- logic to communicate with the cores) in the device which would lead to a hashing performance of 5.8Gh/s assuming timing would be the same for 27 cores. Others on the forum have optimized the design using the Xilinx DSP blocks (this device has 2160 of them) and did run the Kintex7 at much higher frequencies.  This is a very expensive FPGA. But again, I don't know what kind of deal you can get.
funnow (OP)
Full Member
***
Offline Offline

Activity: 347
Merit: 100


View Profile WWW
April 22, 2013, 07:14:34 AM
 #54

kingcoin -> for now I can only say thank you. I have to talk with some potential investors in the project, so for now I can't pay you for your help.

No problem. I did not expect to get payed. These are very expensive FPGA's. In general the cheaper FPGA's will usually get you higher H/s/$. But of course I don't know what kind of deal you can get.


I would be very curious to know what khash/s it could achieve for LTC. Nobody has (atleast in public) claimed to be mining LTC with FPGA.

I dont know much about FPGA or embedded programming, but from what i read, scrypt is very sensitive to the amount of fast RAM available to the cores.. The new expensive FPGAs apparently have them in abundance..

Further reading: http://bitcoin.stackexchange.com/questions/1305/what-features-of-scrypt-make-tenebrix-gpu-resistant

You could be curious, and you will remain curious, because It's a bitcoin project.
funnow (OP)
Full Member
***
Offline Offline

Activity: 347
Merit: 100


View Profile WWW
April 22, 2013, 07:24:01 AM
 #55

[quote au+thor=funnow link=topic=176945.msg1901732#msg1901732 date=1366541912]
Virtex-7 -> It will not be used!

I don't know why you ruled out the Virtex7. I just gave the biggest member (xc7v2000t) of the Virtex7 family a run through Vivado. I constrained the clocks at 200MHz and the timing was reported as 4.575ns or 218Mhz. The utilization of a single core was

Code:
+----------------------------+-------+-------+-----------+-------+
|          Site Type         |  Used | Loced | Available | Util% |
+----------------------------+-------+-------+-----------+-------+
| Slice LUTs                 | 44753 |     0 |   1221600 |  3.66 |

Hence it might be possible to fit 27 hash cores (- logic to communicate with the cores) in the device which would lead to a hashing performance of 5.8Gh/s assuming timing would be the same for 27 cores. Others on the forum have optimized the design using the Xilinx DSP blocks (this device has 2160 of them) and did run the Kintex7 at much higher frequencies.  This is a very expensive FPGA. But again, I don't know what kind of deal you can get.
[/quote]
Kintex VII is slower the 410 series is about 1Gh/s
Virtex VII 1-3Gh/s
Stratix V +5%  more

Those details I get from some companies doing lot with chips. So why they would tell me something that is not true?
kingcoin
Sr. Member
****
Offline Offline

Activity: 262
Merit: 250


View Profile
April 22, 2013, 09:56:18 AM
 #56

Kintex VII is slower the 410 series is about 1Gh/s
Virtex VII 1-3Gh/s
Stratix V +5%  more
Those details I get from some companies doing lot with chips. So why they would tell me something that is not true?

It's difficult to compare FPGA families since the members of the families are available in different sizes and speed grades. It might be that they have compared the fastest/biggest devices in each family though. Still they should tell you exactly which device they have compared. It might be that they have used a different design than the one I've tested. It might be that they have used different optimizations for the different devices. It might also be that they have done a general comparison from other designs and used that to estimate the hashing speed for the various devices. So everybody could be telling you the truth, but all the parameters for the given results have not been given.

Also note that the Stratix V device I used was not the faster nor the biggest available.
Signus
Newbie
*
Offline Offline

Activity: 28
Merit: 0



View Profile
April 22, 2013, 10:32:08 AM
 #57

kingcoin -> for now I can only say thank you. I have to talk with some potential investors in the project, so for now I can't pay you for your help.

No problem. I did not expect to get payed. These are very expensive FPGA's. In general the cheaper FPGA's will usually get you higher H/s/$. But of course I don't know what kind of deal you can get.


I would be very curious to know what khash/s it could achieve for LTC. Nobody has (atleast in public) claimed to be mining LTC with FPGA.

I dont know much about FPGA or embedded programming, but from what i read, scrypt is very sensitive to the amount of fast RAM available to the cores.. The new expensive FPGAs apparently have them in abundance..

Further reading: http://bitcoin.stackexchange.com/questions/1305/what-features-of-scrypt-make-tenebrix-gpu-resistant

You could be curious, and you will remain curious, because It's a bitcoin project.

Well FPGA's are reprogrammable. A well thought out design could easily be programmed to do BTC and then be flashed to do scrypt for LTC.
funnow (OP)
Full Member
***
Offline Offline

Activity: 347
Merit: 100


View Profile WWW
April 22, 2013, 11:01:03 AM
 #58

kingcoin -> for now I can only say thank you. I have to talk with some potential investors in the project, so for now I can't pay you for your help.

No problem. I did not expect to get payed. These are very expensive FPGA's. In general the cheaper FPGA's will usually get you higher H/s/$. But of course I don't know what kind of deal you can get.


I would be very curious to know what khash/s it could achieve for LTC. Nobody has (atleast in public) claimed to be mining LTC with FPGA.

I dont know much about FPGA or embedded programming, but from what i read, scrypt is very sensitive to the amount of fast RAM available to the cores.. The new expensive FPGAs apparently have them in abundance..

Further reading: http://bitcoin.stackexchange.com/questions/1305/what-features-of-scrypt-make-tenebrix-gpu-resistant

You could be curious, and you will remain curious, because It's a bitcoin project.

Well FPGA's are reprogrammable. A well thought out design could easily be programmed to do BTC and then be flashed to do scrypt for LTC.
Sorry but It will not have any memory Smiley so no LTC mining Smiley that's all Smiley
Signus
Newbie
*
Offline Offline

Activity: 28
Merit: 0



View Profile
April 22, 2013, 11:48:13 AM
 #59

Haha, I believe those chips could perform quite adequately. The paper written for Litecoin applications was written a while ago, and we're talking about new FPGA technology, even though yes the more memory the better for scrypt.

Granted nobody has developed a legitimate and power efficient method for mining LTC with FPGA's. What I'm saying is FPGA's give you that opportunity if need be, so that you're not wasting time designing the hardware for something.
kingcoin
Sr. Member
****
Offline Offline

Activity: 262
Merit: 250


View Profile
April 22, 2013, 12:29:08 PM
 #60

You would probably get better response if you started a new thread about this subject.

FPGA's have internal memory. The size is limited, but the memory access has very low latency. Further it's no problem to add external DDR3 or other types of external memory to the FPGA. Most modern FPGA's will typically contain one or more hard memory controllers.

I don't know anything about the memory access pattern of a litecoin application, but I got the impression that it's basically a large cache. You could treat all the internal memory as a cache with low latency access for the external DDR3 memory. The good thing about the FPGA is that you could potentially tune the cache replacement algorithm towards the application, which is something that would be difficult on a CPU. However, I don't know much about scrypt so I can't be more specific.
Pages: « 1 2 [3] 4 5 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!