Bitcoin Forum
November 08, 2024, 06:51:56 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3] 4 5 6 »  All
  Print  
Author Topic: Best demonstrated efficiency: 1290 Mhash/Joule  (Read 20582 times)
Coinoisseur
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
July 29, 2012, 01:46:19 AM
 #41

http://www.cpubenchmark.net/cpu_lookup.php?cpu=Intel+Atom+Z510+%40+1.10GHz

http://ark.intel.com/products/35469/Intel-Atom-Processor-Z510-%28512K-Cache-1_10-GHz-400-MHz-FSB%29

http://ark.intel.com/products/31855/Intel-Pentium-III-Processor---S-1_00-GHz-512K-Cache-133-MHz-FSB

Z510 is a bit slower than mobile P3 1GHz, 130nm->45nm no performance increase but 16.5% of the power use. Keep in mind this is Intel the biggest chip foundry in the world.

                                                                               
                
                                                       ╓▄▌██P                  
                                                 ╔▄▌███▀███▌                   
                                           ▄▄▌██▀▀╚  ╓██╩██                    
                                     ▄▄███▀▀╙      ▄██  ▓█                     
                               ▄▌███▀▀+          ▄█▀   ▐█                      
                        ,▄▌███▀▀¬              ▓█▀     █▄                      
                  ,▄▌███▀▀                  ,██▀      █▌                       
               '█████▌▄▄,                 ╓██╩       ██                        
                  ▀██▌▐▀▀▀█████▌▌▄▄╓    ▄██¬        ▄█                         
                     ▀██▄        ╚▀▀▀████          ▐█═                         
                        ▀██▄        ▓█▀██          █▀                          
                           ▀██▄  ,██▀   █µ        ██                           
                              ▀███Z     ██       ██                            
                                ▐██     ▐█      ▄█                             
                              ,,╓╓█▓▄▌   █▌    ▐█U                             
                        º▄▓▓▓▓▓▓▓▓▓███   ▀█    █▌                              
                          ▀█▓▓▓▓▓████▀█▌  █▌  ██                               
                            ▀███████▌  ▀█µ▀█ ██                                
                              ▀█████     ███▓█                                 
                                ▐███      ▀██Ñ                                 
                                            ▀                             

mrb (OP)
Legendary
*
Offline Offline

Activity: 1512
Merit: 1028


View Profile WWW
July 29, 2012, 02:20:08 AM
 #42


The round "2 W" number quoted for the Z510 is likely Intel rounding up.
Compare instead the (faster) 1.3 W Z600 which I linked above.
130nm->45nm predicts a reduction of the power to 12% (1/8th), and the Z600 reduces it to 11%, hence proving my point.
Coinoisseur
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
July 29, 2012, 02:44:12 AM
 #43

Which still brings us back to.

Let me get this straight: BFL is claiming 1,750 MH/J and you are trying to say that is plausible based on some paper you found that demonstrated 71 MH/J?

Seriously?

And keep in mind that's Intel showing that there is no free performance bonus when aiming for power reduction, are we seriously going to armchair ref that BFL is on par with Intel in terms of engineering and chip production?

                                                                               
                
                                                       ╓▄▌██P                  
                                                 ╔▄▌███▀███▌                   
                                           ▄▄▌██▀▀╚  ╓██╩██                    
                                     ▄▄███▀▀╙      ▄██  ▓█                     
                               ▄▌███▀▀+          ▄█▀   ▐█                      
                        ,▄▌███▀▀¬              ▓█▀     █▄                      
                  ,▄▌███▀▀                  ,██▀      █▌                       
               '█████▌▄▄,                 ╓██╩       ██                        
                  ▀██▌▐▀▀▀█████▌▌▄▄╓    ▄██¬        ▄█                         
                     ▀██▄        ╚▀▀▀████          ▐█═                         
                        ▀██▄        ▓█▀██          █▀                          
                           ▀██▄  ,██▀   █µ        ██                           
                              ▀███Z     ██       ██                            
                                ▐██     ▐█      ▄█                             
                              ,,╓╓█▓▄▌   █▌    ▐█U                             
                        º▄▓▓▓▓▓▓▓▓▓███   ▀█    █▌                              
                          ▀█▓▓▓▓▓████▀█▌  █▌  ██                               
                            ▀███████▌  ▀█µ▀█ ██                                
                              ▀█████     ███▓█                                 
                                ▐███      ▀██Ñ                                 
                                            ▀                             

mrb (OP)
Legendary
*
Offline Offline

Activity: 1512
Merit: 1028


View Profile WWW
July 29, 2012, 03:00:58 AM
 #44

I have explained many times I think they will do 700 Mh/J, not 1750 Mh/J. Read this thread.
Coinoisseur
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
July 29, 2012, 03:08:37 AM
 #45

Using a 2012 research chip design? If they pull that off then they should just become a chip design firm because it'll mean they have some of the best engineers in the world.

                                                                               
                
                                                       ╓▄▌██P                  
                                                 ╔▄▌███▀███▌                   
                                           ▄▄▌██▀▀╚  ╓██╩██                    
                                     ▄▄███▀▀╙      ▄██  ▓█                     
                               ▄▌███▀▀+          ▄█▀   ▐█                      
                        ,▄▌███▀▀¬              ▓█▀     █▄                      
                  ,▄▌███▀▀                  ,██▀      █▌                       
               '█████▌▄▄,                 ╓██╩       ██                        
                  ▀██▌▐▀▀▀█████▌▌▄▄╓    ▄██¬        ▄█                         
                     ▀██▄        ╚▀▀▀████          ▐█═                         
                        ▀██▄        ▓█▀██          █▀                          
                           ▀██▄  ,██▀   █µ        ██                           
                              ▀███Z     ██       ██                            
                                ▐██     ▐█      ▄█                             
                              ,,╓╓█▓▄▌   █▌    ▐█U                             
                        º▄▓▓▓▓▓▓▓▓▓███   ▀█    █▌                              
                          ▀█▓▓▓▓▓████▀█▌  █▌  ██                               
                            ▀███████▌  ▀█µ▀█ ██                                
                              ▀█████     ███▓█                                 
                                ▐███      ▀██Ñ                                 
                                            ▀                             

mrb (OP)
Legendary
*
Offline Offline

Activity: 1512
Merit: 1028


View Profile WWW
July 29, 2012, 04:25:33 AM
 #46

My point, and rjk's point is that: What makes you think the authors of that paper are the world's best ASIC designers? They are not. They are students and professors. The bleeding edge of ASIC research happens in the professional world (at TSMC, Intel, etc), not in the academic world.

The authors did not need to be excellent ASIC designers to conduct this research. They merely tried to make an average design, and that's all they needed to fairly compare the efficiency of different hash functions. This was all they needed to reach their research goal.

That team achieved 71 Mh/J at 130nm, using standard-cell tech. The true best ASIC designers would have achieved higher that that, using full-custom tech not standard-cell, and would have demonstrated it on a smaller process node like 45nm.

PS: the Virginia Tech researchers did not even do the VHDL design themselves, they implemented the one from GMU: https://cryptography.gmu.edu/athena/index.php?id=source_codes  It looks like it is https://cryptography.gmu.edu/athena/sources/2011_10_01/basic/SHA-2_basic.zip  -> any half-decent ASIC designers should be able to take it, implement it to 45nm standard-cell tech, and get 700 Mh/J
Coinoisseur
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
July 29, 2012, 05:06:05 AM
 #47

I guess this 1+ Billion market cap company is slacking.

http://www.cavium.com/processor_security_nitrox-III.html

20W 30Gbps SHA2

                                                                               
                
                                                       ╓▄▌██P                  
                                                 ╔▄▌███▀███▌                   
                                           ▄▄▌██▀▀╚  ╓██╩██                    
                                     ▄▄███▀▀╙      ▄██  ▓█                     
                               ▄▌███▀▀+          ▄█▀   ▐█                      
                        ,▄▌███▀▀¬              ▓█▀     █▄                      
                  ,▄▌███▀▀                  ,██▀      █▌                       
               '█████▌▄▄,                 ╓██╩       ██                        
                  ▀██▌▐▀▀▀█████▌▌▄▄╓    ▄██¬        ▄█                         
                     ▀██▄        ╚▀▀▀████          ▐█═                         
                        ▀██▄        ▓█▀██          █▀                          
                           ▀██▄  ,██▀   █µ        ██                           
                              ▀███Z     ██       ██                            
                                ▐██     ▐█      ▄█                             
                              ,,╓╓█▓▄▌   █▌    ▐█U                             
                        º▄▓▓▓▓▓▓▓▓▓███   ▀█    █▌                              
                          ▀█▓▓▓▓▓████▀█▌  █▌  ██                               
                            ▀███████▌  ▀█µ▀█ ██                                
                              ▀█████     ███▓█                                 
                                ▐███      ▀██Ñ                                 
                                            ▀                             

mrb (OP)
Legendary
*
Offline Offline

Activity: 1512
Merit: 1028


View Profile WWW
July 29, 2012, 05:26:41 AM
 #48

Apples vs. Oranges.

Nitrox III implements much more than SHA-2: full-blown RISC cores, RSA acceleration, etc, blowing its TDP up.

30Gbps corresponds to 29 Mhash/s. At 20W that's 1.45 Mhash/J. Nitrox III is handily beaten by all the Spartan 6 FPGAs around here doing 20 Mhash/J. Why were you thinking that Calvium's chips were the "state of the art" in SHA-2 performance?
Coinoisseur
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
July 29, 2012, 05:39:40 AM
 #49

They are much bigger than BFL.  Roll Eyes

                                                                               
                
                                                       ╓▄▌██P                  
                                                 ╔▄▌███▀███▌                   
                                           ▄▄▌██▀▀╚  ╓██╩██                    
                                     ▄▄███▀▀╙      ▄██  ▓█                     
                               ▄▌███▀▀+          ▄█▀   ▐█                      
                        ,▄▌███▀▀¬              ▓█▀     █▄                      
                  ,▄▌███▀▀                  ,██▀      █▌                       
               '█████▌▄▄,                 ╓██╩       ██                        
                  ▀██▌▐▀▀▀█████▌▌▄▄╓    ▄██¬        ▄█                         
                     ▀██▄        ╚▀▀▀████          ▐█═                         
                        ▀██▄        ▓█▀██          █▀                          
                           ▀██▄  ,██▀   █µ        ██                           
                              ▀███Z     ██       ██                            
                                ▐██     ▐█      ▄█                             
                              ,,╓╓█▓▄▌   █▌    ▐█U                             
                        º▄▓▓▓▓▓▓▓▓▓███   ▀█    █▌                              
                          ▀█▓▓▓▓▓████▀█▌  █▌  ██                               
                            ▀███████▌  ▀█µ▀█ ██                                
                              ▀█████     ███▓█                                 
                                ▐███      ▀██Ñ                                 
                                            ▀                             

mrb (OP)
Legendary
*
Offline Offline

Activity: 1512
Merit: 1028


View Profile WWW
July 29, 2012, 05:46:48 AM
 #50

Maybe you should tip Cavium that by taking an open source SHA-2 VHDL design from students/professors, and implementing it on a 12-year-old 130nm design, they could increase their energy efficiency by a factor 49x from 1.45 Mhash/J to 71 Mhash/J.

My point is: obviously Cavium did not aim at SHA-2 energy efficiency. You are comparing Apples vs. Oranges.
Coinoisseur
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
July 29, 2012, 05:59:06 AM
Last edit: July 29, 2012, 06:11:43 AM by Coinoisseur
 #51

Or possibly reaching that energy efficiency at higher clocks *and variable protocol settings is not easy. I don't see why they wouldn't want a well balanced SHA2 logic block in their design.

                                                                               
                
                                                       ╓▄▌██P                  
                                                 ╔▄▌███▀███▌                   
                                           ▄▄▌██▀▀╚  ╓██╩██                    
                                     ▄▄███▀▀╙      ▄██  ▓█                     
                               ▄▌███▀▀+          ▄█▀   ▐█                      
                        ,▄▌███▀▀¬              ▓█▀     █▄                      
                  ,▄▌███▀▀                  ,██▀      █▌                       
               '█████▌▄▄,                 ╓██╩       ██                        
                  ▀██▌▐▀▀▀█████▌▌▄▄╓    ▄██¬        ▄█                         
                     ▀██▄        ╚▀▀▀████          ▐█═                         
                        ▀██▄        ▓█▀██          █▀                          
                           ▀██▄  ,██▀   █µ        ██                           
                              ▀███Z     ██       ██                            
                                ▐██     ▐█      ▄█                             
                              ,,╓╓█▓▄▌   █▌    ▐█U                             
                        º▄▓▓▓▓▓▓▓▓▓███   ▀█    █▌                              
                          ▀█▓▓▓▓▓████▀█▌  █▌  ██                               
                            ▀███████▌  ▀█µ▀█ ██                                
                              ▀█████     ███▓█                                 
                                ▐███      ▀██Ñ                                 
                                            ▀                             

mrb (OP)
Legendary
*
Offline Offline

Activity: 1512
Merit: 1028


View Profile WWW
July 29, 2012, 06:46:32 AM
 #52

I gave you the answer already: what is consuming the bulk of their 20W power is the other logic blocks such as the RISC cores, RSA engines, etc. That's why comparing such a complex chip like the Nitrox III to a barebone SHA-2 logic block is a pointless apples vs. oranges exercise.
Coinoisseur
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
July 29, 2012, 06:58:46 AM
 #53

But if that's a 65nm or 45nm chip and the SHA2 block is allocated 1W of the power budget, shouldn't they be pulling 100+ MH/s? They better hire BFL stat.

                                                                               
                
                                                       ╓▄▌██P                  
                                                 ╔▄▌███▀███▌                   
                                           ▄▄▌██▀▀╚  ╓██╩██                    
                                     ▄▄███▀▀╙      ▄██  ▓█                     
                               ▄▌███▀▀+          ▄█▀   ▐█                      
                        ,▄▌███▀▀¬              ▓█▀     █▄                      
                  ,▄▌███▀▀                  ,██▀      █▌                       
               '█████▌▄▄,                 ╓██╩       ██                        
                  ▀██▌▐▀▀▀█████▌▌▄▄╓    ▄██¬        ▄█                         
                     ▀██▄        ╚▀▀▀████          ▐█═                         
                        ▀██▄        ▓█▀██          █▀                          
                           ▀██▄  ,██▀   █µ        ██                           
                              ▀███Z     ██       ██                            
                                ▐██     ▐█      ▄█                             
                              ,,╓╓█▓▄▌   █▌    ▐█U                             
                        º▄▓▓▓▓▓▓▓▓▓███   ▀█    █▌                              
                          ▀█▓▓▓▓▓████▀█▌  █▌  ██                               
                            ▀███████▌  ▀█µ▀█ ██                                
                              ▀█████     ███▓█                                 
                                ▐███      ▀██Ñ                                 
                                            ▀                             

MrTeal
Legendary
*
Offline Offline

Activity: 1274
Merit: 1004


View Profile
July 29, 2012, 07:02:15 AM
 #54

Do you really think that BFL would be able to raise the capital necessary to do a full custom 45nm design? I could see maybe something like American Semi's 1D 45nm process, but to do a standard 45nm full custom design just don't make sense given the market. Between late 2012 and the start of 2015 there might be 3M BTC produced. Over two years is forever in Bitcoin terms, and it might take that long for them to recover the millions in NRE.

They are (supposedly) going to be the first to market with an ASIC. Why would you attempt to fabricate on something like 45nm which is still a modern process (the A5 and Exynos 4210 in the Galaxy SII are 45nm) and pay millions in NRE? If you design at 90nm, you could still destroy the competition and sell at basically the same price points, but the NRE would be a fraction of what you'd pay at 45nm. Your time to recover those expenses would be much smaller, and your all around risk would be much lower. If later on competitors force you to a newer process you're in the driver's seat; well funded, experience and with significant brand equity.

I just can't wrap my head around them doing something like that.
mrb (OP)
Legendary
*
Offline Offline

Activity: 1512
Merit: 1028


View Profile WWW
July 29, 2012, 07:44:36 AM
Last edit: July 29, 2012, 08:14:52 AM by mrb
 #55

Do you really think that BFL would be able to raise the capital necessary to do a full custom 45nm design?

I said standard cell, not full custom:

any half-decent ASIC designers should be able to take it, implement it to 45nm standard-cell tech, and get 700 Mh/J

And TSCM launched their standard-cell 45nm toolkits 5 years ago! As said earlier in this thread, this is hardly "bleeding edge" tech... The NRE costs are mostly proportional to the complexity of the chip you are designing. This is why dead-simple logic blocks (SRAM cells, NAND, etc) are always the first ones to be built at the smaller nodes (eg. 22 nm), whereas complex chips like the A5 lag behind (45 nm). A dumb SHA-256 logic block is much closer to SRAM/NAND in terms of complexity than a SoC like the A5. So I think BFL doing 45nm is absolutely plausible. But again, as I said in the OP, they may even get away with 65nm.
mrb (OP)
Legendary
*
Offline Offline

Activity: 1512
Merit: 1028


View Profile WWW
July 29, 2012, 08:10:02 AM
 #56

But if that's a 65nm or 45nm chip and the SHA2 block is allocated 1W of the power budget, shouldn't they be pulling 100+ MH/s?

No because if you read the specs of the thing you found with 30sec of googling around, you would see the Nitrox III appears to run SHA-2/RSA/etc on RISC cores (ie. a CPU core), which implies it is not a custom SHA-256 ASIC, which explains its poor performance per Joule (after all, if it was an ASIC, it should beat Spartan6 FPGAs, but it does not.)
Coinoisseur
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
July 29, 2012, 08:18:09 AM
 #57

Accelerators mean there are at least some custom blocks for crypto in there. I'm going to side with you and say they are just terrible engineers and couldn't find a way to get 100MH/J equivalent out of their reserved SHA2 resources. That or they reserved just a few thousand transistors for SHA2 because having the best in class performance should be reserved for real winners like BFL.

                                                                               
                
                                                       ╓▄▌██P                  
                                                 ╔▄▌███▀███▌                   
                                           ▄▄▌██▀▀╚  ╓██╩██                    
                                     ▄▄███▀▀╙      ▄██  ▓█                     
                               ▄▌███▀▀+          ▄█▀   ▐█                      
                        ,▄▌███▀▀¬              ▓█▀     █▄                      
                  ,▄▌███▀▀                  ,██▀      █▌                       
               '█████▌▄▄,                 ╓██╩       ██                        
                  ▀██▌▐▀▀▀█████▌▌▄▄╓    ▄██¬        ▄█                         
                     ▀██▄        ╚▀▀▀████          ▐█═                         
                        ▀██▄        ▓█▀██          █▀                          
                           ▀██▄  ,██▀   █µ        ██                           
                              ▀███Z     ██       ██                            
                                ▐██     ▐█      ▄█                             
                              ,,╓╓█▓▄▌   █▌    ▐█U                             
                        º▄▓▓▓▓▓▓▓▓▓███   ▀█    █▌                              
                          ▀█▓▓▓▓▓████▀█▌  █▌  ██                               
                            ▀███████▌  ▀█µ▀█ ██                                
                              ▀█████     ███▓█                                 
                                ▐███      ▀██Ñ                                 
                                            ▀                             

MrTeal
Legendary
*
Offline Offline

Activity: 1274
Merit: 1004


View Profile
July 29, 2012, 12:22:10 PM
 #58

Do you really think that BFL would be able to raise the capital necessary to do a full custom 45nm design?

I said standard cell, not full custom:

any half-decent ASIC designers should be able to take it, implement it to 45nm standard-cell tech, and get 700 Mh/J

And TSCM launched their standard-cell 45nm toolkits 5 years ago! As said earlier in this thread, this is hardly "bleeding edge" tech... The NRE costs are mostly proportional to the complexity of the chip you are designing. This is why dead-simple logic blocks (SRAM cells, NAND, etc) are always the first ones to be built at the smaller nodes (eg. 22 nm), whereas complex chips like the A5 lag behind (45 nm). A dumb SHA-256 logic block is much closer to SRAM/NAND in terms of complexity than a SoC like the A5. So I think BFL doing 45nm is absolutely plausible. But again, as I said in the OP, they may even get away with 65nm.

If that's the case then BFL would be straight lying, as they've claimed that their design is full custom.
rjk
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
July 29, 2012, 12:30:46 PM
 #59

Accelerators mean there are at least some custom blocks for crypto in there. I'm going to side with you and say they are just terrible engineers and couldn't find a way to get 100MH/J equivalent out of their reserved SHA2 resources. That or they reserved just a few thousand transistors for SHA2 because having the best in class performance should be reserved for real winners like BFL.
Have you considered the fact that SHA-2 in THEIR application may not require the ultimate in efficiency and even speed? 30Gbps is smoking fast for most SHA-2 applications outside of Bitcoin, and 20 watts is comparatively low-powered. I'm sure they see no reason to spend hundreds of man-hours on optimizing a small part of the overall chip design when the application that it is intended for is bottlenecked by other unrelated factors.

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
pieppiep
Hero Member
*****
Offline Offline

Activity: 1596
Merit: 502


View Profile
July 29, 2012, 02:15:46 PM
 #60

For those who are not following Block Erupter: Dedicated Mining ASIC Project (Open for Discussion)

Update

Our RTL design, optimization and simulation are finished. We have some data to predict the specification of actual chips after they are manufactured.

Hashrate: 1.25GH/s per chip
Area: 17.5mm^2 per chip
Power Consumption: 13.3W

Note that they are calculated from the front-end design and not accurate enough. But of course the possible difference range won't be large. We will keep our updates.
Pages: « 1 2 [3] 4 5 6 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!