Bitcoin Forum
May 06, 2024, 10:34:28 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Zcash Equihash PoW Released  (Read 7732 times)
rdnkjdi (OP)
Legendary
*
Offline Offline

Activity: 1256
Merit: 1009


View Profile
April 15, 2016, 01:34:04 PM
 #1

https://www.internetsociety.org/sites/default/files/blogs-media/equihash-asymmetric-proof-of-work-based-generalized-birthday-problem.pdf

1715034868
Hero Member
*
Offline Offline

Posts: 1715034868

View Profile Personal Message (Offline)

Ignore
1715034868
Reply with quote  #2

1715034868
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715034868
Hero Member
*
Offline Offline

Posts: 1715034868

View Profile Personal Message (Offline)

Ignore
1715034868
Reply with quote  #2

1715034868
Report to moderator
1715034868
Hero Member
*
Offline Offline

Posts: 1715034868

View Profile Personal Message (Offline)

Ignore
1715034868
Reply with quote  #2

1715034868
Report to moderator
bitcoin carpenter
Legendary
*
Offline Offline

Activity: 1582
Merit: 1001


View Profile
April 15, 2016, 04:29:57 PM
 #2

Interesting..

Please elaborate.

If your not actively using the technology behind your crypto investment,

IT IS A SCAM!!!!
AngusCanine
Legendary
*
Offline Offline

Activity: 1414
Merit: 1001

To weird to live To rare to die


View Profile WWW
April 15, 2016, 05:44:17 PM
 #3

Well on page 6 part of the equation is infected with HIV and a first glance with out reading everything  it seems they propose a band aid to solve the problem
rdnkjdi (OP)
Legendary
*
Offline Offline

Activity: 1256
Merit: 1009


View Profile
April 15, 2016, 06:20:17 PM
 #4

Interesting..

Please elaborate.

I understand very little about PoW and I understand my capacity enough to know that learning will be less fruitful than reading from educated observers.  Wolf0, tbtb_need_war, claymore, Gregory Maxwell, etc would all be able to give good insight
TPTB_need_war
Sr. Member
****
Offline Offline

Activity: 420
Merit: 257


View Profile
April 15, 2016, 06:21:50 PM
 #5

I can't right now. Sorry. tromp is also qualified, but he is opinionated about the value of asymmetric proof-of-work versus memory hard hash algorithms.

TPTB_need_war
Sr. Member
****
Offline Offline

Activity: 420
Merit: 257


View Profile
April 16, 2016, 05:53:56 PM
 #6

Re: Zcash Equihash PoW Released

https://z.cash/blog/why-equihash.html

Well some Monero folks seem to very interested, so I decided to take a look because I suddenly remembered something from Iota's whitepaper:

http://188.138.57.93/tangle.pdf#page=20 (4.3 Resistance to quantum computations)

If you look on page 115 of Bernstein's Post Quantum Cryptography, it confirms a sqrt(N) speedup for Wagner's solution to the Generalized Birthday Problem for the hypothetical quantum computer. So if Equihash is using a large list of values to be sorted, then the speedup could be so great that a quantum computer could rewrite the entire block chain quite easily by redoing the past proof-of-work exponentially faster than it was originally done.

It appears that a memory hard algorithm such as Cryptonite would not have this problem.

I am just rushing and have not reread the Equihash white paper carefully, so I may have a mistake in my analysis.

Additionally I was thinking that this Equihash can be trivially sped up on an ASIC because the Equihash algorithm is not memory latency bound and thus is bound on the sorting and computation speed and/or the memory bandwidth, which can be optimized with specialized hardware.

Have I missed something in my haste Huh Surely the Z.cash folks are not this myopic.

The following assumption from the Equihash white paper seems to be wrong:

Quote
For fixed memory size, memory chips with bandwidth
significantly higher than that of typical desktop memory (such
as DDR3) are rare. Assuming that a prover does not have
access to memory with bandwidth higher than certain
Bw
max
,
we can efficiently bound the time-memory (and thus the time-
area) product for such implementations.

...

To the best of our knowledge, the highest reported band-
width in commercial products does not exceed 200 GB/s
(Playstation 4 has 172 GB/s bandwidth [5]), whereas the
desktop DDR3 can have as high as 17 GB/s bandwidth [3].
Thus we conclude that the highest advantage a prover can
get from parallel hardware does not exceed the factor of 15.
This may mean that our proof-of-work is GPU- and desktop-
friendly, whereas it is also ASIC-resistant in the sense that an
ASIC implementation does not yield smaller time-area product.

I am thinking faster memory bandwidth can be obtained by reading and writing to multiple, redundant independent memory banks simultaneously that are interleaved differently to have disjoint collisions on banks.

Or even if not faster, then perhaps orders-of-magnitude more electrically efficient, since the electric consumption will be primarily in the computation and not in the memory. And computation can be radically optimized on an ASIC.

As I predicted when I wrote in the Monero thread about this the other day, it appears the white paper didn't consider electricity as the most important factor. Duh.

tromp
Legendary
*
Offline Offline

Activity: 978
Merit: 1087


View Profile
April 16, 2016, 09:47:54 PM
 #7

Re: Zcash Equihash PoW Released
If you look on page 115 of Bernstein's Post Quantum Cryptography, it confirms a sqrt(N) speedup for Wagner's solution to the Generalized Birthday Problem for the hypothetical quantum computer.

You seem to have have misinterpreted the book. Page 115 says
Quote
If both lists have the same same size sqrt(N) , this means that the merging can be done in sqrt(N) operations instead of N , which is the same speed-up that can be achieved by Grover’s
algorithm. Thus, even with a quantum computer we can not expect to get attacks for FSB or CFS

That says there is *no* known quantum speedup for the Generalized Birthday Problem.

Quote
Additionally I was thinking that this Equihash can be trivially sped up on an ASIC because the Equihash algorithm is not memory latency bound and thus is bound on the sorting and computation speed and/or the memory bandwidth, which can be optimized with specialized hardware.

There is no publicly available optimized code for Equihash, so we don't really know to what extent it
is bandwidth or latency bound. There may be subtle trade offs between the two. We really need to see the memory behaviour of actual running optimized code.

Even once zcash gets around to publishing more optimized cpu and gpu code,
it won't be clear for a while whether further optimizations are possible, considering that the mining
algorithm is rather nontrivial.

I'm somewhat worried by this statement on their optimization page at https://github.com/zcash/zcash/issues/857

Quote
We don't have to do every possible optimization, but enough that it's feasible to run a miner with the right parameters.

Leaving optimizations on the table means that some miners will be able to run more efficiently than others.
arielbit
Legendary
*
Offline Offline

Activity: 3416
Merit: 1059


View Profile
April 17, 2016, 08:02:46 AM
 #8

There is no publicly available optimized code for Equihash, so we don't really know to what extent it
is bandwidth or latency bound. There may be subtle trade offs between the two. We really need to see the memory behaviour of actual running optimized code.

Even once zcash gets around to publishing more optimized cpu and gpu code,
it won't be clear for a while whether further optimizations are possible, considering that the mining
algorithm is rather nontrivial.

I'm somewhat worried by this statement on their optimization page at https://github.com/zcash/zcash/issues/857

Quote
We don't have to do every possible optimization, but enough that it's feasible to run a miner with the right parameters.

Leaving optimizations on the table means that some miners will be able to run more efficiently than others.

calling our miner developers in the house  Cool
TPTB_need_war
Sr. Member
****
Offline Offline

Activity: 420
Merit: 257


View Profile
April 17, 2016, 08:20:02 AM
Last edit: April 17, 2016, 09:01:13 AM by TPTB_need_war
 #9

Additionally I was thinking that this Equihash can be trivially sped up on an ASIC because the Equihash algorithm is not memory latency bound and thus is bound on the sorting and computation speed and/or the memory bandwidth, which can be optimized with specialized hardware.

There is no publicly available optimized code for Equihash, so we don't really know to what extent it
is bandwidth or latency bound. There may be subtle trade offs between the two. We really need to see the memory behaviour of actual running optimized code.

1. You didn't address my electrical efficiency point, which I think is probably the most damning. I very much doubt that the memory power consumption will be even 1/10 of the computational power consumption. Thus an ASIC will be at least 10 times more power efficient, Don't I remember pointing out the same issue with your Cuckoo hash? What was your retort again?

2. I also very much doubt it will be latency bound (even after the 10X speed up of the computation), because optimized sorting doesn't have to be random access.

For any memory bandwidth bound, there is this:


tromp
Legendary
*
Offline Offline

Activity: 978
Merit: 1087


View Profile
April 17, 2016, 02:47:39 PM
 #10

1. You didn't address my electrical efficiency point, which I think is probably the most damning. I very much doubt that the memory power consumption will be even 1/10 of the computational power consumption. Thus an ASIC will be at least 10 times more power efficient, Don't I remember pointing out the same issue with your Cuckoo hash? What was your retort again?

Cuckoo spends only 1/3 of its time doing hash computations, which take little power compared to DRAM switching rows.
So the possible gain by ASIC hashing is less than 33%.

Quote
2. I also very much doubt it will be latency bound (even after the 10X speed up of the computation), because optimized sorting doesn't have to be random access.

Again, we must wait to see the final parameters and optimized implementations. It seems likely they'll pick
n=144 and k=5, which means they need to find collisions on 16 bits at a time. That is just a lookup in a 64K entry table;
I don't expect to see any real sorting there. It's less clear how the actual items are represented, and if they are moved
themselves, or by pointer.
TPTB_need_war
Sr. Member
****
Offline Offline

Activity: 420
Merit: 257


View Profile
April 17, 2016, 06:49:58 PM
 #11

1. You didn't address my electrical efficiency point, which I think is probably the most damning. I very much doubt that the memory power consumption will be even 1/10 of the computational power consumption. Thus an ASIC will be at least 10 times more power efficient, Don't I remember pointing out the same issue with your Cuckoo hash? What was your retort again?

Cuckoo spends only 1/3 of its time doing hash computations, which take little power compared to DRAM switching rows.
So the possible gain by ASIC hashing is less than 33%.

Sorry not trying to make more work for you, but would you happen to have a reference for that claim? I am curious how you measured that? I remember you told smooth and I that you don't have a Kill-A-Watt meter to measure GPU power consumption.

What about multiple processors sharing the same memory to amortize the memory electrical cost?

2. I also very much doubt it will be latency bound (even after the 10X speed up of the computation), because optimized sorting doesn't have to be random access.

Again, we must wait to see the final parameters and optimized implementations. It seems likely they'll pick
n=144 and k=5, which means they need to find collisions on 16 bits at a time. That is just a lookup in a 64K entry table;
I don't expect to see any real sorting there. It's less clear how the actual items are represented, and if they are moved
themselves, or by pointer.

I thought Wagner's algorithm sorted. I'll have to reread the paper more carefully.

If lookup is in the static RAM cache, then ratio of computation to memory power consumption should increase.

Yeah we need to analyze the specifics, which is why I am thinking their white paper is myopic. They didn't present any of this analysis afaics on quick perusal. Perhaps they just assumed a memory bandwidth bound.

coins101
Legendary
*
Offline Offline

Activity: 1456
Merit: 1000



View Profile
April 17, 2016, 11:51:43 PM
 #12

I'm interested to give this a test drive and get this working on some rigs.
tromp
Legendary
*
Offline Offline

Activity: 978
Merit: 1087


View Profile
April 18, 2016, 01:56:29 AM
 #13

Cuckoo spends only 1/3 of its time doing hash computations, which take little power compared to DRAM switching rows.
So the possible gain by ASIC hashing is less than 33%.

Sorry not trying to make more work for you, but would you happen to have a reference for that claim? I am curious how you measured that? I remember you told smooth and I that you don't have a Kill-A-Watt meter to measure GPU power consumption.

I didn't measure that. I estimated that doing 90 64-bit operations (one siphash call) is cheaper than operating the (typically 16k)
sense amplifiers to read a new bitline. But I could be wrong about that...
I'm assuming a Cuckoo size much larger than 16k of course.

Quote
What about multiple processors sharing the same memory to amortize the memory electrical cost?

If they share the same memory bank then they likely cause more row switches and there is negligible amortization.

Quote
I thought Wagner's algorithm sorted. I'll have to reread the paper more carefully.

That puzzled me too. I don't see a need for sorting, just for binning to find collisions on the next chunk of n/(k+1) bits...,
and I will only believe sorting is more performant when I see it (i.e. when optimised code for that is made public).

Quote
If lookup is in the static RAM cache, then ratio of computation to memory power consumption should increase.

Their stated goal is a 1GB memory footprint, so we can rule out fitting in cache.
TPTB_need_war
Sr. Member
****
Offline Offline

Activity: 420
Merit: 257


View Profile
April 18, 2016, 07:26:37 AM
 #14

What about multiple processors sharing the same memory to amortize the memory electrical cost?

If they share the same memory bank then they likely cause more row switches and there is negligible amortization.

The GPU uses some coalescing to minimize latency and maximize utilization of memory bandwidth. Isn't that likely to merge accesses and thus amortize?

Excuse me we are getting in area of hardware that I am not well researched.

MaxDZ8
Hero Member
*****
Offline Offline

Activity: 672
Merit: 500



View Profile
April 18, 2016, 09:30:33 AM
 #15

The GPU uses some coalescing to minimize latency and maximize utilization of memory bandwidth. Isn't that likely to merge accesses and thus amortize?

Excuse me we are getting in area of hardware that I am not well researched.
Coalescing patterns are variable across implementations a common point across most hardware is occupancy ratio or hardware-based, context-switch-free multi-threading of sort. Intel calls it HyperThreading®, it's guaranteed to work on two threads whereas GPUs accomodate as many threads they can fit. In my experience is considerably more consistent and easy to use than exploiting coalescing patterns.
rdnkjdi (OP)
Legendary
*
Offline Offline

Activity: 1256
Merit: 1009


View Profile
April 29, 2016, 04:32:27 AM
 #16

Can someone explain to me the likely difference between a 4GB GPU vs an 8GB GPU?  I'm thinking about adding a few GPU's to my farm - the R9 390 has GB of ram which is pretty much useless when it comes to what was supposed to be memory hard PoW (Ethereum).  The newer larger memory bandwidth 4GB cards (Nano, Fury, etc) have a better hashrate.

Is more RAM per GPU going to wind up meaning better performance (allow more threads)?  Or will optimizations likely allow trading performance + bandwidth for RAM resulting in better performance from a newer lower ram
doremi
Hero Member
*****
Offline Offline

Activity: 654
Merit: 500


View Profile
April 29, 2016, 06:18:41 AM
 #17

wtf you guys need Zcash for? don`t you know yet that Vcash solved all bitcoin problems already? Cheesy Cheesy Cheesy






r0ach
Legendary
*
Offline Offline

Activity: 1260
Merit: 1000


View Profile
April 29, 2016, 06:24:13 AM
 #18

wtf you guys need Zcash for? don`t you know yet that Vcash solved all bitcoin problems already? Cheesy Cheesy Cheesy

It's a proof of stake coin.  Someone hacks Poloniex and coin instantly dies.  Many wow.

......ATLANT......
..Real Estate Blockchain Platform..
                    ▄▄▄▄▄▄▄▄▄
                    ████████████░
                  ▄██████████████░
                 ▒███████▄████████░
                ▒█████████░████████░
                ▀███████▀█████████
                  ██████████████
           ███████▐██▀████▐██▄████████░
          ▄████▄█████████▒████▌█████████░
         ███████▄█████████▀██████████████░
        █████████▌█████████▐█████▄████████░
        ▀█████████████████▐███████████████
          █████▀████████ ░███████████████
    ██████▐██████████▄████████████████████████░
  ▄████▄████████▐███████████████░▄▄▄▄░████████░
 ▄██████▄█████████▐█████▄█████████▀████▄█████████░
███████████████████▐█████▄█████████▐██████████████░
▀████████▀█████████▒██████████████▐█████▀█████████
  ████████████████ █████▀█████████████████████████
   ▀██▀██████████ ▐█████████████  ▀██▀██████████
    ▀▀█████████    ▀▀█████████    ▀▀██████████

..INVEST  ●  RENT  ●  TRADE..
 ✓Assurance     ✓Price Discovery     ✓Liquidity     ✓Low Fees





███
███
███
███
███
███





███
███
███
███
███
███
███
███
███
███
███
███

◣Whitepaper ◣ANN ThreadTelegram
◣ Facebook     ◣ Reddit          ◣ Slack


███
███
███
███
███
███
███
███
███
███
███
███





███
███
███
███
███
███








Hero/Legendary members
Omegasun
Hero Member
*****
Offline Offline

Activity: 742
Merit: 500



View Profile
April 29, 2016, 06:55:02 AM
 #19

Can someone explain to me the likely difference between a 4GB GPU vs an 8GB GPU?  I'm thinking about adding a few GPU's to my farm - the R9 390 has GB of ram which is pretty much useless when it comes to what was supposed to be memory hard PoW (Ethereum).  The newer larger memory bandwidth 4GB cards (Nano, Fury, etc) have a better hashrate.

Is more RAM per GPU going to wind up meaning better performance (allow more threads)?  Or will optimizations likely allow trading performance + bandwidth for RAM resulting in better performance from a newer lower ram

The performance between the 4 and 8GB cards are similar. But when the DAG size increases furhter, you will see the difference.


Tagz
█▀▀▀▀▀▀█
█ █▀▀▀ █
█  ▄██ █
█ ██▀  █
█ ▄▄▄█ █
█ █▀▀▀ █
█  ▄██ █
█ ██▀  █
█ ▄▄▄█ █
█      █
█▄▄▄▄▄▄█

█▀▀▀▀▀▀█
█ █▀▀▀ █
█  ▄██ █
█ ██▀  █
█ ▄▄▄█ █
█ █▀▀▀ █
█  ▄██ █
█ ██▀  █
█ ▄▄▄█ █
█      █
█▄▄▄▄▄▄█
tromp
Legendary
*
Offline Offline

Activity: 978
Merit: 1087


View Profile
April 29, 2016, 01:40:30 PM
Last edit: April 29, 2016, 03:11:02 PM by tromp
 #20

I thought Wagner's algorithm sorted. I'll have to reread the paper more carefully.

That puzzled me too. I don't see a need for sorting, just for binning to find collisions on the next chunk of n/(k+1) bits

In this case it turns out there is little difference between fully sorting and sorting into bins,
since the expected bin size is only 2 :-)

So equihash really seems to be all about sorting...
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!