Bitcoin Forum

Alternate cryptocurrencies => Mining (Altcoins) => Topic started by: tobeaj2mer01 on August 11, 2016, 02:21:33 AM



Title: Bounty crowdsale to GPU-based miner for Zcash
Post by: tobeaj2mer01 on August 11, 2016, 02:21:33 AM
I want to start a crowdsale to help developing a GPU-miner for Zcash, rules are as below, kindly let me know your opinion.

1. All of the money will go to the guy who developed a GPU-miner for Zcash
2. Every participate will donate 0.2 BTC, when miner is ready, every participate will get a copy of the miner, this miner will NOT go to public
3. These BTC will go to a trustworthy escrow, when miner is okay then it will go to the developer
4. Miner should be released before Zcash first version release.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: bbc.reporter on August 12, 2016, 07:58:03 AM
Is it not risky to commit to this? According to the zcash team the algo could change in the future which may or may not allow GPU mining.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: rdnkjdi on August 16, 2016, 07:02:30 AM
I would most certainly like to be involved in this.  In order to do it right - I would suggest perhaps trying to find a few developers and see what they would charge. 

Tromp, Wolf0 and I can't think of the other two I remember (one was the guy who built the Monero miner).

People are super non committal - but I would be happy to pay significantly more for a closed miner as long as the money was held in ESCROW by a reputable BTT member.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: nerdralph on August 16, 2016, 04:06:55 PM
I might be interested.  I've done a bunch of work on ethminer in the past:
https://github.com/nerdralph/ethminer-nr/tree/110

I was already researching zcash/equihash, and communicated with Alex Biryukov about the Equihash paper.

I can write OpenCL, but not CUDA.  Even though they are similar, I'm not interested in doing CUDA code.

I'd need at least 10 BTC to write it.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: YIz on August 16, 2016, 04:14:48 PM
A GPU miner for Zcash will get released to the public eventually. and you will need a serious amount of Bitcoin in order to get someone to code one for you.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: Mugatu on August 16, 2016, 08:03:39 PM
I may be able to write a CUDA based miner.

I haven't spent very much time looking at zcash/equihash yet, and won't be able to for a few weeks yet.

I would need to look into this more before I provide a quote


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: tobeaj2mer01 on August 22, 2016, 02:03:55 AM
There is another way for this thing, you can build a zcash GPU-miner and sell it to the public, maybe you will get a lot of money, for sure I will buy one if it's not expensive.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: tromp on August 22, 2016, 02:38:04 AM
I would most certainly like to be involved in this.  In order to do it right - I would suggest perhaps trying to find a few developers and see what they would charge. 

Tromp, Wolf0 and I can't think of the other two I remember (one was the guy who built the Monero miner).

It's a little early to be developing GPU miners given that the CPU miner is still being optimized,
and is lacking important features like multi-threading.

Once the CPU miner looks reasonably optimized, I might try my hand at a CUDA version.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: Q_R_V on August 22, 2016, 12:48:31 PM
There is another way for this thing, you can build a zcash GPU-miner and sell it to the public, maybe you will get a lot of money, for sure I will buy one if it's not expensive.

Miner with built in fee would be the best idea, something like CDM.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: Ty13rDerden on August 28, 2016, 09:43:30 PM
Formal proposals should be made here on the Zcash forum:

https://forum.z.cash/t/crowdfund-gpu-miner/1324

Most people have opted to commit 1 BTC each for access to the miner.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: tromp on September 26, 2016, 02:45:38 PM
I would most certainly like to be involved in this.  In order to do it right - I would suggest perhaps trying to find a few developers and see what they would charge. 

Tromp, Wolf0 and I can't think of the other two I remember (one was the guy who built the Monero miner).

It's a little early to be developing GPU miners given that the CPU miner is still being optimized,
and is lacking important features like multi-threading.

Once the CPU miner looks reasonably optimized, I might try my hand at a CUDA version.

I ended up having to optimize the CPU miner myself:

https://forum.z.cash/t/breaking-equihash-in-solutions-per-gb-second/1995


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: Nauticalam on September 26, 2016, 03:15:33 PM
There is another way for this thing, you can build a zcash GPU-miner and sell it to the public, maybe you will get a lot of money, for sure I will buy one if it's not expensive.

Miner with built in fee would be the best idea, something like CDM.

0.2 bitcoin is quite expensive for the small miners. So it is better to use the fee based system for small miners.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: nerdralph on September 26, 2016, 04:01:51 PM
I would most certainly like to be involved in this.  In order to do it right - I would suggest perhaps trying to find a few developers and see what they would charge. 

Tromp, Wolf0 and I can't think of the other two I remember (one was the guy who built the Monero miner).

It's a little early to be developing GPU miners given that the CPU miner is still being optimized,
and is lacking important features like multi-threading.

Once the CPU miner looks reasonably optimized, I might try my hand at a CUDA version.

I ended up having to optimize the CPU miner myself:

https://forum.z.cash/t/breaking-equihash-in-solutions-per-gb-second/1995

I think you are making a mistake assuming the performance scales with available memory.  I've already figured out how to create the initial table of 2M hashes in a fully parallel way, such that on a GPU with n compute units, each one creates 2M/n of the hashes.  Evenly distributing the sorting amongst the CUs is looking a lot harder.  I'm confident I can get the memory requirements at k=9, n=200 down to 128MB.  Since each hash is only 25 bytes, using index compression it might be possible to do an efficient solution using half that.  The memory I/O requirements should be less than 4GB (i.e. < 4GB of reads + writes), meaning a low-end GPU like a R7 370 could do 10s of solutions per second.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: tromp on September 26, 2016, 04:38:47 PM
I think you are making a mistake assuming the performance scales with available memory.

I make no such assumption.
Please read solardiz' reply, which I fully agree with.

Quote
I'm confident I can get the memory requirements at k=9, n=200 down to 128MB. 

I would be quite shocked if you can get peak memory usage down to that!


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: nerdralph on September 26, 2016, 06:23:14 PM
I think you are making a mistake assuming the performance scales with available memory.

I make no such assumption.
Please read solardiz' reply, which I fully agree with.
Your statement: "That gives it an estimated single-threaded time*space performance of 0.063 S/s / 0.54GB ~ 0.12 S/GBs"
If it doesn't scale with memory, then it was nonsense to include the memory use in GB as a multiplier in performance.

Quote
I'm confident I can get the memory requirements at k=9, n=200 down to 128MB. 

I would be quite shocked if you can get peak memory usage down to that!

The hashes themselves only take 50MB, so 128MB is not an unreasonable target.  After the first sort, you only need to keep 200-20=180 bits/hash + 21-bits of index per entry, since you only care about those that collide on the first 20 bits.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: tromp on September 26, 2016, 06:50:28 PM
Your statement: "That gives it an estimated single-threaded time*space performance of 0.063 S/s / 0.54GB ~ 0.12 S/GBs"
If it doesn't scale with memory, then it was nonsense to include the memory use in GB as a multiplier in performance.

It makes about as much sense as measuring a country's productivity in GDP per capita.
Does GDP scale linearly with capita? Can you just stick more people in a country to increase its GDP?
No, it takes many other resources. Is GDP per capita nonsense then?
Maybe there are even better analogies, but that's all I could think of for now.

Quote
The hashes themselves only take 50MB, so 128MB is not an unreasonable target.  After the first sort, you only need to keep 200-20=180 bits/hash + 21-bits of index per entry, since you only care about those that collide on the first 20 bits.

You need a fair bit of space to hold the index-lists...


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: nerdralph on September 26, 2016, 07:27:23 PM
Your statement: "That gives it an estimated single-threaded time*space performance of 0.063 S/s / 0.54GB ~ 0.12 S/GBs"
If it doesn't scale with memory, then it was nonsense to include the memory use in GB as a multiplier in performance.

It makes about as much sense as measuring a country's productivity in GDP per capita.
Does GDP scale linearly with capita? Can you just stick more people in a country to increase its GDP?
No, it takes many other resources. Is GDP per capita nonsense then?
Maybe there are even better analogies, but that's all I could think of for now.

Quote
The hashes themselves only take 50MB, so 128MB is not an unreasonable target.  After the first sort, you only need to keep 200-20=180 bits/hash + 21-bits of index per entry, since you only care about those that collide on the first 20 bits.

You need a fair bit of space to hold the index-lists...


You're engaging in logical fallacies referring to the GDP analogy, besides taking the discussion on a tangent.  When I'm done writing a GPU implementation, I'll be quoting performance in solutions/second.  Having an 8GB card will give no more performance than 4GB of memory at the same clock speed.

As for space for indexes, 21 bits isn't that significant.  After the first sort round, you need 2*21 bits + 180 bits hash = 222 bits.  After the 2nd round, you need 4*21 bits + 160 bits hash = 244 bits, which fits nicely in 32-byte fixed size records.  By the 3rd round the size of the indexes overtakes the hash size, but the number of hashes diminishes.  My guess is the zcash c++ implementation is using inefficient data structures that contain 64-bit pointers, bloating the memory use.
I also think it is possible to modify Wagner's algorithm in ways that may optimize performance.

edit: here's a paper by DJB that discusses optimizing Wagner's algorithm.  Optimizations may or may not be possible while still conforming to equihash's algorithm binding requirement.
https://cr.yp.to/rumba20/expandxor-20070411.pdf



Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: tromp on September 26, 2016, 09:37:48 PM
 When I'm done writing a GPU implementation, I'll be quoting performance in solutions/second.

Sure, that makes most sense.
But as Jonathan Toomim pointed out in the Zcash forum,
and as SolarDiz agreed with

Quote
Sol/s and Sol/(GiB•s) are two distinct metrics. Both are important.

I simply chose the somewhat less common measure because
1) for a memory-hard algorithm, it's crucial that you cannot trade off memory for speed without penalty.
2) i didn't want to disclose the precise Sol/s performance of my miner.

Quote
As for space for indexes, 21 bits isn't that significant.  After the first sort round, you need 2*21 bits + 180 bits hash = 222 bits.  After the 2nd round, you need 4*21 bits + 160 bits hash = 244 bits, which fits nicely in 32-byte fixed size records.  By the 3rd round the size of the indexes overtakes the hash size, but the number of hashes diminishes.

And by the 8th round you need 256*21 bits + 40 bits =5416 bits = 677 bytes.
That's where you need to get creative.

Quote
edit: here's a paper by DJB that discusses optimizing Wagner's algorithm.

I'm familiar with some of DJB's papers on wagner's algorithm, although not that particular one.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: nerdralph on September 27, 2016, 12:04:08 AM
Quote
As for space for indexes, 21 bits isn't that significant.  After the first sort round, you need 2*21 bits + 180 bits hash = 222 bits.  After the 2nd round, you need 4*21 bits + 160 bits hash = 244 bits, which fits nicely in 32-byte fixed size records.  By the 3rd round the size of the indexes overtakes the hash size, but the number of hashes diminishes.

And by the 8th round you need 256*21 bits + 40 bits =5416 bits = 677 bytes.
That's where you need to get creative.

Possibly.  If the c++ implementation is using 64-bits for each index, and takes ~1GB of peak memory, using packed 21-bit indexes cuts the memory used by 2/3rds.   You also probably know that GPUs are more efficient at manipulating 32-bit and 24-bit values than modern CPUs (which are optimized for 64-bit register operations and larger cache lines).  I don't know about NVidia, but on AMD GCN the memory is divided into 32-bit channels, and all channels can be reading or writing simultaneously.  Combined with the well-known higher peak bandwidth of GDDR5 vs DDR3/4 makes the usable memory bandwidth on the GPU much higher.

Even if I run into problems and can't get the memory requirements much lower than 256MB due to size of the index storage at the final stages, the larger record size will make it easier to max out the memory bandwidth.  With the latencies of GGDR5 and the way page open/close is queued, I find you need to be writing or reading in at least 128-byte chunks to maximize memory bandwidth.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: bluedeep on September 27, 2016, 12:05:29 AM
Where can we read more info about Zcash? Is there any BitcoinTalk Ann? Any links are welcome.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: nerdralph on September 27, 2016, 12:15:15 AM
Where can we read more info about Zcash? Is there any BitcoinTalk Ann? Any links are welcome.

There's lots on github.  You can start here:
https://github.com/zcash/zcash/wiki/specification


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: mjosephs on September 29, 2016, 04:13:44 AM
Hello folks, this thread appears to be the place with the highest concentration of zECQuihash optimization gurus.  May I trouble you with a stupid question?

You need a fair bit of space to hold the index-lists...

Indeed you do.

I'm trying to correlate that with the zcashd implementation, since it's the only available working cross-checker (too bad it is written for speed instead of readability).  The only part that has me confused is the need to eliminate entries whose index sets contain duplicate/repeated entries after EVERY one of the k collision steps.  There is no mention of this in the algorithm given in the Equihash paper.

I understand why these can't be part of the final solution (the X_i's must use distinct i's).  But why not wait until the end to drop these superfluous solutions?  Why check for duplicates k times?  I did some quick back-of-the-envelope calculations and I'm having a hard time believing that leaving these entries in the table would significantly increase memory consumption.  If you had a duplicate X_a in the second level of the collision, it would have to be a result of (X_a*X_b) colliding with (X_a*X_c) on the first 40 bits (where X_a collides with each of X_b and X_c on the first 20 bits).  If this is true then

 ((X_a*X_b)*(X_a*X_c)) = 00000000000000000000 00000000000000000000

and therefore

 (X_a*X_a*X_b*X_c) = 00000000000000000000 00000000000000000000

so

 (X_b*X_c) = 00000000000000000000 00000000000000000000

and moreover

 (X_a*X_b) = 00000000000000000000 ....................

 (X_a*X_c) = 00000000000000000000 ....................

In general it seems like repeated indices arise in the Equihash-constrained variant of Wagners algorithm as a result of an "overcollision" -- when two rows of the table collide not just on the current column but the next one as well.  When this happens the members of the overcollided set will generate one suprious entry in the next round for every entry they collide with in the current round.  It seems like that would happen on average twice per column.  Is that really enough to bloat the memory usage in a noticable way for k=10 k=9?  Surely for something like k=100 we'd be talking about serious combinatorial explosion, but for k=10 k=9 there just aren't that many steps compared to the size of the table.

(edit: changed k=10 to k=9)


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: tromp on September 29, 2016, 02:24:22 PM
Hello folks, this thread appears to be the place with the highest concentration of zECQuihash optimization gurus.  May I trouble you with a stupid question?


I'm trying to correlate that with the zcashd implementation, since it's the only available working cross-checker (too bad it is written for speed instead of readability).  The only part that has me confused is the need to eliminate entries whose index sets contain duplicate/repeated entries after EVERY one of the k collision steps.  There is no mention of this in the algorithm given in the Equihash paper.

I understand why these can't be part of the final solution (the X_i's must use distinct i's).  But why not wait until the end to drop these superfluous solutions?  Why check for duplicates k times?  I did some quick back-of-the-envelope calculations and I'm having a hard time believing that leaving these entries in the table would significantly increase memory consumption.  If you had a duplicate X_a in the second level of the collision, it would have to be a result of (X_a*X_b) colliding with (X_a*X_c) on the first 40 bits (where X_a collides with each of X_b and X_c on the first 20 bits).  If this is true then

 ((X_a*X_b)*(X_a*X_c)) = 00000000000000000000 00000000000000000000

and therefore

 (X_a*X_a*X_b*X_c) = 00000000000000000000 00000000000000000000

so

 (X_b*X_c) = 00000000000000000000 00000000000000000000

and moreover

 (X_a*X_b) = 00000000000000000000 ....................

 (X_a*X_c) = 00000000000000000000 ....................

In general it seems like repeated indices arise in the Equihash-constrained variant of Wagners algorithm as a result of an "overcollision" -- when two rows of the table collide not just on the current column but the next one as well.  When this happens the members of the overcollided set will generate one suprious entry in the next round for every entry they collide with in the current round.  It seems like that would happen on average twice per column.  Is that really enough to bloat the memory usage in a noticable way for k=10?  Surely for something like k=100 we'd be talking about serious combinatorial explosion, but for k=10 there just aren't that many steps compared to the size of the table.

Sorry if I overlooked it, but where is the stupid question?

All I see are some very astute observations. Welcome to club Guru!


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: nerdralph on September 29, 2016, 03:11:59 PM
@mjosephs

I had previously thought about the need to keep all pairs ab, ac, bc when Xa, Xb, and Xc all collide on n/(k+1) bits.  Your question got me thinking more about the zcash isProbablyDuplicate code and possible ways of avoiding duplicates.  When there is a collision on 40 bits as in your example, I think it may be safe to discard all those tuples at the first stage.  In other words, when a collision on 20 bits is found, check if there is a collision on 40 bits before deciding to keep the tuples.  I think the cost of doing the check early is quite low, so I can think of little reason to postpone the check to the end.


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: mjosephs on September 30, 2016, 12:05:09 AM
Sorry if I overlooked it, but where is the stupid question?

All I see are some very astute observations. Welcome to club Guru!

Thank you @tromp, that's quite flattering coming from you.

I'm guessing that with the sprint towards October 28th you don't have a whole lot of bandwidth to think hard about this, but if you do come across a reason why waiting until the end to prune duplicates doesn't work, or conclude that it does, a quick post here would be much appreciated.


In other words, when a collision on 20 bits is found, check if there is a collision on 40 bits before deciding to keep the tuples.

Thanks @nerdralph, that was my plan too -- check for overcollisions while updating the rest of the row between sorting passes, instead of checking tuple-lists for duplicates.  This way the index tuples become sort of a "write-only/append-only" (or "read-infrequently") type of memory, which allows other much more valuable optimizations.  I just wanted to check if there was some obvious reason why it wouldn't work.

I will sketch out a monte carlo program to check the penalty (expected to be small) and if there are any other situations that lead to duplicate indices that aren't accounted for above.

I sure wish there were a simple ZECquihash solution checker (not solver) written for clarity rather than performance.  The only one we've got right now is the one that comes with zcashd and it is built out of subroutine calls shared with the miner, which is so heavily optimized that it's very hard to read the (uncommented!) code.  Maybe when @tromp becomes a ZEC-billionaire next month he'll publish one :)


Title: Re: Bounty crowdsale to GPU-based miner for Zcash
Post by: tromp on September 30, 2016, 01:07:42 AM
I'm guessing that with the sprint towards October 28th you don't have a whole lot of bandwidth to think hard about this, but if you do come across a reason why waiting until the end to prune duplicates doesn't work, or conclude that it does, a quick post here would be much appreciated.

Waiting until the end does in fact work.

Quote
Maybe when @tromp becomes a ZEC-billionaire next month he'll publish one :)

I have no plans to mine ZEC myself:-(