Title: GPU mining strategies (algorithm) Post by: jacoh on July 09, 2014, 03:07:10 PM Hi,
Sorry if this is the wrong forum. I am currently in the process of creating a fairly simple bitcoin miner for CPU and GPU (purely for demonstration purposes not for money earning). However I am having a little troubling understanding how the GPU miners generally work. Now I do want a fairly simple version of this, but I hope to be able to get one that performs decently (hopefully within 10-30% of normal miners, and definitely faster than CPU version). In general I would think the strategy you have for executing on the GPU is something like the below. I hope someone could help me out on whether I am doing something completely wrong and give me some pointers towards how you usually do it.
I do have a general and very basic GPU implementation but it is currently slower than the CPU implementation I have. I do more or less as above where each kernel check several nonces and return the "best" one (i.e. most trailing zeros of the hash (using getwork protocol) ). Title: Re: GPU mining strategies (algorithm) Post by: Schleicher on July 10, 2014, 04:27:18 PM Hi, Usually in the nonce loop you want to return a result as soon as you find a diff 1 hash (4 bytes of zeroes).Sorry if this is the wrong forum. I am currently in the process of creating a fairly simple bitcoin miner for CPU and GPU (purely for demonstration purposes not for money earning). However I am having a little troubling understanding how the GPU miners generally work. Now I do want a fairly simple version of this, but I hope to be able to get one that performs decently (hopefully within 10-30% of normal miners, and definitely faster than CPU version). In general I would think the strategy you have for executing on the GPU is something like the below. I hope someone could help me out on whether I am doing something completely wrong and give me some pointers towards how you usually do it.
If you don't have 4 bytes of zeroes then you can continue. Some things don't change if you have a different nonce. These can be precomputed outside of the nonce loop. Did you use the unrolled version of sha256? Or do you have a for loop with 64 rounds? Title: Re: GPU mining strategies (algorithm) Post by: jacoh on July 11, 2014, 01:21:26 PM Thanks for the feedback.
Do you then return the nonce from the kernel to do a fullcheck on for validity on the host? On a GPU is it even possible to return prematurely from a calculation? As they run in lockstep mode I do not think this is possible? At the moment I am simply using a full sha256 functiaon from a library which of course is slower, and don't even do midstate pre-hashing at the moment. However I am only getting 150khas/sec so I somewhat doubt that this is the main issue. Even with the sha256 function removed from the kernel I only get theoretically about 3000khash and that is still much lower than the ~10 mhash I can get using other miners. It might actually be down to the rather experimental opencl framework I am using, will investigate a bit further. |