| 
			| 
					
								| nelisky (OP) 
								Legendary    Offline 
								Activity: 1540 
								Merit: 1002
								
								
								
								
								   | 
								|  | September 09, 2010, 01:57:12 AM |  | 
 
 Honestly, I can't understand all the secrecy on the CUDA / OpenCL / Whatever GPU enabled versions of bitcoin. Sure, it's nice to take advantage of the extra edge you get if you want to build an efficient mining farm, but really, what's up with everyone on this regard? So, not really having the time to do it, I decided to take my first cup of CUDA (as all I have as hw to test is a macbook pro with a nvidia inside), and I'm attaching the initial version here. Yes, the source, for you to do as you please! Just remember the reason you got it in the first place, which was that someone didn't take it and hide!   Anyways, it's *really* crude, has static compile instructions on the makefile, only for osx and with the cuda dev sdk in the default place, compiled for x86_64. But I'm sure you can quickly tweak it to compile for your system, though... The code that runs in the GPU is a 1:1 copy of the Cryptopp source, only slightly tweaked. There's a lot of room for improvement, I have some 20 hours total of working with CUDA so I don't have the faintest idea of what optimizations I could achieve, but at least in the memory layout there's a lot to do. I compared the resulting hashes with Cryptopp and it matches, so I'm assuming it can generate blocks. I'm tired and thus haven't tried a local network, but I will soon. I get 1400khs from the CPU with all cores combined, and 1800khs from the GPU alone, so it's pretty nice. It does take a lot of cpu still, for some reason, so run it with cores-1 or you'll loose performance. I *think* the hashes per second calculation is correct, but may not be. The first processor is always the cuda, it will not run if you don't have the GPU / kernel, no error handling, ugly hackish code throughout, but it serves as a start. Lets get this production grade to include in the main clients, shall we? Now, if you want to thank me for doing this just head to http://taabl.datlatec.com  and place a few bets. I'm starting to think the time I spent developing the lottery wasn't worth it, as it got too little interest so far, so it would be super if you all gave it a try. If you want me to continue to pursue this, then toss a good amount of coins my way    or send me some hardware. I have linux and windows machines, but not the GPU for them, so *wink* *wink* Most of all, share your code! |  
						|  |  |  | 
| 
			| 
					
								| Immanuel 
								Member     Offline 
								Activity: 73 
								Merit: 10
								
								
								
								
								   | 
								|  | September 09, 2010, 03:19:00 AM |  | 
 
 I have a feeling this is going to be very controversial.  |  
						| 
 "I swear by my life, and my love of it, that I will never live for the sake of another man, nor ask another man to live for mine." |  |  | 
| 
			| 
					
								| omegadraconis 
								Newbie    Offline 
								Activity: 39 
								Merit: 0
								
								
								
								
								   | 
								|  | September 09, 2010, 03:46:03 AM |  | 
 
 I think the big secret thing with cuda or opencl clients is that they give one person a decisive advantage over everyone else running only a cpu. Add that with the fact that most users don't know cuda programing (me included) makes for a situation where a cuda client can be valuable to the author... For example artforz has made his own cuda client that has not been released. It has been rumored that he makes up 25% of the generation power on the network. It give him a huge advantage and one I think artforz would like to keep. 
 The down side to opening up a GPU client to the masses is that it completely kills CPU based generation unless you have a lot of cpus.... On my windows client I went from 4500Khash to 29000khashes.
 
 I will give this a try here in a few days or so on my 8800gtx hackintosh and see how it runs. What GPU does your mac have in it?
 I have no cuda or MAC OS X programming knowledge but, I have done some c in BSD and will have a go at the code to see what I might be able to do. Thank you for your efforts!
 |  
						|  |  |  | 
| 
			| 
					
								| nelisky (OP) 
								Legendary    Offline 
								Activity: 1540 
								Merit: 1002
								
								
								
								
								   | 
								|  | September 09, 2010, 03:58:43 AM |  | 
 
 I think the big secret thing with cuda or opencl clients is that they give one person a decisive advantage over everyone else running only a cpu. Add that with the fact that most users don't know cuda programing (me included) makes for a situation where a cuda client can be valuable to the author... For example artforz has made his own cuda client that has not been released. It has been rumored that he makes up 25% of the generation power on the network. It give him a huge advantage and one I think artforz would like to keep. 
 The down side to opening up a GPU client to the masses is that it completely kills CPU based generation unless you have a lot of cpus.... On my windows client I went from 4500Khash to 29000khashes.
 
 I don't think that the bias should be towards the ones with better printers having more money    The Bitcoins are a whole ecosystem, and mining for them is a small part of it, I guess. If lots of people start running GPU based mining, then the advantage that was owned by artforz and a handful of others will blend into more hands and pockets, thus making the mining less efficient for everyone. But mining is part statistics, part luck. I recently generated 3 blocks, running a machine that has been mining non stop for over a month. These 3 blocks are the only blocks it ever generated. Of course artforz would like to keep the advantage. Heck, I could use this to my advantage too, but I'd rather take the open source business approach and say "here, take it, my treat" but expect you to do the same for me and others (yes, trust is the base for this), and if you like it, I'll take whatever you feel fair. You want me to do something specific? need suport? Well, priority and special attention comes with a cost. I may never get as many coins as artforz created, but I hope I did my share on helping to spread them   I will give this a try here in a few days or so on my 8800gtx hackintosh and see how it runs. What GPU does your mac have in it? I have no cuda or MAC OS X programming knowledge but, I have done some c in BSD and will have a go at the code to see what I might be able to do. Thank you for your efforts!
 
 I've got a GT330M, really basic entry level thing. And you are most welcome. |  
						|  |  |  | 
| 
			| 
					
								| sgtstein 
								Member     Offline 
								Activity: 61 
								Merit: 10
								
								
								
								
								   | 
								|  | September 09, 2010, 04:28:09 AMLast edit: September 09, 2010, 05:00:13 AM by sgtstein
 |  | 
 
 I am also working on a version of the BitCoin CUDA Client and will do all that I can to help this project. It will be opensourced as well, for the same reasons nelisky stated. I will test this first version out on my GTX 460(borrowing from a friend    ) as soon as I can. Also, the pseudocode I've been writing up should be quite a bit faster than this currently is because it is incorporating FPGA styling and CUDA programming. I should have something up of my code up after the weekend. |  
						|  |  |  | 
| 
			| 
					
								| nelisky (OP) 
								Legendary    Offline 
								Activity: 1540 
								Merit: 1002
								
								
								
								
								   | 
								|  | September 09, 2010, 04:31:47 AM |  | 
 
 I am also working on a version of the BitCoin CUDA Client and will do all that I can to help this project. It will be opensourced as well, for the same reasons nelisky stated. I will test this first version out on my GTX 460(borrowing from a friend    ) as soon as I can. Also, the pseudocode I've been writing up should be quite a bit faster than this currently is because it is incorporating FPGA styling and CUDA programming. I should have something up pf my code up after the weekend.Nice! I've wanted to get into FPGA's too, and bitcoin seems very fitting, but unfortunately time doesn't stretch that much. BTW, the code I posted is slightly broken    If the cuda thread generates a valid block it will die, for where it reads in main.cpp:           if (DEBUG || hash <= hashTarget) {             pblock->nNonce = keep+i;             break;           } it shoudl read           if (DEBUG || hash <= hashTarget) {             pblock->nNonce = tmp.block.nNonce = keep+i;             break;           } |  
						|  |  |  | 
| 
			| 
					
								| aceat64 | 
								|  | September 09, 2010, 05:19:48 AM |  | 
 
 Nelisky, thank you for your work. I think that it's in the network's best interest that the fastest possible methods of generating blocks are freely available (see tcatm's 4-way SSE2). |  
						|  |  |  | 
| 
			| 
					
								| Immanuel 
								Member     Offline 
								Activity: 73 
								Merit: 10
								
								
								
								
								   | 
								|  | September 09, 2010, 12:59:39 PM |  | 
 
 I have one question: Do you need to have a good CPU to take full advantage of your GPU when generating bitcoins? I ask because I want to create a bitcoin mining rig with just pure GPU power. I don't want to spend extra on a CPU that I'm not going to use fully.  |  
						| 
 "I swear by my life, and my love of it, that I will never live for the sake of another man, nor ask another man to live for mine." |  |  | 
| 
			| 
					
								| sgtstein 
								Member     Offline 
								Activity: 61 
								Merit: 10
								
								
								
								
								   | 
								|  | September 09, 2010, 01:36:13 PM |  | 
 
 As of now, yes. It depends on how efficient we can implement the miner in the GPUs. If the entire control structure of the miner with the rest of the system in the cpu(ideally), then no you wouldn't. As of now, it will definitely impact how fast it is. This will take a bit to come to fruition, but when it does... dedicated systems would be quite easy to build. |  
						|  |  |  | 
| 
			| 
					
								| nelisky (OP) 
								Legendary    Offline 
								Activity: 1540 
								Merit: 1002
								
								
								
								
								   | 
								|  | September 09, 2010, 07:18:15 PM |  | 
 
 This new patch has been tested and is generating (at least in a test network), and the move of scratch space into thread local memory has improved performance quite a bit, from ~1880kh/s to ~3000kh/s on my system.
 Did anyone succeed in building and testing on other platforms / GPUs?
 |  
						|  |  |  | 
| 
			| 
					
								| LobsterMan 
								Member     Offline 
								Activity: 73 
								Merit: 10
								
								
								
								
								   | 
								|  | September 13, 2010, 01:47:16 PM |  | 
 
 I am interested in maybe getting something going for windows... |  
						|  |  |  | 
| 
			| 
					
								| aceat64 | 
								|  | September 13, 2010, 03:13:54 PM |  | 
 
 This new patch has been tested and is generating (at least in a test network), and the move of scratch space into thread local memory has improved performance quite a bit, from ~1880kh/s to ~3000kh/s on my system.
 Did anyone succeed in building and testing on other platforms / GPUs?
 
 I've tried to compile it under Ubuntu 64bit, but have run into issues with it. I would greatly appreciate it if you could give me some info about compiling for CUDA or a link to such information. |  
						|  |  |  | 
| 
			| 
					
								| sgtstein 
								Member     Offline 
								Activity: 61 
								Merit: 10
								
								
								
								
								   | 
								|  | September 13, 2010, 03:18:46 PM |  | 
 
 I've tried to compile it under Ubuntu 64bit, but have run into issues with it. I would greatly appreciate it if you could give me some info about compiling for CUDA or a link to such information.
 Are you planning on using an IDE? If so, what is it? I prefer Netbeans and have it set up and running on Fedora 13 64bit. It's a bit of a pain to get setup, but works quite well once you do have it done.  After a quick search this pops up and looks very promising for you:http://lifeofaprogrammergeek.blogspot.com/2008/05/cuda-development-in-ubuntu.html I ran out of time this weekend to get any work done and fried my PSU last week due to overusage. New 850W will get to me tomorrow night.    |  
						|  |  |  | 
| 
			| 
					
								| nelisky (OP) 
								Legendary    Offline 
								Activity: 1540 
								Merit: 1002
								
								
								
								
								   | 
								|  | September 13, 2010, 05:14:12 PM |  | 
 
 I've tried to compile it under Ubuntu 64bit, but have run into issues with it. I would greatly appreciate it if you could give me some info about compiling for CUDA or a link to such information.
 Are you planning on using an IDE? If so, what is it? I prefer Netbeans and have it set up and running on Fedora 13 64bit. It's a bit of a pain to get setup, but works quite well once you do have it done.  After a quick search this pops up and looks very promising for you:http://lifeofaprogrammergeek.blogspot.com/2008/05/cuda-development-in-ubuntu.html I ran out of time this weekend to get any work done and fried my PSU last week due to overusage. New 850W will get to me tomorrow night.   Thanks for the link! I'll be sure to try and follow as soon as I have a moment. The only thing is I don't have any linux with a cuda enabled gpu, but I can still try and build things up so others can test. I don't use IDEs at all, I try very hard to avoid them. It pays in the long run to know and understand your tools. As for helping in compiling, I've taken a very shortcut approach in my system, just copying the calls from the nvidia SDK template. There's a variable (is it debug or verbose?) that you set to 1 in your makefile and you'll get the calls it makes to compile printed out, so I just copied the switches over to the bitcoin makefile. That is one place I really need to improve, but didn't have time to. I don't even check if the gpu is cuda capable, for crying out load   |  
						|  |  |  | 
| 
			| 
					
								| sgtstein 
								Member     Offline 
								Activity: 61 
								Merit: 10
								
								
								
								
								   | 
								|  | September 13, 2010, 07:05:59 PM |  | 
 
 From what I've read, you can use the -emu to emulate GPU code on your CPU. No idea if that's it or it actually works but it's definitely worth a shot.  Oh, I completely understand wanting to understand your tools. I just felt because I am so used to working with the IDE and strapping it all together it would be easier for me to write on there to work on my linux and windows boxes. I don't even check if the gpu is cuda capable, for crying out load  BAHAHAHAHA!!! NICE! Might wanna get that checked out instead of silently failing or imploding ;-) |  
						|  |  |  | 
| 
			| 
					
								| nelisky (OP) 
								Legendary    Offline 
								Activity: 1540 
								Merit: 1002
								
								
								
								
								   | 
								|  | September 13, 2010, 07:17:39 PM |  | 
 
 From what I've read, you can use the -emu to emulate GPU code on your CPU. No idea if that's it or it actually works but it's definitely worth a shot.  Oh, I completely understand wanting to understand your tools. I just felt because I am so used to working with the IDE and strapping it all together it would be easier for me to write on there to work on my linux and windows boxes. I don't even check if the gpu is cuda capable, for crying out load  BAHAHAHAHA!!! NICE! Might wanna get that checked out instead of silently failing or imploding ;-)It is not silent at all when it fails. The result is that the hash calculated is all zeros, thus passing the difficulty check. Then the client double checks the hash and dies on an assert. Still, I could just not start the cuda miner and allow the client to still use the CPU   |  
						|  |  |  | 
| 
			| 
					
								| sgtstein 
								Member     Offline 
								Activity: 61 
								Merit: 10
								
								
								
								
								   | 
								|  | September 13, 2010, 08:50:06 PM |  | 
 
 Still, I could just not start the cuda miner and allow the client to still use the CPU  IMHO, this would be a far better method of it than simply failing. Although, there should be some type of warning that it is failing over to CPU from GPU. |  
						|  |  |  | 
| 
			| 
					
								| omegadraconis 
								Newbie    Offline 
								Activity: 39 
								Merit: 0
								
								
								
								
								   | 
								|  | September 13, 2010, 09:20:09 PM |  | 
 
 Anyone have a working OS X binary they wouldn't mind posting? I am still trying to get my build tools figured out on OSX as it is not my primary OS currently. |  
						|  |  |  | 
| 
			| 
					
								| nelisky (OP) 
								Legendary    Offline 
								Activity: 1540 
								Merit: 1002
								
								
								
								
								   | 
								|  | September 13, 2010, 09:33:51 PM |  | 
 
 Anyone have a working OS X binary they wouldn't mind posting? I am still trying to get my build tools figured out on OSX as it is not my primary OS currently.
 I can give you mine BUT it's not statically linking cuda, so you need the SDK installed AND it's 64bits. Assuming both these things are good for you, and you don't mind using the terminal instead of the GUI, give it a try. This is the latest version that does ~6.2MHs on my system (assuming I'm calculating correctly    ) and you should try it with just one mining thread (although it works with 5, the speed actually decreases from 3 up, and if you generate, you can't be sure it was the GPU miner). Grab it at www.datlatec.com/bitcoind.zip  and let me know how it goes. |  
						|  |  |  | 
| 
			| 
					
								| nelisky (OP) 
								Legendary    Offline 
								Activity: 1540 
								Merit: 1002
								
								
								
								
								   | 
								|  | September 16, 2010, 10:29:40 PM |  | 
 
 This patch is done against the latest SVN, r157, where satoshi made it so much easier to hook up a custom miner in a clean way, including a no nonsense hash per sec calculation. Great work there, satoshi! I have tested it on a test network and at difficulty 1, doing ~6MHs I was generating a block every couple of minutes, with a standard client accepting it. This patch does a much better job than the previous one because the hashes are no longer taken out of the GPU, only the number of 0's in them. That makes parsing a few thousand hashes that were calculated in parallel by the GPU a breeze. And then the nonce is passed to the Crypto++ hasher if one is found, which makes it double sure the hash is valid. Also I cleaned up the makefile (still only for OSX 64bit) to make it easier for people to port to whatever system they have. Now, if any of you has a CUDA or ATI Stream enabled card, PCI-e that is gathering dust, consider donating it to me and I'll make it work on linux. It can be oldy, slowish, don't care. I just need it to be compatible so I can test. Have fun, and keep the code open! PS: the first miner generating is always the GPU, and I don't check if you have a compatible card, I just assume. When I have the time, I'll change that. I'm pretty much giving up on hoping for someone else to pick up where I left it    Everyone developing this GPU thing seems to have an agenda. |  
						|  |  |  | 
	|  |