Bitcoin Forum
October 05, 2024, 07:53:25 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: R600 Bitcoin Miner (Warning: Exceedingly impractical)  (Read 17045 times)
Decade (OP)
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
August 28, 2011, 09:53:42 AM
Last edit: September 11, 2011, 09:57:36 AM by Decade
 #1

So, for the fun of it, and because I don't have any newer graphics cards, I've modified m0mchil's poclbm into something that works on R600 video cards. These are the Radeon HD 2000 and 3000 series. I never bothered to rename it.

64-bit Windows binary http://dl.dropbox.com/u/38079179/poclbm-ati-brook-x64.zip

Source code http://dl.dropbox.com/u/38079179/poclbm-ati-brook-src.zip

Some of the hints on how to build the source code are in the comments on the bottom of the C++ files. To use the Python source, you need some current version of Python 2, Numpy, and Boost::Python. To use the Brook+ source, you need ATI Stream SDK 1.4 beta and some C++ compiler, and some common sense.

Let's discuss the reasons not to mine on an R600.

  • By now, they are fairly old. Not all of them were that fast, either. I think my HP notebook is thermally throttling the video card.
  • They don't support the gather/scatter I/O operations that the OpenCL miners use to minimize memory and bandwidth use.
  • They don't have the 1-instruction rotate operation, that makes the modern Radeons so much faster than the modern GeForces. Nor do they have the 1-instruction bitselect instruction that AMD strangely doesn't expose to any APIs.
  • Shifting is extremely inefficient, only 1/5 the speed of normal operations. That's because the ATI design has 4 simple ALUs and 1 "Transcendental" ALU in each processor. In the R600, only the T-unit does integer multiplies and logical shifts. (All of them do floating-point multiplies.) With the number of shifts in SHA-256, most of the time the simple ALUs sit around waiting for the T-unit.
  • As a result, my video card (Mobility Radeon HD 3410) crunches through fewer hashes than my CPU (AMD Turion Neo X2 L625). If I didn't mangle poclbm too badly, it estimates that it crunches at a rate of roughly 0.850 MH/s.
  • This miner doesn't adjust its work size. That's because of point 2 above, so adjusting work size involves destroying and allocating buffers. For some reason, it's really slow on my computer, taking up to several seconds. This is not something that can be done several times per second.

The AMD KernelAnalyzer says this SHA-256 kernel should operate at a rate of 5M threads/sec on a Radeon HD 2900, or 57M threads/sec on a Radeon HD 6970. Clearly, this is less efficient than the OpenGL versions on the GPUs that can run OpenGL. I may also have made horrible mistakes in modifying the source code, especially in BitcoinMiner.py.

In retrospect, when I saw that Stream SDK was 64-bit, I should have installed the 32-bit Stream SDK, instead of installing the 64-bit Python. And, when I saw that Klöckner had used Boost::Python for PyOpenCL, I should have ignored that and used the raw Python C API.

Also, while destroying and creating buffers is extremely slow, I don't see why it won't work if I have it allocate several buffers and switch between them according to load. I expect that a dozen buffers should take only a few MB, and my video card has 512MB total. Something to do maybe later, if it didn't have such low payoff.

Anyway, I think I can be convinced to produce a 32-bit binary for the low, low price of 5 BTC. Smiley

You can send me tips if you feel like it, too.
19gShNE2sdo9NP7N3kTYPjkqQ6ukPCP8jH
niooron
Full Member
***
Offline Offline

Activity: 193
Merit: 100


View Profile
September 08, 2011, 02:17:33 PM
 #2

So the old radeons are slower than current nvidia cards? I sold my radeon a long time ago, so I can't test it.
Remember remember the 5th of November
Legendary
*
Offline Offline

Activity: 1862
Merit: 1011

Reverse engineer from time to time


View Profile
September 08, 2011, 03:16:44 PM
 #3

Question decade. Can you make it work for an IGP? I.e an integrated HD4200(which people say is actually an HD3k igp)

BTC:1AiCRMxgf1ptVQwx6hDuKMu4f7F27QmJC2
FalconFour
Full Member
***
Offline Offline

Activity: 176
Merit: 100



View Profile WWW
September 08, 2011, 03:38:42 PM
 #4

Well...

- I have 2 ATI cards in my known technology collection (amongst about 7-10 nVidias, about 6 of which are capable of mining - at 2-5 Mhash/sec).
-- One of which is the 6770 that I use for mining 24/7.
-- The other is a 3700, if I remember correctly.
- I'd really like some freedom to play with other pools.
- I've only ever made a grand total of about 0.6 Bitcoin in the entire ~3 weeks of 24/7 mining on the 6770.
-- Low, low price of 5 BTC my ass? Wink I think it'd take an eternity to mine that back...

If the miner goes faster on the 3700, than the ~20 Mhash/sec that my horrifyingly over-powered (and now blown-out, evidently) 8800GTS produces, I'd love to play around with it... even if it's stupidly inefficient, you'd really just need to step back and look at how bad nVidias are that people still try to mine with Wink

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
Decade (OP)
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
September 11, 2011, 10:22:06 AM
 #5

  • This miner doesn't adjust its work size. That's because of point 1 above...
Oops, I rearranged and forgot to renumber. Fixed.

So the old radeons are slower than current nvidia cards? I sold my radeon a long time ago, so I can't test it.
Probably. Bullet points 3 and 4. Bullet point 3 is a reference to the Bitcoin wiki. https://en.bitcoin.it/wiki/Why_a_GPU_mines_faster_than_a_CPU#Why_are_AMD_GPUs_faster_than_Nvidia_GPUs.3F

Essentially, by the same measure that a modern Radeon has a 1.7x advantage over a modern GeForce, the older Radeon does not have that advantage, and in fact has a sloppily estimated 3.5x disadvantage. And that doesn't account for manufacturing process or runtime differences.

Question decade. Can you make it work for an IGP? I.e an integrated HD4200(which people say is actually an HD3k igp)
I don't know. I don't have a Radeon IGP. AMD's spec thing says the Radeon 4200 has the 40 unified shaders supporting the ATI Stream Technology, so it should already work with 64-bit Windows. For the low, low price of 5 BTC, I might be convinced to try to make it work on other platforms, too. I doubt you could recover even the costs of electricity with that chip.

-- Low, low price of 5 BTC my ass? Wink I think it'd take an eternity to mine that back...

If the miner goes faster on the 3700, than the ~20 Mhash/sec that my horrifyingly over-powered (and now blown-out, evidently) 8800GTS produces, I'd love to play around with it... even if it's stupidly inefficient, you'd really just need to step back and look at how bad nVidias are that people still try to mine with Wink
I'm trying to consider the value of my time, here. And while the difficulty of mining increases, the exchange rate decreases, so I think even 5 BTC is only worth it for playing around. Instead, I give you my source code, which is everything you need to try it on your own, except for the links to the SDKs, and however long it takes it learn how to work those tools.

KernelAnalyzer says the SHA-256 kernel should do 6M threads/sec on a Radeon HD 3870, so I doubt that it will do better than a GeForce 8800GTS. Unless I also made horrible mistakes in writing the kernel.

Oh, yeah, for reference, I'm using ATI Stream SDK 1.4.0 beta, Visual Studio 2008, Python 2.7.2, NumPy 1.6.1, and Boost 1.47.0.
dikidera
Full Member
***
Offline Offline

Activity: 126
Merit: 100


View Profile
September 11, 2011, 12:01:27 PM
 #6

Can you convert 6 mill threads to mh/s?

Also, these 5 bitcoins...5 bitcoins per person, or it could be small donations from everyone to make 5 bitcoins?
Decade (OP)
Newbie
*
Offline Offline

Activity: 4
Merit: 0


View Profile
September 13, 2011, 08:51:26 AM
 #7

Can you convert 6 mill threads to mh/s?
No. I don't have any of the GPUs in KernelAnalyzer's group of simulated GPUs. I have only this Radeon HD 3410. I assume it's roughly 4 hashes per thread, so 6 Mthreads/s approximates 24 MH/s. Probably less because of runtime overhead.

Also, these 5 bitcoins...5 bitcoins per person, or it could be small donations from everyone to make 5 bitcoins?
Let's say 5 BTC per platform.
FalconFour
Full Member
***
Offline Offline

Activity: 176
Merit: 100



View Profile WWW
September 27, 2011, 03:01:32 AM
 #8

Okay, fine. If you can do it, count me in for 3 BTC. Anyone else want to cough up the missing 2?

I could really use this at the office. Best card I've got in my desk PC is an nVidia 8600GT that bakes 83 C crunching out 3-4 Mhash/sec, or an nVidia Ion with an Atom that, if overclocked, crunches 7 Mhash/sec. And I've got a couple cards laying around gathering dust that could be crunching 20? Sheez. It's worth it!

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
rTech
Sr. Member
****
Offline Offline

Activity: 305
Merit: 250


Trust but confirm!


View Profile
September 27, 2011, 04:59:02 PM
Last edit: September 29, 2011, 01:51:08 PM by rTech
 #9

Well i can donate 0.5BTC if i can get miner for my RV620 (Ati HD 3450) Smiley
Im just curious to see if i can use my server for mining in its idletimes Cheesy

So what i need to have/do/adjust Smiley so guide me up here Cheesy
FalconFour
Full Member
***
Offline Offline

Activity: 176
Merit: 100



View Profile WWW
September 27, 2011, 05:16:03 PM
 #10

@rTech: I think we just need to wait for the remaining 1.5 BTC to be volunteered Smiley

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
rTech
Sr. Member
****
Offline Offline

Activity: 305
Merit: 250


Trust but confirm!


View Profile
September 27, 2011, 05:26:13 PM
Last edit: September 29, 2011, 01:51:23 PM by rTech
 #11

OK I pay that remain 2btc if you pay that 3 Smiley

But i have to be sure we get working one!
FalconFour
Full Member
***
Offline Offline

Activity: 176
Merit: 100



View Profile WWW
September 27, 2011, 06:30:26 PM
 #12

Yep, I'm in for 3, so if you're in for 2, there's our 5 Bitcoin Smiley

Yo, Decade! Cheesy

edit: Just checked on the GPU I'll be looking at using this on, and it's a Radeon HD 3600. Just FYI.

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
rTech
Sr. Member
****
Offline Offline

Activity: 305
Merit: 250


Trust but confirm!


View Profile
September 28, 2011, 06:24:41 PM
 #13

I actually managed to get 64 version to work with my ASUS EAH 3450 256MB.

1.1 Mhash is the neat result, ill post pictures later.


FalconFour
Full Member
***
Offline Offline

Activity: 176
Merit: 100



View Profile WWW
September 28, 2011, 06:26:12 PM
 #14

Wait, wait. I don't think you're mining with your GPU there. What's your CPU usage? What's the miner? If you have AMD-APP installed, it'll emulate GPU-processing using the CPU if you don't select the right device...

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
gat3way
Sr. Member
****
Offline Offline

Activity: 256
Merit: 250


View Profile
September 28, 2011, 08:39:53 PM
 #15

Brook+ is something deprecated in ATI Stream (now AMD APP) SDK a long time ago. You can consider it as some kind of an OpenCL predecessor. That's was the framework that I used when I got into GPGPU development. It looks like a very very limited C++, rather brain-damaging. Common things like arrays were not allowed in kernels. No local memory, no barriers, limited scatter-gather that was performed into reduction steps and stuff like that, it's like a nightmare from the past. Lucky for me I quickly switched to OpenCL which was something new back then and was also very badly supported. And no, Brook+ is not a heterogenous compute environment, it does not support CPUs. It also has nothing to do with OpenCL.

Actually I doubt recent AMD APP SDKs ship with even brcc. Headers are gone a long time ago. Well, even CAL/IL is being deprecated nowadays.
PLaci1982
Full Member
***
Offline Offline

Activity: 168
Merit: 100


Live long and prosper. \\//,


View Profile
September 28, 2011, 08:41:58 PM
 #16

I actually managed to get 64 version to work with my ASUS EAH 3450 256MB.

1.1 Mhash is the neat result, ill post pictures later.
Wait, wait. I don't think you're mining with your GPU there. What's your CPU usage? What's the miner? If you have AMD-APP installed, it'll emulate GPU-processing using the CPU if you don't select the right device...

Did you read the 1st post?

As a result, my video card (Mobility Radeon HD 3410) crunches through fewer hashes than my CPU (AMD Turion Neo X2 L625). If I didn't mangle poclbm too badly, it estimates that it crunches at a rate of roughly 0.850 MH/s.

Hardware Expert / WinXP, Win7 Expert

1J5oPkyGVdb4mv44KGZQYsHS2ch6e1t4rc
rTech
Sr. Member
****
Offline Offline

Activity: 305
Merit: 250


Trust but confirm!


View Profile
September 28, 2011, 10:24:06 PM
 #17



irrelevant info:
..i use this pc as gameserver and it has intel E2140 oc'd to 2.4ghz.
I have tested ufasoft cpu miner and it gives 5.7 mhash @ 73c
FalconFour
Full Member
***
Offline Offline

Activity: 176
Merit: 100



View Profile WWW
September 29, 2011, 02:14:40 AM
 #18


That's worse than nVidia mining...

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
rTech
Sr. Member
****
Offline Offline

Activity: 305
Merit: 250


Trust but confirm!


View Profile
September 29, 2011, 02:36:31 AM
Last edit: September 29, 2011, 02:49:08 AM by rTech
 #19

yeah its already mentioned in op's title Smiley  Its still fun to test new things.

..anyway there is no point to use R6xx serie in mining.

Max i got out from my HD 3450 was 1.226 mhash and that was oc core 700mhz stable 48c.

Before you ask, my 3450 is low profile passive model.
FalconFour
Full Member
***
Offline Offline

Activity: 176
Merit: 100



View Profile WWW
September 29, 2011, 04:06:31 AM
 #20

You keep saying things to the effect of, "hey dumbass, can't you read?"... and to answer that, yes, I'm not a fucking moron, unlike most of the twats around here, I actually read the posts before I reply. Most of the time. Sometimes I jump in with an "ooh shiney!" effect, but typically I'm the guy slapping everyone straight that didn't read the whole topic.

Case in point... hey dumbass, can't you read?
Can you convert 6 mill threads to mh/s?
No. I don't have any of the GPUs in KernelAnalyzer's group of simulated GPUs. I have only this Radeon HD 3410. I assume it's roughly 4 hashes per thread, so 6 Mthreads/s approximates 24 MH/s. Probably less because of runtime overhead.

Also, these 5 bitcoins...5 bitcoins per person, or it could be small donations from everyone to make 5 bitcoins?
Let's say 5 BTC per platform.

And for the attention-impaired...
No. I don't have any of the GPUs in KernelAnalyzer's group of simulated GPUs. I have only this Radeon HD 3410. I assume it's roughly 4 hashes per thread, so 6 Mthreads/s approximates 24 MH/s. Probably less because of runtime overhead.

Far fucking cry from the CPU- or nVidia-like 1-point-something hashes a sec.

Additionally, these hashing functions used in kernels like phatk and poclbm seem to be centered around a small handful of custom functions. So to port the mining operation to a new language with a stripped down function set, it should STILL be about as simple as writing those "core" functions (rotate, math, etc) in an optimized way in the new language, and adapting all the function calls of the original code to work with the new language. And before you get started, yes, I DO know what I'm speaking of here (that is, I "program" in several languages, but I'm not a professional "programmer").

Plus, I would REALLY, FUCKING REALLY love to know what you're doing to even be able to write statements like "Max I got out of my HD 3450 was 1.226 mhash"... when Decade - the guy WRITING the first-and-only miner for these older GPUs - hasn't even fucking REPLIED to this thread yet since our offers. So there's no conceivable way (unless I'm getting raped here) that you could have your hands on the software we're talking about paying to develop here...

edit: wate. what the hell? i swear to christ that link wasn't up there in OP before. Now there's a binary and source? I don't see how I could've skimmed past that - only thing I plan on using this on is Win7 x64. Shit. Well, I was under the impression it hadn't even been made usable yet, and I guess it hadn't... with all the optimizations being considered so far, and the potential of >=20MHPS speeds on junker GPUs, I still think it's worth the investment in optimizing!

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!