Bitcoin Forum
September 20, 2024, 07:34:07 PM *
News: Latest Bitcoin Core release: 27.1 [Torrent]
 
  Home Help Search Login Register More  
  Show Posts
Pages: « 1 ... 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 [81] 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 »
1601  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 11, 2012, 06:34:17 PM
I've just found a forum post where someone reported stable performance with 0.939V core voltage at stock speeds. Stock voltage is 1.170V so if we ignore the other components of the card that would mean a ~20% power reduction (or more, due to reduced temps), and therefore a 20% efficiency gain as well (bringing the MH/J up from the 2.2-2.5 range to the 2.75-3.125 range). They also report a 7C temp drop which should let me reduce the fan speed from an annoying 60% to sub-50%, which is barely audible against the other case fans.

But this is all just theory until someone actually measures it, so off to find a BIOS that actually lets me lower the core voltage!

Actually power decreases at the square of voltage change.  A 20% undervolt is huge.  If true one would expect more like a 35% drop in power.  Now let me hedge that by saying dropping core voltage doesn't drop mem voltage so it is more like a 35% drop in CORE ONLY power.  Still overall power savings should be >20% for a 20% undervolt. 

If true that could make an undervolted 7970 (especially if someone can figure out how to undervolt the memory something I haven't been able to do on 5970) a hybrid between FPGA and GPU.

Power usage in Mhz is linear (ie, 500 to 1000 mhz will double power usage), but power usage in voltage is squared (using 100 watt at 1 volt? It will use about 150 watts at 1.25 volt).
1602  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 11, 2012, 06:31:49 PM
So much for AMD's "Fold and mine faster than ever with AMD App Acceleration powered by the unprecedented 28nm GCN Architecture."

There is nothing false about their statement.  It looks like the 7970 IS the fasted mining card.  AMD made no claims in terms of efficiency (MH/W or MH/$).

I mean it is like buying the fastest sports car and then saying the company lied because it costs more and has worse gas mileage than a Honda Civic. Smiley

This. 7970 is the fastest GPGPU bar the 7990 coming out RSN. AMD hasn't misadvertised anything.
1603  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 11, 2012, 06:30:08 PM
Chapter 4.16 of the AMD OpenCL programming guide has some optimization guidelines for GCN. This snippet explains why my vectorization experiment didn't help performance:

Quote from: 4.16 Optimization Guidelines for Southern Islands GPUs
... "Vectorization is no longer needed, nor desirable." ...

Oh well, back to the drawing board!  Grin

I'm not going to believe that until I can prove it. Also, mining kernels are very abnormal for how kernels function, it is very much not a typical kernel. I'm glad they updated the programming guide for GCN though, its very useful for intermediate OpenCL programmers.
1604  Bitcoin / Mining / Re: Want legit 7970 testing/benchmarking and tuning for cgminer and Diablominer? on: January 11, 2012, 06:29:01 PM
I wish I had stuck with assembly programming.. I totally rocked assembler on my Apple ][e  Smiley
 
But alas, life took me a different direction.. I'm thinking had I stuck with it, I could be squeezing some more performance out of a miner by redoing it in assembly. Smiley

The CPU miners are written heavily in assembly in some areas just to get maximum speed.
1605  Bitcoin / Mining / Re: Want legit 7970 testing/benchmarking and tuning for cgminer and Diablominer? on: January 11, 2012, 10:47:35 AM
BREAKING NEWS: I now have all the donations needed to grab that 7970, I should be ordering it within the next week give or take.

The last donation was big enough to help me cover that gap, and I think this wonderful person wishes to remain anonymous due to the size of the donation.

Gee, wish I got free stuff.

Spend hundreds of unpaid hours messing with internals of GPU programming (I guess at this point DiabloD3 knows parts of the ATI hardware better than the AMD engineers). Write one of the most used miners and a mining core that's used in other very popular miners as well. Also be supportive of it - answer questions, test bugs, do fixes and continue improving it until you squeeze another 1% performance, and another 1%, and another 1%. All this makes money for the miners using it and you're not getting paid.

Then you will get "free" stuff.

Not better than the engineers, not even close. The problem is, theres stuff that just isn't documented on how Radeons work, and most of what I know is either ArtForz or I threw shit at the Radeon to see what would stick, and then mixed in some moon dust from there.

The GCN, though? Lemons.
1606  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 11, 2012, 09:13:10 AM
Wow, that's pretty impressive!  733 MH/s is really close to what my 5970 puts down, albeit the 7970 is drawing over 100W more power at that hashing rate.  Still though, that's really good for a single GPU. I can't wait to see what a 7990 will do!

5970 does 300w stock on gaming, 7970 does 250w stock on gaming.  At ~300 watts, the 7970@1225mhz is doing 716 mhash, the 5970@725 on SDK 2.5 is doing 646, 671 on magical SDK 2.1. The 5970 has the advantage of being undervolted over the rest of the 58xx family and being able to run SDK 2.1.

Given that, I think AMD has produced a very impressive chip.

I'm not sure where you got that ”300W on gaming figure”, but I'm just going by my killowatt meter. At idle my system draws about 140W. Full out mining (810/200/1.050V) @ 740 MH/s it draws 350W, which tells me (using 1onevvolfs math of mining-idle=card wattage) my 5970 is pulling about 210W -- about 100W less than the 7970.

294 and 250 are the official AMD quoted figures for maximum draw on 5970 and 7970. Mining uses less power due to parts of the chip shutting off (texture units, etc), and I suspect GCN has superior power savings over 58xx in that area, if not, they're very similar (ie, 5970 doesn't draw 294 while mining at stock speeds, and 7970 doesn't use 250 by the same amount give or take).
1607  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 11, 2012, 07:06:24 AM
Wow, that's pretty impressive!  733 MH/s is really close to what my 5970 puts down, albeit the 7970 is drawing over 100W more power at that hashing rate.  Still though, that's really good for a single GPU. I can't wait to see what a 7990 will do!

5970 does 300w stock on gaming, 7970 does 250w stock on gaming.  At ~300 watts, the 7970@1225mhz is doing 716 mhash, the 5970@725 on SDK 2.5 is doing 646, 671 on magical SDK 2.1. The 5970 has the advantage of being undervolted over the rest of the 58xx family and being able to run SDK 2.1.

Given that, I think AMD has produced a very impressive chip.
1608  Bitcoin / Mining / Re: Want legit 7970 testing/benchmarking and tuning for cgminer and Diablominer? on: January 11, 2012, 06:50:31 AM
BREAKING NEWS: I now have all the donations needed to grab that 7970, I should be ordering it within the next week give or take.

The last donation was big enough to help me cover that gap, and I think this wonderful person wishes to remain anonymous due to the size of the donation.
1609  Other / CPU/GPU Bitcoin mining hardware / Re: To overvolt or not to overvolt? on: January 11, 2012, 06:26:18 AM
Please excuse me if this has been discussed bef0re [and please excuse my 0's]. I did search and w0uld appreciate a link t0 appr0priate thread if it exists.

I'd appreciate if s0me0ne had #s sh0wing whether 0r n0t 0verv0lting is pr0fitable - ign0ring the extra wear 0n the card & p0tential pr0blems due t0 excess heat.


Ideally, if s0me0ne had p0wer draw & hash rate @ st0ck speeds, pwr draw & hash rate @ max stable cl0ck speed [w/0 0verv0lt], and pwr draw & hash rate @ max stable cl0ck speed and max stable 0perating v0ltage - that'd be great. As well, if s0me0ne has pwr draw & hash rate @ vari0us undercl0cks, that'd be all the m0re dandy.

If this data d0esn't exist, I'll d0 the testing myself in a few days when I'm back with my rigs [and fully-functi0nal keyb0ard!].

Dear lord, I have somehow traveled back in time to the 1990s! Damnit, Percival Dunwood!
1610  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 10, 2012, 11:24:10 PM
I can't wait till the 7990 that is going to be impressive but expensive  Sad I might have missed this but what is the heat like hashing overclocked ? and what fan speed

Overclocked @ 1125/975MHz with automatic fan speed I'm getting temperatures hovering 81-83C, and the fan runs at 47-49% speed. You can see some screencaps on one of the earlier pages. But since I prefer lower temperatures and am worried about VRM and memory temps not yet being reported by GPU-Z, I usually run it at 60% fan speed and get temps around 72C. The blower fan at 60% speed is quite loud (its a reference design from Sapphire).

At 100% fan speed, the overclocked card gets below 60C while mining but you can hear it from outside of the house at this point Tongue, so as lovely as these temps are this is not an option for me as it is also my gaming and work PC.

As a reminder, 100% fan speed is a good way to kill the fan, they were never meant to be ran that high.

Don't go above 85%.
1611  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 10, 2012, 11:04:52 PM
I've measured my system and these are the results:

                        Stock (925/1375MHz)Overclocked (1125/975MHz)
Mining                        :371 W @ 550MH/s385 W @ 670MH/s
Idle                            :118 W118 W
Difference_(gfx_card_W):253 W267 W
MH/J_(system)             :1.481.74
MH/J_(gfx_card_only)    :2.172.51
MH/$_(gfx_card_only)   :1.001.22

(MH/$ estimated using lowest listed price for HD 7970 on amazon.com today)

Thanks for putting that all together.  For lack of a better term.... brutally bad.

Significant reduction in MH/W compared to 5000 series. 




Not entirely. Remember, this card will need significant optimizations, and don't apple/oranges vs 58xx if you're not using the same SDK. Nothing is going to beat 58xx on SDK 2.1, and you shouldn't expect anything that glorious ever again. Its a classic. That said, SDK 2.5 on 58xx, you lose about 4-5% give or take, dunno about 2.6, still haven't quite figured out how to best fix that.
1612  Other / CPU/GPU Bitcoin mining hardware / Re: DiabloMiner GPU Miner (LP, BFI_INT, async nw, multipool, 79xx GCN) on: January 10, 2012, 02:22:20 PM
Is there any benefit to having multiples of 60?

Yeah. There are some things hardwired in the driver to work around 60hz (the refresh rate of virtually every LCD monitor on DVI/HDMI/Displayport), instead of just being triggered either after the last iteration is done, or on the actual refresh rate. So, multiples and divisors of 60 seem to give higher hash rates and/or give better desktop latency.
1613  Other / CPU/GPU Bitcoin mining hardware / Re: DiabloMiner GPU Miner (LP, BFI_INT, async nw, multipool, 79xx GCN) on: January 10, 2012, 11:20:48 AM
Is there anyway to reduce the amount of graphics lag this causes in OS X? The latest version works perfectly on my iMac w/ a Radeon 4850 (getting 50mhash), but it makes everything so laggy i can only run it when afk.

Currently the only flag I'm using is -w 64 as that's recommended in the OP.

Add -f 60 or a higher number (-f should always be a divisor or multiple of 60)

If that's anywhere close to -f settings in GUIminer, -f 60 lets me watch GPU-accelerated HD videos without a problem. Even -f 55 is ok.

Same goes for you, 55 isn't a divisor or multiple of 60. -f 60, 120, etc are nicer.
1614  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 09, 2012, 11:24:37 PM
Code:
    int16 selection = XG2 == (x)(0x136032ED);
    if (any(selection))
    {
       x mask = Xnonce & 0xF;
       x temp = shuffle(select(Xnonce, 0, selection), mask);
       vstore16(temp, 0, output);
    }

That "if" might be totally unneccesary, and I still don't quite understand how the output array works, but it might give you a better idea of what I was trying to do to avoid all those branches.

I'll go add official 8 and 16 wide support in a bit, should be useful on, say, AVX if you manually enable CPU mining in the code. SDK 2.6's cpu compiler apparently has gotten a lot better from what I've heard.

I'll be watching the repository then Smiley It should almost definitely help with more modern CPUs and Larrabee/Intel MIC.

The output array is basically a massive hack to prevent multiple outputs from hitting each other, although the chances of getting multiple outputs is extremely low. The size of the array now is massive overkill, but it also seems to be a strangely optimum size for hardware.

Now, what would give me the most benefit is some way of sorting the outputs in a single cycle so that the pair of { nonce, H } could instantly give me the best nonce, and then only evaluate that. There seems to be no way to do this (and yes, I imply reverting that one bit of math so that H == 0 is literally done at the end again, makes it much easier to sort on shit). The nonces themselves can't be sorted because its completely random, they're meaningless values essentially.
1615  Bitcoin / Mining / Re: We are important! AMD acknowledges Mining! on: January 09, 2012, 10:15:29 PM
The 7970 would be a horrible investment for bitcoin mining. Not even a good upgrade for gaming if you already have a decent card. Games only run so well.

Well, think of it like this. If the best you can get on a 5870 at 188 watts at stock clocks is about 380 give or take (not on SDK 2.1, of course), and the best you can get out of 7970 is a theoretical upwards of over 600 at 200 or 250 watts, and you either want to stick with a single GPU (you mine on the side), or you're at a slot premium (a bunch of mega miners at 4, 5, 6, whatever cards per box, and you have a couple boxes) and it is cost prohibitive to buy more boxes (or you ran out of power and don't want to install more circuits), 7970 still wins, especially once it starts dropping below $400.

I mean, its either that, or you sell your soul to the FPGA people. FPGA just isn't right for everyone.
1616  Bitcoin / Mining software (miners) / Re: AMD Stream SDK 2.6 (Catalyst 11.12/12.1) - Get your performance back! (Phoenix) on: January 09, 2012, 10:07:58 PM
Will the CPU bug be gone with reversing to 2.1? How do I do that when I'm on 2.6 already? Heard all over the board going back isn't easy AT ALL.

Thx!

The CPU use bug currently is a flaw of the driver, not the SDK. I currently have no issues on 11.12 combined with any SDK, yet 10.7 through 11.10 cause me problems. SDK 2.2 and 2.3, however, had an identical CPU use bug and were the fault of the SDK (2.1 didn't do it, 2.2 and 2.3 do it with any driver revision); they may be identical bugs, but the source is different than the one people suffer from now. Don't get them confused.

On Windows, it is difficult to revert SDKs, on Linux it is easy. So, ymmv.
1617  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 09, 2012, 09:07:48 PM
Quote from: DiabloD3
I'll go add official 8 and 16 wide support in a bit, should be useful on, say, AVX if you manually enable CPU mining in the code. SDK 2.6's cpu compiler apparently has gotten a lot better from what I've heard.

So does that mean that is the best for 5870 cards ? Or stick to 2.1 or 2.4 ? I am quite confused as to what the best SDK / ati driver combo is ATM.

Notice I said CPU not GPU. CPU mining still sucks altogether. 2.1 is still best for 58xx cards.
1618  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 09, 2012, 08:30:21 PM
Wait wait wait. Are we sure uint16 is such a good idea? Last time I tried >4 (which was before 2.6, btw, I haven't tested with 2.6), it would crash in the compiler. Also, does anyone have a count on the number of registers per CU? There might not be enough registers to handle that.

I'm not sure if it's a good idea or not so I wanted to measure it Wink GCN has 64KB worth of registers per CU, and like you said I'm not sure if that's enough. The reason for my curiosity was because GCN's compute units each contain 4 x SIMD units with a width of 16 elements (same size as Larrabee & Intel's MIC, coincidentally), and I recall reading somewhere that each of these SIMD units can retire one 16-way instruction every 4 cycles, so those 16element vectors kind of rang out at me. I also wanted to get familiar with the OpenCL bitcoin mining code and thought it would be a neat exercise (which it was!). Nice code by the way.

I can say for sure that 16element vectors DO compile with the drivers that came with the card.

The -ds code dump for 16 element vectors came out nice and clean, although the last few lines where the result is stored in output seem a bit branchy. It looks something like this:

Code:
    if(XG2.s0 == 0x136032ED) { output[Xnonce.s0 & 0xF] = Xnonce.s0; }
    if(XG2.s1 == 0x136032ED) { output[Xnonce.s1 & 0xF] = Xnonce.s1; }
    if(XG2.s2 == 0x136032ED) { output[Xnonce.s2 & 0xF] = Xnonce.s2; }
    ...
    ...
    if(XG2.sd == 0x136032ED) { output[Xnonce.sd & 0xF] = Xnonce.sd; }
    if(XG2.se == 0x136032ED) { output[Xnonce.se & 0xF] = Xnonce.se; }
    if(XG2.sf == 0x136032ED) { output[Xnonce.sf & 0xF] = Xnonce.sf; }

I tried replacing it with a branch-less expression using shuffle() and vstore16() but haven't managed to get it working. What I've come up with looks something like this:

Code:
    x mask = Xnonce & 0xF;
    x temp = shuffle(select(Xnonce, 0, selection), mask);
    vstore16(temp, 0, output);

Anyhow I'm sure that my code modifications are doing all sorts of dumb things. I'm still learning how it all works so please ignore.

Also, check some of the larger -vs, -v 40 is two sets of uint4 and -v 44 does three uint4s (unlike cgminer, -v 4 does two uint2s).

I've tried all of the different -v settings available (according to the source) but haven't been able to get any higher than the 666MH/s with the default settings and 3 compute threads.

The branching has ended up becoming the best outcome. It can evaluate those branches in parallel, and you can't easily optimize away branches for memory writes (and theres apparently like 2 or 3 good tricks to get rid of branch waste, its just none of them work on memory writes).

I should look at shuffle. Your way doesn't quite work though, vstore would output H !=0 hashes, which would trigger HW error alerts (and rightfully so) in the host code, and I consider the HW error tracking important. At least, assuming I'm reading that code right, anyways.

I'll go add official 8 and 16 wide support in a bit, should be useful on, say, AVX if you manually enable CPU mining in the code. SDK 2.6's cpu compiler apparently has gotten a lot better from what I've heard.
1619  Other / Beginners & Help / Re: My initial Radeon HD 7970 mining benchmarks on: January 09, 2012, 07:51:59 PM
Hey OP do you have a kill-a-watt you could purchase locally.  If you are in the states Home Depot and Lowes carry them.  If you can find one locally I am sure we could get together the 3 or 4 BTC to get some accurate power readings.

The kill-a-watt brand doesn't appear to be commercialized here in europe, and I've been searching for an equivalent device locally each time I've had a chance to head out to a store for the past couple of days, but no luck so far.

I also took a stab at modifying DiabloMiner and managed to get it to use 16component vectors, which is what GCN is supposed to be tuned for, but performance isn't what I expect and its really hard to profile/debug the tahiti since I could not find any development tools that specificly support it yet.

BTW, they do make 240v/50hz euro Killawatts, but you might have to order it from the US. They also make 240v/60hz (double hot, like ovens and water heaters) ones and 208v ones for DC shit. Might have to look around, I love mine, its been essential for planning stuff out.
1620  Bitcoin / Mining software (miners) / Re: AMD Stream SDK 2.6 (Catalyst 11.12/12.1) - Get your performance back! (Phoenix) on: January 09, 2012, 07:49:13 PM
Just switch to DiabloMiner or cgminer already.

Thanks for the troll!  Roll Eyes

cgminer hasn't implemented the fixes yet that allow me to get back to full speed after installing 12.1, so until that happens, I need to figure out a way to get this working right.

Fastest I can get of cgminer is 265 Mhash/sec, where as this fix gets back back up to the 290's. Additionally, the latest kernel from Diapolo won't compile in cgminer.

So any ACTUAL help is welcome.

Cheers! Grin Grin

Diapolo seems to be missing optimizations DiabloMiner has, and DiabloMiner works on 79xx (although in an unoptimized state). Some guy has already gotten 666 mhash out of his 7970 on it.

As for the 2.6 bug, on DiabloMiner, try -v 36 -w 64 with memory at full speed, or try your normal settings with memory at full speed. It should get at least most of it back. 2.6 seems to actually be using the hardware correctly, or at least doing something that requires much tighter memory latency.

Or, otoh, you can slink back to 2.1 or 2.5, the slowness is SDK related.
Pages: « 1 ... 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 [81] 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 »
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!