Bitcoin Forum
April 19, 2024, 04:38:08 AM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: why would it help to lower memory speeds?  (Read 1913 times)
Desolator (OP)
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250



View Profile
July 21, 2011, 11:12:58 PM
 #1

Kinda wondering why everyone everywhere is suggesting lowering the memory speed on my graphics card (5830) from 1000MHz to apparently 400MHz.  What benefit would that possibly provide?  Performance?  I doubt lowering the ram speed makes it run faster lol.
Heat?  Is it a performance per watt thing?  In a whole PC, the CPU is like 65W on up and the ram is like 6W or something so that doesn't sound right if it translates into graphics card land.

Doesn't it kinda need the fastest memory possible to run complex calculations like this?
1713501488
Hero Member
*
Offline Offline

Posts: 1713501488

View Profile Personal Message (Offline)

Ignore
1713501488
Reply with quote  #2

1713501488
Report to moderator
The grue lurks in the darkest places of the earth. Its favorite diet is adventurers, but its insatiable appetite is tempered by its fear of light. No grue has ever been seen by the light of day, and few have survived its fearsome jaws to tell the tale.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
JoelKatz
Legendary
*
Offline Offline

Activity: 1596
Merit: 1012


Democracy is vulnerable to a 51% attack.


View Profile WWW
July 21, 2011, 11:15:15 PM
 #2

The calculations done for mining don't place any significant load on the memory at all. Once the parameters are loaded and the GPU is ready to go, about 200 billion calculations are done before there's any need to go to memory to store results or fetch more work.

The higher memory clock means more heat generated. This means more power used, higher fan speeds needed, and thus reduced fan life and a greater chance of thermal throttling.

I am an employee of Ripple. Follow me on Twitter @JoelKatz
1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN
bcpokey
Hero Member
*****
Offline Offline

Activity: 602
Merit: 500



View Profile
July 21, 2011, 11:16:33 PM
 #3

Read the dozens of threads and hundreds of posts already on this topic?

VRAM is essentially unused in hashing, lowering the clockspeed reduces the heat generated and power consumed (these two go hand in hand). A GPUs architecture is not the same as a desktop computer.
Desolator (OP)
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250



View Profile
July 22, 2011, 04:46:39 AM
 #4

Wow, I can't believe complex memory operations never hit the RAM.  Though, complexity is relative compared to graphics stuff.  It's probably the equivilant memory usage to calculating where one pixel of one shadow in a game should go.

You know, as a programmer, now I'm curious about the specifics.  Is it like if the volume of memory transfers is crazy low cuz of it being 1 single calculation at time that the speed of like 100 bytes for each calculation is barely affected by speed?

Or is it more like repetitive operations take place in the GPU equiliant of L2 cache? That sounds more like what you were saying.  That'd be pretty awesome Tongue  But since it's apparently like over 1000 individual processing cores or whatever, they'd each need some, right? And quite a sizeable bit of it considering it's a hash and not adding 1 + 1.
Chris Acheson
Sr. Member
****
Offline Offline

Activity: 266
Merit: 251


View Profile
July 22, 2011, 05:51:40 PM
 #5

Wow, I can't believe complex memory operations never hit the RAM.  Though, complexity is relative compared to graphics stuff.  It's probably the equivilant memory usage to calculating where one pixel of one shadow in a game should go.

You know, as a programmer, now I'm curious about the specifics.  Is it like if the volume of memory transfers is crazy low cuz of it being 1 single calculation at time that the speed of like 100 bytes for each calculation is barely affected by speed?

Apparently the data being operated on is small enough to fit entirely within the GPU's registers.
Starlightbreaker
Legendary
*
Offline Offline

Activity: 1764
Merit: 1006



View Profile
July 22, 2011, 07:41:34 PM
 #6

Somehow, my 5830s like lower memory speeds. one prefers 225, the other one 275.

Higher than that, up to a certain point, it just lowers the hash speed.
 Huh

Desolator (OP)
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250



View Profile
July 22, 2011, 08:17:53 PM
 #7

my 5830 (or the driver) decided that 850MHz passed the like 5 second stabilization test built into the driver's software.  I was stepping it up slowly and that's as far as I got before I had to leave.  Every step passed their test though.  Today, nope, fail Tongue  In fact, it was running for like 24 hours straight at 850 but nope, 850 = test failure now all of a sudden Tongue The same with basically every frequency I tried, higher or lower.  It was throwing out the FAIL stamps left and right.  So I ignored it and set it at 900 anyway.  It's at 66C steady and ran for like 10 hours just fine so take that, crappy indecisive, inconsistant driver Tongue

between this and yours umm...welcome to AMD graphics cards lol.  This is why everyone uses Nvidia for gaming lol.

Btw the lowest setting in the driver's software was 900 MHz for the memory.  What's the utility called that people use to get Sapphire 5830's down to like 225-400?
marvinmartian
Full Member
***
Offline Offline

Activity: 224
Merit: 100



View Profile
July 23, 2011, 04:48:48 AM
 #8

I've found (and you can too) that bumping memory clock most certainly boosts performance.  Just try it.

But ...

Bumping core clock gives you more bang for the buck, and costs less in terms of heat.  So the algorithm you want is:

First:
     bump core speed to as high as possibly stable
Then:
     bump memory speed to as high as you're comfortable with temperature wise (as well as stable)

Now that being said, some people tweak cards by pumping core clocks beyond the normal ranges.  In most of those cases, you need to keep the memory clocks low or you'll overheat.

But in the ideal case (eg., if you were running your machines in a basement somewhere in Antarctica) you'd crank everything to eleven.

"... and the geeks shall inherit the earth."
Xephan
Newbie
*
Offline Offline

Activity: 42
Merit: 0


View Profile
July 23, 2011, 04:57:30 AM
 #9

I've found (and you can too) that bumping memory clock most certainly boosts performance.  Just try it.

Not enough to justify the cost IMO. How many more MHash do you get from bumping memory clock up by how much?

Dropping my memory clock saved me about 20+ w for no discernible performance loss. So given that most of us have at least a 1.5Mh/Watt performance, bumping memory clocks in my case would have to make more than 30MH/s before it's profitable.
Desolator (OP)
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250



View Profile
July 23, 2011, 05:50:07 AM
 #10

I dropped em from 1000 to 900 and watched extremely carefully in precisely the same controlled circumstances and even after letting it settle and adjust for 30 seconds, it dropped it by an absolute maximum of 0.1MH/s if at all.  I didn't try increasing the memory clock though.
Kermee
Full Member
***
Offline Offline

Activity: 154
Merit: 100



View Profile
July 23, 2011, 06:47:40 AM
 #11

Kinda wondering why everyone everywhere is suggesting lowering the memory speed on my graphics card (5830) from 1000MHz to apparently 400MHz.  What benefit would that possibly provide?  Performance?  I doubt lowering the ram speed makes it run faster lol.
Heat?  Is it a performance per watt thing?  In a whole PC, the CPU is like 65W on up and the ram is like 6W or something so that doesn't sound right if it translates into graphics card land.

Doesn't it kinda need the fastest memory possible to run complex calculations like this?

You're aware that the primary reason for VRAM is used for two things:

* Frame-buffering
* Textures

Neither which are really used by doing complex math calculations for hashing.

You would be surprised how much heat-reduction occurs when you underclock the memory.  I have three different 'brands' of 5830's in my rigs.  Two of the brands work the fastest at 300 MHz, another at 375 MHz.  You still have to tweak a bit to figure out the 'sweet spot'.

Changing the default clocks on my 5830's and underclocking them usually results in a +10C drop in GPU temperature.  It may seem that the heat generated from 1GB of DDR5 on 5830's shouldn't be that much, but alas, it does.  But it may be a peripheral/red-herring that underclocking the memory might also reduce power-consumption from other components on a 5830.

Cheers,
Kermee
Desolator (OP)
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250



View Profile
July 23, 2011, 07:05:21 AM
 #12

I dunno the first thing about GPU structure but if they're individual cores but can hold hash calculation code plus hash calculation results plus the compare hash calculation results to see if they match code at a rate of up to like 400 million calculations in a second and do it without touching the memory, each core can't possibly have its own L2 cache, right?  I know all the data and instructs might maybe fit in under 1 KB even let alone 1 MB of cache but at 400 million a second on some fast cards, how fast can that EEPRAM or whatever it's called actually the last calculation's instructions out?

The entire card and all the processing cores on it must share one single bank of L2 (or at that point I guess L3) cache, huh?  And it must be gigantic.
Kermee
Full Member
***
Offline Offline

Activity: 154
Merit: 100



View Profile
July 23, 2011, 07:10:23 AM
 #13

I dunno the first thing about GPU structure but if they're individual cores but can hold hash calculation code plus hash calculation results plus the compare hash calculation results to see if they match code at a rate of up to like 400 million calculations in a second and do it without touching the memory, each core can't possibly have its own L2 cache, right?  I know all the data and instructs might maybe fit in under 1 KB even let alone 1 MB of cache but at 400 million a second on some fast cards, how fast can that EEPRAM or whatever it's called actually the last calculation's instructions out?

The entire card and all the processing cores on it must share one single bank of L2 (or at that point I guess L3) cache, huh?  And it must be gigantic.

Bingo.

http://en.wikipedia.org/wiki/Evergreen_%28GPU_family%29

Each SIMD core is equipped with 32 kiB local data share and 8 kiB of L1 cache, while all SIMD cores share 64 kiB global data share. Each memory controller ties to two quad ROP units, one per 32-bit channel, and dedicated 128 kiB L2 cache.

It's more than enough...

Cheers,
Kermee
Xephan
Newbie
*
Offline Offline

Activity: 42
Merit: 0


View Profile
July 23, 2011, 08:32:14 AM
 #14

You would be surprised how much heat-reduction occurs when you underclock the memory.  I have three different 'brands' of 5830's in my rigs.  Two of the brands work the fastest at 300 MHz, another at 375 MHz.  You still have to tweak a bit to figure out the 'sweet spot'.

Changing the default clocks on my 5830's and underclocking them usually results in a +10C drop in GPU temperature.  It may seem that the heat generated from 1GB of DDR5 on 5830's shouldn't be that much, but alas, it does.  But it may be a peripheral/red-herring that underclocking the memory might also reduce power-consumption from other components on a 5830.

GDDR5 consumes about 0.5W per Gbps of performance per chip. There are 8 chips on a typical 58xx card at 4Gbps (1000Mhz QDR) per chip, so that's 8 x 4 x 0.5 or 16W worth per card.
marvinmartian
Full Member
***
Offline Offline

Activity: 224
Merit: 100



View Profile
July 23, 2011, 01:50:17 PM
 #15

Perhaps it's largely card specific:

* Current settings (5870)

Default Adapter - ATI Radeon HD 5800 Series
                            Core (MHz)    Memory (MHz)
           Current Clocks :    884           1287
             Current Peak :    884           1287
  Configurable Peak Range : [600-900]     [900-1300]
                 GPU load :    99%

Default Adapter - ATI Radeon HD 5800 Series
                  Sensor 0: Temperature - 70.50 C
Fan speed query:
Query Index: 0, Speed in percent
Result: Fan Speed: 74%

mhash 374.0/362.9 | accept: 68287 | reject: 721 | hw error: 127

--

* Memory clock dropped to 900mhz

Default Adapter - ATI Radeon HD 5800 Series
                            Core (MHz)    Memory (MHz)
           Current Clocks :    884           900
             Current Peak :    884           900
  Configurable Peak Range : [600-900]     [900-1300]
                 GPU load :    0%

Default Adapter - ATI Radeon HD 5800 Series
                  Sensor 0: Temperature - 69.50 C
Fan speed query:
Query Index: 0, Speed in percent
Result: Fan Speed: 71%

mhash 366.0/362.9 | accept: 68305 | reject: 721 | hw error: 127

"... and the geeks shall inherit the earth."
Xephan
Newbie
*
Offline Offline

Activity: 42
Merit: 0


View Profile
July 23, 2011, 02:28:28 PM
 #16

For comparison, on my 5870

939 / 313 / 1.125V / 70% / 81C / 424.5 MH/s / 435~439W

939 / 1200 / 1.125V / 70% / 87C / 408.8 MH/s / 466W~471W

939 / 600 / 1.125V / 70% / 83C / 411.6 MH/s / 450~453W


ironically, it seems on mine that lower memory clocks makes it faster!
marvinmartian
Full Member
***
Offline Offline

Activity: 224
Merit: 100



View Profile
July 23, 2011, 02:33:43 PM
 #17

For comparison, on my 5870

939 / 313 / 1.125V / 70% / 81C / 424.5 MH/s / 435~439W

939 / 1200 / 1.125V / 70% / 87C / 408.8 MH/s / 466W~471W

939 / 600 / 1.125V / 70% / 83C / 411.6 MH/s / 450~453W


ironically, it seems on mine that lower memory clocks makes it faster!


How are you getting your 5870s to go above (and below) the configurable ranges? 

My core clocks max out at 900.  Memory at 1200.  I have one card that was reflashed by the previous owner that goes to 1000 (core) and 1400 (mem).  But it locks up when I go above the normal configurable ranges of 900/1200.  Nor does that reflash let me go below the configurable memory clock range.

I'll guess your card is running a tick faster at 313 mem clock because the chip itself is running a bit cooler?  80+ C seems pretty hot to me regardless.

"... and the geeks shall inherit the earth."
Xephan
Newbie
*
Offline Offline

Activity: 42
Merit: 0


View Profile
July 23, 2011, 02:41:07 PM
 #18

For comparison, on my 5870

939 / 313 / 1.125V / 70% / 81C / 424.5 MH/s / 435~439W

939 / 1200 / 1.125V / 70% / 87C / 408.8 MH/s / 466W~471W

939 / 600 / 1.125V / 70% / 83C / 411.6 MH/s / 450~453W


ironically, it seems on mine that lower memory clocks makes it faster!


How are you getting your 5870s to go above (and below) the configurable ranges?  

My core clocks max out at 900.  Memory at 1200.  I have one card that was reflashed by the previous owner that goes to 1000 (core) and 1400 (mem).  But it locks up when I go above the normal configurable ranges of 900/1200.  Nor does that reflash let me go below the configurable memory clock range.

I'll guess your card is running a tick faster at 313 mem clock because the chip itself is running a bit cooler?  80+ C seems pretty hot to me regardless.

I'm using Sapphire's Trixx on Win7 to do that. It's running hot because I've got a 5850 warming the air under it Cheesy
Not sure what platform you're on but if you're on Linux, you might be restricted to what AMD's official drivers allow, 900Mhz is max for the core on my Catalyst as well.

I don't think being cooler by that few C is the issue because I could lower the fan speed and still get the same performance.
939 / 313 / 55% / 86C / 423.4 MH/s

I suspect it's some memory latency and timing issues but testing at a "mismatched" 939/300 also produces similar MHash to 939/313 so I'm not quite sure what is causing this.

marvinmartian
Full Member
***
Offline Offline

Activity: 224
Merit: 100



View Profile
July 24, 2011, 12:10:40 AM
 #19

I'm using Sapphire's Trixx on Win7 to do that. It's running hot because I've got a 5850 warming the air under it Cheesy
Not sure what platform you're on but if you're on Linux, you might be restricted to what AMD's official drivers allow, 900Mhz is max for the core on my Catalyst as well.

I don't think being cooler by that few C is the issue because I could lower the fan speed and still get the same performance.
939 / 313 / 55% / 86C / 423.4 MH/s

I suspect it's some memory latency and timing issues but testing at a "mismatched" 939/300 also produces similar MHash to 939/313 so I'm not quite sure what is causing this.

Yup, I'm a linux shop over here.  Wonder if I boot into Win7 and muck with the clocks if they'd save the settings.  Or ... if the ATI linux drivers would even let me go there.

I think you're right about matched and mismatched clocks.  Some alignments of speed work better than others.  Even running "in the zone" I find sweet spots.

"... and the geeks shall inherit the earth."
PcChip
Sr. Member
****
Offline Offline

Activity: 418
Merit: 250


View Profile
July 24, 2011, 12:43:54 AM
 #20

It's true that lowering the RAM speed will decrease heat output (the GPU core temp drops instantly, actually), but that doesn't answer the original question of WHY it makes the hashing faster.

I have one theory as to why lowering GDDR5 clocks speeds up hashing:

Remembering the old P35 chipset days, the northbridge chip had different "latches" that were basically latency settings, because the northbridge on those motherboards might be run anywhere from 266 MHz all the way up to 500+ MHz, therefore using only one latency setting between the CPU -> Northbridge and Northbridge -> RAM would not work.   So, most motherboards picked the latch automatically for you based on the FSB you chose, and you ended up with strange situations where if you graphed the performance vs FSB speed, you would get dips as the FSB went up (e.g. sometimes 450 MHz FSB performed worse than 440 MHz FSB)  Note that a few motherboards actually let you select which latch setting you wanted to use manually.

Therefore, perhaps the part of the GPU core that communicates with the GDDR5 RAM automatically negotiates the latency based on the speed at which the RAM is clocked.  If so, then perhaps sha256 hashing doesn't require high bandwidth from the RAM, so by lowering the speed to 300 MHz (which is still plenty of bandwidth for hashing) perhaps an extra-low latency is used which would provide quicker access to RAM for the hashing functions.


Legacy signature from 2011: 
All rates with Phoenix 1.50 / PhatK
5850 - 400 MH/s  |  5850 - 355 MH/s | 5830 - 310 MH/s  |  GTX570 - 115 MH/s | 5770 - 210 MH/s | 5770 - 200 MH/s
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!