Bitcoin Forum
April 26, 2024, 07:55:06 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: More potential for 7970?  (Read 1782 times)
Hexadecibel (OP)
Human Intranet Liason
VIP
Hero Member
*
Offline Offline

Activity: 571
Merit: 504


I still <3 u Satoshi


View Profile
January 13, 2012, 06:07:15 AM
 #1

I'm getting 668Mh/s out of my 7970@1125

Am I to understand that none of the software out there for mining is taking advantage of the new GCN arch?

and if so, could I expect more performance out of my card?

yay first post.
1714161306
Hero Member
*
Offline Offline

Posts: 1714161306

View Profile Personal Message (Offline)

Ignore
1714161306
Reply with quote  #2

1714161306
Report to moderator
1714161306
Hero Member
*
Offline Offline

Posts: 1714161306

View Profile Personal Message (Offline)

Ignore
1714161306
Reply with quote  #2

1714161306
Report to moderator
"You Asked For Change, We Gave You Coins" -- casascius
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714161306
Hero Member
*
Offline Offline

Posts: 1714161306

View Profile Personal Message (Offline)

Ignore
1714161306
Reply with quote  #2

1714161306
Report to moderator
1714161306
Hero Member
*
Offline Offline

Posts: 1714161306

View Profile Personal Message (Offline)

Ignore
1714161306
Reply with quote  #2

1714161306
Report to moderator
1714161306
Hero Member
*
Offline Offline

Posts: 1714161306

View Profile Personal Message (Offline)

Ignore
1714161306
Reply with quote  #2

1714161306
Report to moderator
jake262144
Full Member
***
Offline Offline

Activity: 210
Merit: 100


View Profile
January 13, 2012, 10:58:44 AM
Last edit: March 27, 2012, 11:10:01 PM by jake262144
 #2

Diablo miner is being optimized for GCN (https://bitcointalk.org/index.php?topic=1721.0)
With time, the drivers will be refined as well. There should be plenty of unused potential in the new architecture.

Check these threads out:
https://bitcointalk.org/index.php?topic=57410.0
https://bitcointalk.org/index.php?topic=56630.0

Mind you, it's not pure performance that matters but power efficiency (performance for any given power usage).
1onevvolf
Newbie
*
Offline Offline

Activity: 43
Merit: 0


View Profile
January 17, 2012, 11:38:20 AM
Last edit: January 17, 2012, 05:46:22 PM by 1onevvolf
 #3

Am I to understand that none of the software out there for mining is taking advantage of the new GCN arch?

and if so, could I expect more performance out of my card?

Short answer: Yes, slightly better performance is possible.

Long answer: You can expect a little more performance, but unless there's a detail I'm not aware of there is really not much left to gain (1-2% ideally). Let me explain why.

To the best of my knowledge, the closest estimate of the number of mathematical operations required to compute 1 hash is ~3375 (according to Phateus). And if we consider an ideally efficient processor to be one that computes mathematical operations at a rate of one operation per cycle, then hashing would take ~3375 cycles on this ideally efficient processor.

Now lets take a look at what kind of performance we can measure with today's kernels. The 7970 has 2048 stream processors and a stock frequency of 925Mhz, and with the best known kernels it is computing 550MH/s. Knowing this, we can measure the average number of cycles it is taking each stream processor to compute one hash using the following equation:

Code:
               Stream Processor Count x GPU Frequency    2048 x 925MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3444cycles
                         Hashes per second                  550 MH/s

Now if we consider that each stream processor at best can perform one ALU instruction per cycle, then the 7970 is extremely efficient (in cycles per hash) since this 3444 cycle measurement is reaaaallly close to the ideal value of 3375 cycles at one instruction per cycle. This is only a ~2% difference off of ideal and might even be due to measurement error. Its so efficient that unless there is a breakthrough that reduces the amount of operations required per hash, or there's some new GCN instruction that I'm unaware of that allows the GPU to compute several steps of the hashing function in one cycle, or kernels are modified to start taking advantage of fixed-function hardware somehow, then to the best of my knowledge ~550MH/s at stock clocks is pretty much all we're ever going to get.

To give you an idea how efficient the 7970 is at computing hashes we can compare its efficiency (in cycles per hash) with a 6970, which has 1536 stream processors and a stock frequency of 880MHz for the highest reported hashrate of 370MH/s at that frequency (from the mining hardware comparison chart):

Code:
               Stream Processor Count x GPU Frequency    1536 x 880MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3653cycles
                         Hashes per second                  370 MH/s

At an estimate of 3653 cycles, a 6970 stream processor takes ~6% more cycles per hash than a 7970 stream processor at the same frequency, and ~8% more than the ideal 1 instruction per cycle processor.

Now lets compare to a 5870 which has a highest reported hash rate of 379MH/s with its 1600 stream processors and a stock speed of 850MHz:

Code:
               Stream Processor Count x GPU Frequency    1600 x 850MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3588cycles
                         Hashes per second                  379 MH/s

This makes the 5870 roughly 2% more efficient (in cycles per hash) than the 6970, but it still uses ~4% more cycles per hash than the 7970, and ~6% more than the ideal processor. So we can conclude that ATI's GCN is already making ~98% efficient use of its stream processors for hashing, which is more than the VLIW4 and VLIW5 of its previous two generations and close to the ideal. This more efficient stream processor usage along with the increased number of stream processors and higher stock frequency explains the increased hashing performance when compared to the previous generations of GPUs.

Disclaimer: I'm not a GPU programming expert (yet) so please take my answer with a grain a salt. But for what its worth, I develop HPC software for a living that solves problems running on thousands of nodes in parallel.
vindimy
Full Member
***
Offline Offline

Activity: 239
Merit: 100



View Profile
March 27, 2012, 11:01:30 PM
 #4

Wow, 1onevvolf, thanks for the writeup! I wish there was best-of I could nominate you to Smiley

z3rohour
Newbie
*
Offline Offline

Activity: 13
Merit: 0


View Profile
March 27, 2012, 11:05:28 PM
 #5

Thoses 7970 are amazing
airnesst
Newbie
*
Offline Offline

Activity: 29
Merit: 0


View Profile
March 29, 2012, 05:17:15 PM
 #6

why do You prefer 7970 instead of 6990 ?
vindimy
Full Member
***
Offline Offline

Activity: 239
Merit: 100



View Profile
March 29, 2012, 05:32:10 PM
 #7

why do You prefer 7970 instead of 6990 ?

I just replied to the same question on a different thread: https://bitcointalk.org/index.php?topic=74220.msg826106#msg826106

yochdog
Legendary
*
Offline Offline

Activity: 2044
Merit: 1000



View Profile
March 29, 2012, 06:28:39 PM
 #8

Am I to understand that none of the software out there for mining is taking advantage of the new GCN arch?

and if so, could I expect more performance out of my card?

Short answer: Yes, slightly better performance is possible.

Long answer: You can expect a little more performance, but unless there's a detail I'm not aware of there is really not much left to gain (1-2% ideally). Let me explain why.

To the best of my knowledge, the closest estimate of the number of mathematical operations required to compute 1 hash is ~3375 (according to Phateus). And if we consider an ideally efficient processor to be one that computes mathematical operations at a rate of one operation per cycle, then hashing would take ~3375 cycles on this ideally efficient processor.

Now lets take a look at what kind of performance we can measure with today's kernels. The 7970 has 2048 stream processors and a stock frequency of 925Mhz, and with the best known kernels it is computing 550MH/s. Knowing this, we can measure the average number of cycles it is taking each stream processor to compute one hash using the following equation:

Code:
               Stream Processor Count x GPU Frequency    2048 x 925MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3444cycles
                         Hashes per second                  550 MH/s

Now if we consider that each stream processor at best can perform one ALU instruction per cycle, then the 7970 is extremely efficient (in cycles per hash) since this 3444 cycle measurement is reaaaallly close to the ideal value of 3375 cycles at one instruction per cycle. This is only a ~2% difference off of ideal and might even be due to measurement error. Its so efficient that unless there is a breakthrough that reduces the amount of operations required per hash, or there's some new GCN instruction that I'm unaware of that allows the GPU to compute several steps of the hashing function in one cycle, or kernels are modified to start taking advantage of fixed-function hardware somehow, then to the best of my knowledge ~550MH/s at stock clocks is pretty much all we're ever going to get.

To give you an idea how efficient the 7970 is at computing hashes we can compare its efficiency (in cycles per hash) with a 6970, which has 1536 stream processors and a stock frequency of 880MHz for the highest reported hashrate of 370MH/s at that frequency (from the mining hardware comparison chart):

Code:
               Stream Processor Count x GPU Frequency    1536 x 880MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3653cycles
                         Hashes per second                  370 MH/s

At an estimate of 3653 cycles, a 6970 stream processor takes ~6% more cycles per hash than a 7970 stream processor at the same frequency, and ~8% more than the ideal 1 instruction per cycle processor.

Now lets compare to a 5870 which has a highest reported hash rate of 379MH/s with its 1600 stream processors and a stock speed of 850MHz:

Code:
               Stream Processor Count x GPU Frequency    1600 x 850MHz
Cycles/Hash =  -------------------------------------- =  ------------- = ~3588cycles
                         Hashes per second                  379 MH/s

This makes the 5870 roughly 2% more efficient (in cycles per hash) than the 6970, but it still uses ~4% more cycles per hash than the 7970, and ~6% more than the ideal processor. So we can conclude that ATI's GCN is already making ~98% efficient use of its stream processors for hashing, which is more than the VLIW4 and VLIW5 of its previous two generations and close to the ideal. This more efficient stream processor usage along with the increased number of stream processors and higher stock frequency explains the increased hashing performance when compared to the previous generations of GPUs.

Disclaimer: I'm not a GPU programming expert (yet) so please take my answer with a grain a salt. But for what its worth, I develop HPC software for a living that solves problems running on thousands of nodes in parallel.

Awesome.  This is why I love this forum.....so many enlightened people willing to take the time to share vast quantities of knowledge for nothing more than the love of learning. 

I am a trusted trader!  Ask Inaba, Luo Demin, Vanderbleek, Sannyasi, Episking, Miner99er, Isepick, Amazingrando, Cablez, ColdHardMetal, Dextryn, MB300sd, Robocoder, gnar1ta$ and many others!
Gabi
Legendary
*
Offline Offline

Activity: 1148
Merit: 1008


If you want to walk on water, get out of the boat


View Profile
March 29, 2012, 07:03:38 PM
 #9

Nice explanation

tossil
Newbie
*
Offline Offline

Activity: 16
Merit: 0


View Profile
March 31, 2012, 04:02:36 AM
 #10

Nice write up 1onevvolf!

Thanks for taking the time.

-Tossil
Koooooj
Member
**
Offline Offline

Activity: 75
Merit: 10



View Profile
June 28, 2012, 07:16:51 PM
 #11

Awesome writeup! To add my 2 Satoshis:

For comparison, I'm running a Radeon 6970 at stock clocks (880 MHz core, 1375 Mem) and am getting right at 389 and change MHash/s sustained throughput.  re-running the calculation gives 880*1536/389=3,475, which is only about 2.9% inefficient, but still less efficient than the 7970 (Way to go AMD!) I'm probably getting that last couple of MHash/s from having a faster processor than the people on the hardware comparison page--I use my computer for a variety of things, not just bitcoin, so I put some money in the CPU, which is an Intel Core i7-3930K, overclocked to 4.4 GHz.  Not a huge leap, and probably not a fair comparison unless OP is using a similar CPU, but it adds another data point.
Alwaysmining
Member
**
Offline Offline

Activity: 72
Merit: 10



View Profile
June 28, 2012, 10:12:54 PM
 #12

You can probably get a bit more out of it yes. How much though, i do not know
Hexadecibel (OP)
Human Intranet Liason
VIP
Hero Member
*
Offline Offline

Activity: 571
Merit: 504


I still <3 u Satoshi


View Profile
June 29, 2012, 12:00:25 AM
 #13

oh look its this thread

thanks 1onevvolf  Wink
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!