Bitcoin Forum

Bitcoin => Mining => Topic started by: JinTu on October 18, 2011, 06:13:39 AM



Title: Cacti template for AMD GPU monitoring
Post by: JinTu on October 18, 2011, 06:13:39 AM
By leveraging my earlier work exposing GPU stats via SNMP (https://bitcointalk.org/index.php?topic=48771.0), I created a Cacti template for monitoring my mining rigs and would like to share my work with all of you.

Here are some teaser graphs from one of my (largely untuned) 6990 GPUs:

Adding graphs to Cacti
http://www.praecogito.com/bitcoin/amd-gpu/cacti-template/images/cacti-web-gui-data-query.jpg

Core clock for GPU 0
http://www.praecogito.com/bitcoin/amd-gpu/cacti-template/images/core-clock-0.png

Memory clock for GPU 0 (underclocked to 150MHz)
http://www.praecogito.com/bitcoin/amd-gpu/cacti-template/images/memory-clock-0.png

Core voltage for GPU 0
http://www.praecogito.com/bitcoin/amd-gpu/cacti-template/images/core-voltage-0.png

Fan speed for GPU 0
http://www.praecogito.com/bitcoin/amd-gpu/cacti-template/images/fan-speed-0.png

Temperature for GPU 0
http://www.praecogito.com/bitcoin/amd-gpu/cacti-template/images/temperature-0.png

Load for GPU 0
http://www.praecogito.com/bitcoin/amd-gpu/cacti-template/images/load-0.png


Prerequisites
  • Operational Cacti instance
  • My AMD GPU SNMP script installed and operational (http://www.praecogito.com/bitcoin/amd-gpu/snmp-script/)

Installation instructions
  • Grab the latest template package from here (http://www.praecogito.com/bitcoin/amd-gpu/cacti-template/packages/).
  • Unzip the package and copy snmp_queries/amd.gpu.xml to your snmp_queries directory i.e. /var/www/html/cacti/resource/snmp_queries
  • Import cacti_data_query_amd_gpu.xml from the Cacti web interface.
  • Add the 'AMD GPU' Data Query to your miner host or host template.
  • 'Create Graphs for this Host' as you would normally.

Troubleshooting
  • Ensure that the host's SNMP Timeout value is set adequately long (e.g. >5 seconds) as the polling cycle for a host can be quite long.

A Verbose Query should look like the following:
Code:
+ Running data query [16].
+ Found type = '3' [snmp query].
+ Found data query XML file at '/var/www/html/cacti/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ Index found at OID: '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.1' value: '0'
+ Index found at OID: '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.2' value: '1'
+ Index found at OID: '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.3' value: '2'
+ Index found at OID: '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.4' value: '3'
+ Located input field 'gpuId' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ Found item [gpuId='0'] index: 1 [from value]
+ Found item [gpuId='1'] index: 2 [from value]
+ Found item [gpuId='2'] index: 3 [from value]
+ Found item [gpuId='3'] index: 4 [from value]
+ Located input field 'gpuAddress' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.10.103.112.117.97.100.100.114.101.115.115'
+ Found item [gpuAddress='03:00.0'] index: 1 [from value]
+ Found item [gpuAddress='04:00.0'] index: 2 [from value]
+ Found item [gpuAddress='07:00.0'] index: 3 [from value]
+ Found item [gpuAddress='08:00.0'] index: 4 [from value]
+ Located input field 'gpuDescription' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.14.103.112.117.100.101.115.99.114.105.112.116.105.111.110'
+ Found item [gpuDescription='AMD Radeon HD 6900 Series'] index: 1 [from value]
+ Found item [gpuDescription='AMD Radeon HD 6900 Series'] index: 2 [from value]
+ Found item [gpuDescription='AMD Radeon HD 6900 Series'] index: 3 [from value]
+ Found item [gpuDescription='AMD Radeon HD 6900 Series'] index: 4 [from value]
+ Found data query XML file at '/var/www/html/cacti/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/var/www/html/cacti/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/var/www/html/cacti/resource/snmp_queries/amd_gpu.xml'
...

Have fun!


Title: Re: Cacti template for AMD GPU monitoring
Post by: dlasher on February 17, 2012, 08:20:19 PM
EDIT: working for 2 of 9 boxes - grrrrr


Title: Re: Cacti template for AMD GPU monitoring
Post by: JinTu on February 18, 2012, 01:55:24 AM
EDIT: Ignore me working fine.. I was able to still go add the graphs and it worked.

Glad to hear it!

Feel free to share your graphs if you are willing.


Title: Re: Cacti template for AMD GPU monitoring
Post by: dlasher on February 18, 2012, 05:04:52 AM
EDIT: Ignore me working fine.. I was able to still go add the graphs and it worked.

Glad to hear it!

Feel free to share your graphs if you are willing.

Oddly I can only get graphs from 3 out of 9 machines... there's an issue with PCI ID 0a breaking the perl script (noted in your amd perl script thread) and I can't understand the others.. I may post working/notworking when I get a chance tomorrow.

------EDIT------

Kinda wondering whether we're running into a PCI-ID issue on the cacti supporting scripts/items as well.


Data from a non-working one. It shows "success, 0 items 0 rows". Won't show any cards when I go to add graphs.

Quote

Data Query Debug Information
+ Running data query [12].
+ Found type = '3' [snmp query].
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ No SNMP data returned
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml

--------

root@stats1:~# snmpwalk -v2c -cpublic 10.4.18.22 NET-SNMP-EXTEND-MIB::nsExtendOutLine
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".1 = STRING: 0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".2 = STRING: 1
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".3 = STRING: 2
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".1 = STRING: 34
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".2 = STRING: 72
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".3 = STRING: 33
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".1 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".2 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".3 = STRING: 98
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".1 = STRING: 7900
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".2 = STRING: 7850
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".3 = STRING: 7750
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".1 = STRING: 950
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".2 = STRING: 930
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".3 = STRING: 880
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".1 = STRING: 1100
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".2 = STRING: 1088
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".3 = STRING: 1088
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".1 = STRING: 300
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".2 = STRING: 300
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".3 = STRING: 1250
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".1 = STRING: 0a:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".2 = STRING: 09:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".3 = STRING: 04:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".1 = STRING: ATI Radeon HD 5800 Series
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".2 = STRING: ATI Radeon HD 5800 Series
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".3 = STRING: AMD Radeon HD 6900 Series


And a working one, showing "6 items 2 rows"


Quote
+ Running data query [12].
+ Found type = '3' [snmp query].
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ No SNMP data returned
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'

----------
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".1 = STRING: 0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".2 = STRING: 1
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".1 = STRING: 20
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".2 = STRING: 20
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".1 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".2 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".1 = STRING: 7000
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".2 = STRING: 6700
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".1 = STRING: 870
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".2 = STRING: 870
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".1 = STRING: 1100
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".2 = STRING: 1100
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".1 = STRING: 1250
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".2 = STRING: 1250
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".1 = STRING: 03:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".2 = STRING: 04:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".1 = STRING: AMD Radeon HD 6900 Series
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".2 = STRING: AMD Radeon HD 6900 Series


I suspect the errors in the cacti.log are telling me, but I need a more verbose output...

Quote
02/20/2012 03:25:02 PM - CMDPHP: Poller[0] Host[2] DS[8] WARNING: Result from SNMP not valid.  Partial Result: U
02/20/2012 03:25:04 PM - CMDPHP: Poller[0] Host[8] DS[80] WARNING: Result from SNMP not valid.  Partial Result: U
02/20/2012 03:25:08 PM - CMDPHP: Poller[0] Host[9] DS[118] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:08 PM - CMDPHP: Poller[0] Host[9] DS[119] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:09 PM - CMDPHP: Poller[0] Host[9] DS[120] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:09 PM - CMDPHP: Poller[0] Host[9] DS[120] WARNING: Result from CMD not valid.  Partial Result: U




Title: Re: Cacti template for AMD GPU monitoring
Post by: JinTu on February 29, 2012, 01:07:42 AM
Data from a non-working one. It shows "success, 0 items 0 rows". Won't show any cards when I go to add graphs.

Quote

Data Query Debug Information
+ Running data query [12].
+ Found type = '3' [snmp query].
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ No SNMP data returned
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml

<snip>

I suspect the errors in the cacti.log are telling me, but I need a more verbose output...

Quote
02/20/2012 03:25:02 PM - CMDPHP: Poller[0] Host[2] DS[8] WARNING: Result from SNMP not valid.  Partial Result: U
02/20/2012 03:25:04 PM - CMDPHP: Poller[0] Host[8] DS[80] WARNING: Result from SNMP not valid.  Partial Result: U
02/20/2012 03:25:08 PM - CMDPHP: Poller[0] Host[9] DS[118] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:08 PM - CMDPHP: Poller[0] Host[9] DS[119] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:09 PM - CMDPHP: Poller[0] Host[9] DS[120] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:09 PM - CMDPHP: Poller[0] Host[9] DS[120] WARNING: Result from CMD not valid.  Partial Result: U

I worked with dlasher via PM and we were able to resolve this for his setup (SNMP timeout). Please see the update to the first post (Troubleshooting) for additional details.


Title: Re: Cacti template for AMD GPU monitoring
Post by: dlasher on March 03, 2012, 04:39:13 AM

working perfectly now, thank you for the help..


Title: Re: Cacti template for AMD GPU monitoring
Post by: The LT on March 04, 2012, 12:11:24 PM
You sir, are my hero! Will donate in a couple of days after I've set up the system! This is going to be VERY useful.


Title: Re: Cacti template for AMD GPU monitoring
Post by: JinTu on March 05, 2012, 07:18:23 AM
You sir, are my hero! Will donate in a couple of days after I've set up the system! This is going to be VERY useful.

Thanks mate,

Be sure to post some sample graphs when you get it up and running. 


Title: Re: Cacti template for AMD GPU monitoring
Post by: The LT on March 07, 2012, 05:38:36 PM
I've hit a snag... Got net-snmp sorted out and snmpwalk seems to work fine on a cacti machine, it connects to both my rigs.

Code:
garage ~ # snmpwalk -v2c -cpublic 192.168.2.4 NET-SNMP-EXTEND-MIB::nsExtendOutLine
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".1 = STRING: 0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".2 = STRING: 1
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".1 = STRING: 53
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".2 = STRING: 85
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".1 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".2 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".1 = STRING: 7450
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".2 = STRING: 8050
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".1 = STRING: 920
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".2 = STRING: 880
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".1 = STRING: 1125
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".2 = STRING: 1125
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".1 = STRING: 180
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".2 = STRING: 180
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".1 = STRING: 05:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".2 = STRING: 04:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".1 = STRING: ATI Radeon HD 5800 Series
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".2 = STRING: ATI Radeon HD 5800 Series

But there seems to be some problem with cacti. The verbose Data query information gives this:

Code:

+ Running data query [10].
+ Found type = '3' [SNMP Query].
+ Found data query XML file at '/var/www/localhost/htdocs/cacti/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ <oid_num_indexes> missing in XML file, 'Index Count Changed' emulated by counting oid_index entries
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100' Index Count: 2
+ Index found at OID: '1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.1' value: '0'
+ Index found at OID: '1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.2' value: '1'
+ Located input field 'gpuId' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ Found item [gpuId='0'] index: 1 [from value]
+ Found item [gpuId='1'] index: 2 [from value]
+ Located input field 'gpuAddress' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.10.103.112.117.97.100.100.114.101.115.115'
+ Found item [gpuAddress='01:00.0'] index: 1 [from value]
+ Found item [gpuAddress='02:00.0'] index: 2 [from value]
+ Located input field 'gpuDescription' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.14.103.112.117.100.101.115.99.114.105.112.116.105.111.110'
+ Found item [gpuDescription='ATI Radeon HD 5800 Series'] index: 1 [from value]
+ Found item [gpuDescription='ATI Radeon HD 5800 Series'] index: 2 [from value]

Something is definately wrong here as far as I can tell... It finds the snmp_queries file as seen on the third line of the log but other than that... meh...

I've tried setting timeout up to 10000 msec and increasing oid's per get request but to no avail.

I'm not that experienced with Cacti so maybe JinTu can offer some insight?


Title: Re: Cacti template for AMD GPU monitoring
Post by: The LT on March 07, 2012, 06:37:42 PM
It seems I have the graphing going after all! Will gather some data and report back! This is so great, JinTu! How about a bounty for cgminer stats implementation? Once I mine some BTC i will send you a donation!


Title: Re: Cacti template for AMD GPU monitoring
Post by: JinTu on March 07, 2012, 11:37:18 PM
It seems I have the graphing going after all! Will gather some data and report back! This is so great, JinTu! How about a bounty for cgminer stats implementation? Once I mine some BTC i will send you a donation!

Glad to hear you got it going, and looking forward to your sample graphs.

I'm certainly open to a bounty for implementing cgminer stats. It's been on my todo list ever since the 2.1.0 release supporting the JSON API, but it has been near impossible to find the time. A bounty would help goad me into actually doing it.


Title: Re: Cacti template for AMD GPU monitoring
Post by: The LT on March 08, 2012, 10:33:33 AM
Quote

Glad to hear you got it going, and looking forward to your sample graphs.

I'm certainly open to a bounty for implementing cgminer stats. It's been on my todo list ever since the 2.1.0 release supporting the JSON API, but it has been near impossible to find the time. A bounty would help goad me into actually doing it.


I've made a small 0.5 BTC donation for your efforts for SNMP and Cacti, I know it's not much but I'm waiting for BFL's to arrive before the hashing power increases.

I'll post the graphs soon, when they stop being straight lines and become something "graphy". :)

A quick question, is "sudo" really needed for aticonfig to work? I wanted to minimize my logging. If it isn't explicitly required, then maybe we can add a variable to enable-disable sudo usage?


Title: Re: Cacti template for AMD GPU monitoring
Post by: JinTu on March 08, 2012, 06:29:08 PM

I've made a small 0.5 BTC donation for your efforts for SNMP and Cacti, I know it's not much but I'm waiting for BFL's to arrive before the hashing power increases.

I'll post the graphs soon, when they stop being straight lines and become something "graphy". :)

A quick question, is "sudo" really needed for aticonfig to work? I wanted to minimize my logging. If it isn't explicitly required, then maybe we can add a variable to enable-disable sudo usage?

Donation received, thanks for your support!

Yes, unfortunately aticonfig needs to run as the same user and display your X session is logged in as for most of the commands to work. You can test this yourself by logging in as root (assuming you are already logged into X as a different user) and running the following:

aticonfig --lsa
This should work e.g.
Code:
* 0. 03:00.0 AMD Radeon HD 6900 Series
  1. 04:00.0 AMD Radeon HD 6900 Series
  2. 07:00.0 AMD Radeon HD 6900 Series
  3. 08:00.0 AMD Radeon HD 6900 Series

aticonfig --odgt --adapter=0
This won't work e.g.
Code:
No protocol specified
ERROR - X needs to be running to perform AMD Overdrive(TM) commands
When run as the same user and DISPLAY your X session is (i.e. sudo -u jintu DISPLAY=:0 aticonfig --odgt --adapter=0), you should get the following:
Code:
Default Adapter - AMD Radeon HD 6900 Series
                  Sensor 0: Temperature - 76.00 C




Title: Re: Cacti template for AMD GPU monitoring
Post by: The LT on March 09, 2012, 12:19:29 PM
Oh okay, thanks for clearing that up! The VRM on one of the cards is getting funky, it's really nice to see it nicely graphed. It's also much easier to tune the cards to desired temperature, you can see which parameters affect what.. :) Let me spend a couple of days graphing and I'll post some stats.


Title: Re: Cacti template for AMD GPU monitoring
Post by: dlasher on March 16, 2012, 05:40:36 PM

not a lot to see, but average temps across miners1-9.. using CGminer with a target temp of 80C, hence the flat lines around there.

example:
https://i.imgur.com/1Athj.png

See the rest at: http://imgur.com/a/Hxssv/all

Thanks again JinTu for your work and help!


Title: Re: Cacti template for AMD GPU monitoring
Post by: JinTu on March 16, 2012, 06:24:51 PM

not a lot to see, but average temps across miners1-9.. using CGminer with a target temp of 80C, hence the flat lines around there.

example:
https://i.imgur.com/1Athj.png

See the rest at: http://imgur.com/a/Hxssv/all

Thanks again JinTu for your work and help!

No problem dlasher. I am glad you are getting some use out of it.

wrt cgminer, your sample miner5 devices 0, 1 and 3 temperature graph has a lot more variability than I would expect to see if you are using the auto features. I see that device 2 of miner5 graphs for the same time frame show more temperature stability. Is this by chance a card with reduced airflow compared with devices 0, 1 and 3 or are you using different settings for these cards?


Title: Re: Cacti template for AMD GPU monitoring
Post by: dlasher on March 17, 2012, 03:40:33 AM
wrt cgminer, your sample miner5 devices 0, 1 and 3 temperature graph has a lot more variability than I would expect to see if you are using the auto features. I see that device 2 of miner5 graphs for the same time frame show more temperature stability. Is this by chance a card with reduced airflow compared with devices 0, 1 and 3 or are you using different settings for these cards?

Identical settings, miner5 has 4 58xx cards with zero space between them, lots of airflow, but the middle cards stay hotter.



Title: Re: Cacti template for AMD GPU monitoring
Post by: The LT on March 17, 2012, 07:50:11 PM
Happily monitoring my cards for a week now! The graphs were useful in reducing the failing VRM temperatures. :)