Bitcoin Forum
December 06, 2016, 12:31:37 PM *
News: To be able to use the next phase of the beta forum software, please ensure that your email address is correct/functional.
 
   Home   Help Search Donate Login Register  
Pages: [1]
  Print  
Author Topic: Cacti template for AMD GPU monitoring  (Read 5162 times)
JinTu
Full Member
***
Offline Offline

Activity: 132


Hopping down the bunny trail


View Profile
October 18, 2011, 06:13:39 AM
 #1

By leveraging my earlier work exposing GPU stats via SNMP, I created a Cacti template for monitoring my mining rigs and would like to share my work with all of you.

Here are some teaser graphs from one of my (largely untuned) 6990 GPUs:

Adding graphs to Cacti


Core clock for GPU 0


Memory clock for GPU 0 (underclocked to 150MHz)


Core voltage for GPU 0


Fan speed for GPU 0


Temperature for GPU 0


Load for GPU 0



Prerequisites

Installation instructions
  • Grab the latest template package from here.
  • Unzip the package and copy snmp_queries/amd.gpu.xml to your snmp_queries directory i.e. /var/www/html/cacti/resource/snmp_queries
  • Import cacti_data_query_amd_gpu.xml from the Cacti web interface.
  • Add the 'AMD GPU' Data Query to your miner host or host template.
  • 'Create Graphs for this Host' as you would normally.

Troubleshooting
  • Ensure that the host's SNMP Timeout value is set adequately long (e.g. >5 seconds) as the polling cycle for a host can be quite long.

A Verbose Query should look like the following:
Code:
+ Running data query [16].
+ Found type = '3' [snmp query].
+ Found data query XML file at '/var/www/html/cacti/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ Index found at OID: '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.1' value: '0'
+ Index found at OID: '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.2' value: '1'
+ Index found at OID: '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.3' value: '2'
+ Index found at OID: '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.4' value: '3'
+ Located input field 'gpuId' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ Found item [gpuId='0'] index: 1 [from value]
+ Found item [gpuId='1'] index: 2 [from value]
+ Found item [gpuId='2'] index: 3 [from value]
+ Found item [gpuId='3'] index: 4 [from value]
+ Located input field 'gpuAddress' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.10.103.112.117.97.100.100.114.101.115.115'
+ Found item [gpuAddress='03:00.0'] index: 1 [from value]
+ Found item [gpuAddress='04:00.0'] index: 2 [from value]
+ Found item [gpuAddress='07:00.0'] index: 3 [from value]
+ Found item [gpuAddress='08:00.0'] index: 4 [from value]
+ Located input field 'gpuDescription' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.14.103.112.117.100.101.115.99.114.105.112.116.105.111.110'
+ Found item [gpuDescription='AMD Radeon HD 6900 Series'] index: 1 [from value]
+ Found item [gpuDescription='AMD Radeon HD 6900 Series'] index: 2 [from value]
+ Found item [gpuDescription='AMD Radeon HD 6900 Series'] index: 3 [from value]
+ Found item [gpuDescription='AMD Radeon HD 6900 Series'] index: 4 [from value]
+ Found data query XML file at '/var/www/html/cacti/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/var/www/html/cacti/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/var/www/html/cacti/resource/snmp_queries/amd_gpu.xml'
...

Have fun!

Please donate if this was helpful: 14CLqCNphUJ54ro2PtqQWJDmW3Eic1WmUd
Cacti templates for pool, GPU and CGMINER monitoring.
GPU monitoring with SNMP
1481027497
Hero Member
*
Offline Offline

Posts: 1481027497

View Profile Personal Message (Offline)

Ignore
1481027497
Reply with quote  #2

1481027497
Report to moderator
1481027497
Hero Member
*
Offline Offline

Posts: 1481027497

View Profile Personal Message (Offline)

Ignore
1481027497
Reply with quote  #2

1481027497
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1481027497
Hero Member
*
Offline Offline

Posts: 1481027497

View Profile Personal Message (Offline)

Ignore
1481027497
Reply with quote  #2

1481027497
Report to moderator
1481027497
Hero Member
*
Offline Offline

Posts: 1481027497

View Profile Personal Message (Offline)

Ignore
1481027497
Reply with quote  #2

1481027497
Report to moderator
1481027497
Hero Member
*
Offline Offline

Posts: 1481027497

View Profile Personal Message (Offline)

Ignore
1481027497
Reply with quote  #2

1481027497
Report to moderator
dlasher
Sr. Member
****
Offline Offline

Activity: 468



View Profile WWW
February 17, 2012, 08:20:19 PM
 #2

EDIT: working for 2 of 9 boxes - grrrrr
JinTu
Full Member
***
Offline Offline

Activity: 132


Hopping down the bunny trail


View Profile
February 18, 2012, 01:55:24 AM
 #3

EDIT: Ignore me working fine.. I was able to still go add the graphs and it worked.

Glad to hear it!

Feel free to share your graphs if you are willing.

Please donate if this was helpful: 14CLqCNphUJ54ro2PtqQWJDmW3Eic1WmUd
Cacti templates for pool, GPU and CGMINER monitoring.
GPU monitoring with SNMP
dlasher
Sr. Member
****
Offline Offline

Activity: 468



View Profile WWW
February 18, 2012, 05:04:52 AM
 #4

EDIT: Ignore me working fine.. I was able to still go add the graphs and it worked.

Glad to hear it!

Feel free to share your graphs if you are willing.

Oddly I can only get graphs from 3 out of 9 machines... there's an issue with PCI ID 0a breaking the perl script (noted in your amd perl script thread) and I can't understand the others.. I may post working/notworking when I get a chance tomorrow.

------EDIT------

Kinda wondering whether we're running into a PCI-ID issue on the cacti supporting scripts/items as well.


Data from a non-working one. It shows "success, 0 items 0 rows". Won't show any cards when I go to add graphs.

Quote

Data Query Debug Information
+ Running data query [12].
+ Found type = '3' [snmp query].
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ No SNMP data returned
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml

--------

root@stats1:~# snmpwalk -v2c -cpublic 10.4.18.22 NET-SNMP-EXTEND-MIB::nsExtendOutLine
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".1 = STRING: 0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".2 = STRING: 1
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".3 = STRING: 2
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".1 = STRING: 34
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".2 = STRING: 72
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".3 = STRING: 33
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".1 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".2 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".3 = STRING: 98
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".1 = STRING: 7900
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".2 = STRING: 7850
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".3 = STRING: 7750
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".1 = STRING: 950
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".2 = STRING: 930
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".3 = STRING: 880
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".1 = STRING: 1100
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".2 = STRING: 1088
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".3 = STRING: 1088
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".1 = STRING: 300
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".2 = STRING: 300
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".3 = STRING: 1250
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".1 = STRING: 0a:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".2 = STRING: 09:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".3 = STRING: 04:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".1 = STRING: ATI Radeon HD 5800 Series
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".2 = STRING: ATI Radeon HD 5800 Series
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".3 = STRING: AMD Radeon HD 6900 Series


And a working one, showing "6 items 2 rows"


Quote
+ Running data query [12].
+ Found type = '3' [snmp query].
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ No SNMP data returned
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'

----------
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".1 = STRING: 0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".2 = STRING: 1
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".1 = STRING: 20
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".2 = STRING: 20
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".1 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".2 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".1 = STRING: 7000
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".2 = STRING: 6700
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".1 = STRING: 870
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".2 = STRING: 870
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".1 = STRING: 1100
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".2 = STRING: 1100
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".1 = STRING: 1250
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".2 = STRING: 1250
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".1 = STRING: 03:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".2 = STRING: 04:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".1 = STRING: AMD Radeon HD 6900 Series
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".2 = STRING: AMD Radeon HD 6900 Series


I suspect the errors in the cacti.log are telling me, but I need a more verbose output...

Quote
02/20/2012 03:25:02 PM - CMDPHP: Poller[0] Host[2] DS[8] WARNING: Result from SNMP not valid.  Partial Result: U
02/20/2012 03:25:04 PM - CMDPHP: Poller[0] Host[8] DS[80] WARNING: Result from SNMP not valid.  Partial Result: U
02/20/2012 03:25:08 PM - CMDPHP: Poller[0] Host[9] DS[118] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:08 PM - CMDPHP: Poller[0] Host[9] DS[119] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:09 PM - CMDPHP: Poller[0] Host[9] DS[120] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:09 PM - CMDPHP: Poller[0] Host[9] DS[120] WARNING: Result from CMD not valid.  Partial Result: U


JinTu
Full Member
***
Offline Offline

Activity: 132


Hopping down the bunny trail


View Profile
February 29, 2012, 01:07:42 AM
 #5

Data from a non-working one. It shows "success, 0 items 0 rows". Won't show any cards when I go to add graphs.

Quote

Data Query Debug Information
+ Running data query [12].
+ Found type = '3' [snmp query].
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ No SNMP data returned
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml'
+ Found data query XML file at '/usr/share/cacti/site/resource/snmp_queries/amd_gpu.xml

<snip>

I suspect the errors in the cacti.log are telling me, but I need a more verbose output...

Quote
02/20/2012 03:25:02 PM - CMDPHP: Poller[0] Host[2] DS[8] WARNING: Result from SNMP not valid.  Partial Result: U
02/20/2012 03:25:04 PM - CMDPHP: Poller[0] Host[8] DS[80] WARNING: Result from SNMP not valid.  Partial Result: U
02/20/2012 03:25:08 PM - CMDPHP: Poller[0] Host[9] DS[118] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:08 PM - CMDPHP: Poller[0] Host[9] DS[119] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:09 PM - CMDPHP: Poller[0] Host[9] DS[120] WARNING: Result from CMD not valid.  Partial Result: U
02/20/2012 03:25:09 PM - CMDPHP: Poller[0] Host[9] DS[120] WARNING: Result from CMD not valid.  Partial Result: U

I worked with dlasher via PM and we were able to resolve this for his setup (SNMP timeout). Please see the update to the first post (Troubleshooting) for additional details.

Please donate if this was helpful: 14CLqCNphUJ54ro2PtqQWJDmW3Eic1WmUd
Cacti templates for pool, GPU and CGMINER monitoring.
GPU monitoring with SNMP
dlasher
Sr. Member
****
Offline Offline

Activity: 468



View Profile WWW
March 03, 2012, 04:39:13 AM
 #6


working perfectly now, thank you for the help..
The LT
Full Member
***
Offline Offline

Activity: 188



View Profile WWW
March 04, 2012, 12:11:24 PM
 #7

You sir, are my hero! Will donate in a couple of days after I've set up the system! This is going to be VERY useful.
JinTu
Full Member
***
Offline Offline

Activity: 132


Hopping down the bunny trail


View Profile
March 05, 2012, 07:18:23 AM
 #8

You sir, are my hero! Will donate in a couple of days after I've set up the system! This is going to be VERY useful.

Thanks mate,

Be sure to post some sample graphs when you get it up and running. 

Please donate if this was helpful: 14CLqCNphUJ54ro2PtqQWJDmW3Eic1WmUd
Cacti templates for pool, GPU and CGMINER monitoring.
GPU monitoring with SNMP
The LT
Full Member
***
Offline Offline

Activity: 188



View Profile WWW
March 07, 2012, 05:38:36 PM
 #9

I've hit a snag... Got net-snmp sorted out and snmpwalk seems to work fine on a cacti machine, it connects to both my rigs.

Code:
garage ~ # snmpwalk -v2c -cpublic 192.168.2.4 NET-SNMP-EXTEND-MIB::nsExtendOutLine
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".1 = STRING: 0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuid".2 = STRING: 1
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".1 = STRING: 53
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpufan".2 = STRING: 85
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".1 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuload".2 = STRING: 99
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".1 = STRING: 7450
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gputemp".2 = STRING: 8050
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".1 = STRING: 920
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuclock".2 = STRING: 880
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".1 = STRING: 1125
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuvcore".2 = STRING: 1125
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".1 = STRING: 180
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpumemory".2 = STRING: 180
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".1 = STRING: 05:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpuaddress".2 = STRING: 04:00.0
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".1 = STRING: ATI Radeon HD 5800 Series
NET-SNMP-EXTEND-MIB::nsExtendOutLine."gpudescription".2 = STRING: ATI Radeon HD 5800 Series

But there seems to be some problem with cacti. The verbose Data query information gives this:

Code:

+ Running data query [10].
+ Found type = '3' [SNMP Query].
+ Found data query XML file at '/var/www/localhost/htdocs/cacti/resource/snmp_queries/amd_gpu.xml'
+ XML file parsed ok.
+ <oid_num_indexes> missing in XML file, 'Index Count Changed' emulated by counting oid_index entries
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100' Index Count: 2
+ Index found at OID: '1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.1' value: '0'
+ Index found at OID: '1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100.2' value: '1'
+ Located input field 'gpuId' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.5.103.112.117.105.100'
+ Found item [gpuId='0'] index: 1 [from value]
+ Found item [gpuId='1'] index: 2 [from value]
+ Located input field 'gpuAddress' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.10.103.112.117.97.100.100.114.101.115.115'
+ Found item [gpuAddress='01:00.0'] index: 1 [from value]
+ Found item [gpuAddress='02:00.0'] index: 2 [from value]
+ Located input field 'gpuDescription' [walk]
+ Executing SNMP walk for data @ '.1.3.6.1.4.1.8072.1.3.2.4.1.2.14.103.112.117.100.101.115.99.114.105.112.116.105.111.110'
+ Found item [gpuDescription='ATI Radeon HD 5800 Series'] index: 1 [from value]
+ Found item [gpuDescription='ATI Radeon HD 5800 Series'] index: 2 [from value]

Something is definately wrong here as far as I can tell... It finds the snmp_queries file as seen on the third line of the log but other than that... meh...

I've tried setting timeout up to 10000 msec and increasing oid's per get request but to no avail.

I'm not that experienced with Cacti so maybe JinTu can offer some insight?
The LT
Full Member
***
Offline Offline

Activity: 188



View Profile WWW
March 07, 2012, 06:37:42 PM
 #10

It seems I have the graphing going after all! Will gather some data and report back! This is so great, JinTu! How about a bounty for cgminer stats implementation? Once I mine some BTC i will send you a donation!
JinTu
Full Member
***
Offline Offline

Activity: 132


Hopping down the bunny trail


View Profile
March 07, 2012, 11:37:18 PM
 #11

It seems I have the graphing going after all! Will gather some data and report back! This is so great, JinTu! How about a bounty for cgminer stats implementation? Once I mine some BTC i will send you a donation!

Glad to hear you got it going, and looking forward to your sample graphs.

I'm certainly open to a bounty for implementing cgminer stats. It's been on my todo list ever since the 2.1.0 release supporting the JSON API, but it has been near impossible to find the time. A bounty would help goad me into actually doing it.

Please donate if this was helpful: 14CLqCNphUJ54ro2PtqQWJDmW3Eic1WmUd
Cacti templates for pool, GPU and CGMINER monitoring.
GPU monitoring with SNMP
The LT
Full Member
***
Offline Offline

Activity: 188



View Profile WWW
March 08, 2012, 10:33:33 AM
 #12

Quote

Glad to hear you got it going, and looking forward to your sample graphs.

I'm certainly open to a bounty for implementing cgminer stats. It's been on my todo list ever since the 2.1.0 release supporting the JSON API, but it has been near impossible to find the time. A bounty would help goad me into actually doing it.


I've made a small 0.5 BTC donation for your efforts for SNMP and Cacti, I know it's not much but I'm waiting for BFL's to arrive before the hashing power increases.

I'll post the graphs soon, when they stop being straight lines and become something "graphy". Smiley

A quick question, is "sudo" really needed for aticonfig to work? I wanted to minimize my logging. If it isn't explicitly required, then maybe we can add a variable to enable-disable sudo usage?
JinTu
Full Member
***
Offline Offline

Activity: 132


Hopping down the bunny trail


View Profile
March 08, 2012, 06:29:08 PM
 #13


I've made a small 0.5 BTC donation for your efforts for SNMP and Cacti, I know it's not much but I'm waiting for BFL's to arrive before the hashing power increases.

I'll post the graphs soon, when they stop being straight lines and become something "graphy". Smiley

A quick question, is "sudo" really needed for aticonfig to work? I wanted to minimize my logging. If it isn't explicitly required, then maybe we can add a variable to enable-disable sudo usage?

Donation received, thanks for your support!

Yes, unfortunately aticonfig needs to run as the same user and display your X session is logged in as for most of the commands to work. You can test this yourself by logging in as root (assuming you are already logged into X as a different user) and running the following:

aticonfig --lsa
This should work e.g.
Code:
* 0. 03:00.0 AMD Radeon HD 6900 Series
  1. 04:00.0 AMD Radeon HD 6900 Series
  2. 07:00.0 AMD Radeon HD 6900 Series
  3. 08:00.0 AMD Radeon HD 6900 Series

aticonfig --odgt --adapter=0
This won't work e.g.
Code:
No protocol specified
ERROR - X needs to be running to perform AMD Overdrive(TM) commands
When run as the same user and DISPLAY your X session is (i.e. sudo -u jintu DISPLAY=:0 aticonfig --odgt --adapter=0), you should get the following:
Code:
Default Adapter - AMD Radeon HD 6900 Series
                  Sensor 0: Temperature - 76.00 C



Please donate if this was helpful: 14CLqCNphUJ54ro2PtqQWJDmW3Eic1WmUd
Cacti templates for pool, GPU and CGMINER monitoring.
GPU monitoring with SNMP
The LT
Full Member
***
Offline Offline

Activity: 188



View Profile WWW
March 09, 2012, 12:19:29 PM
 #14

Oh okay, thanks for clearing that up! The VRM on one of the cards is getting funky, it's really nice to see it nicely graphed. It's also much easier to tune the cards to desired temperature, you can see which parameters affect what.. Smiley Let me spend a couple of days graphing and I'll post some stats.
dlasher
Sr. Member
****
Offline Offline

Activity: 468



View Profile WWW
March 16, 2012, 05:40:36 PM
 #15


not a lot to see, but average temps across miners1-9.. using CGminer with a target temp of 80C, hence the flat lines around there.

example:


See the rest at: http://imgur.com/a/Hxssv/all

Thanks again JinTu for your work and help!
JinTu
Full Member
***
Offline Offline

Activity: 132


Hopping down the bunny trail


View Profile
March 16, 2012, 06:24:51 PM
 #16


not a lot to see, but average temps across miners1-9.. using CGminer with a target temp of 80C, hence the flat lines around there.

example:


See the rest at: http://imgur.com/a/Hxssv/all

Thanks again JinTu for your work and help!

No problem dlasher. I am glad you are getting some use out of it.

wrt cgminer, your sample miner5 devices 0, 1 and 3 temperature graph has a lot more variability than I would expect to see if you are using the auto features. I see that device 2 of miner5 graphs for the same time frame show more temperature stability. Is this by chance a card with reduced airflow compared with devices 0, 1 and 3 or are you using different settings for these cards?

Please donate if this was helpful: 14CLqCNphUJ54ro2PtqQWJDmW3Eic1WmUd
Cacti templates for pool, GPU and CGMINER monitoring.
GPU monitoring with SNMP
dlasher
Sr. Member
****
Offline Offline

Activity: 468



View Profile WWW
March 17, 2012, 03:40:33 AM
 #17

wrt cgminer, your sample miner5 devices 0, 1 and 3 temperature graph has a lot more variability than I would expect to see if you are using the auto features. I see that device 2 of miner5 graphs for the same time frame show more temperature stability. Is this by chance a card with reduced airflow compared with devices 0, 1 and 3 or are you using different settings for these cards?

Identical settings, miner5 has 4 58xx cards with zero space between them, lots of airflow, but the middle cards stay hotter.

The LT
Full Member
***
Offline Offline

Activity: 188



View Profile WWW
March 17, 2012, 07:50:11 PM
 #18

Happily monitoring my cards for a week now! The graphs were useful in reducing the failing VRM temperatures. Smiley
Pages: [1]
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!