Bitcoin Forum
June 16, 2024, 07:51:07 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [15] 16 17 »  All
  Print  
Author Topic: cgmon - mining monitor for Linux - auto restart, reboot, sick gpu, ASIC, &more  (Read 48246 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
jdape (OP)
Sr. Member
****
Offline Offline

Activity: 269
Merit: 250


View Profile WWW
April 05, 2014, 05:33:21 PM
 #281

bro

we need for multiple config and edition cgmon setting

like if somebody have 50-100 pc its very hard to change all conf info in ur file on each pc.

could u help with that ?

like u have client file and server file too for mass controll cgmon on pcs

Place a customized copy of cgmon.tcl on a webserver and download it to each of your clients as needed.  

You could even change the auto update URL to your custom copy and then update your cgmon via the builtin update function
Code:
./cgmon.tcl update
 The update function saves the unique configuration options on each machine.


Fork Networking - VPS, Colocation, Dedicated Servers for Bitcoin & Litecoin. Since 1994! www.forked.net
pitBullXXX
Newbie
*
Offline Offline

Activity: 7
Merit: 0


View Profile
April 05, 2014, 05:44:01 PM
 #282

When that happens, run this command and send me the result.

Code:
php -f /tmp/cgmon-api.php notify

I may be able to add support for gridseed with that information.

Ok, miner GSD 3 went down.

Code:
--------------------------------------------------------------------------------
 [P]ool management [S]ettings [D]isplay options [Q]uit
 GSD 0: 8D8A16685449  875 MHz | 370.1K/370.2Kh/s | A:254784 R:512 HW: 51 WU: 1.4/m N: 259 255[49] 244 236 219[2
 GSD 1: 8D841F994849  888 MHz | 375.8K/375.7Kh/s | A:266880 R:  0 HW: 32 WU: 1.5/m N: 231 255 257[28] 241[4] 26
 GSD 2: 8D82348D5753  850 MHz | 359.5K/359.6Kh/s | A:263616 R:256 HW:164 WU: 1.5/m N: 270[65] 278[76] 279[22] 2
 GSD 3: 6D8514714857  913 MHz | OFF   /320.9Kh/s | A:226112 R:256 HW:  5 WU: 1.1/m N: 170 203 186 174[5] 188
--------------------------------------------------------------------------------

 [2014-04-05 13:21:26] Accepted 4a13a3d4 Diff 885/256 GSD 1 pool 0
 [2014-04-05 13:21:32] Accepted 2728d78c Diff 1.67K/256 GSD 2 pool 0
 [2014-04-05 13:21:46] Accepted cdf2dcd2 Diff 318/256 GSD 2 pool 0
 [2014-04-05 13:22:35] Accepted a150cecd Diff 406/256 GSD 2 pool 0
 [2014-04-05 13:22:37] Stratum from pool 0 detected new block
 [2014-04-05 13:22:38] Accepted 0a66c626 Diff 6.3K/256 GSD 1 pool 0
 [2014-04-05 13:22:49] Accepted 551185f5 Diff 770/256 GSD 0 pool 0
 [2014-04-05 13:22:53] Accepted 9b0cd043 Diff 423/256 GSD 1 pool 0
 [2014-04-05 13:23:01] Accepted 2cd1cb1c Diff 1.46K/256 GSD 2 pool 0
 [2014-04-05 13:23:10] Accepted a76977fd Diff 391/256 GSD 1 pool 0
 [2014-04-05 13:23:14] Accepted 5dd9ddf1 Diff 698/256 GSD 0 pool 0
 [2014-04-05 13:23:17] Stratum from pool 0 detected new block


Code:
pi@raspberrypi ~ $ php -f /tmp/cgmon-api.php notify
 notify returned 'STATUS=S,When=1396718524,Code=60,Msg=Notify,Description=cgminer 3.7.2|NOTIFY=0,Name=GSD,ID=0,Last Well=1396718524,Last Not Well=0,Reason Not Well=None,*Thread Fail Init=0,*Thread Zero Hash=0,*Thread Fail Queue=0,*Dev Sick Idle 60s=0,*Dev Dead Idle 600s=0,*Dev Nostart=0,*Dev Over Heat=0,*Dev Thermal Cutoff=0,*Dev Comms Error=0,*Dev Throttle=0|NOTIFY=1,Name=GSD,ID=1,Last Well=1396718524,Last Not Well=0,Reason Not Well=None,*Thread Fail Init=0,*Thread Zero Hash=0,*Thread Fail Queue=0,*Dev Sick Idle 60s=0,*Dev Dead Idle 600s=0,*Dev Nostart=0,*Dev Over Heat=0,*Dev Thermal Cutoff=0,*Dev Comms Error=0,*Dev Throttle=0|NOTIFY=2,Name=GSD,ID=2,Last Well=1396718524,Last Not Well=0,Reason Not Well=None,*Thread Fail Init=0,*Thread Zero Hash=0,*Thread Fail Queue=0,*Dev Sick Idle 60s=0,*Dev Dead Idle 600s=0,*Dev Nostart=0,*Dev Over Heat=0,*Dev Thermal Cutoff=0,*Dev Comms Error=0,*Dev Throttle=0|NOTIFY=3,Name=GSD,ID=3,Last Well=1396710329,Last Not Well=1396710329,Reason Not Well=Thread got zero hashes,*Thread Fail Init=0,*Thread Zero Hash=1,*Thread Fail Queue=0,*Dev Sick Idle 60s=0,*Dev Dead Idle 600s=0,*Dev Nostart=0,*Dev Over Heat=0,*Dev Thermal Cutoff=0,*Dev Comms Error=0,*Dev Throttle=0|'
Array
(
    [STATUS] => Array
        (
            [STATUS] => S
            [When] => 1396718524
            [Code] => 60
            [Msg] => Notify
            [Description] => cgminer 3.7.2
        )

    [NOTIFY0] => Array
        (
            [NOTIFY] => 0
            [Name] => GSD
            [ID] => 0
            [Last Well] => 1396718524
            [Last Not Well] => 0
            [Reason Not Well] => None
            [*Thread Fail Init] => 0
            [*Thread Zero Hash] => 0
            [*Thread Fail Queue] => 0
            [*Dev Sick Idle 60s] => 0
            [*Dev Dead Idle 600s] => 0
            [*Dev Nostart] => 0
            [*Dev Over Heat] => 0
            [*Dev Thermal Cutoff] => 0
            [*Dev Comms Error] => 0
            [*Dev Throttle] => 0
        )

    [NOTIFY1] => Array
        (
            [NOTIFY] => 1
            [Name] => GSD
            [ID] => 1
            [Last Well] => 1396718524
            [Last Not Well] => 0
            [Reason Not Well] => None
            [*Thread Fail Init] => 0
            [*Thread Zero Hash] => 0
            [*Thread Fail Queue] => 0
            [*Dev Sick Idle 60s] => 0
            [*Dev Dead Idle 600s] => 0
            [*Dev Nostart] => 0
            [*Dev Over Heat] => 0
            [*Dev Thermal Cutoff] => 0
            [*Dev Comms Error] => 0
            [*Dev Throttle] => 0
        )

    [NOTIFY2] => Array
        (
            [NOTIFY] => 2
            [Name] => GSD
            [ID] => 2
            [Last Well] => 1396718524
            [Last Not Well] => 0
            [Reason Not Well] => None
            [*Thread Fail Init] => 0
            [*Thread Zero Hash] => 0
            [*Thread Fail Queue] => 0
            [*Dev Sick Idle 60s] => 0
            [*Dev Dead Idle 600s] => 0
            [*Dev Nostart] => 0
            [*Dev Over Heat] => 0
            [*Dev Thermal Cutoff] => 0
            [*Dev Comms Error] => 0
            [*Dev Throttle] => 0
        )

    [NOTIFY3] => Array
        (
            [NOTIFY] => 3
            [Name] => GSD
            [ID] => 3
            [Last Well] => 1396710329
            [Last Not Well] => 1396710329
            [Reason Not Well] => Thread got zero hashes
            [*Thread Fail Init] => 0
            [*Thread Zero Hash] => 1
            [*Thread Fail Queue] => 0
            [*Dev Sick Idle 60s] => 0
            [*Dev Dead Idle 600s] => 0
            [*Dev Nostart] => 0
            [*Dev Over Heat] => 0
            [*Dev Thermal Cutoff] => 0
            [*Dev Comms Error] => 0
            [*Dev Throttle] => 0
        )

)
[/code]
jdape (OP)
Sr. Member
****
Offline Offline

Activity: 269
Merit: 250


View Profile WWW
April 05, 2014, 05:48:58 PM
 #283

Is cgmon running every few minutes in your crontab?  If you're not sure, check the log file.




Fork Networking - VPS, Colocation, Dedicated Servers for Bitcoin & Litecoin. Since 1994! www.forked.net
pitBullXXX
Newbie
*
Offline Offline

Activity: 7
Merit: 0


View Profile
April 05, 2014, 06:43:28 PM
 #284

I think you solved the problem... I had the wrong path to cgmon.tcl in crontab.

Now it shows its running in the log file.

Will see if it restarts cgminer when a miner goes down.

Thank you
jdape (OP)
Sr. Member
****
Offline Offline

Activity: 269
Merit: 250


View Profile WWW
April 05, 2014, 08:05:59 PM
 #285

Sure thing.  Cheers! Smiley

Fork Networking - VPS, Colocation, Dedicated Servers for Bitcoin & Litecoin. Since 1994! www.forked.net
bomberb17
Hero Member
*****
Offline Offline

Activity: 773
Merit: 528



View Profile
April 08, 2014, 08:12:36 AM
 #286

When I manually run it, I don't receive any mail ( I have gmail).
Waiting to see if I receive when cgminer crashes..
whitetoo
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251



View Profile
April 08, 2014, 09:30:08 AM
 #287

Just been testing this script for the past 24 hours! It's caught 3 SICK GPUs so far and solved with an automatic reboot. Thank you to the developers, no more midnight panics!

I did have one question, when the system is told to reboot, cgmon starts cgminer in the background. I then have to open the terminal and type "screen -r" to see the readout from the miners. Is it possible to have this happen automatically following an auto-reboot? Thank you
jdape (OP)
Sr. Member
****
Offline Offline

Activity: 269
Merit: 250


View Profile WWW
April 08, 2014, 01:37:27 PM
 #288

When I manually run it, I don't receive any mail ( I have gmail).
Waiting to see if I receive when cgminer crashes..

Check your configuration options.  This is the default:

Code:
# send email when running script by hand (no or yes)
set mail(notify_on_manual_runs) "no"

Fork Networking - VPS, Colocation, Dedicated Servers for Bitcoin & Litecoin. Since 1994! www.forked.net
jdape (OP)
Sr. Member
****
Offline Offline

Activity: 269
Merit: 250


View Profile WWW
April 08, 2014, 01:48:23 PM
 #289

Just been testing this script for the past 24 hours! It's caught 3 SICK GPUs so far and solved with an automatic reboot. Thank you to the developers, no more midnight panics!

I did have one question, when the system is told to reboot, cgmon starts cgminer in the background. I then have to open the terminal and type "screen -r" to see the readout from the miners. Is it possible to have this happen automatically following an auto-reboot? Thank you

So you're saying you want to see your miner in a terminal window after a reboot?  Hmm...   I'm sure that's possible but it would involve ugly hacks and wouldn't work reliably.   I think you'll have to type screen -r as needed for now.   That being said - you can use any computer to SSH (remotely login) into your miner computer and then type screen -r.   You don't have to physically go to your miner and pull it up on the big screen.  For example, I can open five terminal windows on my laptop, each with an SSH session to a miner and see all five at once from anywhere.

When I get sick gpu's over and over, I lower the clock speed (gpu-engine) by 10 and start it again.  Repeat until card is stable.

Fork Networking - VPS, Colocation, Dedicated Servers for Bitcoin & Litecoin. Since 1994! www.forked.net
whitetoo
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251



View Profile
April 08, 2014, 02:15:14 PM
 #290

Thanks for the advice jdape. Will look into the SSL option.
Noted on the gpu engine clock, sometimes the gpus can run for 48hours+ and be absolutely fine and other times they will run for an hour. Frustrating as they appear to be stable and then throw a curveball.
jdape (OP)
Sr. Member
****
Offline Offline

Activity: 269
Merit: 250


View Profile WWW
April 08, 2014, 09:01:42 PM
 #291

Thanks for the advice jdape. Will look into the SSL option.
Noted on the gpu engine clock, sometimes the gpus can run for 48hours+ and be absolutely fine and other times they will run for an hour. Frustrating as they appear to be stable and then throw a curveball.

Now that spring is here - I notice if certain windows are closed vs opened, that makes the difference between stability and constant hangs.  Will have to power down (depending on price) or break out the AC units soon! Smiley


Fork Networking - VPS, Colocation, Dedicated Servers for Bitcoin & Litecoin. Since 1994! www.forked.net
bomberb17
Hero Member
*****
Offline Offline

Activity: 773
Merit: 528



View Profile
April 09, 2014, 07:50:40 AM
 #292

When I manually run it, I don't receive any mail ( I have gmail).
Waiting to see if I receive when cgminer crashes..

Check your configuration options.  This is the default:

Code:
# send email when running script by hand (no or yes)
set mail(notify_on_manual_runs) "no"

Yes I have that option enabled. Mail is coming when cgminer hangs though.
I also have another question: I added the line
*/2 * * * *        root    /home/user/cgmon.tcl >/dev/null 2>&1
in /etc/crontab as in the instruct.ons. While the script most of the time runs ok, I see that sometimes it does not. For example, this is the output of the log:
Code:
bomberb17@bomberb17-ltcminer:~$ tail -n 20 cgmon.log
Apr 09 02:42:02 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:44:02 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:46:02 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:48:01 bomberb17-ltcminer - GPU 0 Shares accepted since last run:  65  (5.43 shares/min)
Apr 09 02:48:01 bomberb17-ltcminer - GPU 1 Shares accepted since last run:  68  (5.68 shares/min)
Apr 09 02:48:01 bomberb17-ltcminer - GPU 2 Shares accepted since last run:  62  (5.18 shares/min)
Apr 09 02:48:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:50:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:52:02 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:54:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:56:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:58:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:00:02 bomberb17-ltcminer - GPU 0 no accepted shares in 721 seconds. GPU probably hung.
Apr 09 03:02:02 bomberb17-ltcminer - cgminer API is not enabled/responding.  Restart cgminer with '--api-listen' or check the status of mining pools.
Apr 09 03:04:04 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:06:03 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:08:04 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:10:04 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:12:03 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:14:03 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
And I received the email on 03.00am.
While it restarted my computer and cgminer continued to run fine, the script stopped running every 2 minutes for some reason. I tried adding the line also in crontab -e but nothing changed.
Any ideas?
My os is Xubuntu 12.10
bomberb17
Hero Member
*****
Offline Offline

Activity: 773
Merit: 528



View Profile
April 09, 2014, 07:55:49 AM
 #293

When I manually run it, I don't receive any mail ( I have gmail).
Waiting to see if I receive when cgminer crashes..

Check your configuration options.  This is the default:

Code:
# send email when running script by hand (no or yes)
set mail(notify_on_manual_runs) "no"

Yes I have that option enabled. Mail is coming when cgminer hangs though.
I also have another question: I added the line
*/2 * * * *        root    /home/user/cgmon.tcl >/dev/null 2>&1
in /etc/crontab as in the instruct.ons. While the script most of the time runs ok, I see that sometimes it does not. For example, this is the output of the log:
Code:
bomberb17@bomberb17-ltcminer:~$ tail -n 20 cgmon.log
Apr 09 02:42:02 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:44:02 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:46:02 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:48:01 bomberb17-ltcminer - GPU 0 Shares accepted since last run:  65  (5.43 shares/min)
Apr 09 02:48:01 bomberb17-ltcminer - GPU 1 Shares accepted since last run:  68  (5.68 shares/min)
Apr 09 02:48:01 bomberb17-ltcminer - GPU 2 Shares accepted since last run:  62  (5.18 shares/min)
Apr 09 02:48:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:50:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:52:02 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:54:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:56:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 02:58:01 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:00:02 bomberb17-ltcminer - GPU 0 no accepted shares in 721 seconds. GPU probably hung.
Apr 09 03:02:02 bomberb17-ltcminer - cgminer API is not enabled/responding.  Restart cgminer with '--api-listen' or check the status of mining pools.
Apr 09 03:04:04 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:06:03 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:08:04 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:10:04 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:12:03 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
Apr 09 03:14:03 bomberb17-ltcminer - cgmon 1.0.8 - cgminer running and all GPUs healthy.
And I received the email on 03.00am.
While it restarted my computer and cgminer continued to run fine, the script stopped running every 2 minutes for some reason. I tried adding the line also in crontab -e but nothing changed.
Any ideas?
My os is Xubuntu 12.10


See this also
Code:
bomberb17@bomberb17-ltcminer:~$ ./cgmon.tcl
invalid bareword "exited"
in expression "229 - exited";
should be "$exited" or "{exited}" or "exited(...)" or ...
    (parsing expression "229 - exited")
    invoked from within
"expr  $current_accepted($n) -  $previous_accepted($n)"
    (procedure "check_status" line 197)
    invoked from within
"check_status"
    (file "./cgmon.tcl" line 602)
jdape (OP)
Sr. Member
****
Offline Offline

Activity: 269
Merit: 250


View Profile WWW
April 09, 2014, 06:23:25 PM
 #294

Can you please paste the contents of /tmp/accepted_count

Fork Networking - VPS, Colocation, Dedicated Servers for Bitcoin & Litecoin. Since 1994! www.forked.net
bomberb17
Hero Member
*****
Offline Offline

Activity: 773
Merit: 528



View Profile
April 10, 2014, 04:46:48 AM
 #295

Can you please paste the contents of /tmp/accepted_count

Code:
bomberb17@bomberb17-ltcminer:~$ cat /tmp/accepted_count
1127 1397104922
1151 1397104922
1146 1397104922

I have restarted my rig and now cgmon works ok again.
However I have seen cgminer hanging many times without cgmon catching it, and I suspect that this "bug" (which happens once in a while) is the reason.
freskhu
Newbie
*
Offline Offline

Activity: 62
Merit: 0


View Profile
April 10, 2014, 09:37:10 AM
 #296

Hi there,
I have this scrypt running on one rig and i am trying to put on the others.
I run them with bamt 1.3, when i first installed i had this error but i dont remember how i fix it!

i got this information on the log
http://prntscr.com/38nkfi

On the config file i have the right path:
http://prntscr.com/38nkt4

any help? Cheesy
freskhu
Newbie
*
Offline Offline

Activity: 62
Merit: 0


View Profile
April 10, 2014, 09:44:15 AM
 #297

when running it manually:

http://prntscr.com/38nlyx

Ninja edit: this error i solved, in the path to log it misses a "/"

freskhu
Newbie
*
Offline Offline

Activity: 62
Merit: 0


View Profile
April 10, 2014, 09:55:49 AM
 #298

If i run the scrypt manually it works, but it wont do it alone! how to fix? Cheesy
ivanlabrie
Hero Member
*****
Offline Offline

Activity: 812
Merit: 1000



View Profile
April 10, 2014, 02:09:11 PM
 #299

Hi, do I need to add this to say BAMT/SMOS/NotSMOS 1.3 or PiMP to have the rig reboot when sick devices are found?
jdape (OP)
Sr. Member
****
Offline Offline

Activity: 269
Merit: 250


View Profile WWW
April 10, 2014, 06:05:12 PM
 #300

Hi, do I need to add this to say BAMT/SMOS/NotSMOS 1.3 or PiMP to have the rig reboot when sick devices are found?

I haven't used those, but I believe so.

Fork Networking - VPS, Colocation, Dedicated Servers for Bitcoin & Litecoin. Since 1994! www.forked.net
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [15] 16 17 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!