jaebird
Member
Offline
Activity: 79
Merit: 10
|
|
July 20, 2011, 06:37:14 PM |
|
As a developer, I'd be reluctant to add network detection failure to my mining software, just because it would seem out of the core functionality of Smartcoin itself and adds bloat, especially because Jon is doing all of this in bash scripting. I would leave it up to the user to figure out if they have a network problem or not. GPU failure detection is within the core, so it makes sense to have Smartcoin do that and then provide hooks to allow the user to take actions.
I think you could argue it on either side. Mining cannot be done if the GPU is locked or the network is unavailable. Both are edge conditions outside the normal operation of smartcoin, however both have the potential to reduce mining's effectiveness. BTW, It appears that it was not internet connectivity that was causing my lockup detection issue. It was the accidental rollback of SVN that Jon mentioned several posts back. @jondecker76, When will the lockup.sh get called? If this script exists, is that all smartcoin calls? If the script is not there, does it default to killing smartcoin? I've noticed that when a GPU locks up the temps drop even though the card reports 99% utilization... this is another indicator that the GPU is locked. Whereas when the miner is idle (not locked), both temps and % utilization are down. I'm thinking of a pre_lockup_detection hook script that smartcoin calls that can return a result. This hook script could also check for connectivity Thanks, jaebird
|
|
|
|
Rob P.
|
|
July 20, 2011, 07:06:44 PM |
|
Jon --
I'm adding a 4th card to my motherboard and having some heat issues I'm trying to sort out. I'm watching the temps and as they crest 80C I want to shut the GPU's miner off.
I went under "Configure Devices" and edited GPU2, just took the defaults, but when it asked me if I wanted to disable it, I chose "y". This was while all four GPUs were mining.
When I went back to the miner screen, GPU2 had been removed from the top temp and load display. However, GPU2 was still listed under the mining display and was clearly still mining.
Not sure if this is a bug or intended, but I would think that both displays should be consistent. I would also LOVE a way to just Idle a worker easily, so I can make adjustments. I find I basically have to kill Smartcoin, make my changes, and if one of the GPUs is still too warm, I have to shut everything down again. I know you mentioned an IDLE profile, which would work, but my preference would be to just Idle an individual worker, or all workers, which would provide better flexibility.
Just a thought.
|
--
If you like what I've written here, consider tipping the messenger: 1GZu4CtHa6ai8iWoWiVFxV5VVoNte4SkoG
If you don't like what I've written, send me a Tip and I'll stop talking.
|
|
|
irishmick
|
|
July 20, 2011, 10:22:24 PM |
|
I want to donate and currently do, but can you lower the aggression level you have set for the donation period? My machines basically are frozen when donation runs.
|
|
|
|
jondecker76 (OP)
|
|
July 20, 2011, 10:25:00 PM |
|
As a developer, I'd be reluctant to add network detection failure to my mining software, just because it would seem out of the core functionality of Smartcoin itself and adds bloat, especially because Jon is doing all of this in bash scripting. I would leave it up to the user to figure out if they have a network problem or not. GPU failure detection is within the core, so it makes sense to have Smartcoin do that and then provide hooks to allow the user to take actions.
I think you could argue it on either side. Mining cannot be done if the GPU is locked or the network is unavailable. Both are edge conditions outside the normal operation of smartcoin, however both have the potential to reduce mining's effectiveness. BTW, It appears that it was not internet connectivity that was causing my lockup detection issue. It was the accidental rollback of SVN that Jon mentioned several posts back. @jondecker76, When will the lockup.sh get called? If this script exists, is that all smartcoin calls? If the script is not there, does it default to killing smartcoin? I've noticed that when a GPU locks up the temps drop even though the card reports 99% utilization... this is another indicator that the GPU is locked. Whereas when the miner is idle (not locked), both temps and % utilization are down. I'm thinking of a pre_lockup_detection hook script that smartcoin calls that can return a result. This hook script could also check for connectivity Thanks, jaebird The lockup.sh script gets called after a lockup condition is detected just before the miners are restarted - but only if the script exists. So basically, lockup script or not smartcoin will continue to restart the miner instances each time a lockup is detected. Then it will be the users responsibility to take action on this event in their lockup.sh script if they want (including stopping smartcoin with the 'smartcoin --kill' command). It would be easy to ping a known server from the lockup.sh script and decide for yourself what to do if the Internet is down (For example, if the internet is down, kill smartcoin and run a loop pinging an internet server every minute, then restart smartcoin when the Internet is back online, for example) Regarding detecting lockup via GPU temperature, I think it would be much harder than it appears. For example, on my mining rig, there are 3 cards. One of them runs at about 55 degrees, another at around 68 degrees and another around 80 degrees. There is so much variance (because of individual cases of airflow, whether the card is sandwiched between other cards or on the end, etc.) that I think it would be very hard to implement in a general sense. Also, i still think the current lockup detection scheme works fine for failed internet connections. If the internet goes down, then smartcoin will continue to restart the miners every 5 minutes or so until it comes back, which does no harm.
|
|
|
|
jondecker76 (OP)
|
|
July 20, 2011, 10:37:03 PM |
|
Jon --
I'm adding a 4th card to my motherboard and having some heat issues I'm trying to sort out. I'm watching the temps and as they crest 80C I want to shut the GPU's miner off.
I went under "Configure Devices" and edited GPU2, just took the defaults, but when it asked me if I wanted to disable it, I chose "y". This was while all four GPUs were mining.
When I went back to the miner screen, GPU2 had been removed from the top temp and load display. However, GPU2 was still listed under the mining display and was clearly still mining.
Not sure if this is a bug or intended, but I would think that both displays should be consistent. I would also LOVE a way to just Idle a worker easily, so I can make adjustments. I find I basically have to kill Smartcoin, make my changes, and if one of the GPUs is still too warm, I have to shut everything down again. I know you mentioned an IDLE profile, which would work, but my preference would be to just Idle an individual worker, or all workers, which would provide better flexibility.
Just a thought.
Yes, i need to start adding the code that syncs "live" changes. The lower level stuff for this is already in place, but I haven't gotten around to making use of it yet. I'll be adding this in soon though, and the principle in how it works will be quite simple.. After an Add/Edit/Delete, a check will run to see if whatever was added/edited/deleted has any part in the current profile, and if so force a reload of the miners. I know this is personal preference, but I don't worry at all until over 90 degrees - I think cards will run along just fine in the 80-degree range. I do plan on adding some temperature related functions eventually - i'm just not sure the best way to go about it yet (controlling fan speed? Dynamically enabling/disabling the gpu? etc) Also, regarding disabling workers, the workers table in the database already does have a 'disabled' field (though it isn't used yet), but eventually when the live changes syncing is in place, and I give access to the disabled field through add/edit workers, then you will have the functionality that you were thinking about!
|
|
|
|
jaebird
Member
Offline
Activity: 79
Merit: 10
|
|
July 20, 2011, 10:37:16 PM |
|
@jondecker76
Yeah, my box has been running fine even with bad inet, so it's not really necessary to do something more custom than what you already have going! My cards also very between each other, however are quite consistent among themselves. So the only way you could use temp data is for each GPU miner to have a temp profile of idle and working with some slack built-in. That said, I think your lockup detection is working fine. For me, I'll just add a system reboot on lockup since a locked up card can't be recovered AFAIK.
Thanks, jaebird
|
|
|
|
jaebird
Member
Offline
Activity: 79
Merit: 10
|
|
July 20, 2011, 10:41:38 PM |
|
Jon --
I'm adding a 4th card to my motherboard and having some heat issues I'm trying to sort out. I'm watching the temps and as they crest 80C I want to shut the GPU's miner off.
I went under "Configure Devices" and edited GPU2, just took the defaults, but when it asked me if I wanted to disable it, I chose "y". This was while all four GPUs were mining.
When I went back to the miner screen, GPU2 had been removed from the top temp and load display. However, GPU2 was still listed under the mining display and was clearly still mining.
Not sure if this is a bug or intended, but I would think that both displays should be consistent. I would also LOVE a way to just Idle a worker easily, so I can make adjustments. I find I basically have to kill Smartcoin, make my changes, and if one of the GPUs is still too warm, I have to shut everything down again. I know you mentioned an IDLE profile, which would work, but my preference would be to just Idle an individual worker, or all workers, which would provide better flexibility.
Just a thought.
Yes, i need to start adding the code that syncs "live" changes. The lower level stuff for this is already in place, but I haven't gotten around to making use of it yet. I'll be adding this in soon though, and the principle in how it works will be quite simple.. After an Add/Edit/Delete, a check will run to see if whatever was added/edited/deleted has any part in the current profile, and if so force a reload of the miners. I know this is personal preference, but I don't worry at all until over 90 degrees - I think cards will run along just fine in the 80-degree range. I do plan on adding some temperature related functions eventually - i'm just not sure the best way to go about it yet (controlling fan speed? Dynamically enabling/disabling the gpu? etc) Also, regarding disabling workers, the workers table in the database already does have a 'disabled' field (though it isn't used yet), but eventually when the live changes syncing is in place, and I give access to the disabled field through add/edit workers, then you will have the functionality that you were thinking about! I also run two of my cards in the 80-83 range and things seem fine. I've successfully modified overclock and fan speed settings using the command-line option of AMDOverdriveCtrl while GPUs were mining. Whatever you do for temp functions, I think you should use hook scripts that are user customizable. At least those are my thoughts...
|
|
|
|
jondecker76 (OP)
|
|
July 20, 2011, 10:42:30 PM |
|
I want to donate and currently do, but can you lower the aggression level you have set for the donation period? My machines basically are frozen when donation runs.
The donation profile uses the miner flagged as default. You can change the default miner and/or the options used by that miner (such as aggression) in the Configure Miners -> Edit. The special donation profile was written before failover was implemented. I'm thinking in the future, that I will rewrite the donation profile to use failover (instead of multiple instances), which would also help out.
|
|
|
|
jondecker76 (OP)
|
|
July 20, 2011, 10:47:40 PM |
|
I also run two of my cards in the 80-83 range and things seem fine. I've successfully modified overclock and fan speed settings using the command-line option of AMDOverdriveCtrl while GPUs were mining. Whatever you do for temp functions, I think you should use hook scripts that are user customizable. At least those are my thoughts... Yeah, I'm thinking along the same lines, as there is just an incredible amount of variance to deal with, and each user will probably have a unique set of needs. Also, the hook scripts are going to get a nice overhaul soon, such as having parameters passed into them when they are launched (For instance, passing the machine number as an argument to the lockup.sh script so you can have different cases for different machines when multi-machine support is in.. Or having the GPU temperatures passed in to a temperature.sh script to make things easier)
|
|
|
|
irishmick
|
|
July 20, 2011, 11:05:47 PM |
|
I want to donate and currently do, but can you lower the aggression level you have set for the donation period? My machines basically are frozen when donation runs.
The donation profile uses the miner flagged as default. You can change the default miner and/or the options used by that miner (such as aggression) in the Configure Miners -> Edit. The special donation profile was written before failover was implemented. I'm thinking in the future, that I will rewrite the donation profile to use failover (instead of multiple instances), which would also help out. Thanks Jon I'll double check that.
|
|
|
|
jaebird
Member
Offline
Activity: 79
Merit: 10
|
|
July 20, 2011, 11:09:39 PM |
|
I also run two of my cards in the 80-83 range and things seem fine. I've successfully modified overclock and fan speed settings using the command-line option of AMDOverdriveCtrl while GPUs were mining. Whatever you do for temp functions, I think you should use hook scripts that are user customizable. At least those are my thoughts... Yeah, I'm thinking along the same lines, as there is just an incredible amount of variance to deal with, and each user will probably have a unique set of needs. Also, the hook scripts are going to get a nice overhaul soon, such as having parameters passed into them when they are launched (For instance, passing the machine number as an argument to the lockup.sh script so you can have different cases for different machines when multi-machine support is in.. Or having the GPU temperatures passed in to a temperature.sh script to make things easier) Yeah that would be awesome. Btw, I've noticed a trailing "t" on the header line (example: "-----------------t"). Not sure if this is my terminal version or a side effect of turning of line wrap. Just thought I'd mention it.
|
|
|
|
jl
Newbie
Offline
Activity: 10
Merit: 0
|
|
July 21, 2011, 01:28:07 AM |
|
I was getting stale shares from 9%-38% with SmartCoin w/ my 4 rigs/workers - now back hooked up to pool straight, I'm at 0.10% ...is this a known side-effect of running SmartCoin? I don't know how much extra is going on behind the scenes with processes, etc. I do know how much I dig SmartCoin though - will keep updating.
|
|
|
|
jondecker76 (OP)
|
|
July 21, 2011, 01:39:47 AM |
|
I was getting stale shares from 9%-38% with SmartCoin w/ my 4 rigs/workers - now back hooked up to pool straight, I'm at 0.10% ...is this a known side-effect of running SmartCoin? I don't know how much extra is going on behind the scenes with processes, etc. I do know how much I dig SmartCoin though - will keep updating.
This isn't a known side effect of smartcoin. I run 3 instances per card normally with less than 0.05% rejected shares. Its normal for rejection percent to start a little higher and then settle out sometimes though. Can you give me some information on how you have things set up? How many instances are you running per card? And to how many different pools? Which miner? etc.
|
|
|
|
Jen4538
|
|
July 21, 2011, 02:17:54 AM |
|
smartcoin is very much fun , just turned up aggression to 12
now things are looking much better
Jen
|
|
|
|
jl
Newbie
Offline
Activity: 10
Merit: 0
|
|
July 21, 2011, 02:24:17 AM Last edit: July 21, 2011, 02:34:44 AM by jl |
|
I was getting stale shares from 9%-38% with SmartCoin w/ my 4 rigs/workers - now back hooked up to pool straight, I'm at 0.10% ...is this a known side-effect of running SmartCoin? I don't know how much extra is going on behind the scenes with processes, etc. I do know how much I dig SmartCoin though - will keep updating.
This isn't a known side effect of smartcoin. I run 3 instances per card normally with less than 0.05% rejected shares. Its normal for rejection percent to start a little higher and then settle out sometimes though. Can you give me some information on how you have things set up? How many instances are you running per card? And to how many different pools? Which miner? etc. I kept things running for over 8 hours with only 1 instance per card to 1 pool, phoenix - and was getting those high numbers w/ 3, 2-card rigs and 1, 1-card rig. Open to suggestions for multi-instance/pool..can anyone elaborate on how stales form? Could it be from other (home) network load/usage, computer system processes, pool, other?
|
|
|
|
jondecker76 (OP)
|
|
July 21, 2011, 02:42:33 AM |
|
I was getting stale shares from 9%-38% with SmartCoin w/ my 4 rigs/workers - now back hooked up to pool straight, I'm at 0.10% ...is this a known side-effect of running SmartCoin? I don't know how much extra is going on behind the scenes with processes, etc. I do know how much I dig SmartCoin though - will keep updating.
This isn't a known side effect of smartcoin. I run 3 instances per card normally with less than 0.05% rejected shares. Its normal for rejection percent to start a little higher and then settle out sometimes though. Can you give me some information on how you have things set up? How many instances are you running per card? And to how many different pools? Which miner? etc. I kept things running for over 8 hours with only 1 instance per card to 1 pool, phoenix - and was getting those high numbers w/ 3, 2-card rigs and 1, 1-card rig. Open to suggestions for multi-instance/pool..can anyone elaborate on how stales form? Could it be from other (home) network load/usage, computer system processes, pool, other? Can you compare the launch string that you use manually (when you got decent rejection rates) to the launch string defined in Configure Miners->Edit->Phoenix? Perhaps you need to edit the launch string so that you use the same phoenix options that you normally use (the default options may not be playing nice on your setup)
|
|
|
|
jondecker76 (OP)
|
|
July 21, 2011, 02:47:43 AM |
|
Update r490e now available: - Fixed a small bug in the lockup.sh custom script execution - There are 3 new settings that can be fine-tuned: * Lockup threshold (default is 50 iterations). The smaller the number, the faster the lockup detection will trip. * Failover threshold (default is 10 iterations). The smaller the number, the faster the failover detection will trip. * Failover Rejection (default is 10%) Increments the failover detection counter each iteration if the rejection percentage is above this number (INTEGER NUMBER ONLY!)
NOTE: A restart of smartcoin is required after changing these before the new values will take effect.
|
|
|
|
jondecker76 (OP)
|
|
July 21, 2011, 08:16:37 AM |
|
Update r491e now available: - Removed injected delay in the status loop - this pretty much doubles the refresh rate. Future optimizations will probably increase the refresh rate even more.
If everything looks stable for the next day or so, I'll trip the stable update flag
|
|
|
|
jondecker76 (OP)
|
|
July 21, 2011, 10:52:55 AM |
|
Update r495e now available - Failed sql queries will now wait 1/100 of a second between retries - --reload command line argument addition. 'smartcoin --reload' run from the commandline will now reload all the miner instances. The underlying reload structure will also be used soon to reload miners whenever configurations are changed from the control screen. Another way to use this from your own script is the create a file /tmp/smartcoin.reload with a message contained within to display. For example: echo "Reloading because I have the power to do so!" > /tmp/smartcoin.reload
- Started refactoring some code to make reinitialization of global variables possible (as many of these values are grabbed live from the database as needed. Once this refactor is complete, these values will be read in only once from the database unless I choose to reinitialize them at some point in the code. Basically just some framework changes to facilitate future features and upcoming optimizations
|
|
|
|
pickle
Newbie
Offline
Activity: 19
Merit: 0
|
|
July 21, 2011, 12:39:01 PM |
|
I just installed SmartCoin today and after playing with it, I have to say that it looks really great. You've really put a lot of polish on it.
Unfortunately I'm not having any luck getting my lockups detected. There's no relevant information in the SmartCoin logs, and the SmartCoin console just seems to stop running altogether. However, I can disconnect from screen and issue other commands so I know the machine isn't completely unresponsive. I can also run ./lockup.sh manually and it works fine.
What should I look at? Is there any troubleshooting information I can provide? Thanks!
|
|
|
|
|