jondecker76 (OP)
|
|
July 21, 2011, 02:18:51 PM |
|
I just installed SmartCoin today and after playing with it, I have to say that it looks really great. You've really put a lot of polish on it.
Unfortunately I'm not having any luck getting my lockups detected. There's no relevant information in the SmartCoin logs, and the SmartCoin console just seems to stop running altogether. However, I can disconnect from screen and issue other commands so I know the machine isn't completely unresponsive. I can also run ./lockup.sh manually and it works fine.
What should I look at? Is there any troubleshooting information I can provide? Thanks!
A lot depends on what revision you are running. There have been several fixes and improvements in the last few revisions. If I were you, I would first switch to the experimental branch (4) Edit Settings->Development Branch to Follow. Then run an update (11) Update Smartcoin). Then reboot smartcoin by first killing smartcoin (2) Kill smartcoin) then restarting it. That will bring you to the most current revision (r495e at this time) Ok, once we know you are current, lets review what a lockup is so that we are on the same page. A lockup almost always is caused by either loss of Internet, or stability problems from an over aggressive overclock. A locked GPU in this case still reports aticonfig commands, but you will notice that the miner will just hang without ever changing (you can view the miners individually if you want by typing . Another thing to point out is that at default settings, it can take 5-10 minutes before the lockup is detected. You can accelerate your testing by giong to Edi Settings, and edit the Lockup Threshold value to something lower (a value of 10 will take about a minute or two) Let me know how things work out for you
|
|
|
|
jondecker76 (OP)
|
|
July 21, 2011, 04:47:39 PM |
|
One of my miner updated to the lastest 495e but doesn't have lockup threshold settings in the settings. It keep rebooting (using the custom lockup.sh). Is there anyway to backup profile/miners and reinstall the smartcoin clean?
They should have been added at r490.. Did you use the built-in update system? I ask, because simply running svn update can cause this behavior, as the patch system won't get executed (and why I advise to only use the built-in update mechanism) Do you have the fialover threshold or failover rejection rate settings? You can fix your database by running these from the terminal for the missing settings: sqlite3 ~/.smartcoin/smartcoin.db "INSERT INTO settings (data,value,description) VALUES ('failover_threshold','10','Failover Threshold');"
sqlite3 ~/.smartcoin/smartcoin.db "INSERT INTO settings (data,value,description) VALUES ('failover_rejection','10','Failover on rejection % higher than');"
sqlite3 ~/.smartcoin/smartcoin.db "INSERT INTO settings (data,value,description) VALUES ('lockup_threshold','50','Lockup Threshold');"
|
|
|
|
EnzoMatrix
Newbie
Offline
Activity: 42
Merit: 0
|
|
July 21, 2011, 10:04:36 PM |
|
For the failover to work does every miner on the profile need to stop working .. ?
|
|
|
|
Jen4538
|
|
July 21, 2011, 10:11:30 PM |
|
For the failover to work does every miner on the profile need to stop working .. ?
for failover to work primary pool has to go down then it goes to next one etc , from my understanding Jen
|
|
|
|
Rob P.
|
|
July 21, 2011, 10:26:34 PM |
|
I know this is personal preference, but I don't worry at all until over 90 degrees - I think cards will run along just fine in the 80-degree range.
I also run two of my cards in the 80-83 range and things seem fine. I've successfully modified overclock and fan speed settings using the command-line option of AMDOverdriveCtrl while GPUs were mining. My 5830s will lock up and stop mining if their temps are over 85C for more than a minute or two. I think the problem is that the cards are so close together they just cannot vent properly and/or the fans aren't spinning. It's hard to tell. Someone else on the forums has the same setup as I do and he's stated that his cards run in the mid-to-upper 70s. I just can't figure out how. I just dropped that rig back down to 3 cards and they run in the high 60s with no external fans. But I add the 4th card in and two of them run in the high 80s (other two are in mid-70s) with an external fan blowing into the back of the cards. Very puzzling.
|
--
If you like what I've written here, consider tipping the messenger: 1GZu4CtHa6ai8iWoWiVFxV5VVoNte4SkoG
If you don't like what I've written, send me a Tip and I'll stop talking.
|
|
|
jondecker76 (OP)
|
|
July 22, 2011, 01:34:26 AM |
|
For the failover to work does every miner on the profile need to stop working .. ?
For failover, any single instance in the profile needs to go down for the specified number of iterations, it doesn't have to be all of the instances going down.
|
|
|
|
jondecker76 (OP)
|
|
July 22, 2011, 01:37:46 AM |
|
I know this is personal preference, but I don't worry at all until over 90 degrees - I think cards will run along just fine in the 80-degree range.
I also run two of my cards in the 80-83 range and things seem fine. I've successfully modified overclock and fan speed settings using the command-line option of AMDOverdriveCtrl while GPUs were mining. My 5830s will lock up and stop mining if their temps are over 85C for more than a minute or two. I think the problem is that the cards are so close together they just cannot vent properly and/or the fans aren't spinning. It's hard to tell. Someone else on the forums has the same setup as I do and he's stated that his cards run in the mid-to-upper 70s. I just can't figure out how. I just dropped that rig back down to 3 cards and they run in the high 60s with no external fans. But I add the 4th card in and two of them run in the high 80s (other two are in mid-70s) with an external fan blowing into the back of the cards. Very puzzling. My rig only has 3 cards - I'm still amazed at the variance of temperature that I get. My coolest card is in the 50's under full load, while the hottest one is in the low 80's. The thing that helped my cooling the best was prying the top of the cards apart about 1/2" (I used plastic 2-liter bottle caps as spacers). Though, mine is not it a case, so prying the tops of the cards apart a little was very easy
|
|
|
|
Jen4538
|
|
July 22, 2011, 02:27:39 AM |
|
I know this is personal preference, but I don't worry at all until over 90 degrees - I think cards will run along just fine in the 80-degree range.
I also run two of my cards in the 80-83 range and things seem fine. I've successfully modified overclock and fan speed settings using the command-line option of AMDOverdriveCtrl while GPUs were mining. My 5830s will lock up and stop mining if their temps are over 85C for more than a minute or two. I think the problem is that the cards are so close together they just cannot vent properly and/or the fans aren't spinning. It's hard to tell. Someone else on the forums has the same setup as I do and he's stated that his cards run in the mid-to-upper 70s. I just can't figure out how. I just dropped that rig back down to 3 cards and they run in the high 60s with no external fans. But I add the 4th card in and two of them run in the high 80s (other two are in mid-70s) with an external fan blowing into the back of the cards. Very puzzling. My rig only has 3 cards - I'm still amazed at the variance of temperature that I get. My coolest card is in the 50's under full load, while the hottest one is in the low 80's. The thing that helped my cooling the best was prying the top of the cards apart about 1/2" (I used plastic 2-liter bottle caps as spacers). Though, mine is not it a case, so prying the tops of the cards apart a little was very easy cable risers is best option if your going to mine with more than 2 cards generaly . few motherboards have a good spacing for 3 cards . with 4 cards its going to need fans at 100% and extra fans blowing to them to try and keep them cool. with risers i have 5 video cards on a rack with high temp of 75c i think. Jen
|
|
|
|
Rob P.
|
|
July 22, 2011, 02:36:13 AM |
|
cable risers is best option if your going to mine with more than 2 cards generaly . few motherboards have a good spacing for 3 cards . with 4 cards its going to need fans at 100% and extra fans blowing to them to try and keep them cool. with risers i have 5 video cards on a rack with high temp of 75c i think.
Jen
Thanks Jen. I have the MSI 890FXA-GD70, so I can easily get 3 cards, each with 1 full PCI bay between them. However, when I run 4 cards on the motherboard, they have zero space between them. I also need to run them in a case. I know, tons of "issues". Anyone, don't want to hijack Jon's thread, I'll start a new one asking for some ideas.
|
--
If you like what I've written here, consider tipping the messenger: 1GZu4CtHa6ai8iWoWiVFxV5VVoNte4SkoG
If you don't like what I've written, send me a Tip and I'll stop talking.
|
|
|
EnzoMatrix
Newbie
Offline
Activity: 42
Merit: 0
|
|
July 22, 2011, 08:25:46 PM |
|
Hello,
I am not sure if it is something I am going wrong but I am unable to trigger the failover
I have a profiles setup to mine on 3 different server on the same pool and I have tried simulating an outage using /etc/hosts file but it never fails over is there any specific data that might help in locating this.. ? it is a fresh smartcoin install
Smartcoin r495s
here is the failover order 1 was a deleted profile
2) BTCGuild All 3) BTCGuild US 4) BTCGuild USWest 5) BTCGuild USEast 6) BitClockers
|
|
|
|
jondecker76 (OP)
|
|
July 23, 2011, 01:27:51 AM |
|
Hello,
I am not sure if it is something I am going wrong but I am unable to trigger the failover
I have a profiles setup to mine on 3 different server on the same pool and I have tried simulating an outage using /etc/hosts file but it never fails over is there any specific data that might help in locating this.. ? it is a fresh smartcoin install
Smartcoin r495s
here is the failover order 1 was a deleted profile
2) BTCGuild All 3) BTCGuild US 4) BTCGuild USWest 5) BTCGuild USEast 6) BitClockers
/etc/hosts is not the correct way to do this for testing (I think it will work with a reboot, but still a real pain) Use iptables to fake a domain being down. Here is an example to add a rule to block things to X8S: Block: sudo iptables -A OUTPUT -p tcp -m string --string "x8s.de" --algo kmp -j DROP
(in my experience, it will take 30 seconds or so sometimes before the packets will start dropping, and you see <<DOWN>>> in the smartcoin display) To unblock it after testing, first you have to get an index number: sudo iptables -L OUTPUT --line-numbers
make note of the number of the rule to delete for the next command... then remove the rule by: sudo iptables -D OUTPUT #
(replace # above with the rule number to delete) For testing failover, the first criteria is that an instance shows "<<<DOWN>>>" for the number of iterations listed in the settings (default 10). This is about a minute or so...
|
|
|
|
jondecker76 (OP)
|
|
July 23, 2011, 02:37:09 PM |
|
Update 496e/s available - Nothing new, its been pretty quiet so I'm bringing the stable version current with the experimental version
|
|
|
|
krzynek1
Newbie
Offline
Activity: 41
Merit: 0
|
|
July 23, 2011, 05:16:35 PM |
|
Hello, today on my 2 of 3 rigs smartcoin shutdown itself because it detected lockup condition. But gpus was not locked up, because when i manually started smartcoin without reboot system, it worked fine, all cards on that two rigs are submitting shares i think, that lockup was due too many workers was fired, in my case - 28, because failover provile was used, im using smartcoin stable version 452, i dont know if its possible, by i think it will be good, if smartcoin first restarts workers, then itself, and if that fails, reboot the whole system but about that 28 workers, Jon please add that exclude functionality to failover profile
|
|
|
|
jondecker76 (OP)
|
|
July 23, 2011, 05:53:08 PM |
|
Hello, today on my 2 of 3 rigs smartcoin shutdown itself because it detected lockup condition. But gpus was not locked up, because when i manually started smartcoin without reboot system, it worked fine, all cards on that two rigs are submitting shares i think, that lockup was due too many workers was fired, in my case - 28, because failover provile was used, im using smartcoin stable version 452, i dont know if its possible, by i think it will be good, if smartcoin first restarts workers, then itself, and if that fails, reboot the whole system but about that 28 workers, Jon please add that exclude functionality to failover profile Just do an update - newest stable version is r496. The lockup detection on failed profiles was eliminated in r456. Also, after running the update, you can tweak the sensitivity of failover and lockup detection in the Edit Settings menu. Regarding rebooting the whole system, that is left up to the user. You can put a custom 'lockup.sh' script in the smartcoin directory and it will be run on the lockup event. On the new version also, all smartcoin does on a lockup is restart the miners. It will stay in this cycle of runnign until lockup is detected, then restarting miners (this takes care of miner software locking up). If you look some posts back, you can read where some sample lockup scripts were posted - so with the new update it is totally up to you what extra actions happen on a lockup (reboot, send yourself an email, etc)
|
|
|
|
krzynek1
Newbie
Offline
Activity: 41
Merit: 0
|
|
July 23, 2011, 06:24:55 PM Last edit: July 23, 2011, 07:26:26 PM by krzynek1 |
|
updating right now, can you explain once more flags used to exec smartcoin ? --kill --restart etc. edit. on one rig update doesn't go wright, there was some database lock error and after that workers failed to launch correctly must do a fresh install
|
|
|
|
jondecker76 (OP)
|
|
July 24, 2011, 12:23:54 AM |
|
updating right now, can you explain once more flags used to exec smartcoin ? --kill --restart etc. edit. on one rig update doesn't go wright, there was some database lock error and after that workers failed to launch correctly must do a fresh install Sorry about the database lock error - that was finally fixed in r459 As far as options currently supported: smartcoin --kill : will kill a running instance of smartcoin if it exists (this is how you should shut down smartcoin from a custom lockup.sh script) smartcoin --reload : will force a reload of all the miners smartcoin --delay=# : will delay # seconds before continuing. For example, smartcoin --delay=5 --kill will wait 5 seconds, then kill smartcoin.
|
|
|
|
jondecker76 (OP)
|
|
July 24, 2011, 12:59:01 AM |
|
Update r499e now available! - The miners now reload automatically whenever a configuration change is made (I.e. you edit settings, you change worker information, you change miner information, etc...
- This will keep things in sync like they should be, though I expect for a couple of bugs to pop up (for instance, when you disable a device, miners reload - but disabling a device currently only disables temperature/load readings and not the actual profile.. I'll be fixing this soon) Please post if you find any other things associated with this change!
|
|
|
|
Jen4538
|
|
July 24, 2011, 01:58:08 AM |
|
Update r499e now available! - The miners now reload automatically whenever a configuration change is made (I.e. you edit settings, you change worker information, you change miner information, etc...
- This will keep things in sync like they should be, though I expect for a couple of bugs to pop up (for instance, when you disable a device, miners reload - but disabling a device currently only disables temperature/load readings and not the actual profile.. I'll be fixing this soon) Please post if you find any other things associated with this change!
smartcoin is working very well for me but each time i make a change i just need to reboot as i still cant figure out any other way . this revision may fix that problem for me although it doesnt bother me as much as doing it manually all time Jen
|
|
|
|
krzynek1
Newbie
Offline
Activity: 41
Merit: 0
|
|
July 24, 2011, 07:38:51 AM Last edit: July 24, 2011, 02:49:47 PM by krzynek1 |
|
updating right now, can you explain once more flags used to exec smartcoin ? --kill --restart etc. edit. on one rig update doesn't go wright, there was some database lock error and after that workers failed to launch correctly must do a fresh install Sorry about the database lock error - that was finally fixed in r459 As far as options currently supported: smartcoin --kill : will kill a running instance of smartcoin if it exists (this is how you should shut down smartcoin from a custom lockup.sh script) smartcoin --reload : will force a reload of all the miners smartcoin --delay=# : will delay # seconds before continuing. For example, smartcoin --delay=5 --kill will wait 5 seconds, then kill smartcoin. thank you for information after fresh install on that one rig i have some error poping out http://naforum.zapodaj.net/thumbs/3584d3a1288a.jpgedit: i have high cpu usage on all rigs, smartcoin shows > ~21 %, but htop gives 0.76 and more (on my last version r45x smartcoin showed 7-8 %) that doesnt occured on previous releases, now im using r496s , is there option to manually rollback update, or fresh install previous versions just to double check if it is my system fault ?
|
|
|
|
jondecker76 (OP)
|
|
July 24, 2011, 03:21:50 PM Last edit: July 24, 2011, 04:28:38 PM by jondecker76 |
|
updating right now, can you explain once more flags used to exec smartcoin ? --kill --restart etc. edit. on one rig update doesn't go wright, there was some database lock error and after that workers failed to launch correctly must do a fresh install Sorry about the database lock error - that was finally fixed in r459 As far as options currently supported: smartcoin --kill : will kill a running instance of smartcoin if it exists (this is how you should shut down smartcoin from a custom lockup.sh script) smartcoin --reload : will force a reload of all the miners smartcoin --delay=# : will delay # seconds before continuing. For example, smartcoin --delay=5 --kill will wait 5 seconds, then kill smartcoin. thank you for information after fresh install on that one rig i have some error poping out edit: i have high cpu usage on all rigs, smartcoin shows > ~21 %, but htop gives 0.76 and more (on my last version r45x smartcoin showed 7-8 %) that doesnt occured on previous releases, now im using r496s , is there option to manually rollback update, or fresh install previous versions just to double check if it is my system fault ? Thanks for the screenshot of the error -I'm looking into it - though I'm not finding anything yet. Are all 3 miners the same? (I.e. running the same distro and version). Also, can you post the result of this: sqlite3 ~/.smartcoin/smartcoin.db "SELECT * FROM device;"
Yes, CPU usage going up in recent versions is normal. In older versions, there was a delay put into the loop on purpose to make it run slower. I removed the delay to make the loop run much faster (you should see that the display updates more frequently now). It shouldn't hurt anything though
|
|
|
|
|