Bitcoin Forum
April 27, 2024, 11:51:30 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 [111] 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 ... 233 »
  Print  
Author Topic: [ANN] CureCoin 2.0 is live - Mandatory Update is available now - DEC 2018  (Read 696200 times)
ChasingTheDream
Sr. Member
****
Offline Offline

Activity: 292
Merit: 250


View Profile
June 03, 2014, 07:00:20 PM
Last edit: June 03, 2014, 07:17:07 PM by ChasingTheDream
 #2201

Calling Aboy68   Cheesy  

Our production is dropping off and I've been having a lot of hardware issues and instability despite under clocking the GPU's literally to their lowest possible settings (both core and speed).  If you are having similar issues try under clocking your RAM.  I got that tip directly from the F@H support and I think the guy may have nailed it!  At least I hope so!  I'll know more in about 24 hours.  I may be able to spin up the GPU's again.  They are running ridiculously under clocked right now.

Despite the bumps I'm still trying to beat you into the top 10.   Grin

Update:  GAH.  Still didn't make it three hours before two machines were down again.  LOL.  So the quest for stability continues...

Im running stock values on all hardware I have, yes sometimes the worker and the WU do get lost in space with the result of 99.99%
To resync is the solution, ea pause and fold commands.
I have a 100% fix on this automaticly no human hands on! The last 3 day I have tryed out the fix and it still works.

Do you want to know how?

//Aboy68

Yes any ideas are welcome.  Virtually every morning at least two of my machines are down meaning I can not restart them folding without physically rebooting the machine.  The machine will not respond to remote restarts or keyboard input.  I actually have to press the reset button.  That happens during the day as well but I'm not always available to do anything about it so they sit for hours like that.  At this point I've got the GPU's under clocked to the maximum amount so it is not the GPU's.  It is something in the systems themselves.  Memory, CPU, something.  I've removed the CPU slot on the troubled machines (after the WU finished of course) but it does not seem to have made any difference in terms of stability.

I'm going to gather my logs and present them to the F@H support group to see if they have any ideas to help speed up the process of getting these things running properly.  As a short term fix I may write a program that reads the logs and if too much time goes by before the log is updated it could force a system restart.  Unfortunately I don't think this will work because whatever is happening makes the system so unstable that I don't think it will be able to restart.

Ironically there are no hardware errors or application errors in Windows Event Viewer though.  This has actually been plaguing me since I started but it was the same way with mining.  It took a long time to get the systems to behave.  This will eventually get worked out.

Unfortunately, as a result I'm only running at about 2/3 my expected output, but it is still better than nothing.  lol

If you have a fix I would love to hear about it!

Are the fans working correctly? Might want to get a tool that lets you see VRM temps too, I had a 7970 experiencing a similar issue (back in the mining days) and VRMs were around 117C. Some more work showed that the fan speed in CCC/afterburner/trixx was incorrect, as the fan had a hardware issue and was spinning with a much higher resistance than it should have.

I used GPU-Z to take a peek and it looks like the highest VRM temp on any of the cards in the troubled machines is 58C.  Also based on your recommendation some time ago I did swap the GPU's out of the most troubled machine with the machine next to it.  The most troubled machine is still the machine having the most issues.  Based on all that, I don't think it is the GPU's at this point but it was definitely worth a look.  Thanks for the suggestion.

Ironically though, I ran GPU-Z on several of my machines that don't seem to want to run for very long.  The most troubled machine ended up getting a video driver failure while I was watching.  The machine was still stable afterwards and I was able to remotely reboot it so it was responding appropriately.  Whatever else is happening makes it so unstable it is on a whole different level of ugly.  Definitely not just a video driver failure.

Try underclocking your system RAM. This helped with one of my rigs.

I actually was talking about that in the first post in this sequence and thought it was going to help because the RAM speed in all the machines was at 2133.  I brought it down to 1333 and unfortunately it didn't help.  I think I'm going to run a memory test next and maybe even swap memory between a machine that behaves somewhat well and the least stable machine.  Hard to believe the memory is bad in 2-3 different machines but I need to rule it out.

Another suggestion from the F@H folks was that I could be overloading a rail on the PSU but the computer that is having the most issues has a 1200 watt Corsair which is a single rail PSU.  So the quest continues.
1714218690
Hero Member
*
Offline Offline

Posts: 1714218690

View Profile Personal Message (Offline)

Ignore
1714218690
Reply with quote  #2

1714218690
Report to moderator
"Bitcoin: mining our own business since 2009" -- Pieter Wuille
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714218690
Hero Member
*
Offline Offline

Posts: 1714218690

View Profile Personal Message (Offline)

Ignore
1714218690
Reply with quote  #2

1714218690
Report to moderator
1714218690
Hero Member
*
Offline Offline

Posts: 1714218690

View Profile Personal Message (Offline)

Ignore
1714218690
Reply with quote  #2

1714218690
Report to moderator
Aboy68
Member
**
Offline Offline

Activity: 96
Merit: 10


View Profile
June 03, 2014, 09:47:24 PM
Last edit: June 03, 2014, 10:02:41 PM by Aboy68
 #2202

Calling Aboy68   Cheesy  

Our production is dropping off and I've been having a lot of hardware issues and instability despite under clocking the GPU's literally to their lowest possible settings (both core and speed).  If you are having similar issues try under clocking your RAM.  I got that tip directly from the F@H support and I think the guy may have nailed it!  At least I hope so!  I'll know more in about 24 hours.  I may be able to spin up the GPU's again.  They are running ridiculously under clocked right now.

Despite the bumps I'm still trying to beat you into the top 10.   Grin

Update:  GAH.  Still didn't make it three hours before two machines were down again.  LOL.  So the quest for stability continues...

Im running stock values on all hardware I have, yes sometimes the worker and the WU do get lost in space with the result of 99.99%
To resync is the solution, ea pause and fold commands.
I have a 100% fix on this automaticly no human hands on! The last 3 day I have tryed out the fix and it still works.

Do you want to know how?

//Aboy68

Yes any ideas are welcome.  Virtually every morning at least two of my machines are down meaning I can not restart them folding without physically rebooting the machine.  The machine will not respond to remote restarts or keyboard input.  I actually have to press the reset button.  That happens during the day as well but I'm not always available to do anything about it so they sit for hours like that.  At this point I've got the GPU's under clocked to the maximum amount so it is not the GPU's.  It is something in the systems themselves.  Memory, CPU, something.  I've removed the CPU slot on the troubled machines (after the WU finished of course) but it does not seem to have made any difference in terms of stability.

I'm going to gather my logs and present them to the F@H support group to see if they have any ideas to help speed up the process of getting these things running properly.  As a short term fix I may write a program that reads the logs and if too much time goes by before the log is updated it could force a system restart.  Unfortunately I don't think this will work because whatever is happening makes the system so unstable that I don't think it will be able to restart.

Ironically there are no hardware errors or application errors in Windows Event Viewer though.  This has actually been plaguing me since I started but it was the same way with mining.  It took a long time to get the systems to behave.  This will eventually get worked out.

Unfortunately, as a result I'm only running at about 2/3 my expected output, but it is still better than nothing.  lol

If you have a fix I would love to hear about it!

Are the fans working correctly? Might want to get a tool that lets you see VRM temps too, I had a 7970 experiencing a similar issue (back in the mining days) and VRMs were around 117C. Some more work showed that the fan speed in CCC/afterburner/trixx was incorrect, as the fan had a hardware issue and was spinning with a much higher resistance than it should have.

I used GPU-Z to take a peek and it looks like the highest VRM temp on any of the cards in the troubled machines is 58C.  Also based on your recommendation some time ago I did swap the GPU's out of the most troubled machine with the machine next to it.  The most troubled machine is still the machine having the most issues.  Based on all that, I don't think it is the GPU's at this point but it was definitely worth a look.  Thanks for the suggestion.

Ironically though, I ran GPU-Z on several of my machines that don't seem to want to run for very long.  The most troubled machine ended up getting a video driver failure while I was watching.  The machine was still stable afterwards and I was able to remotely reboot it so it was responding appropriately.  Whatever else is happening makes it so unstable it is on a whole different level of ugly.  Definitely not just a video driver failure.

Try underclocking your system RAM. This helped with one of my rigs.

I actually was talking about that in the first post in this sequence and thought it was going to help because the RAM speed in all the machines was at 2133.  I brought it down to 1333 and unfortunately it didn't help.  I think I'm going to run a memory test next and maybe even swap memory between a machine that behaves somewhat well and the least stable machine.  Hard to believe the memory is bad in 2-3 different machines but I need to rule it out.

Another suggestion from the F@H folks was that I could be overloading a rail on the PSU but the computer that is having the most issues has a 1200 watt Corsair which is a single rail PSU.  So the quest continues.

Hi, this is the fix.
I did study a computer that did have stalled working units, I did recoqnise that the percentage you can read in the GUI are really detached from what you find in the log file. What you find in the log file is what results you have from the GPU/WU, that is the true performance. So when a worker are losing sync to a working unit the percentage are increased with the same speed until at reaches 99.99% and stops(in the GUI), thats why you can find out that something is wrong until the timer are reaching 99.99%. So the GUI are not the right place to look for stalled WU's. So if you look deeper in to the computer to the list of processes and particular process FahCore_17.exe, these processes are one for each GPU in your computer. The average CPU usage are around 1-4% and memory size is 200-300 MB ruff numbers.(When the WU are loaded and are folding) When a WU are stalling the CPU activity goes to 0% and stays there until you pause and starts folding again and then the FAHControl.exe are restarting the WU from the last saved file. OK, that's nice you can actually check if the GPU's are folding or not, when the activity are 0% you only need to terminate the stalled FahCore_17.exe and the WU restarts(from last save file). BUT that is hard work to run around and check WU's, so to the end of the story is the automatic solution: Download a software called processlasso and install it. http://bitsum.com/processlasso/
Search and find one process of FahCore_17.exe and right click on it, select menu option "Set watchdog rules for this process", 1: for -CPU, 2: Less than, 3: 1%, 4: 300 sec, 5: terminate the process, 6: Puch button "Create new process watchdog rule". Now the software are terminating the wu that have been inactive for 5 minutes and restarts, this is saved by the software and are restarted every time the computer starts. - YES it works, no more lost hours of stalled WU's Smiley

//Aboy68, by the way - if you change in the motherboard bios settings the PCIe version to version 2, that gives you are more stable system, the timing is not that fast as version 3 and verson 2 have the bandwith we need in foldings, I have this setting on all mother boards, Ver 1 is to slow(I have tryed it)
ChasingTheDream
Sr. Member
****
Offline Offline

Activity: 292
Merit: 250


View Profile
June 03, 2014, 11:23:15 PM
Last edit: June 04, 2014, 03:51:10 AM by ChasingTheDream
 #2203

Hi, this is the fix.
I did study a computer that did have stalled working units, I did recoqnise that the percentage you can read in the GUI are really detached from what you find in the log file. What you find in the log file is what results you have from the GPU/WU, that is the true performance. So when a worker are losing sync to a working unit the percentage are increased with the same speed until at reaches 99.99% and stops(in the GUI), thats why you can find out that something is wrong until the timer are reaching 99.99%. So the GUI are not the right place to look for stalled WU's. So if you look deeper in to the computer to the list of processes and particular process FahCore_17.exe, these processes are one for each GPU in your computer. The average CPU usage are around 1-4% and memory size is 200-300 MB ruff numbers.(When the WU are loaded and are folding) When a WU are stalling the CPU activity goes to 0% and stays there until you pause and starts folding again and then the FAHControl.exe are restarting the WU from the last saved file. OK, that's nice you can actually check if the GPU's are folding or not, when the activity are 0% you only need to terminate the stalled FahCore_17.exe and the WU restarts(from last save file). BUT that is hard work to run around and check WU's, so to the end of the story is the automatic solution: Download a software called processlasso and install it. http://bitsum.com/processlasso/
Search and find one process of FahCore_17.exe and right click on it, select menu option "Set watchdog rules for this process", 1: for -CPU, 2: Less than, 3: 1%, 4: 300 sec, 5: terminate the process, 6: Puch button "Create new process watchdog rule". Now the software are terminating the wu that have been inactive for 5 minutes and restarts, this is saved by the software and are restarted every time the computer starts. - YES it works, no more lost hours of stalled WU's Smiley

//Aboy68, by the way - if you change in the motherboard bios settings the PCIe version to version 2, that gives you are more stable system, the timing is not that fast as version 3 and verson 2 have the bandwith we need in foldings, I have this setting on all mother boards, Ver 1 is to slow(I have tryed it)

Outstanding post Aboy68!  I love automated solutions and this will save me the trouble of having to try to write one myself!  I will give this a shot.  A couple of my computers get extremely unstable when some event happens that I have not been able to identify yet so I'm not sure this will work in my case but I'm definitely going to try it.  I will also try the PCI-E 2 settings.

Thanks again!

Update1:  I've just set up the software exactly as you describe on my machines.  Your instructions were very easy to follow.  Again well done!

Update2:  Immediately had an opportunity for processlasso to take some action on the troubled machine and I got a message that I had been using it for 21,000,000 days so it was deactivated.  lol  UGH.  I would have liked to try it before buying it just to see if would help so I may have to pursue a homegrown approach if / when I ever get time to do it.
Burninj
Legendary
*
Offline Offline

Activity: 1148
Merit: 1000


View Profile
June 03, 2014, 11:28:08 PM
 #2204

Hi, this is the fix.
I did study a computer that did have stalled working units, I did recoqnise that the percentage you can read in the GUI are really detached from what you find in the log file. What you find in the log file is what results you have from the GPU/WU, that is the true performance. So when a worker are losing sync to a working unit the percentage are increased with the same speed until at reaches 99.99% and stops(in the GUI), thats why you can find out that something is wrong until the timer are reaching 99.99%. So the GUI are not the right place to look for stalled WU's. So if you look deeper in to the computer to the list of processes and particular process FahCore_17.exe, these processes are one for each GPU in your computer. The average CPU usage are around 1-4% and memory size is 200-300 MB ruff numbers.(When the WU are loaded and are folding) When a WU are stalling the CPU activity goes to 0% and stays there until you pause and starts folding again and then the FAHControl.exe are restarting the WU from the last saved file. OK, that's nice you can actually check if the GPU's are folding or not, when the activity are 0% you only need to terminate the stalled FahCore_17.exe and the WU restarts(from last save file). BUT that is hard work to run around and check WU's, so to the end of the story is the automatic solution: Download a software called processlasso and install it. http://bitsum.com/processlasso/
Search and find one process of FahCore_17.exe and right click on it, select menu option "Set watchdog rules for this process", 1: for -CPU, 2: Less than, 3: 1%, 4: 300 sec, 5: terminate the process, 6: Puch button "Create new process watchdog rule". Now the software are terminating the wu that have been inactive for 5 minutes and restarts, this is saved by the software and are restarted every time the computer starts. - YES it works, no more lost hours of stalled WU's Smiley

//Aboy68, by the way - if you change in the motherboard bios settings the PCIe version to version 2, that gives you are more stable system, the timing is not that fast as version 3 and verson 2 have the bandwith we need in foldings, I have this setting on all mother boards, Ver 1 is to slow(I have tryed it)

Really really nice to share this!
kingscrown
Hero Member
*****
Offline Offline

Activity: 672
Merit: 500


http://fuk.io - check it out!


View Profile WWW
June 04, 2014, 02:34:27 AM
 #2205

one of coolest startup coins, im sure price will raise. this stuff deserves it.

ChasingTheDream
Sr. Member
****
Offline Offline

Activity: 292
Merit: 250


View Profile
June 04, 2014, 02:37:26 AM
 #2206

one of coolest startup coins, im sure price will raise. this stuff deserves it.

Join us if you haven't already.
Aboy68
Member
**
Offline Offline

Activity: 96
Merit: 10


View Profile
June 04, 2014, 04:48:28 AM
 #2207

Hi, this is the fix.
I did study a computer that did have stalled working units, I did recoqnise that the percentage you can read in the GUI are really detached from what you find in the log file. What you find in the log file is what results you have from the GPU/WU, that is the true performance. So when a worker are losing sync to a working unit the percentage are increased with the same speed until at reaches 99.99% and stops(in the GUI), thats why you can find out that something is wrong until the timer are reaching 99.99%. So the GUI are not the right place to look for stalled WU's. So if you look deeper in to the computer to the list of processes and particular process FahCore_17.exe, these processes are one for each GPU in your computer. The average CPU usage are around 1-4% and memory size is 200-300 MB ruff numbers.(When the WU are loaded and are folding) When a WU are stalling the CPU activity goes to 0% and stays there until you pause and starts folding again and then the FAHControl.exe are restarting the WU from the last saved file. OK, that's nice you can actually check if the GPU's are folding or not, when the activity are 0% you only need to terminate the stalled FahCore_17.exe and the WU restarts(from last save file). BUT that is hard work to run around and check WU's, so to the end of the story is the automatic solution: Download a software called processlasso and install it. http://bitsum.com/processlasso/
Search and find one process of FahCore_17.exe and right click on it, select menu option "Set watchdog rules for this process", 1: for -CPU, 2: Less than, 3: 1%, 4: 300 sec, 5: terminate the process, 6: Puch button "Create new process watchdog rule". Now the software are terminating the wu that have been inactive for 5 minutes and restarts, this is saved by the software and are restarted every time the computer starts. - YES it works, no more lost hours of stalled WU's Smiley

//Aboy68, by the way - if you change in the motherboard bios settings the PCIe version to version 2, that gives you are more stable system, the timing is not that fast as version 3 and verson 2 have the bandwith we need in foldings, I have this setting on all mother boards, Ver 1 is to slow(I have tryed it)

Outstanding post Aboy68!  I love automated solutions and this will save me the trouble of having to try to write one myself!  I will give this a shot.  A couple of my computers get extremely unstable when some event happens that I have not been able to identify yet so I'm not sure this will work in my case but I'm definitely going to try it.  I will also try the PCI-E 2 settings.

Thanks again!

Update1:  I've just set up the software exactly as you describe on my machines.  Your instructions were very easy to follow.  Again well done!

Update2:  Immediately had an opportunity for processlasso to take some action on the troubled machine and I got a message that I had been using it for 21,000,000 days so it was deactivated.  lol  UGH.  I would have liked to try it before buying it just to see if would help so I may have to pursue a homegrown approach if / when I ever get time to do it.


Strange the trial period are 7 days?
ChasingTheDream
Sr. Member
****
Offline Offline

Activity: 292
Merit: 250


View Profile
June 04, 2014, 05:17:35 AM
 #2208

Hi, this is the fix.
I did study a computer that did have stalled working units, I did recoqnise that the percentage you can read in the GUI are really detached from what you find in the log file. What you find in the log file is what results you have from the GPU/WU, that is the true performance. So when a worker are losing sync to a working unit the percentage are increased with the same speed until at reaches 99.99% and stops(in the GUI), thats why you can find out that something is wrong until the timer are reaching 99.99%. So the GUI are not the right place to look for stalled WU's. So if you look deeper in to the computer to the list of processes and particular process FahCore_17.exe, these processes are one for each GPU in your computer. The average CPU usage are around 1-4% and memory size is 200-300 MB ruff numbers.(When the WU are loaded and are folding) When a WU are stalling the CPU activity goes to 0% and stays there until you pause and starts folding again and then the FAHControl.exe are restarting the WU from the last saved file. OK, that's nice you can actually check if the GPU's are folding or not, when the activity are 0% you only need to terminate the stalled FahCore_17.exe and the WU restarts(from last save file). BUT that is hard work to run around and check WU's, so to the end of the story is the automatic solution: Download a software called processlasso and install it. http://bitsum.com/processlasso/
Search and find one process of FahCore_17.exe and right click on it, select menu option "Set watchdog rules for this process", 1: for -CPU, 2: Less than, 3: 1%, 4: 300 sec, 5: terminate the process, 6: Puch button "Create new process watchdog rule". Now the software are terminating the wu that have been inactive for 5 minutes and restarts, this is saved by the software and are restarted every time the computer starts. - YES it works, no more lost hours of stalled WU's Smiley

//Aboy68, by the way - if you change in the motherboard bios settings the PCIe version to version 2, that gives you are more stable system, the timing is not that fast as version 3 and verson 2 have the bandwith we need in foldings, I have this setting on all mother boards, Ver 1 is to slow(I have tryed it)

Outstanding post Aboy68!  I love automated solutions and this will save me the trouble of having to try to write one myself!  I will give this a shot.  A couple of my computers get extremely unstable when some event happens that I have not been able to identify yet so I'm not sure this will work in my case but I'm definitely going to try it.  I will also try the PCI-E 2 settings.

Thanks again!

Update1:  I've just set up the software exactly as you describe on my machines.  Your instructions were very easy to follow.  Again well done!

Update2:  Immediately had an opportunity for processlasso to take some action on the troubled machine and I got a message that I had been using it for 21,000,000 days so it was deactivated.  lol  UGH.  I would have liked to try it before buying it just to see if would help so I may have to pursue a homegrown approach if / when I ever get time to do it.


Strange the trial period are 7 days?

No worries.  It was worth a shot.  I can write something to do something similar if I get some free time.  I'm actually quite concerned with the stability of the machines.  I don't think it would work with how unstable the machines become and I have no idea why.  Nothing in Windows Event Viewer and nothing in the F@H logs.  No errors or warnings.  I'm interacting with the F@H support forum now to see if they have any ideas.

I'll figure it out but I can see it is going to take quite a while.  I suspect my sustainable PPD is going to be about 2 million instead of the 3 million I was doing earlier.  Just no way to keep the machines running to sustain it.  So you better pass me while you can.  lol  Sooner or later I might get the machines running right.
Aboy68
Member
**
Offline Offline

Activity: 96
Merit: 10


View Profile
June 04, 2014, 05:20:50 AM
 #2209

Hi, this is the fix.
I did study a computer that did have stalled working units, I did recoqnise that the percentage you can read in the GUI are really detached from what you find in the log file. What you find in the log file is what results you have from the GPU/WU, that is the true performance. So when a worker are losing sync to a working unit the percentage are increased with the same speed until at reaches 99.99% and stops(in the GUI), thats why you can find out that something is wrong until the timer are reaching 99.99%. So the GUI are not the right place to look for stalled WU's. So if you look deeper in to the computer to the list of processes and particular process FahCore_17.exe, these processes are one for each GPU in your computer. The average CPU usage are around 1-4% and memory size is 200-300 MB ruff numbers.(When the WU are loaded and are folding) When a WU are stalling the CPU activity goes to 0% and stays there until you pause and starts folding again and then the FAHControl.exe are restarting the WU from the last saved file. OK, that's nice you can actually check if the GPU's are folding or not, when the activity are 0% you only need to terminate the stalled FahCore_17.exe and the WU restarts(from last save file). BUT that is hard work to run around and check WU's, so to the end of the story is the automatic solution: Download a software called processlasso and install it. http://bitsum.com/processlasso/
Search and find one process of FahCore_17.exe and right click on it, select menu option "Set watchdog rules for this process", 1: for -CPU, 2: Less than, 3: 1%, 4: 300 sec, 5: terminate the process, 6: Puch button "Create new process watchdog rule". Now the software are terminating the wu that have been inactive for 5 minutes and restarts, this is saved by the software and are restarted every time the computer starts. - YES it works, no more lost hours of stalled WU's Smiley

//Aboy68, by the way - if you change in the motherboard bios settings the PCIe version to version 2, that gives you are more stable system, the timing is not that fast as version 3 and verson 2 have the bandwith we need in foldings, I have this setting on all mother boards, Ver 1 is to slow(I have tryed it)

Outstanding post Aboy68!  I love automated solutions and this will save me the trouble of having to try to write one myself!  I will give this a shot.  A couple of my computers get extremely unstable when some event happens that I have not been able to identify yet so I'm not sure this will work in my case but I'm definitely going to try it.  I will also try the PCI-E 2 settings.

Thanks again!

Update1:  I've just set up the software exactly as you describe on my machines.  Your instructions were very easy to follow.  Again well done!

Update2:  Immediately had an opportunity for processlasso to take some action on the troubled machine and I got a message that I had been using it for 21,000,000 days so it was deactivated.  lol  UGH.  I would have liked to try it before buying it just to see if would help so I may have to pursue a homegrown approach if / when I ever get time to do it.


Display driver recovery happend = one of the folding GPU's did stop = no CPU activity, waiting waiting and there the reset of the process happend.
This is the log file text for this event.

05:09:57:WU04:FS04:0x17:Completed 2550000 out of 5000000 steps (51%)
   REM - Display driver recovery event
   REM - 3min and reset of process.
05:16:20:WARNING:WU04:FS04:FahCore returned: FAILED_2 (1 = 0x1)
05:16:20:WU04:FS04:Starting
05:16:20:WU04:FS04:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 04 -suffix 01 -version 704 -lifeline 4756 -checkpoint 15 -gpu 3 -gpu-vendor ati
05:16:20:WU04:FS04:Started FahCore on PID 2068
05:16:20:WU04:FS04:Core PID:200
05:16:20:WU04:FS04:FahCore 0x17 started
05:16:21:WU04:FS04:0x17:*********************** Log Started 2014-06-04T05:16:21Z ***********************
05:16:21:WU04:FS04:0x17:Project: 9408 (Run 355, Clone 0, Gen 32)
05:16:21:WU04:FS04:0x17:Unit: 0x000000290a3b1e5c5342d6762c91b48a
05:16:21:WU04:FS04:0x17:CPU: 0x00000000000000000000000000000000
05:16:21:WU04:FS04:0x17:Machine: 4
05:16:21:WU04:FS04:0x17:Digital signatures verified
05:16:21:WU04:FS04:0x17:Folding@home GPU core17
05:16:21:WU04:FS04:0x17:Version 0.0.52
05:16:22:WU04:FS04:0x17:  Found a checkpoint file

AND it running again.

If you want to check if there have been any reset events you only tick the Warnings and errors box and then you look at rows like this:
   - 05:16:20:WARNING:WU04:FS04:FahCore returned: FAILED_2 (1 = 0x1)

//Aboy68
ChasingTheDream
Sr. Member
****
Offline Offline

Activity: 292
Merit: 250


View Profile
June 04, 2014, 05:23:41 AM
Last edit: June 04, 2014, 05:33:45 AM by ChasingTheDream
 #2210

Hi, this is the fix.
I did study a computer that did have stalled working units, I did recoqnise that the percentage you can read in the GUI are really detached from what you find in the log file. What you find in the log file is what results you have from the GPU/WU, that is the true performance. So when a worker are losing sync to a working unit the percentage are increased with the same speed until at reaches 99.99% and stops(in the GUI), thats why you can find out that something is wrong until the timer are reaching 99.99%. So the GUI are not the right place to look for stalled WU's. So if you look deeper in to the computer to the list of processes and particular process FahCore_17.exe, these processes are one for each GPU in your computer. The average CPU usage are around 1-4% and memory size is 200-300 MB ruff numbers.(When the WU are loaded and are folding) When a WU are stalling the CPU activity goes to 0% and stays there until you pause and starts folding again and then the FAHControl.exe are restarting the WU from the last saved file. OK, that's nice you can actually check if the GPU's are folding or not, when the activity are 0% you only need to terminate the stalled FahCore_17.exe and the WU restarts(from last save file). BUT that is hard work to run around and check WU's, so to the end of the story is the automatic solution: Download a software called processlasso and install it. http://bitsum.com/processlasso/
Search and find one process of FahCore_17.exe and right click on it, select menu option "Set watchdog rules for this process", 1: for -CPU, 2: Less than, 3: 1%, 4: 300 sec, 5: terminate the process, 6: Puch button "Create new process watchdog rule". Now the software are terminating the wu that have been inactive for 5 minutes and restarts, this is saved by the software and are restarted every time the computer starts. - YES it works, no more lost hours of stalled WU's Smiley

//Aboy68, by the way - if you change in the motherboard bios settings the PCIe version to version 2, that gives you are more stable system, the timing is not that fast as version 3 and verson 2 have the bandwith we need in foldings, I have this setting on all mother boards, Ver 1 is to slow(I have tryed it)

Outstanding post Aboy68!  I love automated solutions and this will save me the trouble of having to try to write one myself!  I will give this a shot.  A couple of my computers get extremely unstable when some event happens that I have not been able to identify yet so I'm not sure this will work in my case but I'm definitely going to try it.  I will also try the PCI-E 2 settings.

Thanks again!

Update1:  I've just set up the software exactly as you describe on my machines.  Your instructions were very easy to follow.  Again well done!

Update2:  Immediately had an opportunity for processlasso to take some action on the troubled machine and I got a message that I had been using it for 21,000,000 days so it was deactivated.  lol  UGH.  I would have liked to try it before buying it just to see if would help so I may have to pursue a homegrown approach if / when I ever get time to do it.


Display driver recovery happend = one of the folding GPU's did stop = no CPU activity, waiting waiting and there the reset of the process happend.
This is the log file text for this event.

05:09:57:WU04:FS04:0x17:Completed 2550000 out of 5000000 steps (51%)
   REM - Display driver recovery event
   REM - 3min and reset of process.
05:16:20:WARNING:WU04:FS04:FahCore returned: FAILED_2 (1 = 0x1)
05:16:20:WU04:FS04:Starting
05:16:20:WU04:FS04:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 04 -suffix 01 -version 704 -lifeline 4756 -checkpoint 15 -gpu 3 -gpu-vendor ati
05:16:20:WU04:FS04:Started FahCore on PID 2068
05:16:20:WU04:FS04:Core PID:200
05:16:20:WU04:FS04:FahCore 0x17 started
05:16:21:WU04:FS04:0x17:*********************** Log Started 2014-06-04T05:16:21Z ***********************
05:16:21:WU04:FS04:0x17:Project: 9408 (Run 355, Clone 0, Gen 32)
05:16:21:WU04:FS04:0x17:Unit: 0x000000290a3b1e5c5342d6762c91b48a
05:16:21:WU04:FS04:0x17:CPU: 0x00000000000000000000000000000000
05:16:21:WU04:FS04:0x17:Machine: 4
05:16:21:WU04:FS04:0x17:Digital signatures verified
05:16:21:WU04:FS04:0x17:Folding@home GPU core17
05:16:21:WU04:FS04:0x17:Version 0.0.52
05:16:22:WU04:FS04:0x17:  Found a checkpoint file

AND it running again.

If you want to check if there have been any reset events you only tick the Warnings and errors box and then you look at rows like this:
   - 05:16:20:WARNING:WU04:FS04:FahCore returned: FAILED_2 (1 = 0x1)

//Aboy68

Yeah when I do it mine looks like this....

*********************** Log Started 2014-06-04T04:17:08Z ***********************

There are literally no errors or warnings.
pastet89
Sr. Member
****
Offline Offline

Activity: 378
Merit: 265


View Profile WWW
June 04, 2014, 07:18:30 AM
 #2211

I'm all for Curecoin, but the implementation IS flawed...devs seem awfully silent and that is not good in the eyes of the public.

There's too much inflation and too little uses for the Curecoins right now, sha256 miners will flock to btc again and the coin won't be secure with PoS alone. Something must be done asap...  Undecided
+1. This coin is dying. Even people are leaving folding as number of coins earned per day is growing each day. Make multipool and make paid advertisment. You have money from the IPO. Contact health organisations, ask for sponsorship. Otherwise this good idea will die.

Cryptostats.es
intrinsic coins
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
June 04, 2014, 09:15:01 AM
 #2212

I'm all for Curecoin, but the implementation IS flawed...devs seem awfully silent and that is not good in the eyes of the public.

There's too much inflation and too little uses for the Curecoins right now, sha256 miners will flock to btc again and the coin won't be secure with PoS alone. Something must be done asap...  Undecided
+1. This coin is dying. Even people are leaving folding as number of coins earned per day is growing each day. Make multipool and make paid advertisment. You have money from the IPO. Contact health organisations, ask for sponsorship. Otherwise this good idea will die.


Well,  start up a exchange for the secondary computation already.  Like reserve 60% of the network for protein folding then leave the rest 40% up for bid.  Then you use then money to do coins buyback.  like 10% at the start,  30% in the middle then 50% at the finish.  Use easy proof of concept such as prime number crunching for demos.     
Tweek
Sr. Member
****
Offline Offline

Activity: 308
Merit: 250

CoinTweak profitability charts


View Profile WWW
June 04, 2014, 09:42:00 AM
 #2213

I'm all for Curecoin, but the implementation IS flawed...devs seem awfully silent and that is not good in the eyes of the public.

There's too much inflation and too little uses for the Curecoins right now, sha256 miners will flock to btc again and the coin won't be secure with PoS alone. Something must be done asap...  Undecided
+1. This coin is dying. Even people are leaving folding as number of coins earned per day is growing each day. Make multipool and make paid advertisment. You have money from the IPO. Contact health organisations, ask for sponsorship. Otherwise this good idea will die.
The whole multipool idea is a direct assault to everything cryptocurrency stands for. Instead of working together to make a coin for everyone, instead a fiat controlled by government and banks, you are directly attacking other coins in order to gain a little wealth for your own. What makes you any better then any of those bankers creating a financial crisis and only covering for themselves?

Contacting health organisations and seeking sponsorships is a great idea though.

pastet89
Sr. Member
****
Offline Offline

Activity: 378
Merit: 265


View Profile WWW
June 04, 2014, 10:25:22 AM
 #2214

I'm all for Curecoin, but the implementation IS flawed...devs seem awfully silent and that is not good in the eyes of the public.

There's too much inflation and too little uses for the Curecoins right now, sha256 miners will flock to btc again and the coin won't be secure with PoS alone. Something must be done asap...  Undecided
+1. This coin is dying. Even people are leaving folding as number of coins earned per day is growing each day. Make multipool and make paid advertisment. You have money from the IPO. Contact health organisations, ask for sponsorship. Otherwise this good idea will die.
The whole multipool idea is a direct assault to everything cryptocurrency stands for. Instead of working together to make a coin for everyone, instead a fiat controlled by government and banks, you are directly attacking other coins in order to gain a little wealth for your own. What makes you any better then any of those bankers creating a financial crisis and only covering for themselves?

Contacting health organisations and seeking sponsorships is a great idea though.

Nothing makes me better than other coins. However, the idea behind curecoin makes curecoin not only better but the best coin so far, because it is the only one with real life application and helping people for something beneficial. And yes, I truly belive for this cause would be nice even if other coins die in order curecoin to survive.

Cryptostats.es
curetheworld
Newbie
*
Offline Offline

Activity: 21
Merit: 0


View Profile
June 04, 2014, 10:43:11 AM
 #2215

How is the voting going at Mintpal?

All new interested ones in the coin must submit there comment and like/vote on Cryptsy
https://cryptsy.freshdesk.com/support/discussions/topics/4000277225
Vorksholk
Legendary
*
Offline Offline

Activity: 1713
Merit: 1029



View Profile WWW
June 04, 2014, 04:46:20 PM
 #2216

I'm all for Curecoin, but the implementation IS flawed...devs seem awfully silent and that is not good in the eyes of the public.

There's too much inflation and too little uses for the Curecoins right now, sha256 miners will flock to btc again and the coin won't be secure with PoS alone. Something must be done asap...  Undecided
+1. This coin is dying. Even people are leaving folding as number of coins earned per day is growing each day. Make multipool and make paid advertisment. You have money from the IPO. Contact health organisations, ask for sponsorship. Otherwise this good idea will die.


Unfortunately health organizations have strict regulations, and any who deal with medicaid/medicare can only accept standard fiat currencies. In the next week or two we'll be rolling out advertisements on coinmarketcap's top banner space.

VeriBlock: Securing The World's Blockchains Using Bitcoin
https://veriblock.org
mistersushi
Member
**
Offline Offline

Activity: 112
Merit: 10


View Profile
June 04, 2014, 07:23:18 PM
 #2217

I'm all for Curecoin, but the implementation IS flawed...devs seem awfully silent and that is not good in the eyes of the public.

There's too much inflation and too little uses for the Curecoins right now, sha256 miners will flock to btc again and the coin won't be secure with PoS alone. Something must be done asap...  Undecided
+1. This coin is dying. Even people are leaving folding as number of coins earned per day is growing each day. Make multipool and make paid advertisment. You have money from the IPO. Contact health organisations, ask for sponsorship. Otherwise this good idea will die.


Unfortunately health organizations have strict regulations, and any who deal with medicaid/medicare can only accept standard fiat currencies. In the next week or two we'll be rolling out advertisements on coinmarketcap's top banner space.

I've already sent out an email to this effect, but I'm not sure it will go anywhere.  Stanford should put its money where its mouth is and offer something(s) of value for CureCoin:  tuition, legal assistance/counsel, medical care, t-shirts, coffee.  You got brilliant minds over there, and a lot of money.  I'm sure they can come up with something.
spiffcow
Full Member
***
Offline Offline

Activity: 308
Merit: 100



View Profile
June 04, 2014, 08:24:31 PM
 #2218

I'm all for Curecoin, but the implementation IS flawed...devs seem awfully silent and that is not good in the eyes of the public.

There's too much inflation and too little uses for the Curecoins right now, sha256 miners will flock to btc again and the coin won't be secure with PoS alone. Something must be done asap...  Undecided
+1. This coin is dying. Even people are leaving folding as number of coins earned per day is growing each day. Make multipool and make paid advertisment. You have money from the IPO. Contact health organisations, ask for sponsorship. Otherwise this good idea will die.


What?  Like they're increasing the total amount of coins paid out for folding each day?  I got a dramatic increase recently, but I thought it was because they finally fixed the points calculation so that I was getting the amount that matched what F@H reported (essentially getting a bigger piece of the pie).  My PPD increased from 400k to 800k over the last 2 days.
ivanlabrie
Hero Member
*****
Offline Offline

Activity: 812
Merit: 1000



View Profile
June 05, 2014, 01:40:09 AM
 #2219

I'm all for Curecoin, but the implementation IS flawed...devs seem awfully silent and that is not good in the eyes of the public.

There's too much inflation and too little uses for the Curecoins right now, sha256 miners will flock to btc again and the coin won't be secure with PoS alone. Something must be done asap...  Undecided
+1. This coin is dying. Even people are leaving folding as number of coins earned per day is growing each day. Make multipool and make paid advertisment. You have money from the IPO. Contact health organisations, ask for sponsorship. Otherwise this good idea will die.


What?  Like they're increasing the total amount of coins paid out for folding each day?  I got a dramatic increase recently, but I thought it was because they finally fixed the points calculation so that I was getting the amount that matched what F@H reported (essentially getting a bigger piece of the pie).  My PPD increased from 400k to 800k over the last 2 days.

Hush!  Grin
Aboy68
Member
**
Offline Offline

Activity: 96
Merit: 10


View Profile
June 05, 2014, 09:12:38 AM
 #2220

Im currently folding in 4 different cancer projects, 9406, 9408, 13000 and 13001.

Are there any page where we could find how different project are progressing?
Now when Im folding, Im kind of part of the project groups Smiley
Investing money and time - and that are significant for investments.

If you streatch this - what are my shares in a project, ea 9406?

//Aboy68
Pages: « 1 ... 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 [111] 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 ... 233 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!