Doff
|
|
April 12, 2012, 03:41:34 AM |
|
I have a segfault issue id like to see if anyone can help me fix. The segfault only happens when I try to use the 2.4, or 2.5 SDK Drivers with the 12.3 drivers, on Debian/unstable. What is odd is even when I recompile with the 2.6 SDK it still doesn't work I have to reinstall the drivers to get it to stop segfaulting. Witch leads me to believe I'm doing something wrong with the SDKs, Although I switch between 2.4, and 2.5 without an issue using a stable build of Debian with older ATI drivers, so I'm basically at a loss.
Im using a single 5850 on this machine., and cgminer compiles fine, and even shows the correct SDK loaded when I compile for 2.4. The 2.5 just flat out segfaults even with -n.
Id really like to get the 2.4/2.5 SDK working with the 12.3 drivers if possible.
Thanks
Doff
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 13, 2012, 05:56:27 AM |
|
I have a segfault issue id like to see if anyone can help me fix. The segfault only happens when I try to use the 2.4, or 2.5 SDK Drivers with the 12.3 drivers, on Debian/unstable. What is odd is even when I recompile with the 2.6 SDK it still doesn't work I have to reinstall the drivers to get it to stop segfaulting. Witch leads me to believe I'm doing something wrong with the SDKs, Although I switch between 2.4, and 2.5 without an issue using a stable build of Debian with older ATI drivers, so I'm basically at a loss.
Im using a single 5850 on this machine., and cgminer compiles fine, and even shows the correct SDK loaded when I compile for 2.4. The 2.5 just flat out segfaults even with -n.
Id really like to get the 2.4/2.5 SDK working with the 12.3 drivers if possible.
Thanks
Doff
12.3 is your problem. It's a stinker. Drop down to 12.1 or 12.2 if you need 79x0 support.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 13, 2012, 05:59:03 AM |
|
Woke up to a hung cgminer 0% activity and no hashing The log below hasn't changed in the last 5 hours, so I've restarted cgminer (had to X out of the window, as "q" wasn't working) GPU 0/1 = 5970 GPU 2 = 5830 GPU 3 = 5830 GPU 4 = 5830 [2012-04-09 04:07:28] Accepted 00000000.398219cb.0ccf2195 GPU 4 thread 9 [2012-04-09 04:07:30] Accepted 00000000.5bb7eedb.2e813ef8 GPU 3 thread 7 [2012-04-09 04:07:30] GPU 3 stopped reporting fanspeed [2012-04-09 04:07:30] Will attempt to re-initialise ADL [2012-04-09 04:07:30] ADL re-initialisation complete [2012-04-09 04:07:32] Accepted 00000000.031c03e9.d02f5836 GPU 3 thread 6 [2012-04-09 04:07:32] GPU 3 stopped reporting fanspeed [2012-04-09 04:07:32] Will attempt to re-initialise ADL [2012-04-09 04:07:32] ADL re-initialisation complete Well that settles it, I cannot successfully re-initialise ADL. I haven't said it for a while since I've been away for a week, but thanks AMD I guess the other solution is for cgminer to completely restart with all its original settings. Would people like cgminer to attempt to do this? The problem with doing this unconditionally is that if a GPU has hung, usually the other GPUs can keep mining, but if you try to stop cgminer, they all stop mining. So I would need to make it try to restart itself from scratch only if it hasn't got a dead GPU. Comments?
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 13, 2012, 06:00:47 AM |
|
Huge issue with cgminer cant shut down gpu if its overheats. Cgminer trying it to disable the gpu over and over again, but its continuing to mine! I must have broken it when I instituted the REST followed by restart if it detected overheat. Unless of course it overheated, cooled enough and then restarted over and over again in short bursts? Was it submitting shares at the same rate?
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 13, 2012, 06:06:14 AM |
|
I'm a bit out of the loop, but, did something drastic change in 2.3.2 ? I have rigs reporting as high as 170% Efficiency right now, with 125-150% being the average range for the others.Win7 x64, CGMiner 2.3.2 5 rigs each with either SDK 2.1 or 2.4 and CAT 11.12 on all. Mixed 5xxx/6xxx series card machines, exclusive 5xxx series card machines and exclusive 6xxx series card machines. The performance is not specific to any card/config. The ONLY thing they all have in common is the 11.12 driver and CGminer version 2.3.2. WTF ?I double checked and didn't rely on the Efficiency% readout only. The ratio of GetWork's to Accepted's (even with Rejected included/excluded) does not lie. Colour me impressed I'm guessing your pool operator just changed software or settings at their end. The efficiency shouldn't have changed much from 2.3.1 to 2.3.2
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 13, 2012, 06:11:35 AM |
|
Why not leave autofan on ?
That way it maintains a constant temperature for the GPUs which should be good for longer-term usage.
Default ati algorithm is better imo and save fan & gpu for longer-term usage as well. With cgminer's autofan on they're less efficient. For example, when 50% is enough for one card for keeping good temp, cgminer switch it to 60%. Or when more speed is needed to keep temp below 69C ati algo spinning it at 48%, and keeping great temp., cgminer switch it to 40% and gpu starting to slowly overheating. These are real examples from my rig. So no, thanks. And yes, as DeathAndTaxes said, auto-fan don't fix the bug. That's funny. Cgminer only goes to the temperature you choose and if you don't choose a temperature, it will use 75 degrees, so of course it will go over 69C. Unless of course you set 69 C. * ckolivas shrugs
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
tnkflx
|
|
April 13, 2012, 06:36:30 AM |
|
blabla
Well that settles it, I cannot successfully re-initialise ADL. I haven't said it for a while since I've been away for a week, but thanks AMD I guess the other solution is for cgminer to completely restart with all its original settings. Would people like cgminer to attempt to do this? The problem with doing this unconditionally is that if a GPU has hung, usually the other GPUs can keep mining, but if you try to stop cgminer, they all stop mining. So I would need to make it try to restart itself from scratch only if it hasn't got a dead GPU. Comments? Yes, I would like to see this ;-) Also, restart with all stats still available if possible ;-) I will donate 10 BTC for this feature ;-)
|
| Operating electrum.be & us.electrum.be |
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 13, 2012, 09:47:49 AM |
|
blabla
Well that settles it, I cannot successfully re-initialise ADL. I haven't said it for a while since I've been away for a week, but thanks AMD I guess the other solution is for cgminer to completely restart with all its original settings. Would people like cgminer to attempt to do this? The problem with doing this unconditionally is that if a GPU has hung, usually the other GPUs can keep mining, but if you try to stop cgminer, they all stop mining. So I would need to make it try to restart itself from scratch only if it hasn't got a dead GPU. Comments? Yes, I would like to see this ;-) Also, restart with all stats still available if possible ;-) I will donate 10 BTC for this feature ;-) Restart with all stats carried over might be more than a little messy, and much more prone to failure...
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
BCMan
|
|
April 13, 2012, 12:49:35 PM |
|
Huge issue with cgminer cant shut down gpu if its overheats. Cgminer trying it to disable the gpu over and over again, but its continuing to mine! I must have broken it when I instituted the REST followed by restart if it detected overheat. Unless of course it overheated, cooled enough and then restarted over and over again in short bursts? Was it submitting shares at the same rate? No, it never stops, shares are submitted with same speed at same temp. I've tested identical config with 2.2.1 and it worked without any issues.
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 13, 2012, 03:00:58 PM |
|
I must have broken it when I instituted the REST followed by restart if it detected overheat. Unless of course it overheated, cooled enough and then restarted over and over again in short bursts? Was it submitting shares at the same rate?
No, it never stops, shares are submitted with same speed at same temp. I've tested identical config with 2.2.1 and it worked without any issues. Okay thanks. I have reviewed the code in question and indeed it would not cut out. I have fixed this bug in the current git tree.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
P_Shep
Legendary
Offline
Activity: 1795
Merit: 1208
This is not OK.
|
|
April 13, 2012, 10:33:17 PM |
|
Kano,
You've got in the devs list, 'ID' and 'BFL'/'ICA' etc. to give us some number. What's the difference between these numbers? Are they equivilent, or might you get something like: ID 0, BFL 0 ID 1, BFL 1 ID 2, ICA 0 ...
And when you request data with gpu|N or PGA|N, is that N 'ID' or 'BFL'/'ICA' etc?
Thanks
|
|
|
|
kano
Legendary
Offline
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
|
|
April 14, 2012, 01:38:19 AM |
|
Kano,
You've got in the devs list, 'ID' and 'BFL'/'ICA' etc. to give us some number. What's the difference between these numbers? Are they equivilent, or might you get something like: ID 0, BFL 0 ID 1, BFL 1 ID 2, ICA 0 ...
And when you request data with gpu|N or PGA|N, is that N 'ID' or 'BFL'/'ICA' etc?
Thanks
Not quite It's: PGA, Name, ID The PGA number starts at 0 and goes up to the number of PGA devices -1 The ID matches the order on the screen - and it's the cgminer internal sequential device_id The point of PGA is that you send PGA only commands and they refer to the PGA devices in device_id order but not skipping numbers. Thus it's always just a simple number range starting at 0. So on my rig (2x6950 + 2xIcarus): GPU=0,...| GPU=1,...| PGA=0,Name=ICA,ID=2,...| PGA=1,Name=ICA,ID=3,...|
|
|
|
|
Doff
|
|
April 14, 2012, 01:44:26 AM |
|
I get the same error in on 12.1, 12.2, and 12.3 I even thought maybe I didn't unzip my files in root or something and redid all the SDKs but no luck. 2.4 Looks like it will almost work then Segfaults just about as it looks like its going to LP. Is there more debugging info I can provide somehow? -n just goes right to segfault with 2.5, and 2.4 actually looks like it works when you -n but doesn't once you start it up. Sigh.. I have a segfault issue id like to see if anyone can help me fix. The segfault only happens when I try to use the 2.4, or 2.5 SDK Drivers with the 12.3 drivers, on Debian/unstable. What is odd is even when I recompile with the 2.6 SDK it still doesn't work I have to reinstall the drivers to get it to stop segfaulting. Witch leads me to believe I'm doing something wrong with the SDKs, Although I switch between 2.4, and 2.5 without an issue using a stable build of Debian with older ATI drivers, so I'm basically at a loss.
Im using a single 5850 on this machine., and cgminer compiles fine, and even shows the correct SDK loaded when I compile for 2.4. The 2.5 just flat out segfaults even with -n.
Id really like to get the 2.4/2.5 SDK working with the 12.3 drivers if possible.
Thanks
Doff
12.3 is your problem. It's a stinker. Drop down to 12.1 or 12.2 if you need 79x0 support.
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 14, 2012, 01:47:34 AM |
|
I get the same error in on 12.1, 12.2, and 12.3 I even thought maybe I didn't unzip my files in root or something and redid all the SDKs but no luck. 2.4 Looks like it will almost work then Segfaults just about as it looks like its going to LP. Is there more debugging info I can provide somehow? -n just goes right to segfault with 2.5, and 2.4 actually looks like it works when you -n but doesn't once you start it up. Sigh.. Basically the issue is you have a corrupt sdk installation, usually a mixture of files from multiple SDKs and not a complete installation of any one. You've gotta clear it all out somehow...
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
Internet151
|
|
April 14, 2012, 05:56:38 AM |
|
I updated cgminer from version 2.3.1 to 2.3.2 on a few of my windows 7 machines recently, and let it run for over 8 days straight (which is around when ADL normally fails). Now instead of just ADL failing, cgminer completly stops hashing and refuses to respond to any keyboard input, but the API still works.
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 14, 2012, 06:23:41 AM |
|
I updated cgminer from version 2.3.1 to 2.3.2 on a few of my windows 7 machines recently, and let it run for over 8 days straight (which is around when ADL normally fails). Now instead of just ADL failing, cgminer completly stops hashing and refuses to respond to any keyboard input, but the API still works.
Bug's been mentioned. Working on it now.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
freakfantom
Newbie
Offline
Activity: 73
Merit: 0
|
|
April 14, 2012, 06:24:21 AM |
|
I updated cgminer from version 2.3.1 to 2.3.2 on a few of my windows 7 machines recently, and let it run for over 8 days straight (which is around when ADL normally fails). Now instead of just ADL failing, cgminer completly stops hashing and refuses to respond to any keyboard input, but the API still works.
It's true. So I just restart cgminer every seven days, usually on saturdays lol
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 15, 2012, 12:44:52 AM Last edit: April 15, 2012, 08:01:16 AM by ckolivas |
|
Version 2.3.3 - April 15, 2012
Human readable summary: - Over temperature GPUs that should have had mining suspended but did not, should now be fixed. - Windows lusers that had the ATI Display Library fail and stop reporting fan speed, which would then cause cgminer to just abruptly stop, should now have cgminer spontaneously restart from scratch if it detects this mode of AMD failure. It is a gamble, but should work based on feedback from people that had this problem. - There is now a restart cgminer option within cgminer. - When mining with more than 8 devices, the display will only show a summary instead of corruption.
Full changelog: - Don't even display that cpumining is disabled on ./configure to discourage people from enabling it. - Do a complete cgminer restart if the ATI Display Library fails, as it does on windows after running for some time, when fanspeed reporting fails. - Cache the initial arguments passed to cgminer and implement an attempted restart option from the settings menu. - Disable per-device status lines when there are more than 8 devices since screen output will be corrupted, enumerating them to the log output instead at startup. - Reuse Vals[] array more than W[] till they're re-initialised on the second sha256 cycle in poclbm kernel. - Minor variable alignment in poclbm kernel. - Make sure to disable devices with any status not being DEV_ENABLED to ensure that thermal cutoff code works as it was setting the status to DEV_RECOVER. - Re-initialising ADL simply made the driver fail since it is corruption over time within the windows driver that's responsible. Revert "Attempt to re-initialise ADL should a device that previously reported fanspeed stops reporting it." - Microoptimise poclbm kernel by ordering Val variables according to usage frequency.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
Krak
|
|
April 15, 2012, 04:23:54 AM Last edit: April 15, 2012, 05:19:31 AM by Krak |
|
Installed 2.3.3 and it still shows "cgminer version 2.3.2" for the Ubuntu binary. Also been getting some weird screen lag on my display GPU, but it's still set at dynamic intensity like it's always been. The display GPU is getting about half the hashrate that it usually gets too.
EDIT: It looks like this was triggered by my trying out a new pool and changing from d,7 intensities and -g 1 to d,8 and default threads. Switching back to a p2pool node with d,7 and -g 1 again fixed it.
|
BTC: 1KrakenLFEFg33A4f6xpwgv3UUoxrLPuGn
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4228
Merit: 1644
Ruu \o/
|
|
April 15, 2012, 07:20:09 AM |
|
Installed 2.3.3 and it still shows "cgminer version 2.3.2" for the Ubuntu binary. Also been getting some weird screen lag on my display GPU, but it's still set at dynamic intensity like it's always been. The display GPU is getting about half the hashrate that it usually gets too.
EDIT: It looks like this was triggered by my trying out a new pool and changing from d,7 intensities and -g 1 to d,8 and default threads. Switching back to a p2pool node with d,7 and -g 1 again fixed it.
The version number was the only thing missing from the binary. I've reuploaded it (it's the same but just shows 2.3.3).
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
|