-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
February 09, 2012, 11:12:11 PM |
|
Thanks for the reply ckolivas and again thanks for the good work.. If I remember right, with version 1.5.1 or a bit later you've upgraded the kernels and again with version 2.2.1or3. I don't blame you for the performance decrease ckolivas! I just want to find out what other users do for best performance, what the best config is. Btw.. do you still support SDK 2.1, in your FAQ you mention 2.4/2.5 only.
The kernel updates recently were purely bugfixes for platforms they wouldn't work SDK 2.1 should work fine for 5x cards with poclbm. I don't think they work with phatk.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
Endeavour79
|
|
February 09, 2012, 11:14:13 PM |
|
Thanks for the reply ckolivas and again thanks for the good work.. If I remember right, with version 1.5.1 or a bit later you've upgraded the kernels and again with version 2.2.1or3. I don't blame you for the performance decrease ckolivas! I just want to find out what other users do for best performance, what the best config is. Btw.. do you still support SDK 2.1, in your FAQ you mention 2.4/2.5 only.
The kernel updates recently were purely bugfixes for platforms they wouldn't work SDK 2.1 should work fine for 5x cards with poclbm. I don't think they work with phatk. Thanks. Will try 11.6 with SDK 2.1 then. Post an update soon.. Btw.. 1.5.x was best performance ever..
|
NSW, Australia - Rigs, Mining, Pools - Local help needed? Send me a message!
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
February 09, 2012, 11:15:19 PM |
|
Thanks for the reply ckolivas and again thanks for the good work.. If I remember right, with version 1.5.1 or a bit later you've upgraded the kernels and again with version 2.2.1or3. I don't blame you for the performance decrease ckolivas! I just want to find out what other users do for best performance, what the best config is. Btw.. do you still support SDK 2.1, in your FAQ you mention 2.4/2.5 only.
The kernel updates recently were purely bugfixes for platforms they wouldn't work SDK 2.1 should work fine for 5x cards with poclbm. I don't think they work with phatk. Thanks. Will try 11.6 with SDK 2.1 then. Post an update soon.. Btw.. 1.5.x was best performance ever.. I seriously cannot see how that can happen...
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
gnar1ta$
Donator
Hero Member
Offline
Activity: 798
Merit: 500
|
|
February 09, 2012, 11:25:12 PM |
|
Anyone know how to solve rcocchiararo's "no protocol specified" issue? I've reformatted a couple rigs cuz I couldn't figure that one out. Sometimes host + works, but usually not.
|
Losing hundreds of Bitcoins with the best scammers in the business - BFL, Avalon, KNC, HashFast.
|
|
|
rcocchiararo
Newbie
Offline
Activity: 78
Merit: 0
|
|
February 09, 2012, 11:27:16 PM |
|
something to add to my issue:
1) autologin is enabled on that rig
2) if a display is attached, i can remote controll it with vnc (ubuntu/debian integrated) 3) if no display is attached, that wont work
|
|
|
|
kano
Legendary
Offline
Activity: 4620
Merit: 1851
Linux since 1997 RedHat 4
|
|
February 09, 2012, 11:46:54 PM |
|
Didn't' forget about this I ran 2.1.2 with no errors for 18 hours then started 2.2.3 yesterday with the same flags and got this: Weird how las initialised is slightly after start time, but it wasn't disabled for a few hours. You might be on to something here. Both 2.2.2 and 2.2.3 will not even start on the machine I compile on they fail to initialize the GPUs never tried on the others I have. Just an aside ... don't try to use 2.2.2
|
|
|
|
Diapolo
|
|
February 10, 2012, 05:54:14 AM |
|
Hey Con, I looked again through every kernel argument and compared line by line with my Python code. I found 2 small differences and 2 brackets, that are not needed (see last commit https://github.com/Diapolo/cgminer/commit/68e36c657318fbe1e7714be470cf954a1d512333), but I guess they don't fix the persisting problem with false-positive nonces (perhaps you can give it a try - I have no compiler or IDE setup to test it by myself). The argument order is exactly as DiaKGCN awaits it, so that can't be the problem either. It could be a problem of your changes to the output code in the kernel, a problem with the base-nonces, who are passed to the kernel or something with the output-buffer in the CGMINER host code ... :-/. Where resides the output-buffer processing? As I said my kernel used ulong * natively, which I changed to uint * in one commit of my fork, I guess I need to look at it. Edit: OMFG, I introduced a bug with one of my former commits, which changed the type of the output buffer from uint * to int * ... fixed that one! It's time for another try Con . Dia Diapolo... I appreciate the effort you're putting in, and I realise you're new to this collaborative coding and source control management, but probably a good idea to see your code actually compiles before you ask someone to test it. Usually people compile and test their own code before asking someone else to test it for them. Anyway... I fixed the !(find) in my local copy and it still produces hardware errors. edit: It doesn't matter what vectors or worksize I try this with. Well ... as I said, I have no IDE setup, so currently I can't compile a version for myself. If you don't have the time to fiddle around with my commits, then I really need help in setting up an IDE in Windows. Have you got this in a readme, wiki or can you give me a brief explanation in how to do this? I worked with MS VC++ Express as a hobby some time ago ... You said local copy, is it a copy of the last version of my fork? As you've observed I am new to this kind of working, but I hope you see my progress . Dia
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
February 10, 2012, 05:59:47 AM |
|
Well ... as I said, I have no IDE setup, so currently I can't compile a version for myself. If you don't have the time to fiddle around with my commits, then I really need help in setting up an IDE in Windows. Have you got this in a readme, wiki or can you give me a brief explanation in how to do this? I worked with MS VC++ Express as a hobby some time ago ... You said local copy, is it a copy of the last version of my fork? As you've observed I am new to this kind of working, but I hope you see my progress . Dia Compiling this on windows is nothing short of a DISASTER so forget it. Anyway I fixed up a few things on my Diapolo branch on github. Pull the changes to bring your local tree into sync. Alas I'm still only getting HW errors, so there's clearly something wrong. The return code for giving me a nonce I use works fine, provided I'm testing for the right thing before sending the nonce back. I've stared at it for half a day and can't find what's wrong. I even tried diablo's kernel and encountered exactly the same problem. For some reason I keep thinking it's something to do with confusion about the initial offset of the nonce and what is passed to the kernel.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
Diapolo
|
|
February 10, 2012, 06:27:55 AM Last edit: February 10, 2012, 06:45:23 AM by Diapolo |
|
Well ... as I said, I have no IDE setup, so currently I can't compile a version for myself. If you don't have the time to fiddle around with my commits, then I really need help in setting up an IDE in Windows. Have you got this in a readme, wiki or can you give me a brief explanation in how to do this? I worked with MS VC++ Express as a hobby some time ago ... You said local copy, is it a copy of the last version of my fork? As you've observed I am new to this kind of working, but I hope you see my progress . Dia Compiling this on windows is nothing short of a DISASTER so forget it. Anyway I fixed up a few things on my Diapolo branch on github. Pull the changes to bring your local tree into sync. Alas I'm still only getting HW errors, so there's clearly something wrong. The return code for giving me a nonce I use works fine, provided I'm testing for the right thing before sending the nonce back. I've stared at it for half a day and can't find what's wrong. I even tried diablo's kernel and encountered exactly the same problem. For some reason I keep thinking it's something to do with confusion about the initial offset of the nonce and what is passed to the kernel. Okay, so as I wrote, if Phatk works, then the base-nonces passed to the kernel should be correct for diakgcn. I will check the phatk.cl to be sure. I saw you added a BITALIGN path to diakgcn, that's not using bitalign() or any other OpenCL function, but simply does it's thing directly. What is that for, i'm not sure if that's needed for a GCN kernel anyway . Another idea, are you applying a BFI_INT patch on Tahiti (it must not use amd_bytealign())? This is not needed and produces wrong values ... I want that damn thing working , I stared at it quite a few hours too ^^. Edit: Perhaps we could try my old approach of writing to output in the kernel, because I know that worked for me? That's the code I used, but uses your NFLAG. It would need to scan the output buffer on host side everytime after a kernel execution, which could lead to higher CPU usage (and needs changes in host code), but saves the IF-clause and another write into output (which saves the kernel quite some instructions, even on GCN). u result = (V[7] == 0x136032ed) * nonce; output[NFLAG & result] = result;
This code would be more like your current code, but uses the approach of comparison and mul to save 0 or a positive nonce in result (and is slower than your current code). But for sure that can't be the problem we are looking for ... u result = (V[7] == 0x136032ed) * nonce; if (result) output[FOUND] = output[NFLAG & result] = result;
Dia
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
February 10, 2012, 06:45:46 AM |
|
Okay, so as I wrote, if Phatk works, then the base-nonces passed to the kernel should be correct for diakgcn. I will check the phatk.cl to be sure. I saw you added a BITALIGN path to diakgcn, that's not using bitalign() or any other OpenCL function, but simply does it's thing directly. What is that for, i'm not sure if that's needed for a GCN kernel anyway . Another idea, are you applying a BFI_INT patch on Tahiti (it must not use amd_bytealign())? This is not needed and produces wrong values ... I want that damn thing working , I stared at it quite a few hours too ^^. BITALIGN is to enable amd media ops for platforms that have it. BFI INT patching does NOT work on Tahiti. It makes a corrupt kernel. SDK2.6 automatically uses the BFI INT instruction anyway so there is no need for this crappy patching.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
February 10, 2012, 06:49:36 AM |
|
Edit: Perhaps we could try my old approach of writing to output in the kernel, because I know that worked for me? That's the code I used, but uses your NFLAG. It would need to scan the output buffer on host side everytime after a kernel execution, which could lead to higher CPU usage (and needs changes in host code), but saves the IF-clause and another write into output (which saves the kernel quite some instructions, even on GCN). u result = (V[7] == 0x136032ed) * nonce; output[NFLAG & result] = result;
This code would be more like your current code, but uses the approach of comparison and mul to save 0 or a positive nonce in result (and is slower than your current code). But for sure that can't be the problem we are looking for ... u result = (V[7] == 0x136032ed) * nonce; if (result) output[FOUND] = output[NFLAG & result] = result;
Dia Writing to output on every iteration isn't going to fix the problem, and I can't see how this would help to be honest. Note that your last code will end up setting output[FOUND] to 0 and would undo anything you wrote to it with other threads
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
Diapolo
|
|
February 10, 2012, 07:11:21 AM |
|
Okay, so as I wrote, if Phatk works, then the base-nonces passed to the kernel should be correct for diakgcn. I will check the phatk.cl to be sure. I saw you added a BITALIGN path to diakgcn, that's not using bitalign() or any other OpenCL function, but simply does it's thing directly. What is that for, i'm not sure if that's needed for a GCN kernel anyway . Another idea, are you applying a BFI_INT patch on Tahiti (it must not use amd_bytealign())? This is not needed and produces wrong values ... I want that damn thing working , I stared at it quite a few hours too ^^. BITALIGN is to enable amd media ops for platforms that have it. BFI INT patching does NOT work on Tahiti. It makes a corrupt kernel. SDK2.6 automatically uses the BFI INT instruction anyway so there is no need for this crappy patching. That's what I said, to be sure that BFI_INT patching is DISABLED for Tahiti, I thought it coule be active, so that would have been a problem . BITALIGN flag is not needed for DiaKGCN, because amd_bitalign() is use nowhere ... cl_amd_media_ops is only needed for doing BFI_INT patching on non GCN hardware (where amd_bytealign() is "patched" into bfi_int instruction). The amd_bitalign() was used with former SDKs to speedup the rotations, these are now optimized via the OpenCL compiler into bitalign anyway. Dia
|
|
|
|
Diapolo
|
|
February 10, 2012, 07:21:22 AM |
|
Edit: Perhaps we could try my old approach of writing to output in the kernel, because I know that worked for me? That's the code I used, but uses your NFLAG. It would need to scan the output buffer on host side everytime after a kernel execution, which could lead to higher CPU usage (and needs changes in host code), but saves the IF-clause and another write into output (which saves the kernel quite some instructions, even on GCN). u result = (V[7] == 0x136032ed) * nonce; output[NFLAG & result] = result;
This code would be more like your current code, but uses the approach of comparison and mul to save 0 or a positive nonce in result (and is slower than your current code). But for sure that can't be the problem we are looking for ... u result = (V[7] == 0x136032ed) * nonce; if (result) output[FOUND] = output[NFLAG & result] = result;
Dia Writing to output on every iteration isn't going to fix the problem, and I can't see how this would help to be honest. Note that your last code will end up setting output[FOUND] to 0 and would undo anything you wrote to it with other threads You are right on the last code, at least I did no commit for it :-P. Okay, phatk has const u base, which DiaKGCN has too, if GOFFSET is not set, which is currently always true. phatk: base + get_local_id(0) + get_group_id(0) * (WORKSIZE); diakgcn: ((uint)get_group_id(0) * (uint)get_local_size(0)) + (uint)get_local_id(0) + base; I have the (uint) cast, because the returned size_t can have 64 bits on some systems. The brakets I have for group-id * local_size should be unneded, because of * before +. Guess no problem here and as we discussed, output writing should be okay for now. Dia
|
|
|
|
Peao
Legendary
Offline
Activity: 1320
Merit: 1001
|
|
February 10, 2012, 11:04:46 AM |
|
Didn't' forget about this I ran 2.1.2 with no errors for 18 hours then started 2.2.3 yesterday with the same flags and got this: (...) Weird how las initialised is slightly after start time, but it wasn't disabled for a few hours. You might be on to something here. I had to stay on 2.1.2. It's weird just us having reported this problem. I believe that when people with 5970 rigs running Linux update cgminer, more posts like ours will appear.
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
February 10, 2012, 11:18:17 AM |
|
Didn't' forget about this I ran 2.1.2 with no errors for 18 hours then started 2.2.3 yesterday with the same flags and got this: (...) Weird how las initialised is slightly after start time, but it wasn't disabled for a few hours. You might be on to something here. I had to stay on 2.1.2. It's weird just us having reported this problem. I believe that when people with 5970 rigs running Linux update cgminer, more posts like ours will appear. Working on i t... workaround for now is to set fan to fixed speed and disable --auto-fan
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
der__flo
Newbie
Offline
Activity: 9
Merit: 0
|
|
February 10, 2012, 11:24:46 AM |
|
Didn't' forget about this I ran 2.1.2 with no errors for 18 hours then started 2.2.3 yesterday with the same flags and got this: (...) Weird how las initialised is slightly after start time, but it wasn't disabled for a few hours. You might be on to something here. I had to stay on 2.1.2. It's weird just us having reported this problem. I believe that when people with 5970 rigs running Linux update cgminer, more posts like ours will appear. Same here, 2x 6770 on Ubuntu 11.04 ~ 443 MH/s With 2.2.3 GPU 0 stops working after a few hours. When i restart cgminer it works again for a few hours. With 2.1.2 both cards are working fine since 30 hrs now...
|
|
|
|
Peao
Legendary
Offline
Activity: 1320
Merit: 1001
|
|
February 10, 2012, 01:09:08 PM |
|
Working on i t... workaround for now is to set fan to fixed speed and disable --auto-fan
Thank you, ck. No need to hurry. I know you are very busy these days.
|
|
|
|
-ck (OP)
Legendary
Offline
Activity: 4284
Merit: 1645
Ruu \o/
|
|
February 10, 2012, 03:33:23 PM |
|
Working on the 7970 tuning, I have tried to port both the diapolo and diablo kernels to cgminer. Alas neither of them are actually working yet, so instead I started modifying the existing poclbm kernel in cgminer to improve throughput. This should work on other GPUs as well as the 7970, but I have no idea if it's better or worse than phatk. When it's released it will get a new date/version number, but I haven't changed the number right now so that people can download it now and give it a try: https://raw.github.com/ckolivas/cgminer/kernels/poclbm120203.clRemember to delete any .bin files and if you're not on a 7970 with the latest cgminer, you'll have to tell it to use that kernel with -k poclbm. So what's the damage? Well on the 7970 at 1200/1050 clocks, which was getting 694MHash, it's now getting 711Mhash. The 7970 has this unusual behaviour where the hashrate slowly rises for the first 5-10 minutes.
|
Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel 2% Fee Solo mining at solo.ckpool.org -ck
|
|
|
johnyj
Legendary
Offline
Activity: 1988
Merit: 1012
Beyond Imagination
|
|
February 10, 2012, 04:07:13 PM |
|
2.2.3 had strange behavior on my fixed-fan-rate rig, 10% lower hash rate on 5870s and crash on 5970
|
|
|
|
Proofer
Member
Offline
Activity: 266
Merit: 36
|
|
February 10, 2012, 04:50:50 PM |
|
Didn't' forget about this I ran 2.1.2 with no errors for 18 hours then started 2.2.3 yesterday with the same flags and got this: (...) Weird how las initialised is slightly after start time, but it wasn't disabled for a few hours. You might be on to something here. I had to stay on 2.1.2. It's weird just us having reported this problem. I believe that when people with 5970 rigs running Linux update cgminer, more posts like ours will appear. Must be something other than 5970+Linux: I've been running 2.2.3 on a 3x5970 rig for several days without problems. (Well, I did have one DEAD core but I had that occasionally with previous versions as well.) With p2pool: "intensity" : "9", "gpu-threads" : "1", "gpu-engine" : "810-810", "gpu-memclock" : "200", "gpu-fan" : "85", "temp-cutoff" : "80", "temp-overheat" : "77", "temp-target" : "70", "temp-hysteresis" : "3", "auto-fan" : true, "auto-gpu" : true,
|
|
|
|
|