I was trying to identify linked GPUs with this latest code to prevent frying a GPU by disabling the one that had the fan control while the one without the fan control was still burning. Hmmmm D&T... I was thinking that I can't fix the ordering, as that happens on a bios+mobo+os+driver+/- crossfire basis There is a way though... hmm but it would require running through all the ADL code first and then renumbering everything... hmmmmmm * ckolivas hmms some more... I think I have a way. * ckolivas ponders. BBIAW...
|
|
|
Thanks very much for that D&T
You said:
A iBusNumber 10 B iBusNumber 14 C iBusNumber 9 D iBusNumber 13 E iBusNumber 5 F iBusNumber 6
and: GPU #0 & GPU #2 = 2nd physical card (2nd expansion slot from CPU) GPU #1 & GPU #3 = 3rd physical card (3rd expansion slot from CPU) GPU #5 & GPU #6 = 1st physical card (1st expansion slot from CPU)
This is reassuring. Using my existing logic, cgminer should already be grouping A+C, B+D and E+F which corresponds with your layout. This should actually already be working. GPU 0 and GPU 2 should be showing the same fanspeed and so on. Is it not working?
I'm trying to recall if you watercooled or something, because cgminer also relies on one card having a fancontrol while the other does not. Is there any way you have removed the fancontrol? Ultimately all I'm trying to do with this code is make the fan speed control work from the temps from both GPUs.
If not, I could always remove the test for one having a fancontrol and the other not having and relying entirely on the bus numbers.
|
|
|
Thanks. Here is a cgdebug.exe special build that outputs extra information about each device. http://ck.kolivas.org/apps/cgminer/temp/cgdebug.exeFor those on windows with 5970 or 6990 that are out of order, please run this like you would run cgminer, but with -T and stop it once it starts mining. Then please post the lines that look like this: [2012-01-22 17:31:58] lpAdapterID 20016752 iBusNumber 1 iDeviceNumber 0 iFunctionNumber 0 iVendorID 4098 strAdapterName AMD Radeon HD 6900 Series strDisplayName :0.0 lpInfo.strUDID 256:26393:4098:12583:5762
And then please describe the layout of where the GPUs of each gpu "twin" appears. Thanks!
|
|
|
True I guess when trying to debug a graphics card, crashes are gonna happen and doing that remotely would suck.
Indeed when I was working remotely on that 5970 card I literally had someone who would push the reset button for me 10,000 km away whenever he saw the screen go haywire telling him it had crashed, yet that wasn't entirely reliable either and I'd have to email him regularly... and all I was doing was adjusting fan control on it. We were in different timezones too so he really didn't appreciate me spinning the fans up and down while he was trying to sleep
|
|
|
Does his mean you don't need another Linux/6990 box?
Correct, thanks for the offer though. I just happened to bug someone online and get the access I needed. Linux is working mostly ok now with these dual card GPUs. Windows on the other hand... as you can see.
|
|
|
Does AMD API exposed a card serial # or any other unique identifying information?
Nope nope, as I've said numerous times before this is the most unsatisfying part of it. The opencl information is useless for determining which card is which. Furthermore if you have 2 monitors connected to a single GPU, it will come up as 2 unique opencl devices. Then motherboards decide to order cards backwards sometimes with the pci bus id order being the opposite of the device order we end up getting. Some motherboards do not order their PCIE lanes in numerical order as well, with lanes going 1,3,2 or other randomness. Then the ATI Display Library gives unique adapter ID information that has absolutely nothing in common with the opencl devices. The ADL will also give you one for each device, including the ones that don't have OpenCL support so if you have a card that can't mine in your setup, the number of devices cannot match. There are also unique "thermal devices" in the ADL which had the opportunity to designate GPU 0 and GPU 1 in shared devices (like 6990) but instead they simply come up as unique devices. Shared GPUs in single cards also do not have a single identifying feature whatsoever to say they're on the same card. Then of course, windows does something different to linux, and osx doesnt even have an ADL. All I can do is enumerate them in the order they appear and *hope* they align. Then I use surrogate markers with circumstantial evidence that the GPUs are on the same card. As far as I'm concerned, getting any of these bastards behaving in concert is somewhat of a miracle. All in all it's a mindfuck of epic proportions.
|
|
|
Righto, well still some work to do then
|
|
|
So I got some debugging on a 6990 and found that, of course, it does things differently to the 5970.
I've committed some changes to the git tree which should detect 5970s and 6990s reliably on linux, and mumble mumble something maybe on windows. I have at least one report of success on windows with 5970 already.
Give it a go and report back!
I gave that windows build a spin and unfortunatly it doesn't work like it should. I have 2x6990, so 4 GPU's visible for cgminer. I've ran it with 3 gpu's enabled, one disabled. By looking at the temperature, it is clear that cgminer is not showing the correct temperature for the correct core. Also, this version does not show temperatures for cores that are disabled, I think it would be useful if we still can see that. Next remark: I see (always have, unchanged in this version) for my 2 first cores the RPM's for the fans, but the other 2 just show a percentage... This is a windows machine, yes? Was this one where the two GPUs from the same card would be not one after the other? I don't know if I can pick those up at all. As I've said before, false positives for linked GPUs would be a big problem whereas false negatives would be no worse than current releases of cgminer. I'll try and make a debugging version for windows which might give me more information.
|
|
|
So I got some debugging on a 6990 and found that, of course, it does things differently to the 5970. I've committed some changes to the git tree which should detect 5970s and 6990s reliably on linux, and mumble mumble something maybe on windows. I have at least one report of success on windows with 5970 already. You can now also mine with 7970s if you specify the poclbm kernel with -k poclbm, but it will perform poorly so it's not recommended. Intensity can also be increased beyond 10 now (specifically put there for the 7970s) but it is highly advised against for most cards, where 8-9 is usually best. All this and more, and I even made an exe for windows (this is not the final new version). http://ck.kolivas.org/apps/cgminer/temp/cgminer.exeThis is taking longer than I'd hoped for the next release to come out, but there are still some final touches to put to it, and I really want things working well. Alas 7970 support is still woeful for now, but that will change thanks to this: https://bitcointalk.org/index.php?topic=61027.0Give it a go and report back!
|
|
|
I committed a little more 79x0 specific code and built a windows binary. Again, no idea if it works, nor if it performs, but here is a windows build: http://ck.kolivas.org/apps/cgminer/temp/cgminer.exeJust drop it into a 2.1.2 directory, replacing the existing exe. Still broken it appears, never mind for now... Should change when I get a 7970 from the sponsor thread
|
|
|
You only would need to compile and shit if you change the API. If the API is static and you're just fiddling with the cl code, modify the cl file, just delete any .bins generated and start the app again.
|
|
|
That's fantastic thanks so much I committed a little more 79x0 specific code and built a windows binary. Again, no idea if it works, nor if it performs, but here is a windows build: http://ck.kolivas.org/apps/cgminer/temp/cgminer.exeJust drop it into a 2.1.2 directory, replacing the existing exe. with this new cgminer.exe at first it said my 7970 failed to start, but once in cgminer i enabled it and it started hashing my 7970 is getting 384mh/sec 925mhz .9V my 5870 is getting 349mh/sec 850mhz 338watts at wall with diablo miner im getting 880mh/sec from same cards same speeds but 388watts at wall I just noticed my 7970 is showing A:0 R:0 HW:53 doesnt look like its accepting any shares?? Using diablominer now. Thanks. I guess it's still horribly broken then.
|
|
|
Ah. Gotcha. I'll give it another go.Well, in that case, there's another thing to sort out that doesn't make any damn sense (of course). The phatk kernel doesn't always run on GPU0. It's fairly random. It always runs on the other 3 cards every time. CGMiner pauses with a message about closing other apps that use the GPU (like Afterburner). If I close CGMiner and restart it with the same command line, it'll run. It's probably 50/50. Yes this is the never ending mindfuck that is windows creating a binary that reports size zero after it's built. Then it changes its mind and works the next time around. It's a bug that's plagued cgminer on windows for 6 months now and makes no sense whatsoever to anyone. Uploaded a fresh .exe which may fix this problem. It turns out I may have been attempting to build the opencl program twice.
|
|
|
Further investigation reveals the phatk kernel included with the current cgminer basically ONLY works with the bfi int patching. That means the kernel itself needs fixing to even work in its current intended form, so there's still some work to go... However for those brave, at least you have an exe you can use with the kernel and you can delete the .bin files and manually fiddle with the code in the phatk*.cl file included. Other kernels taken from other projects diablo, diapolo, phoenix etc will NOT under ANY circumstances work directly as the API is different so don't waste your time trying that.
I've upgrade a *potential* fix for the phatk kernel in here: http://ck.kolivas.org/apps/cgminer/tempMake sure to right click on phatk110817.cl and choose save link as to avoid your browser turning into html falsely. 79x0 Try replacing the file with that one with the new exe and see if that submits valid shares...
|
|
|
Further investigation reveals the phatk kernel included with the current cgminer basically ONLY works with the bfi int patching. That means the kernel itself needs fixing to even work in its current intended form, so there's still some work to go... However for those brave, at least you have an exe you can use with the kernel and you can delete the .bin files and manually fiddle with the code in the phatk*.cl file included. Other kernels taken from other projects diablo, diapolo, phoenix etc will NOT under ANY circumstances work directly as the API is different so don't waste your time trying that.
My current posted kernel version doesn't work with 7970, but I'm currently in the process of rewriting / reordering the kernel, which currently gives a performance of ~540 MHash/s for my 7970 card (this was over 100 MHash/s lower before I started my work). I guess there is still more potential in it, DiabloD3s kernel seems to be even faster! But as Con said, this won't work for CGMINER ... Dia It's most unusual that you work with only phoenix... it's not like working with cgminer is hard.
|
|
|
Further investigation reveals the phatk kernel included with the current cgminer basically ONLY works with the bfi int patching. That means the kernel itself needs fixing to even work in its current intended form, so there's still some work to go... However for those brave, at least you have an exe you can use with the kernel and you can delete the .bin files and manually fiddle with the code in the phatk*.cl file included. Other kernels taken from other projects diablo, diapolo, phoenix etc will NOT under ANY circumstances work directly as the API is different so don't waste your time trying that.
|
|
|
Ah. Gotcha. I'll give it another go.Well, in that case, there's another thing to sort out that doesn't make any damn sense (of course). The phatk kernel doesn't always run on GPU0. It's fairly random. It always runs on the other 3 cards every time. CGMiner pauses with a message about closing other apps that use the GPU (like Afterburner). If I close CGMiner and restart it with the same command line, it'll run. It's probably 50/50. Yes this is the never ending mindfuck that is windows creating a binary that reports size zero after it's built. Then it changes its mind and works the next time around. It's a bug that's plagued cgminer on windows for 6 months now and makes no sense whatsoever to anyone.
|
|
|
Thanks. By the way, I reuploaded the exe after that first message so perhaps you got the dud build. Please try re-downloading it.
Increasing threads is nothing like increasing intensity so it will not have the same effect.
Finally, we are never going to get there with the poclbm kernel. The one it is loading is ancient and numerous improvements have happened since then, most of which are in the phatk kernel. Unless I can get the existing phatk kernel working it's going to be nigh on useless trying to mine on 79x0 until I can create an entirely new kernel...
|
|
|
Interesting. Redownload the .exe, I've made a few minor changes and now you can increase intensity to 14. The other thing to try is increase the threads from 2 to 3 or more with -g 3
oops hang on, intensity doesnt work yet, but try -g 3 Now it should work (redownload exe).
|
|
|
|