dsky
|
|
July 07, 2011, 07:22:21 AM |
|
All miner are Windows 7 x32 - SDK 2.4 - Catalyst 11.6
Latest changes: HD5770 - from 219 up to 220 HD6950 (unlockable) - from 367 to 370 HD6970 (6950 with 6950 BIOS) - from 405 up to 408
Small speed increase on all three kind of cards and the rejected rate seems better, too.
Well done again, Sir!
|
|
|
|
hugolp
Legendary
Offline
Activity: 1148
Merit: 1001
Radix-The Decentralized Finance Protocol
|
|
July 07, 2011, 07:40:36 AM Last edit: July 07, 2011, 09:07:07 AM by hugolp |
|
5870, Ubuntu 11.04, 11.6, 2.4, poclbm, went up 1MH/s (with last modification from previous modification).
The good news is the card that was randomly crashing the miner every 20 minutes with previous patch has been running for more than an hour without problems, so it seems stable now. Just crashed. I dont know what happens with this card and the modified kernel. Also, consumption has gone down like 5W. Im very puzzled by this changes in consumption by the different kernels.
Very good job. A small donation is going your way.
|
|
|
|
teukon
Legendary
Offline
Activity: 1246
Merit: 1011
|
|
July 07, 2011, 07:48:45 AM |
|
Ok, here are the errors for the latest kernel on SDK 2.1. { Build on <pyopencl.Device 'Cypress' at 0x34a3680>:
/tmp/OCLthVTDN.cl(126): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[19] = P4(19) + 0x11002000 + P1(19); ^
/tmp/OCLthVTDN.cl(138): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[30] = P3(30) + 0xA00055 + P1(30); ^
/tmp/OCLthVTDN.cl(261): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) Vals[3] = L + W[64]; ^
/tmp/OCLthVTDN.cl(286): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[81] = P4(81) + P2(81) + 0xA00000; ^
/tmp/OCLthVTDN.cl(299): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[87] = P4(87) + P3(87) + 0x11002000 + P1(87); ^
/tmp/OCLthVTDN.cl(316): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[94] = P3(94) + 0x400022 + P1(94); ^
6 errors detected in the compilation of "/tmp/OCLthVTDN.cl". }
Hopefully this is just some implicit casting of a kind which SDK 2.1 wants to be babied through. If I were versed in OpenCL I'd have a go at fixing this myself but I'm sure you would be much more efficient.
You could try to add (u) in front of the raw hex values like (u)0x400022 and report back then. Dia No good. Now I get a ton of "expression must have a constant value" errors. The end of the log looks like: { /tmp/OCLgV3our.cl(25): error: expression must have a constant value (u)0x6a09e667, (u)0xbb67ae85, (u)0x3c6ef372, (u)0x510e527f, (u)0x9b05688c, (u)0x1f83d9ab, (u)0xfc08884d, (u)0x5be0cd19 ^ /tmp/OCLgV3our.cl(29): error: expression must have a constant value __constant ulong L = (u)0x198c7e2a2; ^ /tmp/OCLgV3our.cl(261): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) Vals[3] = L + W[64]; ^ 74 errors detected in the compilation of "/tmp/OCLgV3our.cl". } Only one of the "mixed vector-scalar operation" errors remains but I'm guessing the others are still there but just buried by the even more urgent "constant value" errors.
|
|
|
|
OCedHrt
Member
Offline
Activity: 111
Merit: 10
|
|
July 07, 2011, 08:32:42 AM |
|
7/6 kernel seems to have the following effects for me:
4850 - elf kernel is much larger, but slight speed increase of ~3-4% from 56-57 @ 460 core to 58-59 @ 460 core. also seems to run cooler. since my card is overheating I have to underclock. cooler kernel means higher clock at 480 - getting 60. Using -v -w 128. worksize 128 is still optimal, 64 is slightly slower and 256 is much slower < 50mhash.
6450 - again elf kernel is much larger, but slight speed decrease at worksize 128. previous kernel was optimal at 128 but this one is optimal at 64. worksize 64 with new kernel is equivalent to worksize 128 with old kernel. potentially runs cooler but did not check. this is a fanless card but only gets 32mhash.
|
|
|
|
teukon
Legendary
Offline
Activity: 1246
Merit: 1011
|
|
July 07, 2011, 08:50:22 AM |
|
You could try to add (u) in front of the raw hex values like (u)0x400022 and report back then.
Dia
Sorry about that last post. I'm not usually that dumb I assure you. I've modified your kernel code by adding (u) before each of the 5 raw hex values corresponding to the error messages. I also added (u) directly before L from the other error message. After this everything starts working in SDK 2.1. For my stock voltage 5850: 423.7 (+/- 0.1) MH/s -> 425.9 (+/- 0.05) MH/s This does of course mean that SDK 2.1 has increased its lead against SDK 2.4 for me. So many people are convinced that SDK 2.4 is faster so perhaps this is a Windows/Linux thing. If this runs for 24 hours without freezing then I have a new personal best! I will want to test what proportion of these hashes are inaccurate but things are looking good. Another donation is coming your way.
|
|
|
|
Diapolo (OP)
|
|
July 07, 2011, 09:46:52 AM |
|
Ok, here are the errors for the latest kernel on SDK 2.1. { Build on <pyopencl.Device 'Cypress' at 0x34a3680>:
/tmp/OCLthVTDN.cl(126): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[19] = P4(19) + 0x11002000 + P1(19); ^
/tmp/OCLthVTDN.cl(138): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[30] = P3(30) + 0xA00055 + P1(30); ^
/tmp/OCLthVTDN.cl(261): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) Vals[3] = L + W[64]; ^
/tmp/OCLthVTDN.cl(286): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[81] = P4(81) + P2(81) + 0xA00000; ^
/tmp/OCLthVTDN.cl(299): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[87] = P4(87) + P3(87) + 0x11002000 + P1(87); ^
/tmp/OCLthVTDN.cl(316): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) W[94] = P3(94) + 0x400022 + P1(94); ^
6 errors detected in the compilation of "/tmp/OCLthVTDN.cl". }
Hopefully this is just some implicit casting of a kind which SDK 2.1 wants to be babied through. If I were versed in OpenCL I'd have a go at fixing this myself but I'm sure you would be much more efficient.
You could try to add (u) in front of the raw hex values like (u)0x400022 and report back then. Dia No good. Now I get a ton of "expression must have a constant value" errors. The end of the log looks like: { /tmp/OCLgV3our.cl(25): error: expression must have a constant value (u)0x6a09e667, (u)0xbb67ae85, (u)0x3c6ef372, (u)0x510e527f, (u)0x9b05688c, (u)0x1f83d9ab, (u)0xfc08884d, (u)0x5be0cd19 ^ /tmp/OCLgV3our.cl(29): error: expression must have a constant value __constant ulong L = (u)0x198c7e2a2; ^ /tmp/OCLgV3our.cl(261): error: mixed vector-scalar operation not allowed unless up-convertable(scalar-type=>vector-element-type) Vals[3] = L + W[64]; ^ 74 errors detected in the compilation of "/tmp/OCLgV3our.cl". } Only one of the "mixed vector-scalar operation" errors remains but I'm guessing the others are still there but just buried by the even more urgent "constant value" errors. Ah sorry, I was not clear enough. You must not add (u) in front of every hex value in the kernel, but ONLY in front of the hex values, that generated an error. W[19] = P4(19) + (u)0x11002000 + P1(19);
W[30] = P3(30) + (u)0xA00055 + P1(30);
Vals[3] = (u)L + W[64];
W[81] = P4(81) + P2(81) + (u)0xA00000;
W[87] = P4(87) + P3(87) + (u)0x11002000 + P1(87);
W[94] = P3(94) + (u)0x400022 + P1(94); If you can be so kind and test this out and report back. I would say restore the latest kernel and then modifiy the 6 places. Dia
|
|
|
|
teukon
Legendary
Offline
Activity: 1246
Merit: 1011
|
|
July 07, 2011, 09:58:45 AM |
|
Ah sorry, I was not clear enough. You must not add (u) in front of every hex value in the kernel, but ONLY in front of the hex values, that generated an error.
You were perfectly clear, I was just being dumb. Incase you missed my second post, this fix works for SDK 2.1. Thank you very much.
|
|
|
|
Diapolo (OP)
|
|
July 07, 2011, 10:06:00 AM |
|
Great, so we have a fix and a version that works with 2.1. Will release a fix later today!
Dia
|
|
|
|
n4l3hp
|
|
July 07, 2011, 10:44:05 AM |
|
7/6 kernel seems to have the following effects for me:
4850 - elf kernel is much larger, but slight speed increase of ~3-4% from 56-57 @ 460 core to 58-59 @ 460 core. also seems to run cooler. since my card is overheating I have to underclock. cooler kernel means higher clock at 480 - getting 60. Using -v -w 128. worksize 128 is still optimal, 64 is slightly slower and 256 is much slower < 50mhash.
6450 - again elf kernel is much larger, but slight speed decrease at worksize 128. previous kernel was optimal at 128 but this one is optimal at 64. worksize 64 with new kernel is equivalent to worksize 128 with old kernel. potentially runs cooler but did not check. this is a fanless card but only gets 32mhash.
My 4850 @ 675 core 250 mem gets 85MH/s. 0.32% stale rate at DeepBit. (bought from eBay, dont know what brand, came with zalman cooler. anything higher than 680 core will cause it to stop hashing even if overvolted). Temps at 71 degrees celsius, closed case. Been running Milkyway@Home for more than a year at the same settings before I switched it to bitcoin mining. For ATI 4000 series, use SDK 2.1 and poclbm (April 28 version). Using phatk and higher opencl sdk version on these cards will only lower the hash rate.
|
|
|
|
OCedHrt
Member
Offline
Activity: 111
Merit: 10
|
|
July 07, 2011, 12:11:48 PM |
|
7/6 kernel seems to have the following effects for me:
4850 - elf kernel is much larger, but slight speed increase of ~3-4% from 56-57 @ 460 core to 58-59 @ 460 core. also seems to run cooler. since my card is overheating I have to underclock. cooler kernel means higher clock at 480 - getting 60. Using -v -w 128. worksize 128 is still optimal, 64 is slightly slower and 256 is much slower < 50mhash.
6450 - again elf kernel is much larger, but slight speed decrease at worksize 128. previous kernel was optimal at 128 but this one is optimal at 64. worksize 64 with new kernel is equivalent to worksize 128 with old kernel. potentially runs cooler but did not check. this is a fanless card but only gets 32mhash.
My 4850 @ 675 core 250 mem gets 85MH/s. 0.32% stale rate at DeepBit. (bought from eBay, dont know what brand, came with zalman cooler. anything higher than 680 core will cause it to stop hashing even if overvolted). Temps at 71 degrees celsius, closed case. Been running Milkyway@Home for more than a year at the same settings before I switched it to bitcoin mining. For ATI 4000 series, use SDK 2.1 and poclbm (April 28 version). Using phatk and higher opencl sdk version on these cards will only lower the hash rate. You misread my post. I am running at 460 core because the card is 105C at that speed. I cannot run it any faster. I can run 480 core with this new kernel. Btw, the days of SDK 2.1 and poclbm are nearly over. I get 84MH/s at 675 core and 494 mem. I can't do 250 mem the card doesn't downclock that far with afterburner. At 250 mem it would be even higher. However I can actually clock 700+ even though only for a few seconds.
|
|
|
|
Diapolo (OP)
|
|
July 07, 2011, 03:20:41 PM |
|
New version 2011-07-07 is ready: http://www.mediafire.com/?7j70gnmllgi9b73This is mainly a bugfix release for SDK 2.1 with some code restructuring to save a few writes and additions. I can not guarantee, that this really works for 2.1, because I didn't test it. If you are unsure, wait for users to test it for you and consider applying this patch later! By the way, I want to thank all of those who donated a few Bitcents to me, feels great! Thanks, Dia PS.: If it works, please post here and consider a small donation @ 1B6LEGEUu1USreFNaUfvPWLu6JZb7TLivM .
|
|
|
|
Saturn7
|
|
July 07, 2011, 03:29:29 PM |
|
Went from 433/Mhash to 440/Mhash on 5870. Overclocked to 970Mhz. Thanks Donation sent.
|
First there was Fire, then Electricity, and now Bitcoins
|
|
|
BOARBEAR
Member
Offline
Activity: 77
Merit: 10
|
|
July 07, 2011, 03:30:10 PM |
|
This kernel will cause poclbm to exit after running for a while. I have tested several times, after few hours poclbm will be gone and my machine left there doing nothing
AMD5870 with SDK 2.5
|
|
|
|
Diapolo (OP)
|
|
July 07, 2011, 03:37:03 PM |
|
This kernel will cause poclbm to exit after running for a while. I have tested several times, after few hours poclbm will be gone and my machine left there doing nothing
AMD5870 with SDK 2.5
That's the first report I get with that problem. Any other poclbm users with that observation? Dia
|
|
|
|
hugolp
Legendary
Offline
Activity: 1148
Merit: 1001
Radix-The Decentralized Finance Protocol
|
|
July 07, 2011, 03:45:03 PM |
|
This kernel will cause poclbm to exit after running for a while. I have tested several times, after few hours poclbm will be gone and my machine left there doing nothing
AMD5870 with SDK 2.5
That's the first report I get with that problem. Any other poclbm users with that observation? Dia I reported it twice in this same thread... It only happens in one of my cards (I have 4 5870's, and I have one that its an identical model that works fine). In phoenix what happens is that it will give a mistake about a kernel mistake and will continue mining (I suspect phoenix reloads the kernel).
|
|
|
|
phorensic
|
|
July 07, 2011, 03:56:54 PM |
|
Excellent work on this kernel. It seems like the original author got to a point where he thought he had improved performance to almost the max, but you are progressing very nicely!
|
|
|
|
CYPER
|
|
July 07, 2011, 07:16:33 PM |
|
Well finally I can see some improvement.
With todays version I got from 1748 to 1758 so ~10Mhash increase.
This is for 4x 5870 @ 960Mhz Core & 300Mhz Memory SDK 2.1
|
|
|
|
Diapolo (OP)
|
|
July 07, 2011, 08:01:50 PM |
|
Well finally I can see some improvement.
With todays version I got from 1748 to 1758 so ~10Mhash increase.
This is for 4x 5870 @ 960Mhz Core & 300Mhz Memory SDK 2.1
Seems like good news for SDK 2.1 users, right ?
|
|
|
|
CYPER
|
|
July 07, 2011, 08:11:12 PM |
|
Well to be perfectly honest and objective 0.5% increase won't make any difference even with my setup At least not in terms of financial benefits But I don't mean to belittle your work - well done
|
|
|
|
erek
Newbie
Offline
Activity: 36
Merit: 0
|
|
July 07, 2011, 09:40:20 PM |
|
808.6 MH/sec max now (up 1-2% at least) from 7-6-11 to 7-7-11
|
|
|
|
|