Bitcoin Forum
October 24, 2017, 06:11:23 AM *
News: Latest stable version of Bitcoin Core: 0.15.0.1  [Torrent]. (New!)
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 [9] 10 11 12 13 14 15 16 17 18 19 20 21 22 »  All
  Print  
Author Topic: [ANN][GRS][DMD][DGB] Pallas optimized groestl opencl kernels  (Read 59283 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
pallas
Legendary
*
Offline Offline

Activity: 1442


Black Belt Developer


View Profile
January 08, 2015, 09:57:06 PM
 #161

I'm using your kernel: groestlcoin.cl.

Now I disassembled a dummy kernel with the appropriate parameters and I forgot about the T buffers. OpenCL uploads them in an extra buffer automatically. I don't even wanna know how the driver send that extra buffer and most importantly can't make an automatic skeleton kernel to get the binary with a placeholder for constant data that my program can patch with the output of the assembler.

So the easiest way would be to modify sgminer to handle my kernel. I have found the the 'queue_sph_kernel()' function where I can start from.

I never tested my kernel with cards smaller than tahiti, I also have no reports of it running on <= pitcairn: other groestlcoin kernels might be faster in that case.

Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1508825483
Hero Member
*
Offline Offline

Posts: 1508825483

View Profile Personal Message (Offline)

Ignore
1508825483
Reply with quote  #2

1508825483
Report to moderator
1508825483
Hero Member
*
Offline Offline

Posts: 1508825483

View Profile Personal Message (Offline)

Ignore
1508825483
Reply with quote  #2

1508825483
Report to moderator
1508825483
Hero Member
*
Offline Offline

Posts: 1508825483

View Profile Personal Message (Offline)

Ignore
1508825483
Reply with quote  #2

1508825483
Report to moderator
pallas
Legendary
*
Offline Offline

Activity: 1442


Black Belt Developer


View Profile
January 09, 2015, 03:55:25 PM
 #162

I see there is very little interest in mining groestl coins with GPU: very few users joined the recent discussion (2/3).
Let alone contributing to the code (2) or donating (2), in the whole life of this thread.

utahjohn
Hero Member
*****
Offline Offline

Activity: 630


View Profile
January 09, 2015, 04:08:09 PM
 #163

I see there is very little interest in mining groestl coins with GPU: very few users joined the recent discussion (2/3).
Let alone contributing to the code (2) or donating (2), in the whole life of this thread.
Well I still prefer GPU mining while block rwd 1.0 and will see what happens to diff when Rwd drops to 0.1 ... So count me in on new kernel, I donated a bit last time u did new kernel and will donate again for new super-super asm kernel Smiley
I expect diff will drop remarkably when Rwd drops and solo mining might still be attractive even aftre ...
I have 1 280x solo mining DMD (Pallas Diamond) approx 18.6 MHs (2-4 coins per day)
and 7950 solo mining FTC (neoscrypt) 278 KHs (would be sweet if these opt'z could be applied to Neoscrypt also ... wolf0 where are u?)

@realhet
Would be great if you could add a kernel setting parameter (perhaps realhet) that selects using your kernel and supply a windows x64 build of your sgminer ... I'd donate for that Smiley
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 10, 2015, 01:41:26 AM
 #164

Well, I found it better not to alter sgminer, that I'm totally unfamiliar with it and rather started to turn my kernel to be exactly the same as groestlcoin.cl from the outside. It will be a half page of additional code that deals with the kernel parameters. With a small dummy kernel it is already working now, but I'm just too tired to continue now. Cheesy
pallas
Legendary
*
Offline Offline

Activity: 1442


Black Belt Developer


View Profile
January 10, 2015, 03:48:37 AM
 #165

Well, I found it better not to alter sgminer, that I'm totally unfamiliar with it and rather started to turn my kernel to be exactly the same as groestlcoin.cl from the outside. It will be a half page of additional code that deals with the kernel parameters. With a small dummy kernel it is already working now, but I'm just too tired to continue now. Cheesy

well that's easier to use for the people.
waiting forward to seeing your progress! :-)

realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 11, 2015, 03:01:34 AM
 #166

Do I need a better proof than this? Grin

I'm the proud owner of my first 19 GRS coins, haha. I guess I was super lucky to get an 'accepted' right after 10 minutes of mining.

The speed increase in sgminer is the same that I measured in my 'workbench': From 2MH/s it raised to 7MH/s. (Or if we calculate in GroestlHash/s then it is 4MH/s -> 14MH/s.)

If anyone willing to help me testing this, please tell me! You'll need a Windows with cat14.9 and you also have to brave enough to run my IDE (HetPas.exe) on that system.

I can't wait to see your reports that how fast it is on the big cards. Cheesy
pallas
Legendary
*
Offline Offline

Activity: 1442


Black Belt Developer


View Profile
January 11, 2015, 03:24:35 AM
 #167

The compiled bin file should work regardless of catalyst version or operating system, so could you please post a link to the bin file? Thanks.
There is something weird about the sgminer screenshot, are you sure it's working correctly? It shows a single, disabled GPU with id 0, and the share got accepted was from GPU id 1. The diff numbers are also kinda weird.

Wolf0
Legendary
*
Online Online

Activity: 1722


Miner Developer


View Profile
January 11, 2015, 05:23:58 AM
 #168

The compiled bin file should work regardless of catalyst version or operating system, so could you please post a link to the bin file? Thanks.
There is something weird about the sgminer screenshot, are you sure it's working correctly? It shows a single, disabled GPU with id 0, and the share got accepted was from GPU id 1. The diff numbers are also kinda weird.

SG bug.

Code:
Donations: BTC: 1WoLFdwcfNEg64fTYsX1P25KUzzSjtEZC -- XMR: 45SLUTzk7UXYHmzJ7bFN6FPfzTusdUVAZjPRgmEDw7G3SeimWM2kCdnDQXwDBYGUWaBtZNgjYtEYA22aMQT4t8KfU3vHLHG
qwep1
Hero Member
*****
Offline Offline

Activity: 542


View Profile
January 11, 2015, 09:21:06 AM
 #169

I would also tested

              ▄▄██▄▄
          ▄▄██████████▄▄
      ▄▄██████████████████▄▄
  ▄▄██████████▀▀ ▀▀██████████▄▄
▄█████████▀▀          ▀▀█████████▄
██████▀▀        ▄▄        ▀▀██████
██████      ▄▄██████▄▄      ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████      ▀▀██████▀▀      ██████
██████          ▀▀        ▄▄██████
▀█████    ▄▄          ▄▄█████████▀
   ▀▀█    ████▄▄ ▄▄██████████▀▀
          ████████████████▀▀
          ▀▀██████████▀▀
              ▀▀██▀▀
P H O R E

     █
    █
   █
  █
   █
    █
   █
  █
 █
    KryptKoin rebranded to Phore   
     █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
PoS 3.0  -  Masternodes  -  Obfuscation


     █
    █
   █
  █
   █
    █
   █
  █
 █
.


            ▄▄██▄▄
        ▄▄██████████▄▄
    ▄▄████████▀▀████████▄▄
 ▄████████▀▀      ▀▀████████▄
▐█████▀▀              ▀▀█████▌
▐████       ▄▄██▄▄       ████▌
▐████    ▄██████████▄    ████▌
▐████    ████████████    ████▌
▐████    ▀██████████▀    ████▌
▐████       ▀▀██▀▀       ████▌
 ▀███                 ▄▄█████▌
    ▀    █▄▄      ▄▄████████▀
         █████▄▄████████▀▀
         ▀██████████▀▀
            ▀▀██▀▀
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 11, 2015, 10:54:26 PM
 #170

Sorry for taking it a bit long.

Here's what all you have to know if you're willing to test: http://realhet.wordpress.com/gcn-asm-groestl-coin-kernel/

Please send me benchmarks and compiled kernels for various cards!

I'm running it for an hour now and I got a 'rejected'. I'm solo mining GRS. Do I need to worry? Or is it usual? Can it be caused by slow network?
pallas
Legendary
*
Offline Offline

Activity: 1442


Black Belt Developer


View Profile
January 11, 2015, 10:56:59 PM
 #171

Realhet, thanks for the capeverde bin, unfortunately I can't use it because it's 32 bit.
I created a bootable win7 stick in order to compile the kernel: it compiles fine but, when run, it says "no target Hawaii" and no bin is created.

pallas
Legendary
*
Offline Offline

Activity: 1442


Black Belt Developer


View Profile
January 11, 2015, 10:58:04 PM
 #172

I'm running it for an hour now and I got a 'rejected'. I'm solo mining GRS. Do I need to worry? Or is it usual? Can it be caused by slow network?

yes it can be cause of the network: if the wallet is behind sync, the block may be rejected (or orphaned).
try with a pool...

utahjohn
Hero Member
*****
Offline Offline

Activity: 630


View Profile
January 11, 2015, 11:11:06 PM
 #173

Runtime error: No GCN device found

I have 2 AMD cards on gpu-platform 1
and 1 Intel GPU on gpu-platform 0

Edit: DOH 14.7RC3 not GCN ...
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 11, 2015, 11:47:28 PM
 #174

Thx for testing! So many errors :S But usually that's how it goes.

"No GCN device found" error.

That could be because I can't recognize new cards.
I know only these at the moment.
'TAHITI', 'PITCAIRN', 'CAPEVERDE', 'UNKNOWN5');
Importing new names right now.

Meanwhile you can select an OpenCL device by uncommenting this line in the code:
var dev:=cl.devices[0]; //access device by index (must be a GCN one)

The findDevices function can't recognize new cards. I'll repair it now.

@pallas: Thanks for fiddling with Win7! Cheesy What does it means by 32 bit code? That has no meaning regarding the GCN hardware o.O
But I'm 100% sure that you can't use my Capeverde binary unless you have that chip in the device you selected. ( var dev:=cl.devices[CLdeviceIndex]; )
   
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 12, 2015, 12:19:53 AM
 #175

I've updated HetPas and the groestl_isa.hpas too. Pls download HetPas150111_Groestl.zip.

From now it will start with a list of the cards:
writeln("List of opencl devices:");
for var i:=0 to cl.devices.count-1 do begin
  writeln("Device #",i);
  writeln(cl.devices[ i].dump);
end;

It should display something like this:
List of opencl devices:
Device #0
Target: Cayman  Series: 6  Core:880 MHz  CU:24  RAM:2048 MB  UID:4098
ext: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics ...
Device #1
Target: Capeverde  Series: 7  Core:880 MHz  CU:10  RAM:1024 MB  UID:4098
ext: cl_khr_fp64 cl_amd_fp64 ...

Using device:
Target: Capeverde  Series: 7  Core:880 MHz  CU:10  RAM:1024 MB  UID:4098
ext: cl_khr_fp64 cl_amd_fp64 ...
* core MHz value is not always accurate, use Catalyst Control Center (or ADL) instead!

For the GCN cards, the 'Series' must be at least 7. If it fails and it is indeed a GCN card, then I detected it badly, pls report then. My first card is a series 6xxx Northern Islands hardware, it can't used for this kernel.

@utahjohn: Maybe it works on 14.7 too. I can't tell that, but I know that it will crash on 13.4 because the kernel parameters are handled differently in that driver.
utahjohn
Hero Member
*****
Offline Offline

Activity: 630


View Profile
January 12, 2015, 12:22:34 AM
 #176

Temporarily upgraded to 14.9 to run hetpas, built for 280x.
Had hell of a time reverting back to 14.7 ... several tries later 14.7 working again and I have a kernel.elf for 280x.

Testing now ...

Very early results ...
280x I=22 E=1180 M=150 WS=256 ... 26 MHs Solo . No blocks yet ... approx 1.4x normal diamond kernel (18.5MHs)

Intensity 22 is sweet spot for my 280x, now playing with mem clock ...

No significant effect on raising mem-clock other than higher temps ...

stick with low mem clock.
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 12, 2015, 12:56:32 AM
 #177

"Very early results ..."

Very good, that it runs at you!

The speedup is not that impressive but let me ask yo to do a test:

Please when you stop sgminer, press run the groestl_isa.hpas, and copy/paste here my programs output, like this:

-----------------------------------------
Using new GCN ASM code
Kernel binary saved: C:\Work\Groestl\kernel_dump\kernel.elf

elapsed: 190.661 ms  13.749 MH/s   gain:   3.44x
elapsed: 188.444 ms  13.911 MH/s   gain:   3.48x
elapsed: 188.218 ms  13.928 MH/s   gain:   3.48x
elapsed: 188.225 ms  13.927 MH/s   gain:   3.48x

Functional test: RESULT IS OK
-----------------------------------------

And then go to around line 23 and comment out the "#define USE_NEW_ASM_KERNEL" and run it again! This will compile the original OpenCL kernel I've downloaded with sgminer5.1.

-----------------------------------------
Using original OpenCL code
Kernel binary saved: C:\Work\Groestl\kernel_dump\kernel.elf

elapsed: 657.623 ms  3.986 MH/s   gain:   1.00x
elapsed: 655.396 ms  4.000 MH/s   gain:   1.00x
elapsed: 654.897 ms  4.003 MH/s   gain:   1.00x
elapsed: 655.055 ms  4.002 MH/s   gain:   1.00x

Functional test: RESULT IS OK
-----------------------------------------

As you can see, on my small card the speedup is 3.5x. I'd like to check these results on your 280x as well.
I'm thinking that the problem is only because your big card don't get enough threads ore something similar.

Just a silly test: what if you turn Memory clock up to normal speed? Maybe it will change the L1 cache's behaviour? My kernel uses 0 memory, but uses L1 cache extensively.

And finally I had an 'accepted', phew...

"Had hell of a time reverting back to 14.7" -> Is there a tool called "Catalyst Clean Uninstall Utility" nowadays? 2-3 years ago that was useful when decrease Cat version.
utahjohn
Hero Member
*****
Offline Offline

Activity: 630


View Profile
January 12, 2015, 01:09:06 AM
 #178

"Very early results ..."

Very good, that it runs at you!

The speedup is not that impressive but let me ask yo to do a test:

Please when you stop sgminer, press run the groestl_isa.hpas, and copy/paste here my programs output, like this:

-----------------------------------------
Using new GCN ASM code
Kernel binary saved: C:\Work\Groestl\kernel_dump\kernel.elf

elapsed: 190.661 ms  13.749 MH/s   gain:   3.44x
elapsed: 188.444 ms  13.911 MH/s   gain:   3.48x
elapsed: 188.218 ms  13.928 MH/s   gain:   3.48x
elapsed: 188.225 ms  13.927 MH/s   gain:   3.48x

Functional test: RESULT IS OK
-----------------------------------------

And then go to around line 23 and comment out the "#define USE_NEW_ASM_KERNEL" and run it again! This will compile the original OpenCL kernel I've downloaded with sgminer5.1.

-----------------------------------------
Using original OpenCL code
Kernel binary saved: C:\Work\Groestl\kernel_dump\kernel.elf

elapsed: 657.623 ms  3.986 MH/s   gain:   1.00x
elapsed: 655.396 ms  4.000 MH/s   gain:   1.00x
elapsed: 654.897 ms  4.003 MH/s   gain:   1.00x
elapsed: 655.055 ms  4.002 MH/s   gain:   1.00x

Functional test: RESULT IS OK
-----------------------------------------

As you can see, on my small card the speedup is 3.5x. I'd like to check these results on your 280x as well.
I'm thinking that the problem is only because your big card don't get enough threads ore something similar.

Just a silly test: what if you turn Memory clock up to normal speed? Maybe it will change the L1 cache's behaviour? My kernel uses 0 memory, but uses L1 cache extensively.

And finally I had an 'accepted', phew...

"Had hell of a time reverting back to 14.7" -> Is there a tool called "Catalyst Clean Uninstall Utility" nowadays? 2-3 years ago that was useful when decrease Cat version.
No significant effect on raising mem-clock other than higher temps ...

Use "DDU" to clean catalyst drivers but not always 100% effective sometimes a little manual cleaning needed too ...

BTW I am using Pallas kernel as reference, not one supplied with stock sgminer ...

Any tweaks you can do with 2048 shaders (280x) and 1792 shaders (7950) ?
realhet
Jr. Member
*
Offline Offline

Activity: 32


View Profile WWW
January 12, 2015, 01:24:21 AM
 #179

Yes, that is must be the same kernel that I've copied into the groestl directory next to the groestl_isa.hpas file.

When you compile the original kernel within then groestl_isa.hpas program, it will use the groestl_original.cl kernel. It's Pallas's kernel, except that I hardcoded the workgroup size in it, and did another very minor change.

Also I compared the kernel I downloaded from the very first post in this topic: It's the same.
utahjohn
Hero Member
*****
Offline Offline

Activity: 630


View Profile
January 12, 2015, 01:29:45 AM
 #180

I did not try running kernel under catalyst 14.9, all I wanted was to generate the kernel.elf to run under 14.7 ... because I run multiple algos concurrently under 14.7 that suffer under 14.9 ...

Also note that I am running sgminer 4.1.0
Pages: « 1 2 3 4 5 6 7 8 [9] 10 11 12 13 14 15 16 17 18 19 20 21 22 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!