Bitcoin Forum
November 06, 2024, 12:44:02 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 [16] 17 18 19 20 »  All
  Print  
Author Topic: [ANN][GRS][DMD][DGB] Pallas optimized groestl opencl kernels  (Read 61242 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
April 13, 2015, 10:44:28 AM
 #301

The need for an improved goestl kernel is now immediate ... please do what u can ... I am just C, C++ coder and am not fully into multi thread GPU coding ...
sp_
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
April 13, 2015, 11:38:45 AM
 #302

Pallas, can you  rewrite this groesl-256 implementation to a groestl-512 and add it to sgminer (x11,x13,x15).?


Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
sp_
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
April 13, 2015, 01:53:50 PM
 #303

Some of competitors are awake, taking exercises with pen and paper to get all AES-wannabees at once Cheesy

Wolf0 claims to know aes from the inside backwords and forwards. Me too.

The answer is SEA :-)


Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
smolen
Hero Member
*****
Offline Offline

Activity: 524
Merit: 500


View Profile
April 14, 2015, 08:59:49 AM
 #304

Wolf0 claims to know aes from the inside backwords and forwards. Me too.

The answer is SEA :-)
Yes, that makes the game damn addictive.
Look, you told us about wide tables, great idea, but to skip sboxing with it couple more inches deeper inside AES is needed Smiley

Of course I gave you bad advice. Good one is way out of your price range.
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
April 14, 2015, 02:03:28 PM
 #305

Pallas, can you  rewrite this groesl-256 implementation to a groestl-512 and add it to sgminer (x11,x13,x15).?

Sorry for the delay.
That would be nice, but everybody's using wolf0's binaries, so why? It would make sense if there is a plan to opensource optimized versions of most of the algos.

utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
April 16, 2015, 05:33:39 AM
 #306

@wolf0 do you have anything better than the neocrypt kernel u leaked on feathercoin thread?  I am getting 278KHs on 7950 and 295 on 280x
utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
April 16, 2015, 03:03:22 PM
Last edit: April 16, 2015, 03:29:16 PM by utahjohn
 #307

@wolf0 do you have anything better than the neocrypt kernel u leaked on feathercoin thread?  I am getting 278KHs on 7950 and 295 on 280x

I didn't leak that, I released it. Checking my records...

EDIT: Okay, most recent record of Neoscrypt I have is 12/23/2014 (NSFW): https://ottrbutt.com/miner/neoscryptwolf-12232014.png
Needless to say but I will, I appreciate your work, I have no conception of wavefronts and such, I have tried but I'm just too old to embrace new concepts.  If you have something better for me please do put on Mega Smiley  Same goes for groestl Pallas Smiley  U are my heroes Smiley
And realhet who understands AMD GPU coding better than all of us Smiley  realhet hetpas assembly kernel still best for 280x and other Tahiti cards AFAIK Smiley
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
April 16, 2015, 03:27:14 PM
 #308

Just wanted to say I've tried applying some of the tricks I learnt working on whirlpoolx to the groestl kernel, but it's not so simple.
This kernel is much bigger in size so you can't just copy some good lines of code and it runs faster. Furthermore some of the optimizations I made in the past, make it more time consuming to apply some apparently simple hacks. Wolf0 I'm sure you know what I mean ;-)
Still there is room for improvement, I have some ideas, but the question is: when the profit is gone, and the fun is gone, is it still worth?

utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
April 16, 2015, 03:32:50 PM
 #309

Just wanted to say I've tried applying some of the tricks I learnt working on whirlpoolx to the groestl kernel, but it's not so simple.
This kernel is much bigger in size so you can't just copy some good lines of code and it runs faster. Furthermore some of the optimizations I made in the past, make it more time consuming to apply some apparently simple hacks. Wolf0 I'm sure you know what I mean ;-)
Still there is room for improvement, I have some ideas, but the question is: when the profit is gone, and the fun is gone, is it still worth?

I expect DMD to drop into low teens difficulty after a week or so Smiley  If it does not mining is dead LOL.  I have a direct interest in this as a partner on donkypool ... 12 miners up from 6 a few weeks ago ... I am currently mining neoscrypt for sale on westhash lol and p=4.8 selling Smiley  anything less goes to yaamp ...
utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
April 16, 2015, 11:31:53 PM
Last edit: April 17, 2015, 12:01:43 AM by utahjohn
 #310

@wolf0 do you have anything better than the neocrypt kernel u leaked on feathercoin thread?  I am getting 278KHs on 7950 and 295 on 280x

I didn't leak that, I released it. Checking my records...

EDIT: Okay, most recent record of Neoscrypt I have is 12/23/2014 (NSFW): https://ottrbutt.com/miner/neoscryptwolf-12232014.png
Needless to say but I will, I appreciate your work, I have no conception of wavefronts and such, I have tried but I'm just too old to embrace new concepts.  If you have something better for me please do put on Mega Smiley  Same goes for groestl Pallas Smiley  U are my heroes Smiley
And realhet who understands AMD GPU coding better than all of us Smiley  realhet hetpas assembly kernel still best for 280x and other Tahiti cards AFAIK Smiley

Nope, I have 21MH/s out of a 7950 at 1125/1250, IIRC, using OpenCL.
Wow! may I have new Neoscrypt kernel, 7950 working hard just doing 278KHs with your older kernel!

Looking ... 1160/1500 Smiley I have modded card a bit for better cooling Smiley
utahjohn
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500


View Profile
April 17, 2015, 12:08:03 AM
 #311

I get 26MHs on 280x mining groestl however I have quit groestl mining of DMD for the moment till diff drops back into the teens.  For some reason ASM kernel crashes 7950 within a few minutes ...  I am mining nneoscrypt on yaamp at present and also selling neo on westhash Smiley
Buying more DMD than I used to mine direct Huh  Will see what happens in next week or so as miners drop like flies on DMD ...
smolen
Hero Member
*****
Offline Offline

Activity: 524
Merit: 500


View Profile
April 17, 2015, 04:02:13 AM
 #312

Just wanted to say I've tried applying some of the tricks I learnt working on whirlpoolx to the groestl kernel, but it's not so simple.
This kernel is much bigger in size so you can't just copy some good lines of code and it runs faster. Furthermore some of the optimizations I made in the past, make it more time consuming to apply some apparently simple hacks. Wolf0 I'm sure you know what I mean ;-)
Still there is room for improvement, I have some ideas, but the question is: when the profit is gone, and the fun is gone, is it still worth?
Another trick, not for speed, but for cleaning the code - when you want to postpone sboxing of byte, put preimage of zero (0x81 in Whirlpool) there.

Of course I gave you bad advice. Good one is way out of your price range.
realhet
Newbie
*
Offline Offline

Activity: 32
Merit: 0


View Profile WWW
May 17, 2015, 10:21:24 AM
 #313

Hi,

Have you checked the new GCN3 ISA manual? http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/07/AMD_GCN3_Instruction_Set_Architecture.pdf

It has some really useful things like:

- Bytepermute (no more shifts and masks)
- VOP_DPP: It actually does 2 ds_swizzle in the instruction in no time, so optimizing a single thread for 4 lanes costs no more cycles.
- VOP_SDWA: access a word or a byte in the 32bit inputs and in the output too. (again: no more shifts and masks)
- S alu can write memory

No 3 op add, and 3 op bitwise, though.

And they altered some instruction encodings, so I guess my asm will crash on GCN3 immediately. Cheesy

MaxDZ8
Hero Member
*****
Offline Offline

Activity: 672
Merit: 500



View Profile
May 18, 2015, 03:44:15 PM
 #314

That's some truly slick updates!

I was indeed planning to do full AES round without t-tables as the amount of masks are nonsensical.
I had the impression the SALU was immensely updated for Tonga given it takes much more VGPRs on the analyzer.

I wonder how to trick the CL compiler in emitting this code.

But most importantly, what are they waiting to just make an AMD_GCN_swizzle extension!?
realhet
Newbie
*
Offline Offline

Activity: 32
Merit: 0


View Profile WWW
May 21, 2015, 12:31:48 PM
 #315

It doesn't seems like they are implementing gcn specific goodies on the current compiler stack. It's kinda bloated, and AMD_IL awaits for it's replacement since 7970 came out. I'm sure in the upcoming HSA language there will be much more GCN things implemented (except the separated V and S programming).
pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
July 01, 2015, 09:45:05 PM
 #316

I'm interested in knowing the hashrate of R9 285 and R9 Fury X cards, anybody?

pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
July 03, 2015, 08:31:30 AM
 #317

Wolf0 created a faster Tahiti binary and posted about it in the groestlcoin thread:

I have a faster Tahiti binary than Pallas' for Groestlcoin - works on DMD, too. The usage is the same as his binary; I should have more info later.

Get it here: https://ottrbutt.com/miner/wolf-groestlcoinTahitigw256l4.bin

it is indeed faster and works flawlessly.
usage: just rename it over the old one and make sure you set worksize 256 for that card; you can get a bit more hashrate by using 2 or 4 threads.

pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
October 13, 2015, 11:32:00 AM
 #318

Nothing new in the groestl+groestl area, but I've worked a bit on the groestl+sha variant (myr-groestl for myriad, digibyte, saffron, etc.).
Tahiti is a mess, but I could easily push hawaii over 60 Mh/s, keeping the kernel compatible with the old miners.

pallas (OP)
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
October 16, 2015, 12:59:30 PM
 #319

Nothing new in the groestl+groestl area, but I've worked a bit on the groestl+sha variant (myr-groestl for myriad, digibyte, saffron, etc.).
Tahiti is a mess, but I could easily push hawaii over 60 Mh/s, keeping the kernel compatible with the old miners.

I could finally get rid of scratch registers on Tahiti: now the 280x is doing 35 Mh/s with moderate overclock :-)

carlo_0000
Legendary
*
Offline Offline

Activity: 1281
Merit: 1003


View Profile
November 02, 2015, 01:30:27 AM
Last edit: November 02, 2015, 01:52:27 AM by carlo_0000
 #320

the diamond.cl  is missing on the download
i only see groestlcoin-v1.cl

or must we just rename to diamond ?

so i rename to diamond.cl
but no change in my speed i have 4.7 mh on r9 270  sgminer 4.1.0

i guest groestlcoin-v1.cl is not for diamond, i ve got a lot rejected shares
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 [16] 17 18 19 20 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!