Bitcoin Forum
November 15, 2024, 01:11:44 PM *
News: Check out the artwork 1Dq created to commemorate this forum's 15th anniversary
 
   Home   Help Search Login Register More  
Poll
Question: Do you want to see improvements in Ethash dual-mining with GGS?
I desperately need it. - 8 (15.1%)
It would be nice. - 12 (22.6%)
It's not worth it anymore. - 33 (62.3%)
Total Voters: 53

Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 ... 197 »
  Print  
Author Topic: Gateless Gate Sharp 1.3.8: 30Mh/s (Ethash) on RX 480!  (Read 214420 times)
m1n1ngP4d4w4n
Full Member
***
Offline Offline

Activity: 224
Merit: 100

CryptoLearner


View Profile
January 05, 2017, 12:11:32 PM
 #221

xD, well to be honest there is no pros at keeping old drivers when there is numerous proof newer ones are working as good or even better, it's not THAT hard to update even if you got alot of rig and know a little about scripting/dev (if you got this much rig you must have some knowledge to have proper monitoring at least) i prefer the dev to be able to focus on one driver (that's what claymore for example does if i recall) otherwise you spend all your time for compatibility instead of improving performance.
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 05, 2017, 12:12:14 PM
 #222

It turned out that the "legacy" AMD drivers require a totally different set of optimizations. This must be the reason why GG was running rather slow on older (GCN1/2) cards. I suppose optimizations for legacy drivers are worth the effort after all...

or people could update Smiley

Well, the thing is, with older cards, even the latest drivers always switch back to the "legacy" mode.
Weird, weird...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
m1n1ngP4d4w4n
Full Member
***
Offline Offline

Activity: 224
Merit: 100

CryptoLearner


View Profile
January 05, 2017, 12:14:21 PM
 #223

It turned out that the "legacy" AMD drivers require a totally different set of optimizations. This must be the reason why GG was running rather slow on older (GCN1/2) cards. I suppose optimizations for legacy drivers are worth the effort after all...

or people could update Smiley

Well, the thing is, with older cards, even the latest drivers always switch back to the "legacy" mode.
Weird, weird...

Ah i see, so it's not the drivers that is updated but just the package then... lazy amd lol, they don't make newest drivers compatible with old hardware, they just pack different drivers version into one package, no wonder they're so big, nvidia prob does the same, when you see that the package is 350+ MB
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
January 05, 2017, 02:34:19 PM
Last edit: January 05, 2017, 02:51:00 PM by nerdralph
 #224

It looks like AMDIL is a dead-end anyway.
http://lists.llvm.org/pipermail/llvm-dev/2015-May/085684.html

HSAIL will probably short-lived since most of the work is now focused on the llvm amdgpu back-end.  It even supports inline asm, but I'm not sure if it will generate a kernel binary that conforms to AMD's CL2.0 ABI.   With clang/llvm-3.9, I've only got as far as getting it to output gcn assembler from the OpenCL + inline asm code.




Like Wolf said, CLRX is the way to go if you haven't looked into it. I used it in my previous project with a great success. I am trying to figure out how to enable GDS on Ellesmere, which turned out to be rather tricky. It seems that there is no way to enable GDS with the CL2.0 ABI and you have to resort back to CL1.2 ABI with the "-legacy" build option. This totally sucks as I need to redo optimizations all over again. I have no idea as to what engineers at AMD had in mind when they decided to make this design change.

I'm not so sure.  While Mateusz has done some great work with CLRX, he hasn't fully reverse-engineered how the AMD drivers load kernel binaries.  For example he thought the amdcl2 binaries were 64-bit only until I sent him a Tonga CL2.0 32-bit kernel binary.  The fglrx linux ocl driver contains the string "gds_segment_byte_size", which the CLRX docs only lists in the ROCm ABI.  Despite being "closed source", much of the llvm code contained in the drivers is open source.  I think it may be possible to compile a mixed OpenCL/asm kernel offline using llvm that can be loaded (CreateProgramWithBinary) by the 16.x Crimson drivers on Windoze, the Linux fglrx drivers, and the AMDGPU-PRO drivers.

p.s. According to a source at AMD, "The ROCm ABI is the same as the HSAIL ABI and will be common".  With fglrx, building with CL2.0 and --save-temps generates hsail along with asm, meaning it supports the HSAIL ABI.  Ergo the ROCm ABI should be supported.
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 05, 2017, 03:18:32 PM
 #225

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
January 05, 2017, 03:38:23 PM
 #226

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

It'll be a while.  I'm not a "just make it work" kind of guy.  I need to understand how things work at a low level first.
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 05, 2017, 08:51:52 PM
 #227

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

I just noticed this string is not present in the output of CodeXL, which means that the AMD OpenCL 2.0 ABI is not capable of handling GDS. Oh well. I was able to achieve a reasonable performance in the legacy mode, so this should be a good foundation for the upcoming GCN assembly version. If I can get GDS and GWS working, the new version should be able to surpass Claymore's performance. We will see.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
QuintLeo
Legendary
*
Offline Offline

Activity: 1498
Merit: 1030


View Profile
January 05, 2017, 11:06:31 PM
 #228


Ah i see, so it's not the drivers that is updated but just the package then... lazy amd lol, they don't make newest drivers compatible with old hardware, they just pack different drivers version into one package, no wonder they're so big, nvidia prob does the same, when you see that the package is 350+ MB

 More like 450 for the Relive bloatware - as opposed to 250ish for the most Crimson versions.

 Seems like AMD has caught Mickey$loth Bloatware disease BAD the last few months.

I'm no longer legendary just in my own mind!
Like something I said? Donations gratefully accepted. LYLnTKvLefz9izJFUvEGQEZzSkz34b3N6U (Litecoin)
1GYbjMTPdCuV7dci3iCUiaRrcNuaiQrVYY (Bitcoin)
m1n1ngP4d4w4n
Full Member
***
Offline Offline

Activity: 224
Merit: 100

CryptoLearner


View Profile
January 05, 2017, 11:40:01 PM
 #229


Ah i see, so it's not the drivers that is updated but just the package then... lazy amd lol, they don't make newest drivers compatible with old hardware, they just pack different drivers version into one package, no wonder they're so big, nvidia prob does the same, when you see that the package is 350+ MB

 More like 450 for the Relive bloatware - as opposed to 250ish for the most Crimson versions.

 Seems like AMD has caught Mickey$loth Bloatware disease BAD the last few months.

ahah isn't that true... damn what a waste of space and bandwith  Angry, same with nvidia, they should pack the VR / 3D stuff apart from the rest, there isn't many people that actually use this...
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
January 06, 2017, 12:16:03 AM
 #230

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

I just noticed this string is not present in the output of CodeXL, which means that the AMD OpenCL 2.0 ABI is not capable of handling GDS. Oh well. I was able to achieve a reasonable performance in the legacy mode, so this should be a good foundation for the upcoming GCN assembly version. If I can get GDS and GWS working, the new version should be able to surpass Claymore's performance. We will see.

A CL2.0 kernel won't touch the GDS unless you use pipes or atomic counters (counter32_t).
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 06, 2017, 04:16:28 AM
 #231

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

I just noticed this string is not present in the output of CodeXL, which means that the AMD OpenCL 2.0 ABI is not capable of handling GDS. Oh well. I was able to achieve a reasonable performance in the legacy mode, so this should be a good foundation for the upcoming GCN assembly version. If I can get GDS and GWS working, the new version should be able to surpass Claymore's performance. We will see.

A CL2.0 kernel won't touch the GDS unless you use pipes or atomic counters (counter32_t).


I don't know about pipes, but counter32_t is officially deprecated already, and I confirmed that counter32_t is not available with OpenCL 2.0 binaries. I suppose I am really lucky that I was able to touch GDS with the "-legacy" build option as even assembler programmers struggled to find a way to access GDS on newer devices:

GDS memory revisited, moving from Tahiti to Hawaii
https://community.amd.com/thread/170216

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 06, 2017, 07:21:40 AM
 #232

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

I just noticed this string is not present in the output of CodeXL, which means that the AMD OpenCL 2.0 ABI is not capable of handling GDS. Oh well. I was able to achieve a reasonable performance in the legacy mode, so this should be a good foundation for the upcoming GCN assembly version. If I can get GDS and GWS working, the new version should be able to surpass Claymore's performance. We will see.

A CL2.0 kernel won't touch the GDS unless you use pipes or atomic counters (counter32_t).


I also tried using pipes to see if they result in GDS access, but CodeXL did not show any GDS instructions. What a bummer...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 06, 2017, 08:21:29 AM
 #233

I was finally able to assemble a GCN ISA code with the OpenCL 1.2 ABI for Ellesmere by modifying CLRX. This means that I have the ability to access GDS on RX 480 without restrictions now. I will be out of town for a week, but we should see pretty interesting things after I come back home.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 07, 2017, 01:32:25 PM
 #234

I appreciate what you are doing and look forward to switching my farm to your miner when it is a bit faster.  A moderate difference in hashrate is to costly with a bunch of miners running but I will accept a small loss in hashrate just to stop using the closed source stuff.

There are tons of optimizations that can be done with the GCN assembly, so that should happen sooner than later. Now I am thinking about taking a completely different strategy. My current game plan is (1) to reduce the number of kernels by using global synchronizations across work-items and (2) to move pathetically slow row counters in global memory to the fast 64KB GDS. (2) is GCN-specific, but (1) should also be possible with NVIDIA. Too bad I cannot try these ideas until next Wednesday...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 07, 2017, 10:51:39 PM
 #235

Since I am away from home until Wednesday and do not have access to dedicated graphics cards, I just decided to try potential replacements for gatelessgate.py as I feel more comfortable with C++ than Python and I think the Python component of SA v5 is rather lacking as far as functionality is concerned. I am planning to evaluate sgminer-gm and nheqminer. I hope they should bring GG on par with Claymore's in terms of usability.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
January 07, 2017, 11:39:00 PM
 #236

Since I am away from home until Wednesday and do not have access to dedicated graphics cards, I just decided to try potential replacements for gatelessgate.py as I feel more comfortable with C++ than Python and I think the Python component of SA v5 is rather lacking as far as functionality is concerned. I am planning to evaluate sgminer-gm and nheqminer. I hope they should bring GG on par with Claymore's in terms of usability.
If you feel the need to test graphics I have windows 10+NV 1070 on it, linux 14.04 r9 390, linux 16.04 rx480 Smiley


Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 07, 2017, 11:43:33 PM
 #237

Since I am away from home until Wednesday and do not have access to dedicated graphics cards, I just decided to try potential replacements for gatelessgate.py as I feel more comfortable with C++ than Python and I think the Python component of SA v5 is rather lacking as far as functionality is concerned. I am planning to evaluate sgminer-gm and nheqminer. I hope they should bring GG on par with Claymore's in terms of usability.
If you feel the need to test graphics I have windows 10+NV 1070 on it, linux 14.04 r9 390, linux 16.04 rx480 Smiley



That would be great! Could you set up the Linux box with RX 480?

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
January 07, 2017, 11:51:38 PM
 #238

Since I am away from home until Wednesday and do not have access to dedicated graphics cards, I just decided to try potential replacements for gatelessgate.py as I feel more comfortable with C++ than Python and I think the Python component of SA v5 is rather lacking as far as functionality is concerned. I am planning to evaluate sgminer-gm and nheqminer. I hope they should bring GG on par with Claymore's in terms of usability.
If you feel the need to test graphics I have windows 10+NV 1070 on it, linux 14.04 r9 390, linux 16.04 rx480 Smiley



That would be great! Could you set up the Linux box with RX 480?
I will P.M with details.

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
Subw
Hero Member
*****
Offline Offline

Activity: 672
Merit: 500


View Profile
January 08, 2017, 11:41:27 AM
 #239

I am planning to evaluate sgminer-gm

moving to sgminer would be awesome as it has great pool management support and popular API for monitoring and control
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
January 08, 2017, 12:47:42 PM
 #240

Well guess I did something wrong, with the latest amdgpu-pro drivers I built with make, then ran gatelessgate.py, getting 10/sec on each 480, anyone know where I messed up?  Cry Thanks!

There was a compatibility issue, but I already fixed it and pushed a patch to the repository.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 ... 197 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!