Bitcoin Forum
December 18, 2017, 06:22:10 AM *
News: Latest stable version of Bitcoin Core: 0.15.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 ... 88 »
  Print  
Author Topic: Gateless Gate Sharp 1.1.5: zawawa's open-source dual ETH/XMR/PASC/LBC/FTC miner  (Read 164610 times)
ioglnx
Sr. Member
****
Offline Offline

Activity: 434

Fighting mob law and inquisition in this forum


View Profile
January 05, 2017, 12:07:34 PM
 #221

It turned out that the "legacy" AMD drivers require a totally different set of optimizations. This must be the reason why GG was running rather slow on older (GCN1/2) cards. I suppose optimizations for legacy drivers are worth the effort after all...

or people could update Smiley
Smart move :-D

GTX 1080Ti rocks da house... seriously... this card is a beast³
Owning by now 18x GTX1080Ti :-D @serious love of efficiency
1513578130
Hero Member
*
Offline Offline

Posts: 1513578130

View Profile Personal Message (Offline)

Ignore
1513578130
Reply with quote  #2

1513578130
Report to moderator
1513578130
Hero Member
*
Offline Offline

Posts: 1513578130

View Profile Personal Message (Offline)

Ignore
1513578130
Reply with quote  #2

1513578130
Report to moderator
1513578130
Hero Member
*
Offline Offline

Posts: 1513578130

View Profile Personal Message (Offline)

Ignore
1513578130
Reply with quote  #2

1513578130
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1513578130
Hero Member
*
Offline Offline

Posts: 1513578130

View Profile Personal Message (Offline)

Ignore
1513578130
Reply with quote  #2

1513578130
Report to moderator
1513578130
Hero Member
*
Offline Offline

Posts: 1513578130

View Profile Personal Message (Offline)

Ignore
1513578130
Reply with quote  #2

1513578130
Report to moderator
1513578130
Hero Member
*
Offline Offline

Posts: 1513578130

View Profile Personal Message (Offline)

Ignore
1513578130
Reply with quote  #2

1513578130
Report to moderator
m1n1ngP4d4w4n
Full Member
***
Offline Offline

Activity: 154

CryptoLearner


View Profile
January 05, 2017, 12:11:32 PM
 #222

xD, well to be honest there is no pros at keeping old drivers when there is numerous proof newer ones are working as good or even better, it's not THAT hard to update even if you got alot of rig and know a little about scripting/dev (if you got this much rig you must have some knowledge to have proper monitoring at least) i prefer the dev to be able to focus on one driver (that's what claymore for example does if i recall) otherwise you spend all your time for compatibility instead of improving performance.

BTC - 1B1RBYkzxiTmrbnFe2vj8EaNPSYftW8186 for tips Wink
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 05, 2017, 12:12:14 PM
 #223

It turned out that the "legacy" AMD drivers require a totally different set of optimizations. This must be the reason why GG was running rather slow on older (GCN1/2) cards. I suppose optimizations for legacy drivers are worth the effort after all...

or people could update Smiley

Well, the thing is, with older cards, even the latest drivers always switch back to the "legacy" mode.
Weird, weird...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
m1n1ngP4d4w4n
Full Member
***
Offline Offline

Activity: 154

CryptoLearner


View Profile
January 05, 2017, 12:14:21 PM
 #224

It turned out that the "legacy" AMD drivers require a totally different set of optimizations. This must be the reason why GG was running rather slow on older (GCN1/2) cards. I suppose optimizations for legacy drivers are worth the effort after all...

or people could update Smiley

Well, the thing is, with older cards, even the latest drivers always switch back to the "legacy" mode.
Weird, weird...

Ah i see, so it's not the drivers that is updated but just the package then... lazy amd lol, they don't make newest drivers compatible with old hardware, they just pack different drivers version into one package, no wonder they're so big, nvidia prob does the same, when you see that the package is 350+ MB

BTC - 1B1RBYkzxiTmrbnFe2vj8EaNPSYftW8186 for tips Wink
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
January 05, 2017, 02:34:19 PM
 #225

It looks like AMDIL is a dead-end anyway.
http://lists.llvm.org/pipermail/llvm-dev/2015-May/085684.html

HSAIL will probably short-lived since most of the work is now focused on the llvm amdgpu back-end.  It even supports inline asm, but I'm not sure if it will generate a kernel binary that conforms to AMD's CL2.0 ABI.   With clang/llvm-3.9, I've only got as far as getting it to output gcn assembler from the OpenCL + inline asm code.




Like Wolf said, CLRX is the way to go if you haven't looked into it. I used it in my previous project with a great success. I am trying to figure out how to enable GDS on Ellesmere, which turned out to be rather tricky. It seems that there is no way to enable GDS with the CL2.0 ABI and you have to resort back to CL1.2 ABI with the "-legacy" build option. This totally sucks as I need to redo optimizations all over again. I have no idea as to what engineers at AMD had in mind when they decided to make this design change.

I'm not so sure.  While Mateusz has done some great work with CLRX, he hasn't fully reverse-engineered how the AMD drivers load kernel binaries.  For example he thought the amdcl2 binaries were 64-bit only until I sent him a Tonga CL2.0 32-bit kernel binary.  The fglrx linux ocl driver contains the string "gds_segment_byte_size", which the CLRX docs only lists in the ROCm ABI.  Despite being "closed source", much of the llvm code contained in the drivers is open source.  I think it may be possible to compile a mixed OpenCL/asm kernel offline using llvm that can be loaded (CreateProgramWithBinary) by the 16.x Crimson drivers on Windoze, the Linux fglrx drivers, and the AMDGPU-PRO drivers.

p.s. According to a source at AMD, "The ROCm ABI is the same as the HSAIL ABI and will be common".  With fglrx, building with CL2.0 and --save-temps generates hsail along with asm, meaning it supports the HSAIL ABI.  Ergo the ROCm ABI should be supported.
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 05, 2017, 03:18:32 PM
 #226

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
January 05, 2017, 03:38:23 PM
 #227

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

It'll be a while.  I'm not a "just make it work" kind of guy.  I need to understand how things work at a low level first.
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 05, 2017, 08:51:52 PM
 #228

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

I just noticed this string is not present in the output of CodeXL, which means that the AMD OpenCL 2.0 ABI is not capable of handling GDS. Oh well. I was able to achieve a reasonable performance in the legacy mode, so this should be a good foundation for the upcoming GCN assembly version. If I can get GDS and GWS working, the new version should be able to surpass Claymore's performance. We will see.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
QuintLeo
Hero Member
*****
Offline Offline

Activity: 910


View Profile
January 05, 2017, 11:06:31 PM
 #229


Ah i see, so it's not the drivers that is updated but just the package then... lazy amd lol, they don't make newest drivers compatible with old hardware, they just pack different drivers version into one package, no wonder they're so big, nvidia prob does the same, when you see that the package is 350+ MB

 More like 450 for the Relive bloatware - as opposed to 250ish for the most Crimson versions.

 Seems like AMD has caught Mickey$loth Bloatware disease BAD the last few months.
m1n1ngP4d4w4n
Full Member
***
Offline Offline

Activity: 154

CryptoLearner


View Profile
January 05, 2017, 11:40:01 PM
 #230


Ah i see, so it's not the drivers that is updated but just the package then... lazy amd lol, they don't make newest drivers compatible with old hardware, they just pack different drivers version into one package, no wonder they're so big, nvidia prob does the same, when you see that the package is 350+ MB

 More like 450 for the Relive bloatware - as opposed to 250ish for the most Crimson versions.

 Seems like AMD has caught Mickey$loth Bloatware disease BAD the last few months.

ahah isn't that true... damn what a waste of space and bandwith  Angry, same with nvidia, they should pack the VR / 3D stuff apart from the rest, there isn't many people that actually use this...

BTC - 1B1RBYkzxiTmrbnFe2vj8EaNPSYftW8186 for tips Wink
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
January 06, 2017, 12:16:03 AM
 #231

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

I just noticed this string is not present in the output of CodeXL, which means that the AMD OpenCL 2.0 ABI is not capable of handling GDS. Oh well. I was able to achieve a reasonable performance in the legacy mode, so this should be a good foundation for the upcoming GCN assembly version. If I can get GDS and GWS working, the new version should be able to surpass Claymore's performance. We will see.

A CL2.0 kernel won't touch the GDS unless you use pipes or atomic counters (counter32_t).
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 06, 2017, 04:16:28 AM
 #232

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

I just noticed this string is not present in the output of CodeXL, which means that the AMD OpenCL 2.0 ABI is not capable of handling GDS. Oh well. I was able to achieve a reasonable performance in the legacy mode, so this should be a good foundation for the upcoming GCN assembly version. If I can get GDS and GWS working, the new version should be able to surpass Claymore's performance. We will see.

A CL2.0 kernel won't touch the GDS unless you use pipes or atomic counters (counter32_t).


I don't know about pipes, but counter32_t is officially deprecated already, and I confirmed that counter32_t is not available with OpenCL 2.0 binaries. I suppose I am really lucky that I was able to touch GDS with the "-legacy" build option as even assembler programmers struggled to find a way to access GDS on newer devices:

GDS memory revisited, moving from Tahiti to Hawaii
https://community.amd.com/thread/170216

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 06, 2017, 07:21:40 AM
 #233

the string "gds_segment_byte_size"

This is exactly what I needed! Thank you! I will stick with CLRX for now as I am already used to it.
I will add the setting for GDS to CLRX and see if that would work.
In the meantime, please let me know if you manage to get the ROCm stuff working.

I just noticed this string is not present in the output of CodeXL, which means that the AMD OpenCL 2.0 ABI is not capable of handling GDS. Oh well. I was able to achieve a reasonable performance in the legacy mode, so this should be a good foundation for the upcoming GCN assembly version. If I can get GDS and GWS working, the new version should be able to surpass Claymore's performance. We will see.

A CL2.0 kernel won't touch the GDS unless you use pipes or atomic counters (counter32_t).


I also tried using pipes to see if they result in GDS access, but CodeXL did not show any GDS instructions. What a bummer...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 06, 2017, 08:21:29 AM
 #234

I was finally able to assemble a GCN ISA code with the OpenCL 1.2 ABI for Ellesmere by modifying CLRX. This means that I have the ability to access GDS on RX 480 without restrictions now. I will be out of town for a week, but we should see pretty interesting things after I come back home.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 07, 2017, 01:32:25 PM
 #235

I appreciate what you are doing and look forward to switching my farm to your miner when it is a bit faster.  A moderate difference in hashrate is to costly with a bunch of miners running but I will accept a small loss in hashrate just to stop using the closed source stuff.

There are tons of optimizations that can be done with the GCN assembly, so that should happen sooner than later. Now I am thinking about taking a completely different strategy. My current game plan is (1) to reduce the number of kernels by using global synchronizations across work-items and (2) to move pathetically slow row counters in global memory to the fast 64KB GDS. (2) is GCN-specific, but (1) should also be possible with NVIDIA. Too bad I cannot try these ideas until next Wednesday...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 07, 2017, 10:51:39 PM
 #236

Since I am away from home until Wednesday and do not have access to dedicated graphics cards, I just decided to try potential replacements for gatelessgate.py as I feel more comfortable with C++ than Python and I think the Python component of SA v5 is rather lacking as far as functionality is concerned. I am planning to evaluate sgminer-gm and nheqminer. I hope they should bring GG on par with Claymore's in terms of usability.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
laik2
Sr. Member
****
Offline Offline

Activity: 392


View Profile
January 07, 2017, 11:39:00 PM
 #237

Since I am away from home until Wednesday and do not have access to dedicated graphics cards, I just decided to try potential replacements for gatelessgate.py as I feel more comfortable with C++ than Python and I think the Python component of SA v5 is rather lacking as far as functionality is concerned. I am planning to evaluate sgminer-gm and nheqminer. I hope they should bring GG on par with Claymore's in terms of usability.
If you feel the need to test graphics I have windows 10+NV 1070 on it, linux 14.04 r9 390, linux 16.04 rx480 Smiley


ZEC: t1KbbHtXqzSS6qHBaPZDKyWnzxhRjr9oCtW
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 07, 2017, 11:43:33 PM
 #238

Since I am away from home until Wednesday and do not have access to dedicated graphics cards, I just decided to try potential replacements for gatelessgate.py as I feel more comfortable with C++ than Python and I think the Python component of SA v5 is rather lacking as far as functionality is concerned. I am planning to evaluate sgminer-gm and nheqminer. I hope they should bring GG on par with Claymore's in terms of usability.
If you feel the need to test graphics I have windows 10+NV 1070 on it, linux 14.04 r9 390, linux 16.04 rx480 Smiley



That would be great! Could you set up the Linux box with RX 480?

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
laik2
Sr. Member
****
Offline Offline

Activity: 392


View Profile
January 07, 2017, 11:51:38 PM
 #239

Since I am away from home until Wednesday and do not have access to dedicated graphics cards, I just decided to try potential replacements for gatelessgate.py as I feel more comfortable with C++ than Python and I think the Python component of SA v5 is rather lacking as far as functionality is concerned. I am planning to evaluate sgminer-gm and nheqminer. I hope they should bring GG on par with Claymore's in terms of usability.
If you feel the need to test graphics I have windows 10+NV 1070 on it, linux 14.04 r9 390, linux 16.04 rx480 Smiley



That would be great! Could you set up the Linux box with RX 480?
I will P.M with details.

ZEC: t1KbbHtXqzSS6qHBaPZDKyWnzxhRjr9oCtW
Subw
Hero Member
*****
Offline Offline

Activity: 672


View Profile
January 08, 2017, 11:41:27 AM
 #240

I am planning to evaluate sgminer-gm

moving to sgminer would be awesome as it has great pool management support and popular API for monitoring and control
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 ... 88 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!