Bitcoin Forum
December 13, 2017, 07:42:29 AM *
News: Latest stable version of Bitcoin Core: 0.15.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 [20] 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 ... 86 »
  Print  
Author Topic: Gateless Gate Sharp 1.1.4: zawawa's open-source dual ETH/XMR/PASC/LBC miner  (Read 163630 times)
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 21, 2017, 08:17:14 PM
 #381

I might have to look into HSAIL after all for reliable access to GDS on RX 480.
What a pain in the rear...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
1513150949
Hero Member
*
Offline Offline

Posts: 1513150949

View Profile Personal Message (Offline)

Ignore
1513150949
Reply with quote  #2

1513150949
Report to moderator
1513150949
Hero Member
*
Offline Offline

Posts: 1513150949

View Profile Personal Message (Offline)

Ignore
1513150949
Reply with quote  #2

1513150949
Report to moderator
1513150949
Hero Member
*
Offline Offline

Posts: 1513150949

View Profile Personal Message (Offline)

Ignore
1513150949
Reply with quote  #2

1513150949
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1513150949
Hero Member
*
Offline Offline

Posts: 1513150949

View Profile Personal Message (Offline)

Ignore
1513150949
Reply with quote  #2

1513150949
Report to moderator
1513150949
Hero Member
*
Offline Offline

Posts: 1513150949

View Profile Personal Message (Offline)

Ignore
1513150949
Reply with quote  #2

1513150949
Report to moderator
1513150949
Hero Member
*
Offline Offline

Posts: 1513150949

View Profile Personal Message (Offline)

Ignore
1513150949
Reply with quote  #2

1513150949
Report to moderator
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 21, 2017, 08:20:33 PM
 #382

It seems like basic tools are available on Windows:

https://github.com/HSAFoundation/HSAIL-Tools

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 21, 2017, 09:55:20 PM
 #383



Ah, the joy... I'm pretty much over the hump.
I didn't need the OpenCL 1.2 ABI or HSAIL after all.
This most likely means I should be able to catch up with Optiminer.
Good stuff, good stuff.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
January 22, 2017, 01:36:58 AM
 #384


I didn't need the OpenCL 1.2 ABI or HSAIL after all.
This most likely means I should be able to catch up with Optiminer.
Good stuff, good stuff.

You wouldn't have had much luck with HSAIL anyway; I'm pretty sure I already mentioned there's no GDS instructions in HSAIL.
zzzzzzzzzz
Full Member
***
Offline Offline

Activity: 148


View Profile
January 22, 2017, 02:03:00 AM
 #385

@zawawa Just in case, I'll ask: Are you working on GPUs other than RX4xx? I ask because that's the only GPU that anyone has even mentioned in this thread. How about R9 Fury/Nano, for instance? 290x? Etc..? In any case, thank you for all the effort you've given to this!
manotroll
Sr. Member
****
Offline Offline

Activity: 287


View Profile
January 22, 2017, 02:06:39 AM
 #386

@zawawa Just in case, I'll ask: Are you working on GPUs other than RX4xx? I ask because that's the only GPU that anyone has even mentioned in this thread. How about R9 Fury/Nano, for instance? 290x? Etc..? In any case, thank you for all the effort you've given to this!


390x use for eth
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 22, 2017, 02:24:07 AM
 #387

@zawawa Just in case, I'll ask: Are you working on GPUs other than RX4xx? I ask because that's the only GPU that anyone has even mentioned in this thread. How about R9 Fury/Nano, for instance? 290x? Etc..? In any case, thank you for all the effort you've given to this!


I am currently focusing on RX 480, but I am planning to work on other cards once I'm done with it.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 22, 2017, 03:55:32 AM
 #388

The miner is running stably with 2 threads with a 32KB GDS segment each. Very cool...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 22, 2017, 04:14:43 AM
 #389

I added a new pseudo-op for Global Data Share (GDS) to CLRadeonExtender:

https://github.com/CLRX/CLRX-mirror/pull/11

It will be so much fun if we can freely exploit this killer feature at last...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 22, 2017, 04:19:20 AM
 #390


I didn't need the OpenCL 1.2 ABI or HSAIL after all.
This most likely means I should be able to catch up with Optiminer.
Good stuff, good stuff.

You wouldn't have had much luck with HSAIL anyway; I'm pretty sure I already mentioned there's no GDS instructions in HSAIL.


Really? I don't recall that... The ROCm ABI does expose GDS, though. I will doublecheck.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
cryptominer420
Full Member
***
Offline Offline

Activity: 183


View Profile
January 22, 2017, 05:50:38 PM
 #391

Sounds interesting, I'm anxious to see what you find out.

BTC: 1Eeb9SoBeY7AQjjFn7YMJZMY7Jtw5gxxHs  ETH: 0x68e4EA3b7e60C8D6fC9BA92775ccE27Ca542D114
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
January 22, 2017, 06:39:32 PM
 #392

I added a new pseudo-op for Global Data Share (GDS) to CLRadeonExtender:

https://github.com/CLRX/CLRX-mirror/pull/11

It will be so much fun if we can freely exploit this killer feature at last...

Nice.  With this change there should be no more need to explicitly initialize M0 (except maybe for GCN1 devices since they only have OpenCL1.2 driver support).
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
January 22, 2017, 06:48:27 PM
 #393


I didn't need the OpenCL 1.2 ABI or HSAIL after all.
This most likely means I should be able to catch up with Optiminer.
Good stuff, good stuff.

You wouldn't have had much luck with HSAIL anyway; I'm pretty sure I already mentioned there's no GDS instructions in HSAIL.


Really? I don't recall that... The ROCm ABI does expose GDS, though. I will doublecheck.

I confirmed it with one of the AMD devs working on llvm.  He said there was plans for a GCN extension that never got implemented in the HSAIL llvm backend since they are now focused on the AMDGPU backend.
ROCm also now supports OpenCL kernels.
https://www.khronos.org/news/permalink/rocm-1.4-has-support-for-opencl-1.2-host-code-and-2.0-kernels

The possibility of using inline asm for GDS access with the rest of the kernel in straight OpenCL looks promising to me...
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 23, 2017, 03:00:12 AM
 #394


The possibility of using inline asm for GDS access with the rest of the kernel in straight OpenCL looks promising to me...


That would be really nice, but I need a solution that works right now.
I had to go through another hoop and turn on the "enable_ordered_append_gds" bit, but I finally located where the GDS base is stored. I am getting really close!

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 23, 2017, 05:33:28 AM
 #395

Do I need to initialize GDS before actually using it?
These instructions are documented nowhere.

Code:
DS_CONSUME
DS_APPEND
DS_ORDERED_COUNT

nerdralph, do you have any ideas?

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 23, 2017, 08:54:04 AM
 #396

Hmm... It seems that GDS is not activated for some reasons.
What to do, what to do...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
chronek
Sr. Member
****
Offline Offline

Activity: 262


View Profile
January 23, 2017, 12:13:14 PM
 #397

i heard that rx480 have opencl 2.0, would be any benefits when using abi 2.0?
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 23, 2017, 01:58:06 PM
 #398

i heard that rx480 have opencl 2.0, would be any benefits when using abi 2.0?

The OpenCL 2.0 ABI does not make any differences. I might have to bypass the driver and send raw packets directly to the GPU to enable GDS. This is crazy.

http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/10/si_programming_guide_v2.pdf
https://github.com/fail0verflow/radeon-tools/blob/master/f32/f32dis.py

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 23, 2017, 02:07:57 PM
 #399

There you go!

Quote
2.9 Misc/Data Transfer Packets
2.9.1 ALLOC_GDS
The packet will allocate a new segment within its corresponding GDS partition. The corresponding partition is
determined from the Ring to which the packet is submitted. The microcode will first wait until the active partition
count equals zero before continuing. This guarantees that the entire contents of the previous allocated segment have
been dumped to memory before allocating the new segment within the current partition. It will also check if the
segment size is less than partition size and interrupt if the current segment does not fit into its specified partition

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
January 23, 2017, 02:11:57 PM
 #400

Do I need to initialize GDS before actually using it?
These instructions are documented nowhere.

Code:
DS_CONSUME
DS_APPEND
DS_ORDERED_COUNT

nerdralph, do you have any ideas?

I suspect the driver initializes M0 when gds_segment_byte_size is set in the kernel configuration.  If you look in the GCN ISA docs, it says M0 has 16 bits for offset and 16 bits for size.  M0 is also used for LDS, so when you use both in your code you'll need to save it to another register.

I hadn't looked at the DS_ instructions you refer to, and a quick look at the ISA confirms your observation about them having no documentation.  The llvm source would at least have the instruction encoding.

I'm not sure why you want to use those instructions though.  For the global row counters I'd use ds_add_u32 with the GDS bit set.

p.s. the M0 description is in s. 3.7 of the GCN ISA docs.

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 [20] 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 ... 86 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!