Bitcoin Forum
December 11, 2017, 03:04:33 AM *
News: Latest stable version of Bitcoin Core: 0.15.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 [43] 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 »
  Print  
Author Topic: Gateless Gate Sharp 1.1.3: zawawa's open-source dual ETH/XMR/PASC/LBC miner  (Read 161322 times)
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
March 12, 2017, 04:42:14 PM
 #841


Looks promising, however I find you really need to dig through the code and experiment to see what does and does not work.  I don't consider myself a kernel module developer, so you might already know more about this than I do.  With closed-source drivers like AMDGPU-Pro, it's hard to figure out which parts of the kernel drm API are implemented, and even if they are implemented whether they work.  For example the 16.40 drivers implements the powerplay function force_clock_level(), but it only seems to support type PP_SCLK.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
zawawa
Sr. Member
****
Online Online

Activity: 420


Miner Developer


View Profile
March 12, 2017, 07:59:33 PM
 #842


Looks promising, however I find you really need to dig through the code and experiment to see what does and does not work.  I don't consider myself a kernel module developer, so you might already know more about this than I do.  With closed-source drivers like AMDGPU-Pro, it's hard to figure out which parts of the kernel drm API are implemented, and even if they are implemented whether they work.  For example the 16.40 drivers implements the powerplay function force_clock_level(), but it only seems to support type PP_SCLK.


You are absolutely right about that. Thanks a lot for sticking around!

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Online Online

Activity: 420


Miner Developer


View Profile
March 13, 2017, 10:58:06 AM
 #843

While I was trying to find a way to access the entire GDS, I realized I don't need that much memory for row counters.
All I have to do is to squeeze four 10-bit row counters into a 32-bit integer. Let's see...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Online Online

Activity: 420


Miner Developer


View Profile
March 13, 2017, 09:47:34 PM
 #844

Alright, I'm getting 250 sol/s with stock RX 480 on Ubuntu.
This should be good enough as a starter.
I'm getting a pretty good hang of the GCN inline assembly, so the rest of GG's development should be a pretty smooth ride.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
qwep1
Hero Member
*****
Offline Offline

Activity: 556


View Profile
March 13, 2017, 10:05:47 PM
 #845

Alright, I'm getting 250 sol/s with stock RX 480 on Ubuntu.
This should be good enough as a starter.
I'm getting a pretty good hang of the GCN inline assembly, so the rest of GG's development should be a pretty smooth ride.
mod bios or no

              ▄▄██▄▄
          ▄▄██████████▄▄
      ▄▄██████████████████▄▄
  ▄▄██████████▀▀ ▀▀██████████▄▄
▄█████████▀▀          ▀▀█████████▄
██████▀▀        ▄▄        ▀▀██████
██████      ▄▄██████▄▄      ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████      ▀▀██████▀▀      ██████
██████          ▀▀        ▄▄██████
▀█████    ▄▄          ▄▄█████████▀
   ▀▀█    ████▄▄ ▄▄██████████▀▀
          ████████████████▀▀
          ▀▀██████████▀▀
              ▀▀██▀▀
P H O R E

     █
    █
   █
  █
   █
    █
   █
  █
 █
    KryptKoin rebranded to Phore   
     █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
PoS 3.0  -  Masternodes  -  Obfuscation


     █
    █
   █
  █
   █
    █
   █
  █
 █
.


            ▄▄██▄▄
        ▄▄██████████▄▄
    ▄▄████████▀▀████████▄▄
 ▄████████▀▀      ▀▀████████▄
▐█████▀▀              ▀▀█████▌
▐████       ▄▄██▄▄       ████▌
▐████    ▄██████████▄    ████▌
▐████    ████████████    ████▌
▐████    ▀██████████▀    ████▌
▐████       ▀▀██▀▀       ████▌
 ▀███                 ▄▄█████▌
    ▀    █▄▄      ▄▄████████▀
         █████▄▄████████▀▀
         ▀██████████▀▀
            ▀▀██▀▀
zawawa
Sr. Member
****
Online Online

Activity: 420


Miner Developer


View Profile
March 13, 2017, 10:09:25 PM
 #846

Alright, I'm getting 250 sol/s with stock RX 480 on Ubuntu.
This should be good enough as a starter.
I'm getting a pretty good hang of the GCN inline assembly, so the rest of GG's development should be a pretty smooth ride.
mod bios or no

No BIOS mod.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Online Online

Activity: 420


Miner Developer


View Profile
March 14, 2017, 05:48:31 AM
 #847

I just uploaded an experimental version for Ellesmere (RX 470/480):

https://github.com/zawawawa/gatelessgate/releases/tag/v0.1.3-pre3

With stock RX 480, t runs at around 250 sol/s and 230 sol/s on Linux and Windows, respectively.
The miner should get full speed as soon as I figure out how to access the entire GDS without restrictions.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
toptek
Legendary
*
Offline Offline

Activity: 1120


View Profile
March 14, 2017, 07:01:15 AM
 #848

got Eth were CM is for me on my two new cards haven't tired my other cards yet, at stock on a 480 8gb and 470 8gb holding at or around 24 to 25 mh what CM does but with no fee so I get it all now ..gonna give Zec a shot later on ... and NO HW error mining ETH so far ....
Daniel0785
Newbie
*
Offline Offline

Activity: 7


View Profile
March 14, 2017, 10:27:14 AM
 #849

Great work! Any improvements about XMR in pre3?
lexele
Full Member
***
Offline Offline

Activity: 164


View Profile
March 14, 2017, 10:38:15 AM
 #850

Great work! Any improvements about XMR in pre3?

Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR.

While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
Daniel0785
Newbie
*
Offline Offline

Activity: 7


View Profile
March 14, 2017, 12:15:19 PM
 #851

Great work! Any improvements about XMR in pre3?

Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR.

While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
I agree. But if there's any improvements, I have to update, otherwise I can just stay with pre2.
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
March 14, 2017, 02:04:54 PM
 #852

While I was trying to find a way to access the entire GDS, I realized I don't need that much memory for row counters.
All I have to do is to squeeze four 10-bit row counters into a 32-bit integer. Let's see...

I doubt going over 8-bit row counters with 2^14 rows would give a material performance improvement.  It also complicates the code, and increases LDS use.  Once you're finished the code with 10-bit row counters, I may try optimizing it for 8.  Using SLC or GLC memory read/write may also give a small performance boost.

UnclWish
Sr. Member
****
Offline Offline

Activity: 262


View Profile
March 14, 2017, 02:27:38 PM
 #853

Great work! Any improvements about XMR in pre3?

Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR.

While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
You're not right. Fastest XMR miner at this moment is Claymore CryptoNote 9.7.
Even with 2% fee it gives about 5-10% more speed on 280X.
R0mi
Full Member
***
Offline Offline

Activity: 175

Axe Capital LOVES BTC and ETH ETFs and ETNs


View Profile WWW
March 14, 2017, 02:53:12 PM
 #854

I just uploaded an experimental version for Ellesmere (RX 470/480):

https://github.com/zawawawa/gatelessgate/releases/tag/v0.1.3-pre3

With stock RX 480, t runs at around 250 sol/s and 230 sol/s on Linux and Windows, respectively.
The miner should get full speed as soon as I figure out how to access the entire GDS without restrictions.


Great, and thanks for this.  Any chance of adding blake2 kernel? Seems it was included in SGMiner 5.1.1 and then disappeared.

What would BitCoin Jesus do?
zawawa
Sr. Member
****
Online Online

Activity: 420


Miner Developer


View Profile
March 14, 2017, 08:00:24 PM
 #855

Using SLC or GLC memory read/write may also give a small performance boost.

Could you elaborate on this? I recall you said Wolf was using them for his private miner, but I am not entirely sure how to use SLC/GLC bits for performance enhancements.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Online Online

Activity: 420


Miner Developer


View Profile
March 14, 2017, 08:38:30 PM
 #856

It seems like RX 480 has plenty of GDS-related configuration registers.

https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

These are helper functions in the Linux kernel:

http://lxr.free-electrons.com/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L92

The size and offset of the GDS are set in the following function:

amdgpu_gds_reg_offset()

This must be it!

Code:
/* GDS Base */
amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3));
amdgpu_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) |
WRITE_DATA_DST_SEL(0)));
amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].mem_base);
amdgpu_ring_write(ring, 0);
amdgpu_ring_write(ring, gds_base);

/* GDS Size */
amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3));
amdgpu_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) |
WRITE_DATA_DST_SEL(0)));
amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].mem_size);
amdgpu_ring_write(ring, 0);
amdgpu_ring_write(ring, gds_size);

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Online Online

Activity: 420


Miner Developer


View Profile
March 14, 2017, 08:44:23 PM
 #857

Great work! Any improvements about XMR in pre3?

Not yet, not yet. I plan to optimize the hell out of other kernels once I'm done with Equihash.
The GCN inline assembly is ridiculously powerful, I'm telling you.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Online Online

Activity: 420


Miner Developer


View Profile
March 14, 2017, 09:26:16 PM
 #858

GDS-related parameters for the compute kernel are stored here... Good stuff.

Code:
struct amdgpu_job {
struct amd_sched_job    base;
struct amdgpu_device *adev;
struct amdgpu_vm *vm;
struct amdgpu_ring *ring;
struct amdgpu_sync sync;
struct amdgpu_ib *ibs;
struct dma_fence *fence; /* the hw fence */
uint32_t preamble_status;
uint32_t num_ibs;
void *owner;
uint64_t fence_ctx; /* the fence_context this job uses */
bool                    vm_needs_flush;
unsigned vm_id;
uint64_t vm_pd_addr;
uint32_t gds_base, gds_size;
uint32_t gws_base, gws_size;
uint32_t oa_base, oa_size;

/* user fence handling */
uint64_t uf_addr;
uint64_t uf_sequence;

};

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
lexele
Full Member
***
Offline Offline

Activity: 164


View Profile
March 14, 2017, 11:06:44 PM
 #859

Hi,
tested pr3 on RX 470 4G: 240H/S (1230/2025, moded bios,)
lexele
Full Member
***
Offline Offline

Activity: 164


View Profile
March 14, 2017, 11:09:11 PM
 #860

Great work! Any improvements about XMR in pre3?

Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR.

While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
You're not right. Fastest XMR miner at this moment is Claymore CryptoNote 9.7.
Even with 2% fee it gives about 5-10% more speed on 280X.

on my 290x GG is 15% faster, on my RX470 not much faster but it's faster.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 [43] 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!