Bitcoin Forum
May 10, 2024, 06:24:24 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Poll
Question: Do you want to see improvements in Ethash dual-mining with GGS?
I desperately need it. - 8 (15.1%)
It would be nice. - 12 (22.6%)
It's not worth it anymore. - 33 (62.3%)
Total Voters: 53

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 [43] 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 ... 197 »
  Print  
Author Topic: Gateless Gate Sharp 1.3.8: 30Mh/s (Ethash) on RX 480!  (Read 214344 times)
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
March 12, 2017, 07:59:33 PM
 #841


Looks promising, however I find you really need to dig through the code and experiment to see what does and does not work.  I don't consider myself a kernel module developer, so you might already know more about this than I do.  With closed-source drivers like AMDGPU-Pro, it's hard to figure out which parts of the kernel drm API are implemented, and even if they are implemented whether they work.  For example the 16.40 drivers implements the powerplay function force_clock_level(), but it only seems to support type PP_SCLK.


You are absolutely right about that. Thanks a lot for sticking around!

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
1715322264
Hero Member
*
Offline Offline

Posts: 1715322264

View Profile Personal Message (Offline)

Ignore
1715322264
Reply with quote  #2

1715322264
Report to moderator
1715322264
Hero Member
*
Offline Offline

Posts: 1715322264

View Profile Personal Message (Offline)

Ignore
1715322264
Reply with quote  #2

1715322264
Report to moderator
1715322264
Hero Member
*
Offline Offline

Posts: 1715322264

View Profile Personal Message (Offline)

Ignore
1715322264
Reply with quote  #2

1715322264
Report to moderator
In order to get the maximum amount of activity points possible, you just need to post once per day on average. Skipping days is OK as long as you maintain the average.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
March 13, 2017, 10:58:06 AM
 #842

While I was trying to find a way to access the entire GDS, I realized I don't need that much memory for row counters.
All I have to do is to squeeze four 10-bit row counters into a 32-bit integer. Let's see...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
March 13, 2017, 09:47:34 PM
 #843

Alright, I'm getting 250 sol/s with stock RX 480 on Ubuntu.
This should be good enough as a starter.
I'm getting a pretty good hang of the GCN inline assembly, so the rest of GG's development should be a pretty smooth ride.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
qwep1
Hero Member
*****
Offline Offline

Activity: 610
Merit: 500


View Profile
March 13, 2017, 10:05:47 PM
 #844

Alright, I'm getting 250 sol/s with stock RX 480 on Ubuntu.
This should be good enough as a starter.
I'm getting a pretty good hang of the GCN inline assembly, so the rest of GG's development should be a pretty smooth ride.
mod bios or no

              ▄▄██▄▄
          ▄▄██████████▄▄
      ▄▄██████████████████▄▄
  ▄▄██████████▀▀ ▀▀██████████▄▄
▄█████████▀▀          ▀▀█████████▄
██████▀▀        ▄▄        ▀▀██████
██████      ▄▄██████▄▄      ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████      ▀▀██████▀▀      ██████
██████          ▀▀        ▄▄██████
▀█████    ▄▄          ▄▄█████████▀
   ▀▀█    ████▄▄ ▄▄██████████▀▀
          ████████████████▀▀
          ▀▀██████████▀▀
              ▀▀██▀▀
P H O R E

     █
    █
   █
  █
   █
    █
   █
  █
 █
    KryptKoin rebranded to Phore   
     █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
PoS 3.0  -  Masternodes  -  Obfuscation


     █
    █
   █
  █
   █
    █
   █
  █
 █
.


            ▄▄██▄▄
        ▄▄██████████▄▄
    ▄▄████████▀▀████████▄▄
 ▄████████▀▀      ▀▀████████▄
▐█████▀▀              ▀▀█████▌
▐████       ▄▄██▄▄       ████▌
▐████    ▄██████████▄    ████▌
▐████    ████████████    ████▌
▐████    ▀██████████▀    ████▌
▐████       ▀▀██▀▀       ████▌
 ▀███                 ▄▄█████▌
    ▀    █▄▄      ▄▄████████▀
         █████▄▄████████▀▀
         ▀██████████▀▀
            ▀▀██▀▀
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
March 13, 2017, 10:09:25 PM
 #845

Alright, I'm getting 250 sol/s with stock RX 480 on Ubuntu.
This should be good enough as a starter.
I'm getting a pretty good hang of the GCN inline assembly, so the rest of GG's development should be a pretty smooth ride.
mod bios or no

No BIOS mod.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
March 14, 2017, 05:48:31 AM
 #846

I just uploaded an experimental version for Ellesmere (RX 470/480):

https://github.com/zawawawa/gatelessgate/releases/tag/v0.1.3-pre3

With stock RX 480, t runs at around 250 sol/s and 230 sol/s on Linux and Windows, respectively.
The miner should get full speed as soon as I figure out how to access the entire GDS without restrictions.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
toptek
Legendary
*
Offline Offline

Activity: 1274
Merit: 1000


View Profile
March 14, 2017, 07:01:15 AM
 #847

got Eth were CM is for me on my two new cards haven't tired my other cards yet, at stock on a 480 8gb and 470 8gb holding at or around 24 to 25 mh what CM does but with no fee so I get it all now ..gonna give Zec a shot later on ... and NO HW error mining ETH so far ....

For security, your account has been locked. Email acctcomp15@theymos.e4ward.com
Daniel0785
Newbie
*
Offline Offline

Activity: 7
Merit: 0


View Profile
March 14, 2017, 10:27:14 AM
 #848

Great work! Any improvements about XMR in pre3?
lexele
Full Member
***
Offline Offline

Activity: 190
Merit: 100


View Profile
March 14, 2017, 10:38:15 AM
 #849

Great work! Any improvements about XMR in pre3?

Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR.

While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
Daniel0785
Newbie
*
Offline Offline

Activity: 7
Merit: 0


View Profile
March 14, 2017, 12:15:19 PM
 #850

Great work! Any improvements about XMR in pre3?

Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR.

While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
I agree. But if there's any improvements, I have to update, otherwise I can just stay with pre2.
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
March 14, 2017, 02:04:54 PM
 #851

While I was trying to find a way to access the entire GDS, I realized I don't need that much memory for row counters.
All I have to do is to squeeze four 10-bit row counters into a 32-bit integer. Let's see...

I doubt going over 8-bit row counters with 2^14 rows would give a material performance improvement.  It also complicates the code, and increases LDS use.  Once you're finished the code with 10-bit row counters, I may try optimizing it for 8.  Using SLC or GLC memory read/write may also give a small performance boost.

UnclWish
Sr. Member
****
Offline Offline

Activity: 1484
Merit: 253


View Profile
March 14, 2017, 02:27:38 PM
 #852

Great work! Any improvements about XMR in pre3?

Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR.

While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
You're not right. Fastest XMR miner at this moment is Claymore CryptoNote 9.7.
Even with 2% fee it gives about 5-10% more speed on 280X.
R0mi
Full Member
***
Offline Offline

Activity: 305
Merit: 148

Theranos Coin - IoT + micro-blood arrays = Moon!


View Profile
March 14, 2017, 02:53:12 PM
 #853

I just uploaded an experimental version for Ellesmere (RX 470/480):

https://github.com/zawawawa/gatelessgate/releases/tag/v0.1.3-pre3

With stock RX 480, t runs at around 250 sol/s and 230 sol/s on Linux and Windows, respectively.
The miner should get full speed as soon as I figure out how to access the entire GDS without restrictions.


Great, and thanks for this.  Any chance of adding blake2 kernel? Seems it was included in SGMiner 5.1.1 and then disappeared.

Walton Chain CEO Mo' Bling: "Walton Chain will be the Qualcomm + Cisco in the blockchain industry, the ‘Google’ of the Blockchain."  It's December 1999, do you know how your shitcoin holdings are doing?  Magic 8 ball market analysis: www.doiownashitcoin.com
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
March 14, 2017, 08:00:24 PM
 #854

Using SLC or GLC memory read/write may also give a small performance boost.

Could you elaborate on this? I recall you said Wolf was using them for his private miner, but I am not entirely sure how to use SLC/GLC bits for performance enhancements.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
March 14, 2017, 08:38:30 PM
 #855

It seems like RX 480 has plenty of GDS-related configuration registers.

https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

These are helper functions in the Linux kernel:

http://lxr.free-electrons.com/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L92

The size and offset of the GDS are set in the following function:

amdgpu_gds_reg_offset()

This must be it!

Code:
	/* GDS Base */
amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3));
amdgpu_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) |
WRITE_DATA_DST_SEL(0)));
amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].mem_base);
amdgpu_ring_write(ring, 0);
amdgpu_ring_write(ring, gds_base);

/* GDS Size */
amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3));
amdgpu_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) |
WRITE_DATA_DST_SEL(0)));
amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].mem_size);
amdgpu_ring_write(ring, 0);
amdgpu_ring_write(ring, gds_size);

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
March 14, 2017, 08:44:23 PM
 #856

Great work! Any improvements about XMR in pre3?

Not yet, not yet. I plan to optimize the hell out of other kernels once I'm done with Equihash.
The GCN inline assembly is ridiculously powerful, I'm telling you.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa (OP)
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
March 14, 2017, 09:26:16 PM
 #857

GDS-related parameters for the compute kernel are stored here... Good stuff.

Code:
struct amdgpu_job {
struct amd_sched_job    base;
struct amdgpu_device *adev;
struct amdgpu_vm *vm;
struct amdgpu_ring *ring;
struct amdgpu_sync sync;
struct amdgpu_ib *ibs;
struct dma_fence *fence; /* the hw fence */
uint32_t preamble_status;
uint32_t num_ibs;
void *owner;
uint64_t fence_ctx; /* the fence_context this job uses */
bool                    vm_needs_flush;
unsigned vm_id;
uint64_t vm_pd_addr;
uint32_t gds_base, gds_size;
uint32_t gws_base, gws_size;
uint32_t oa_base, oa_size;

/* user fence handling */
uint64_t uf_addr;
uint64_t uf_sequence;

};

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
lexele
Full Member
***
Offline Offline

Activity: 190
Merit: 100


View Profile
March 14, 2017, 11:06:44 PM
 #858

Hi,
tested pr3 on RX 470 4G: 240H/S (1230/2025, moded bios,)
lexele
Full Member
***
Offline Offline

Activity: 190
Merit: 100


View Profile
March 14, 2017, 11:09:11 PM
 #859

Great work! Any improvements about XMR in pre3?

Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR.

While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
You're not right. Fastest XMR miner at this moment is Claymore CryptoNote 9.7.
Even with 2% fee it gives about 5-10% more speed on 280X.

on my 290x GG is 15% faster, on my RX470 not much faster but it's faster.
UnclWish
Sr. Member
****
Offline Offline

Activity: 1484
Merit: 253


View Profile
March 15, 2017, 12:01:20 AM
 #860

Great work! Any improvements about XMR in pre3?

Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR.

While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
You're not right. Fastest XMR miner at this moment is Claymore CryptoNote 9.7.
Even with 2% fee it gives about 5-10% more speed on 280X.

on my 290x GG is 15% faster, on my RX470 not much faster but it's faster.
But on 280X Claymore is faster. Thats mean that GG is not fastest miner at this moment. Maybe author optimizes something else and it became the best )
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 [43] 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 ... 197 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!