zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
March 12, 2017, 07:59:33 PM |
|
Looks promising, however I find you really need to dig through the code and experiment to see what does and does not work. I don't consider myself a kernel module developer, so you might already know more about this than I do. With closed-source drivers like AMDGPU-Pro, it's hard to figure out which parts of the kernel drm API are implemented, and even if they are implemented whether they work. For example the 16.40 drivers implements the powerplay function force_clock_level(), but it only seems to support type PP_SCLK. You are absolutely right about that. Thanks a lot for sticking around!
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
March 13, 2017, 10:58:06 AM |
|
While I was trying to find a way to access the entire GDS, I realized I don't need that much memory for row counters. All I have to do is to squeeze four 10-bit row counters into a 32-bit integer. Let's see...
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
March 13, 2017, 09:47:34 PM |
|
Alright, I'm getting 250 sol/s with stock RX 480 on Ubuntu. This should be good enough as a starter. I'm getting a pretty good hang of the GCN inline assembly, so the rest of GG's development should be a pretty smooth ride.
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
qwep1
|
|
March 13, 2017, 10:05:47 PM |
|
Alright, I'm getting 250 sol/s with stock RX 480 on Ubuntu. This should be good enough as a starter. I'm getting a pretty good hang of the GCN inline assembly, so the rest of GG's development should be a pretty smooth ride.
mod bios or no
|
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
March 13, 2017, 10:09:25 PM |
|
Alright, I'm getting 250 sol/s with stock RX 480 on Ubuntu. This should be good enough as a starter. I'm getting a pretty good hang of the GCN inline assembly, so the rest of GG's development should be a pretty smooth ride.
mod bios or no No BIOS mod.
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
March 14, 2017, 05:48:31 AM |
|
I just uploaded an experimental version for Ellesmere (RX 470/480): https://github.com/zawawawa/gatelessgate/releases/tag/v0.1.3-pre3With stock RX 480, t runs at around 250 sol/s and 230 sol/s on Linux and Windows, respectively. The miner should get full speed as soon as I figure out how to access the entire GDS without restrictions.
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
toptek
Legendary
Offline
Activity: 1274
Merit: 1000
|
|
March 14, 2017, 07:01:15 AM |
|
got Eth were CM is for me on my two new cards haven't tired my other cards yet, at stock on a 480 8gb and 470 8gb holding at or around 24 to 25 mh what CM does but with no fee so I get it all now ..gonna give Zec a shot later on ... and NO HW error mining ETH so far ....
|
|
|
|
Daniel0785
Newbie
Offline
Activity: 7
Merit: 0
|
|
March 14, 2017, 10:27:14 AM |
|
Great work! Any improvements about XMR in pre3?
|
|
|
|
lexele
|
|
March 14, 2017, 10:38:15 AM |
|
Great work! Any improvements about XMR in pre3?
Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR. While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited.
|
|
|
|
Daniel0785
Newbie
Offline
Activity: 7
Merit: 0
|
|
March 14, 2017, 12:15:19 PM |
|
Great work! Any improvements about XMR in pre3?
Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR. While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited. I agree. But if there's any improvements, I have to update, otherwise I can just stay with pre2.
|
|
|
|
nerdralph
|
|
March 14, 2017, 02:04:54 PM |
|
While I was trying to find a way to access the entire GDS, I realized I don't need that much memory for row counters. All I have to do is to squeeze four 10-bit row counters into a 32-bit integer. Let's see...
I doubt going over 8-bit row counters with 2^14 rows would give a material performance improvement. It also complicates the code, and increases LDS use. Once you're finished the code with 10-bit row counters, I may try optimizing it for 8. Using SLC or GLC memory read/write may also give a small performance boost.
|
|
|
|
UnclWish
|
|
March 14, 2017, 02:27:38 PM |
|
Great work! Any improvements about XMR in pre3?
Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR. While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited. You're not right. Fastest XMR miner at this moment is Claymore CryptoNote 9.7. Even with 2% fee it gives about 5-10% more speed on 280X.
|
|
|
|
R0mi
Full Member
Offline
Activity: 305
Merit: 148
Theranos Coin - IoT + micro-blood arrays = Moon!
|
|
March 14, 2017, 02:53:12 PM |
|
I just uploaded an experimental version for Ellesmere (RX 470/480): https://github.com/zawawawa/gatelessgate/releases/tag/v0.1.3-pre3With stock RX 480, t runs at around 250 sol/s and 230 sol/s on Linux and Windows, respectively. The miner should get full speed as soon as I figure out how to access the entire GDS without restrictions. Great, and thanks for this. Any chance of adding blake2 kernel? Seems it was included in SGMiner 5.1.1 and then disappeared.
|
Walton Chain CEO Mo' Bling: "Walton Chain will be the Qualcomm + Cisco in the blockchain industry, the ‘Google’ of the Blockchain." It's December 1999, do you know how your shitcoin holdings are doing? Magic 8 ball market analysis: www.doiownashitcoin.com
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
March 14, 2017, 08:00:24 PM |
|
Using SLC or GLC memory read/write may also give a small performance boost. Could you elaborate on this? I recall you said Wolf was using them for his private miner, but I am not entirely sure how to use SLC/GLC bits for performance enhancements.
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
March 14, 2017, 08:38:30 PM |
|
It seems like RX 480 has plenty of GDS-related configuration registers. https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.cThese are helper functions in the Linux kernel: http://lxr.free-electrons.com/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L92The size and offset of the GDS are set in the following function: amdgpu_gds_reg_offset() This must be it! /* GDS Base */ amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); amdgpu_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) | WRITE_DATA_DST_SEL(0))); amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].mem_base); amdgpu_ring_write(ring, 0); amdgpu_ring_write(ring, gds_base);
/* GDS Size */ amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); amdgpu_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) | WRITE_DATA_DST_SEL(0))); amdgpu_ring_write(ring, amdgpu_gds_reg_offset[vmid].mem_size); amdgpu_ring_write(ring, 0); amdgpu_ring_write(ring, gds_size);
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
March 14, 2017, 08:44:23 PM |
|
Great work! Any improvements about XMR in pre3?
Not yet, not yet. I plan to optimize the hell out of other kernels once I'm done with Equihash. The GCN inline assembly is ridiculously powerful, I'm telling you.
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
March 14, 2017, 09:26:16 PM |
|
GDS-related parameters for the compute kernel are stored here... Good stuff. struct amdgpu_job { struct amd_sched_job base; struct amdgpu_device *adev; struct amdgpu_vm *vm; struct amdgpu_ring *ring; struct amdgpu_sync sync; struct amdgpu_ib *ibs; struct dma_fence *fence; /* the hw fence */ uint32_t preamble_status; uint32_t num_ibs; void *owner; uint64_t fence_ctx; /* the fence_context this job uses */ bool vm_needs_flush; unsigned vm_id; uint64_t vm_pd_addr; uint32_t gds_base, gds_size; uint32_t gws_base, gws_size; uint32_t oa_base, oa_size;
/* user fence handling */ uint64_t uf_addr; uint64_t uf_sequence;
};
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
lexele
|
|
March 14, 2017, 11:06:44 PM |
|
Hi, tested pr3 on RX 470 4G: 240H/S (1230/2025, moded bios,)
|
|
|
|
lexele
|
|
March 14, 2017, 11:09:11 PM |
|
Great work! Any improvements about XMR in pre3?
Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR. While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited. You're not right. Fastest XMR miner at this moment is Claymore CryptoNote 9.7. Even with 2% fee it gives about 5-10% more speed on 280X. on my 290x GG is 15% faster, on my RX470 not much faster but it's faster.
|
|
|
|
UnclWish
|
|
March 15, 2017, 12:01:20 AM |
|
Great work! Any improvements about XMR in pre3?
Hi, it's allready the fastest XMR miner (at least that I found out) and it's open source, so what's the point being even faster? As everybody will be faster and you will not get more XMR. While on ZEC, the faster miners are closed source with devfee, that's why Zawawa's miner is awaited. You're not right. Fastest XMR miner at this moment is Claymore CryptoNote 9.7. Even with 2% fee it gives about 5-10% more speed on 280X. on my 290x GG is 15% faster, on my RX470 not much faster but it's faster. But on 280X Claymore is faster. Thats mean that GG is not fastest miner at this moment. Maybe author optimizes something else and it became the best )
|
|
|
|
|