zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
February 06, 2017, 06:03:55 PM |
|
The ROCm OpenCL driver doesn't seem to support inline GCN assembly and I have to use HCC after all. Grrrrrrrrrr...
Yea their OpenCL implementation is pretty barebones...the current release was the first and it was just a developer beta. Will probably be a while before OpenCL matures on that platform. It makes a perfect sense that AMD's commitment to OpenCL seems half-hearted if AMD wants to promote HCC over OpenCL in a long run. I'm not entirely sure if I want to rewrite the kernel, but their ROCm stuff looks pretty neat... Yea but HCC does not make much sense for mining algorithms unless you find a speed up by utilizing inter-gpu memory and data sharing between gpu cores on multi gpu systems. Could be interesting to explore for more advanced Algos like ethash and equihash. Well, the only real reason I would be interested in HCC is that I would be able to use the inline GCN assembly. If I can convert ROCm (hsaco) binaries into OpenCL binaries, I wouldn't even need HCC. I am running some experiments, and the results are pretty good so far. In fact, I am almost done replacing one of the kernels with an LLVM-generated substitute. It would be such a dirty hack, but, if it works well, who cares?
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
hawkfish007
|
|
February 06, 2017, 06:34:18 PM |
|
Very nice zawawa, I appreciate that you are doing this for the community. What is the difference between platform 0 and 1, platform 0 seems to get more hash ~239 from 480 4 GB than platform 1 ~229.
|
|
|
|
woodaxe
Member
Offline
Activity: 129
Merit: 10
|
|
February 06, 2017, 07:13:02 PM |
|
I must be thick i cannot get the bat file to run on dwarfpool for ether
this is my bat file
@echo off @set GPU_FORCE_64BIT_PTR 0 @set GPU_MAX_HEAP_SIZE 100 @set GPU_USE_SYNC_OBJECTS 1 @set GPU_MAX_ALLOC_PERCENT 100 @set GPU_SINGLE_ALLOC_PERCENT 100 gatelessgate.exe --gpu-platform 1 -k ethash -o stratum+tcp://eth-eu.dwarfpool.com:8008 u mywalletaddress --xintensity 4620 --worksize 192 --gpu-threads 2 --no-extranonce pause
|
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
February 06, 2017, 07:16:31 PM |
|
Very nice zawawa, I appreciate that you are doing this for the community. What is the difference between platform 0 and 1, platform 0 seems to get more hash ~239 from 480 4 GB than platform 1 ~229.
clinfo should give you a pretty good idea.
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
February 07, 2017, 03:47:54 AM |
|
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
February 07, 2017, 08:50:10 AM |
|
I must be thick i cannot get the bat file to run on dwarfpool for ether
this is my bat file
@echo off @set GPU_FORCE_64BIT_PTR 0 @set GPU_MAX_HEAP_SIZE 100 @set GPU_USE_SYNC_OBJECTS 1 @set GPU_MAX_ALLOC_PERCENT 100 @set GPU_SINGLE_ALLOC_PERCENT 100 gatelessgate.exe --gpu-platform 1 -k ethash -o stratum+tcp://eth-eu.dwarfpool.com:8008 u mywalletaddress --xintensity 4620 --worksize 192 --gpu-threads 2 --no-extranonce pause
ETH mining hasn't been fully tested yet. I will take a look when I get a chance.
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
elgi76
Member
Offline
Activity: 124
Merit: 10
|
|
February 07, 2017, 09:16:34 AM |
|
I must be thick i cannot get the bat file to run on dwarfpool for ether
this is my bat file
@echo off @set GPU_FORCE_64BIT_PTR 0 @set GPU_MAX_HEAP_SIZE 100 @set GPU_USE_SYNC_OBJECTS 1 @set GPU_MAX_ALLOC_PERCENT 100 @set GPU_SINGLE_ALLOC_PERCENT 100 gatelessgate.exe --gpu-platform 1 -k ethash -o stratum+tcp://eth-eu.dwarfpool.com:8008 u mywalletaddress --xintensity 4620 --worksize 192 --gpu-threads 2 --no-extranonce pause
maybe -u and not just u. and i don't show -p. for me work fine on miningpoolhub : gatelessgate.exe --api-listen -k ethash -o stratum+ssl://hub.miningpoolhub.com:17020 -u <login.worker> -p x --gpu-threads 1 --worksize 512 --xintensity 2048
|
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
February 07, 2017, 09:31:48 AM |
|
I finally tracked down the last bug and was able to confirm that LLVM/Clang can be used to generate binaries for AMD's proprietary OpenCL drivers. Since LLVM is only capable of generating ROCm binaries, I had to insert a thin ABI translation layer at the beginning of the kernel.
At this point, I don't think it's difficult to rewrite LLVM and make it compatible with CLRadeonExtender and have it directly generate OpenCL binaries. This would be huge for open-source mining as LLVM has a variety of neat features such as inline GCN assembly and limiting the number of registers. More importantly, we wouldn't have to rely on AMD's flaky drivers for code generation any more!
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
m1n1ngP4d4w4n
Full Member
Offline
Activity: 224
Merit: 100
CryptoLearner
|
|
February 07, 2017, 02:44:41 PM |
|
I finally tracked down the last bug and was able to confirm that LLVM/Clang can be used to generate binaries for AMD's proprietary OpenCL drivers. Since LLVM is only capable of generating ROCm binaries, I had to insert a thin ABI translation layer at the beginning of the kernel.
At this point, I don't think it's difficult to rewrite LLVM and make it compatible with CLRadeonExtender and have it directly generate OpenCL binaries. This would be huge for open-source mining as LLVM has a variety of neat features such as inline GCN assembly and limiting the number of registers. More importantly, we wouldn't have to rely on AMD's flaky drivers for code generation any more!
This would be indeed a revolution, amd has plagued us with bad drivers for years, keep up the good work man, it's impressive
|
|
|
|
IOTUSA
|
|
February 07, 2017, 03:52:16 PM |
|
Does anyone here have the 'Eco Bios' for R9 Fury and Rx 480 cards? I don't like using wattman or software to get power draw down and would prefer to flash bios for lowest possible power consumption at stock clocks or close to.
|
|
|
|
xeridea
|
|
February 07, 2017, 05:00:34 PM |
|
Does anyone here have the 'Eco Bios' for R9 Fury and Rx 480 cards? I don't like using wattman or software to get power draw down and would prefer to flash bios for lowest possible power consumption at stock clocks or close to.
I just use the miner to set voltage and clocks, really easy, and flexible, since different algorithms have different optimal clocks and undervolts. The Eco bios isn't undervolted, it is just lower clocks (lower volts, but still stock). You can easily undevolt another 50-100mv+ and save tons of power. I use CM, but SGM/GG supports all that stuff right?
|
Profitability over time charts for many GPUs - http://xeridea.us/chartsBTC: bc1qr2xwjwfmjn43zhrlp6pn7vwdjrjnv5z0anhjhn LTC: LXDm6sR4dkyqtEWfUbPumMnVEiUFQvxSbZ Eth: 0x44cCe2cf90C8FEE4C9e4338Ae7049913D4F6fC24
|
|
|
IOTUSA
|
|
February 07, 2017, 05:35:21 PM |
|
Does anyone here have the 'Eco Bios' for R9 Fury and Rx 480 cards? I don't like using wattman or software to get power draw down and would prefer to flash bios for lowest possible power consumption at stock clocks or close to.
I just use the miner to set voltage and clocks, really easy, and flexible, since different algorithms have different optimal clocks and undervolts. The Eco bios isn't undervolted, it is just lower clocks (lower volts, but still stock). You can easily undevolt another 50-100mv+ and save tons of power. I use CM, but SGM/GG supports all that stuff right? I am mostly looking to undervolt to run Fury/480 at low power to squeeze 6 cards into a 1300w PSU so I'd have to load it on Bios level no?
|
|
|
|
laik2
|
|
February 07, 2017, 05:43:14 PM |
|
Does anyone here have the 'Eco Bios' for R9 Fury and Rx 480 cards? I don't like using wattman or software to get power draw down and would prefer to flash bios for lowest possible power consumption at stock clocks or close to.
I just use the miner to set voltage and clocks, really easy, and flexible, since different algorithms have different optimal clocks and undervolts. The Eco bios isn't undervolted, it is just lower clocks (lower volts, but still stock). You can easily undevolt another 50-100mv+ and save tons of power. I use CM, but SGM/GG supports all that stuff right? Wrong. GG/SGM basicly supports what driver supports. If u use fglrx(ADL) and it does support undervolting, you can do it, but if you are using amdgpu/amdgpu-pro there's no vddc/vddci support(yet).
|
|
|
|
nerdralph
|
|
February 07, 2017, 06:33:51 PM |
|
I finally tracked down the last bug and was able to confirm that LLVM/Clang can be used to generate binaries for AMD's proprietary OpenCL drivers. Since LLVM is only capable of generating ROCm binaries, I had to insert a thin ABI translation layer at the beginning of the kernel.
At this point, I don't think it's difficult to rewrite LLVM and make it compatible with CLRadeonExtender and have it directly generate OpenCL binaries. This would be huge for open-source mining as LLVM has a variety of neat features such as inline GCN assembly and limiting the number of registers. More importantly, we wouldn't have to rely on AMD's flaky drivers for code generation any more!
AMD's drivers embed llvm, so "flakiness" in the drivers is often related to problems with the version of llvm they included. So using clang/llvm directly doesn't completely bypass problems, but it certainly does give you more control.
|
|
|
|
nerdralph
|
|
February 07, 2017, 06:46:30 PM |
|
I finally tracked down the last bug and was able to confirm that LLVM/Clang can be used to generate binaries for AMD's proprietary OpenCL drivers. Since LLVM is only capable of generating ROCm binaries, I had to insert a thin ABI translation layer at the beginning of the kernel.
According to a source at AMD, the ROCm ABI is the same as the HSAIL/brig ABI. When I used CodeXL to generate a kernel binary on a system with Crimson 16.10 drivers, the binary contained HSAIL as well as brig symbols. Since AMDs drivers can handle HSA/brig kernels, I would think they should also handle ROCm binaries.
|
|
|
|
cryptomined
|
|
February 07, 2017, 07:04:06 PM |
|
This runs great for pascal coin is there anyway to get claymore type speeds for ETH or XMR with rx cards?? both seem to hash a bit less than with the corresponding claymore miners... ETH for example gets 2 Mh/s less on each rx470.
Thanks
|
|
|
|
xeridea
|
|
February 07, 2017, 07:06:53 PM |
|
Does anyone here have the 'Eco Bios' for R9 Fury and Rx 480 cards? I don't like using wattman or software to get power draw down and would prefer to flash bios for lowest possible power consumption at stock clocks or close to.
I just use the miner to set voltage and clocks, really easy, and flexible, since different algorithms have different optimal clocks and undervolts. The Eco bios isn't undervolted, it is just lower clocks (lower volts, but still stock). You can easily undevolt another 50-100mv+ and save tons of power. I use CM, but SGM/GG supports all that stuff right? Wrong. GG/SGM basicly supports what driver supports. If u use fglrx(ADL) and it does support undervolting, you can do it, but if you are using amdgpu/amdgpu-pro there's no vddc/vddci support(yet). Oh, yeah, I use Windows, I know there are some limitations for Polaris under Linux, but Fury should be fine I think.
|
Profitability over time charts for many GPUs - http://xeridea.us/chartsBTC: bc1qr2xwjwfmjn43zhrlp6pn7vwdjrjnv5z0anhjhn LTC: LXDm6sR4dkyqtEWfUbPumMnVEiUFQvxSbZ Eth: 0x44cCe2cf90C8FEE4C9e4338Ae7049913D4F6fC24
|
|
|
xeridea
|
|
February 07, 2017, 07:11:57 PM |
|
Does anyone here have the 'Eco Bios' for R9 Fury and Rx 480 cards? I don't like using wattman or software to get power draw down and would prefer to flash bios for lowest possible power consumption at stock clocks or close to.
I just use the miner to set voltage and clocks, really easy, and flexible, since different algorithms have different optimal clocks and undervolts. The Eco bios isn't undervolted, it is just lower clocks (lower volts, but still stock). You can easily undevolt another 50-100mv+ and save tons of power. I use CM, but SGM/GG supports all that stuff right? I am mostly looking to undervolt to run Fury/480 at low power to squeeze 6 cards into a 1300w PSU so I'd have to load it on Bios level no? Not sure about getting to work on Linux, but you can undervolt 6 470s, stock 1260 clock, and run ~930W at wall Zec mining (CM 11.1, Platinum PSU, ~1020mv), 480s would be ~1000 or so at wall. Underclocking allows substantial undervolting, but affects Zec speed, I dual mine Eth/Dcr right now so not as much of an issue. If you can't change volts in Linux on 480s, in Windows, use atiwinflash to pull bios, edit voltages with polarisbioseditor, the reflash. Dunno about Linux flashing/editing tools. What cards do you have? If there is dual bios, there is usually a "silent" bios.
|
Profitability over time charts for many GPUs - http://xeridea.us/chartsBTC: bc1qr2xwjwfmjn43zhrlp6pn7vwdjrjnv5z0anhjhn LTC: LXDm6sR4dkyqtEWfUbPumMnVEiUFQvxSbZ Eth: 0x44cCe2cf90C8FEE4C9e4338Ae7049913D4F6fC24
|
|
|
m0niker
Newbie
Offline
Activity: 39
Merit: 0
|
|
February 07, 2017, 08:19:55 PM |
|
Does anyone here have the 'Eco Bios' for R9 Fury and Rx 480 cards? I don't like using wattman or software to get power draw down and would prefer to flash bios for lowest possible power consumption at stock clocks or close to.
I just use the miner to set voltage and clocks, really easy, and flexible, since different algorithms have different optimal clocks and undervolts. The Eco bios isn't undervolted, it is just lower clocks (lower volts, but still stock). You can easily undevolt another 50-100mv+ and save tons of power. I use CM, but SGM/GG supports all that stuff right? I am mostly looking to undervolt to run Fury/480 at low power to squeeze 6 cards into a 1300w PSU so I'd have to load it on Bios level no? Not sure about getting to work on Linux, but you can undervolt 6 470s, stock 1260 clock, and run ~930W at wall Zec mining (CM 11.1, Platinum PSU, ~1020mv), 480s would be ~1000 or so at wall. Underclocking allows substantial undervolting, but affects Zec speed, I dual mine Eth/Dcr right now so not as much of an issue. If you can't change volts in Linux on 480s, in Windows, use atiwinflash to pull bios, edit voltages with polarisbioseditor, the reflash. Dunno about Linux flashing/editing tools. What cards do you have? If there is dual bios, there is usually a "silent" bios. Running 6x480s overclocked, 1050w at wall in most of my rigs, overclocking is fun!
|
|
|
|
zawawa (OP)
Sr. Member
Offline
Activity: 728
Merit: 304
Miner Developer
|
|
February 07, 2017, 09:27:55 PM |
|
I finally tracked down the last bug and was able to confirm that LLVM/Clang can be used to generate binaries for AMD's proprietary OpenCL drivers. Since LLVM is only capable of generating ROCm binaries, I had to insert a thin ABI translation layer at the beginning of the kernel.
According to a source at AMD, the ROCm ABI is the same as the HSAIL/brig ABI. When I used CodeXL to generate a kernel binary on a system with Crimson 16.10 drivers, the binary contained HSAIL as well as brig symbols. Since AMDs drivers can handle HSA/brig kernels, I would think they should also handle ROCm binaries. What I found out was that AMD's drivers add extra hidden arguments to kernels, which completely messes up the order of arguments. Therefore, modifications to LLVM are necessary in order to use it for AMD's proprietary drivers.
|
Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4VBTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
|
|
|
|