ghostlander (OP)
Legendary
Offline
Activity: 1241
Merit: 1020
No surrender, no retreat, no regret.
|
|
February 20, 2016, 09:24:46 PM |
|
NSGminer v0.9.2 released with my NeoScrypt OpenCL kernel v7 and other enhancements.
1. All GCN based AMD Radeons get a hash rate increase of 20% to 100% depending on driver version. The difference between 14.6 and 15.7 drivers is less than 2% now, so R9 280X @ 1000MHz may deliver 500KH/s.
2. Performance of the older VLIW based AMD Radeons was doubled simply. HD6970 @ 925MHz delivers 255KH/s now.
3. Added support for the very old VLIW based AMD Radeons of HD4000 series. HD4870 @ 750MHz can do 60KH/s. Not very much, but what do you expect of a card 8 years old?
4. Added initial support for the NVIDIA hardware. Thanks to the Feathercoin community for their donation of 0.3 BTC spent on a GTX 750 Ti. Performance improved from 50KH/s to 185KH/s @ 1400MHz shaders. Older GeForce cards down to the very old 8000 series are also known to work.
5. NVIDIA Management Library (NVML) may be used to provide with temperature and fan speed data. Copy nvml.dll from your driver distribution package to the miner's directory.
A good deal of work has been put into this release, so consider a donation. The addresses and download links are in the OP.
|
|
|
|
wrapperband0lite
Newbie
Offline
Activity: 33
Merit: 0
|
|
February 21, 2016, 10:42:52 AM |
|
@Ghostlander, thanks for all your work on NSGminer. Triple kudos
|
|
|
|
semajjames
|
|
February 21, 2016, 01:06:49 PM |
|
nice work @Ghostlander
thanks
does this miner work solo ??
|
|
|
|
ghostlander (OP)
Legendary
Offline
Activity: 1241
Merit: 1020
No surrender, no retreat, no regret.
|
|
February 21, 2016, 02:33:51 PM |
|
I don't say this often, but Ghostlander - really well done on the improvements to FastKDF and everything it calls. I know I gave the idea for the aligned copies/XORs, but your implementation is quite nice. I do wish you would use loops with #pragma unroll just a little more often, but your code is decently readable regardless. Extremely well done implementation on that. In fact, I wouldn't be surprised if, comparing only FastKDF, your implementation surpasses my own.
However, I must say, you still haven't optimized quite an important bit here... the main loop.
FastKDF was a major bottleneck in v6, so I had to fix it first. Catalysts above 14.7 lost ability to align reads and writes properly on their own, so bitalign was the way to go. I know there are other places which need optimisations. That's for the next release. Thanks for all your work Ghost, but just a heads up R9 Nano gets hardware errors with same settings from 0.9.0. BIG jumps on my other GPUs.
I wish I had a Nano or Fury for testing. I have added their ID as well as the Carrizo ID (the last AMD APU) to the kernel. Hope they work well with the default GCN settings. The ISA code looks good at least. Pull it from my GitHub and let me know. nice work @Ghostlander
thanks
does this miner work solo ??
Of course it does. That's how I use it most of the time.
|
|
|
|
ghostlander (OP)
Legendary
Offline
Activity: 1241
Merit: 1020
No surrender, no retreat, no regret.
|
|
February 22, 2016, 10:32:40 AM |
|
iBeLink DM384M Dash MinerIf what they advertise is true, 384MH/s for 715W and $2098 in cash, say good-bye to X11 GPU mining. A reference R9 280X outputs 2MH/s for maybe 150W. If they produce enough of these ASICs for themselves, they can do a 51% attack on any X11 coin including Dash.
|
|
|
|
sp_
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
February 22, 2016, 10:58:31 AM Last edit: February 22, 2016, 11:09:23 AM by sp_ |
|
iBeLink DM384M Dash MinerIf what they advertise is true, 384MH/s for 715W and $2098 in cash, say good-bye to X11 GPU mining. A reference R9 280X outputs 2MH/s for maybe 150W. If they produce enough of these ASICs for themselves, they can do a 51% attack on any X11 coin including Dash. The latest bins does 10MHASH on the reference 280x (nicehashminer) DM384M Dash Miner: 0,537 Mhash /Watt r9 280x: 0.066 Mhash /Watt (1/8 of the asic) (tahiti 2012) 750ti Maxwell: 0.075 Mhash /watt (1/7 of the asic) (maxwell 2014) r9 NANO 0,141 MHASH /watt (1/4 of the asic)(fury 2015) NVIDIA pascal(16nm) ... (1/2? of the asic, or perhaps the same speed.) (2016?)
|
|
|
|
ghostlander (OP)
Legendary
Offline
Activity: 1241
Merit: 1020
No surrender, no retreat, no regret.
|
|
February 22, 2016, 12:57:33 PM |
|
iBeLink DM384M Dash MinerIf what they advertise is true, 384MH/s for 715W and $2098 in cash, say good-bye to X11 GPU mining. A reference R9 280X outputs 2MH/s for maybe 150W. If they produce enough of these ASICs for themselves, they can do a 51% attack on any X11 coin including Dash. The latest bins does 10MHASH on the reference 280x (nicehashminer) DM384M Dash Miner: 0,537 Mhash /Watt r9 280x: 0.066 Mhash /Watt (1/8 of the asic) (tahiti 2012) 750ti Maxwell: 0.075 Mhash /watt (1/7 of the asic) (maxwell 2014) r9 NANO 0,141 MHASH /watt (1/4 of the asic)(fury 2015) NVIDIA pascal(16nm) ... (1/2? of the asic, or perhaps the same speed.) (2016?) 10MH/s is much better, but still not enough to compete with these ASICs. There may be even faster ASICs by the time Pascal comes into production.
|
|
|
|
sp_
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
February 23, 2016, 01:10:19 PM |
|
10MH/s is much better, but still not enough to compete with these ASICs. There may be even faster ASICs by the time Pascal comes into production.
The (28nm) R9 Nano is already 1/2 the speed of the announced unreleased ASIC if you messure hash/Watt. With a modded kernal you can reach 20MHASH per card. for $2000 you can buy many cards.. Pascal is rumoured to be 12 times faster, and will be released in 2016. (16nm) The announced asic is slow. It should be 2.3 GHASH / 700Watt. to beat the competition.
|
|
|
|
ghostlander (OP)
Legendary
Offline
Activity: 1241
Merit: 1020
No surrender, no retreat, no regret.
|
|
February 23, 2016, 01:58:00 PM |
|
10MH/s is much better, but still not enough to compete with these ASICs. There may be even faster ASICs by the time Pascal comes into production.
The (28nm) R9 Nano is already 1/2 the speed of the announced unreleased ASIC if you messure hash/Watt. With a modded kernal you can reach 20MHASH per card. for $2000 you can buy many cards.. Pascal is rumoured to be 12 times faster, and will be released in 2016. (16nm) The announced asic is slow. It should be 2.3 GHASH / 700Watt. to beat the competition. I'm sure this ASIC is of a very inexpensive design and production. Something like 150nm and multiproject wafers. Indeed it's overpriced. Any subsequent designs should be much more impressive.
|
|
|
|
KloNEM
Member
Offline
Activity: 182
Merit: 11
|
|
February 23, 2016, 04:03:09 PM |
|
cl_amd_media_ops is for bitalign/bytealign mostly which are not used in v6 directly. The compiler is supposed to take care of this, but it doesn't do well in the drivers newer than 14.7. It won't be an issue in the next release.
Hi ghostlander, thanks for amazing job! You're right, that isn't an issue in new 0.9.2 version anymore. And also, I removed "static" from `neoscrypt.cl' source file (this string is three times there), and now the error "OpenCL does not support the 'static' storage class specifier" is gone. Great! But with --neoscrypt option, nsgminer exits very quickly - sometimes it lists current config and then "Segmentation fault (core dumped)", sometimes it gets this fault immediately. Unfortunately there is no dump file (even if I added --verbose option) for further analysis. The same nsgminer, with option --scrypt, or without algo-option (it means sha256 algo) runs fine. Well, the hash-power is really ridiculous, but it's not just related to OSS drivers, it's also because I'm testing it on very weak IGP/APU (Radeon HD 7660D). If everything will be fine with --neoscrypt there, I'll try on more powerful/faster discrete AMD cards too... Finally, within compilation of nsgminer from git repo, I didn't add AMD APP SDK; I guess it doesn't make a sense with OSS radeon.ko driver ... or would be there some performance impact also in this "specific OSS case" ? Thanks for your help and your work on nsgminer again!
|
|
|
|
ghostlander (OP)
Legendary
Offline
Activity: 1241
Merit: 1020
No surrender, no retreat, no regret.
|
|
February 23, 2016, 05:12:26 PM |
|
cl_amd_media_ops is for bitalign/bytealign mostly which are not used in v6 directly. The compiler is supposed to take care of this, but it doesn't do well in the drivers newer than 14.7. It won't be an issue in the next release.
Hi ghostlander, thanks for amazing job! You're right, that isn't an issue in new 0.9.2 version anymore. And also, I removed "static" from `neoscrypt.cl' source file (this string is three times there), and now the error "OpenCL does not support the 'static' storage class specifier" is gone. Great! But with --neoscrypt option, nsgminer exits very quickly - sometimes it lists current config and then "Segmentation fault (core dumped)", sometimes it gets this fault immediately. Unfortunately there is no dump file (even if I added --verbose option) for further analysis. The same nsgminer, with option --scrypt, or without algo-option (it means sha256 algo) runs fine. Well, the hash-power is really ridiculous, but it's not just related to OSS drivers, it's also because I'm testing it on very weak IGP/APU (Radeon HD 7660D). If everything will be fine with --neoscrypt there, I'll try on more powerful/faster discrete AMD cards too... Finally, within compilation of nsgminer from git repo, I didn't add AMD APP SDK; I guess it doesn't make a sense with OSS radeon.ko driver ... or would be there some performance impact also in this "specific OSS case" ? Thanks for your help and your work on nsgminer again! HD7660D is a VLIW4 design which should be capable of some 60KH/s @ 800MHz. I'm almost sure the OSS driver doesn't expose any GPU specific data to an OpenCL compiler, so you have to set it up manually. Add "#define VLIW 1" somewhere in the kernel beginning. If there is no support for cl_amd_media_ops, change it to "#define OLD_VLIW 1"
|
|
|
|
Klokan-NXT
Newbie
Offline
Activity: 3
Merit: 0
|
|
February 23, 2016, 07:42:41 PM |
|
HD7660D is a VLIW4 design which should be capable of some 60KH/s @ 800MHz. I'm almost sure the OSS driver doesn't expose any GPU specific data to an OpenCL compiler, so you have to set it up manually. Add "#define VLIW 1" somewhere in the kernel beginning. If there is no support for cl_amd_media_ops, change it to "#define OLD_VLIW 1"
You're right, if I added "# define VLIW 1" into `neoscrypt.cl', just after : /* NeoScrypt(128, 2, 1) with Salsa20/20 and ChaCha20/20 * Optimised for the AMD GCN, VLIW4 and VLIW5 architectures * v7, 20-Feb-2016 */
#define VLIW 1
I got this error : [20:37:42] Probing for an alive pool [20:37:43] The network difficulty has been set to 8307 [20:37:43] Stratum from pool 0 detected new block [20:37:43] Error -11: Building Program (clBuildProgram) [20:37:43] input.cl:100:26: warning: unknown OpenCL extension 'cl_amd_media_ops' - ignoring input.cl:126:14: warning: implicit declaration of function 'amd_bitalign' is invalid in C99 unsupported call to function amd_bitalign in search [20:37:43] Failed to init GPU thread 0, disabling device 0 [20:37:43] Restarting the GPU from the menu will not fix this. [20:37:43] Try to restart the miner.
If I changed it on the same place to "#define OLD_VLIW 1", it's core dumped again : [20:40:07] Probing for an alive pool [20:40:07] The network difficulty has been set to 8307 [20:40:07] Stratum from pool 0 detected new block [20:40:07] Pool 0 is sending mismatched block contents to us (0 is not 1-1)./nsgminer.sh: line 24: 23981 Segmentation fault (core dumped) /usr/local/bin/nsgminer --neoscrypt -o stratum+tcp://....
Would be a problem with (enough of free) memory (HD7660D can allocate 256MB at maximum, if I'm not fallen), for example ?
|
|
|
|
ghostlander (OP)
Legendary
Offline
Activity: 1241
Merit: 1020
No surrender, no retreat, no regret.
|
|
February 23, 2016, 08:09:02 PM |
|
You're right, if I added "# define VLIW 1" into `neoscrypt.cl', just after : /* NeoScrypt(128, 2, 1) with Salsa20/20 and ChaCha20/20 * Optimised for the AMD GCN, VLIW4 and VLIW5 architectures * v7, 20-Feb-2016 */
#define VLIW 1
I got this error : [20:37:42] Probing for an alive pool [20:37:43] The network difficulty has been set to 8307 [20:37:43] Stratum from pool 0 detected new block [20:37:43] Error -11: Building Program (clBuildProgram) [20:37:43] input.cl:100:26: warning: unknown OpenCL extension 'cl_amd_media_ops' - ignoring input.cl:126:14: warning: implicit declaration of function 'amd_bitalign' is invalid in C99 unsupported call to function amd_bitalign in search [20:37:43] Failed to init GPU thread 0, disabling device 0 [20:37:43] Restarting the GPU from the menu will not fix this. [20:37:43] Try to restart the miner.
If I changed it on the same place to "#define OLD_VLIW 1", it's core dumped again : [20:40:07] Probing for an alive pool [20:40:07] The network difficulty has been set to 8307 [20:40:07] Stratum from pool 0 detected new block [20:40:07] Pool 0 is sending mismatched block contents to us (0 is not 1-1)./nsgminer.sh: line 24: 23981 Segmentation fault (core dumped) /usr/local/bin/nsgminer --neoscrypt -o stratum+tcp://....
Would be a problem with (enough of free) memory (HD7660D can allocate 256MB at maximum, if I'm not fallen), for example ? Yes, it doesn't seem to support bitalign. You can allocate more memory for your APU in the BIOS, 512Mb or 1Gb. It should work with -I 12 for 256Mb.
|
|
|
|
fredeq
Legendary
Offline
Activity: 1537
Merit: 1005
|
|
February 25, 2016, 08:38:20 PM |
|
Impressive improvements, getting 470Khash on my crappy 15.11 drivers
|
|
|
|
Buratinos
|
|
February 29, 2016, 05:48:19 PM |
|
Hi guys, I have a hd 7750 in Neoscrypt produces 111 kh / s, it is normal ?. It was established Catalyst 14.6
|
|
|
|
barry8212
Newbie
Offline
Activity: 5
Merit: 0
|
|
March 02, 2016, 12:45:54 AM |
|
Hi,
Can someone please help me to setup on ubuntu 14.04.
not sure what to type as code. please post step by step code. i appreciate the help. I managed to install on my pc with R9 280x and currently hashrate is 515kh/s
thanks
barry
|
|
|
|
ghostlander (OP)
Legendary
Offline
Activity: 1241
Merit: 1020
No surrender, no retreat, no regret.
|
|
March 02, 2016, 06:54:50 PM |
|
Hi guys, I have a hd 7750 in Neoscrypt produces 111 kh / s, it is normal ?. It was established Catalyst 14.6
That's expected for 512 shader units. About 20% performance of R9 280X. Hi,
Can someone please help me to setup on ubuntu 14.04.
not sure what to type as code. please post step by step code. i appreciate the help. I managed to install on my pc with R9 280x and currently hashrate is 515kh/s
thanks
barry
git clone https://github.com/ghostlander/nsgminer.gitcd nsgminer ./autogen.sh make sudo make install
|
|
|
|
barry8212
Newbie
Offline
Activity: 5
Merit: 0
|
|
March 02, 2016, 07:37:17 PM |
|
Hi,
Thanks for the code. Will check it out later tonight. Can you please assist me in downloading the correct driver version for MSI R9-280X Tahiti needed for the best performance on Windows 10 64bit. What is the best feathercoin pool to mine on?
Thanks again for the assistance.
Barry
|
|
|
|
ghostlander (OP)
Legendary
Offline
Activity: 1241
Merit: 1020
No surrender, no retreat, no regret.
|
|
March 02, 2016, 07:48:12 PM |
|
Hi,
Thanks for the code. Will check it out later tonight. Can you please assist me in downloading the correct driver version for MSI R9-280X Tahiti needed for the best performance on Windows 10 64bit. What is the best feathercoin pool to mine on?
Thanks again for the assistance.
Barry
Windows 10? Catalyst 15.7 probably. Although it shouldn't be of much difference there. The best pool is the closest to your location with low fees and as low as possible downtime.
|
|
|
|
barry8212
Newbie
Offline
Activity: 5
Merit: 0
|
|
March 02, 2016, 08:57:52 PM |
|
Hi,
Thanks for the code. Will check it out later tonight. Can you please assist me in downloading the correct driver version for MSI R9-280X Tahiti needed for the best performance on Windows 10 64bit. What is the best feathercoin pool to mine on?
Thanks again for the assistance.
Barry
Windows 10? Catalyst 15.7 probably. Although it shouldn't be of much difference there. The best pool is the closest to your location with low fees and as low as possible downtime. I have crimson 16.1.1 installed. attached is a screen shot for error. can you please assist? https://www.dropbox.com/s/giv1zx6hfpwth1c/Miner.jpg?dl=0 thanks
|
|
|
|
|