laik2
|
|
December 17, 2018, 11:29:17 PM |
|
I benched other algos, on v8 i'm on par, with equal fees, in a ~1% margin of error, considering xmrig has 1% fees. If you get a well compiled xmrig with 0 fees, so it would be 1% better, on normal CPUs. On extreme cases (very veak CPU or ultra big multi-xeon/epyc...) i'm better.
No longer, i improved my assembly and i'm now 2% faster than xmrig, so even fees deduced, still faster, but by 0.5% I also got +1% on old non-aes cpus, where i already provide +30% speed compared to other miners. release planned tomorrow as the 0.33i Still no linux
|
|
|
|
JeffJohnsonTheNameYouKnow
Jr. Member
Offline
Activity: 41
Merit: 2
|
|
December 18, 2018, 03:15:47 AM |
|
Have you compared your c8 vs teamred? On Vegas?
|
|
|
|
pbfarmer
Member
Offline
Activity: 340
Merit: 29
|
|
December 18, 2018, 03:32:35 AM Last edit: December 18, 2018, 03:47:09 AM by pbfarmer |
|
The effective hashrate of recent JCE is back to >98%, close to 99% efficiency, as before. And as all current versions of miners so far (xmrig, srb, teamred...).
I'm polishing my fix of the regression on CN-Fast on Vega. Looks like my auto-hybrid introduced in the experimental -x, -y, -z and -sync versions works bad in such case, so i introduced an override --legacy to force the good old non-hybrid mode. I benched it to be either a little faster or slower than b12 depending on the cards and algo, hard to tell if it was an improvment or not. This way i'll let the user choose.
edit: @cryptoprofitswitcher: online is the b13, partial release with just the .exe Inside you may try with undocumented parameter --legacy to check if it restores the speed on your dual Vega.
@other: it contains a very small optim for Vega, and for non-heavy algos, if you want to take a look. the --legacy may also give a few extra perf, but it's ignored for heavy-class algos, it's just for CN and CN-Light
if it solves the problem, i'll make it a full documented release.
Are you sure hybrid is not affecting heavy algos? If i run 32q heavy on my 8 polaris rig, I can run my ryzen 1600 @ ~525H/s (cnv2). If I run any 33 heavy, I can only get ~400H/s on the cpu while GPUs are mining. Given that 33b13 gives me an extra ~100H/s per GPU vs 32q, I'll take the tradeoff if it's necessary, but just checking that something hasn't been overlooked. EDIT: --legacy doesn't help, but still just want to make sure something hasn't been inadvertently changed.
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
December 18, 2018, 09:49:40 AM |
|
The 0.33i CPU is a major version, the +2% increase is a huge gap for CPU, where we're all close to hardware max. But this is for v8 only, no gain on other algos. So of course i'll do a Linux release for it.
TeamRed: this is the best v8 miner for Vega, sure. I congratulated them day one, they did a wonderful job. If you're looking for the best v8 miner for your Vega, the most efficient, so that's TeamRed. Mine is the best for the niche case of older GPU they don't support (older than the Vega and RX). And cpu. You may still look for the gross performance of my miner where I may be better in some cases, and i happen to be better than xmrig and other reference miners, but looking at the power consumption, TeamRed is the best.
CPU impact: this is expected. Bad, but expected. The recent 0.33+ and their hybrid, as the name implies, involve more the CPU into helping the GPU than before. Because a CPU is basically an AES asic. The CPU usage and power consumption is negligible, less than 2%, but it causes a lot of cache invalidation.
When you mine with CPU and GPU in the same JCE instance, each thread knows what the other do and get the CPU at the good time, resulting into a negligible performance impact on the CPU part when the GPUs mine. But if you use two separate miners, the CPU one being another JCE or another miner, no way to sync and the CPU impact when mining a cache-intensive algo like CN is high. For normal use (Internet surf, gaming...) it's still negligible.
However i did some extra test and observed that, when --legacy is used, the CPU job is now of zero help, and should be skipped. It will make the 33 with --legacy very like the 0.32 on the cpu side. The current --legacy makes it legacy on the GPU side only. While i was aware of that problem from the beginning, and solved it when dual mining CPU+GPU with the same JCE, i overlooked the case of dual mining with two separate miners. But there's a way to fix, and i'll do it for b14. For people who don't use --legacy, or don't mine with CPU, or mine with CPU using the same JCE instance, there will be no change.
|
|
|
|
tipo70
Newbie
Offline
Activity: 15
Merit: 0
|
|
December 18, 2018, 12:38:42 PM |
|
Which version do you think is best for 550-560-470-570?
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
December 18, 2018, 01:02:18 PM |
|
i'd say the last one, 0.33b13 my autoconfig aims for safety, for max perf, use the manual config, the github page provides some examples. https://github.com/jceminer/cn_gpu_minerbut each card may be different (overclocking, memory...) so take time to tune the values. only three are relevant: multi_hash (a multiple of 16), alpha (64 or 128) and beta (8 or 16).
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
December 18, 2018, 06:47:06 PM |
|
Still no linux Online is the 0.33i CPU Windows and Linux, 32 and 64-bitsmajor release with a big +2% speed on v8, making my miner the best in all cases on CPU, even fees deduced. The only case where i don't provide best speed is the rare CPUs mining Heavy/Haven with an exact even number of threads and cache (like 4 threads on 16M cache on the Ryzen 1500). In such case it's a tie with xmrig. Otherwise, i get the best speed. On Bittube i'm like +30% faster even on AES CPUs. And on non-AES, still about +35% faster (assuming someone still use them, i admit i myself has shut down my core2 rigs due to the current low coin prices).
|
|
|
|
GKumaran
Member
Offline
Activity: 204
Merit: 10
|
|
December 19, 2018, 02:15:21 AM |
|
Thx for the new version.
Your thread optimisation guide for the 1mb algo is great for cpu, is it possible to tweak the gpu thread in a similar way to get higher performance on 1mb algos?
|
|
|
|
impynick
Jr. Member
Offline
Activity: 77
Merit: 6
|
|
December 19, 2018, 03:01:08 AM |
|
on my i7 im getting 295 on xmr stak vs 286 on JCE. How can i improve these numbers? I do have hyperthreading on.
I'm unsure on how to config this? Currently use xmr stak with the following:
{ "low_power_mode" : false, "no_prefetch" : true, "asm" : "auto", "affine_to_cpu" : 0 }, { "low_power_mode" : false, "no_prefetch" : true, "asm" : "auto", "affine_to_cpu" : 2 }, { "low_power_mode" : false, "no_prefetch" : true, "asm" : "auto", "affine_to_cpu" : 4 }, { "low_power_mode" : false, "no_prefetch" : true, "asm" : "auto", "affine_to_cpu" : 6 },
this is on v8....and compiled xmr stak with 0 dev fee....can you compete? If so I'll happily make the move if its worth it.
|
|
|
|
laik2
|
|
December 19, 2018, 08:54:30 AM |
|
i'd say the last one, 0.33b13 my autoconfig aims for safety, for max perf, use the manual config, the github page provides some examples. https://github.com/jceminer/cn_gpu_minerbut each card may be different (overclocking, memory...) so take time to tune the values. only three are relevant: multi_hash (a multiple of 16), alpha (64 or 128) and beta (8 or 16). Actually I was talking about GPU version not having linux port
|
|
|
|
PIOUPIOU99
Copper Member
Member
Offline
Activity: 293
Merit: 11
|
|
December 19, 2018, 10:10:47 AM |
|
Still no linux Online is the 0.33i CPU Windows and Linux, 32 and 64-bitsmajor release with a big +2% speed on v8, making my miner the best in all cases on CPU, even fees deduced. for my light config v8 0.33e 0.33i
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
December 19, 2018, 11:12:13 AM |
|
Hi all, Linux GPU: unlikely. I'm a niche miner (CPU and older GPUs) and adding the Linux concept would make it a niche of a niche, but a lot of dev time to do. The Win GPU is already like 15% of my fees but 90% of the support, the Linux version would be like 1% of my users for 95% of the support. I cannot afford this Sometime i don't look at the market and do things for fun, like supporting the HD6000, but it remains an acceptable dev time. Linux GPU wouldn't. Btw try TeamRed on Linux for v8 mining, it burns like fire @PIOUPIOU99: yeah thanks, my new CPU miner also burns like fire Speed on Intel: i don't even have any big Intel CPU, i'm all AMD, as for the GPU (i've zero nVidia). But that's ok, i'll do some theorical optimizations for big Intel CPU too. Can you tell me what exact CPU you have? Maybe a good config can close the gap with xmrstak. I know i must beat it by more than 1.5% to compensate for the devfee. It's true in most cases, but yeah maybe not the i7.
|
|
|
|
Iamtutut
|
|
December 19, 2018, 11:55:28 AM |
|
Mining bittube with the lastest GPU version: (4X RX574: 1240/2070; 1240/2070; 1240/2040; 1240/2035). Starting GPU Thread 0, on GPU 0 Created OpenCL Context for GPU 0 at 000001cf487080a0 Created OpenCL Thread 0 Command-Queue for GPU 0 at 000001cf48720ca0 Scratchpad Allocation success for OpenCL Thread 0 Allocating big 1856MB scratchpad for OpenCL Thread 0... Compiling kernels of OpenCL Thread 0... Kernels of OpenCL Thread 0 compiled.
Starting GPU Thread 1, on GPU 0 Created OpenCL Thread 1 Command-Queue for GPU 0 at 000001cf4d55d740 Scratchpad Allocation success for OpenCL Thread 1 Allocating big 1856MB scratchpad for OpenCL Thread 1... Compiling kernels of OpenCL Thread 1... Kernels of OpenCL Thread 1 compiled.
Starting GPU Thread 2, on GPU 1 Created OpenCL Context for GPU 1 at 000001cf487839f0 Created OpenCL Thread 2 Command-Queue for GPU 1 at 000001cf4d55db30 Scratchpad Allocation success for OpenCL Thread 2 Allocating big 1856MB scratchpad for OpenCL Thread 2... Compiling kernels of OpenCL Thread 2... Kernels of OpenCL Thread 2 compiled.
Starting GPU Thread 3, on GPU 1 Created OpenCL Thread 3 Command-Queue for GPU 1 at 000001cf4d55d200 Scratchpad Allocation success for OpenCL Thread 3 Allocating big 1856MB scratchpad for OpenCL Thread 3... Compiling kernels of OpenCL Thread 3... Kernels of OpenCL Thread 3 compiled.
Starting GPU Thread 4, on GPU 2 Created OpenCL Context for GPU 2 at 000001cf487844f0 Created OpenCL Thread 4 Command-Queue for GPU 2 at 000001cf4d55d4a0 Scratchpad Allocation success for OpenCL Thread 4 Allocating big 1856MB scratchpad for OpenCL Thread 4... Compiling kernels of OpenCL Thread 4... Kernels of OpenCL Thread 4 compiled.
Starting GPU Thread 5, on GPU 2 Created OpenCL Thread 5 Command-Queue for GPU 2 at 000001cf588aa3a0 Scratchpad Allocation success for OpenCL Thread 5 Allocating big 1856MB scratchpad for OpenCL Thread 5... Compiling kernels of OpenCL Thread 5... Kernels of OpenCL Thread 5 compiled.
Starting GPU Thread 6, on GPU 3 Created OpenCL Context for GPU 3 at 000001cf48783470 Created OpenCL Thread 6 Command-Queue for GPU 3 at 000001cf588ab210 Scratchpad Allocation success for OpenCL Thread 6 Allocating big 1856MB scratchpad for OpenCL Thread 6... Compiling kernels of OpenCL Thread 6... Kernels of OpenCL Thread 6 compiled.
Starting GPU Thread 7, on GPU 3 Created OpenCL Thread 7 Command-Queue for GPU 3 at 000001cf588aa640 Scratchpad Allocation success for OpenCL Thread 7 Allocating big 1856MB scratchpad for OpenCL Thread 7... Compiling kernels of OpenCL Thread 7... Kernels of OpenCL Thread 7 compiled. Keep-Alive enabled Devfee for GPU is 0.9% 12:39:39 | Miner uptime 4:05:07 12:39:39 | Effective net hashrate 3591.50 h/s 12:39:39 | Devices results - Shares Accepted/Ignored/Rejected - Net Hashrate 12:39:39 | * GPU 0 - 98/0/0 - 891.67 h/s 12:39:39 | * GPU 1 - 87/0/0 - 817.14 h/s 12:39:39 | * GPU 2 - 101/0/0 - 976.39 h/s 12:39:39 | * GPU 3 - 91/0/0 - 906.30 h/s 12:40:56 | Hashrate GPU Thread 0: 462.00 h/s 12:40:56 | Hashrate GPU Thread 1: 461.32 h/s - Total GPU 0: 923.31 h/s 12:40:56 | Hashrate GPU Thread 2: 440.52 h/s 12:40:56 | Hashrate GPU Thread 3: 444.89 h/s - Total GPU 1: 885.41 h/s 12:40:56 | Hashrate GPU Thread 4: 449.96 h/s 12:40:56 | Hashrate GPU Thread 5: 449.84 h/s - Total GPU 2: 899.79 h/s 12:40:56 | Hashrate GPU Thread 6: 464.51 h/s 12:40:56 | Hashrate GPU Thread 7: 464.82 h/s - Total GPU 3: 929.33 h/s 12:40:56 | Total: 3637.83 h/s - Max: 3649.71 h/s
|
|
|
|
sergneo
Newbie
Offline
Activity: 33
Merit: 0
|
|
December 19, 2018, 06:49:16 PM |
|
jce_cn_cpu_miner.windows.033i -1 h/s in comparison with the previous version. Where is the optimization on V8 ? No improvement seen. CPU Xeon E5440 , Core2Quad Q9400.
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
December 19, 2018, 08:07:41 PM |
|
Right, i rephrase explicitely the comment as all CPUs where i lacked extra performance versus xmrig. Of course my new v8 assembly is for modern AES CPU like Zen, the one for non-aes Core2 is already ultra-optimized and the 33i gives no extra perf compared to 33h
Also i observed a little regression too, you're right, it was hard to understand how there could be a side effect but found it, that's a cache allocation problem. Will be fixed in 33j that I already planned to release with an optim for Intel modern CPU and the UPlexa fork.
|
|
|
|
laik2
|
|
December 19, 2018, 08:09:37 PM |
|
Hi all, Linux GPU: unlikely. I'm a niche miner (CPU and older GPUs) and adding the Linux concept would make it a niche of a niche, but a lot of dev time to do. The Win GPU is already like 15% of my fees but 90% of the support, the Linux version would be like 1% of my users for 95% of the support. I cannot afford this Sometime i don't look at the market and do things for fun, like supporting the HD6000, but it remains an acceptable dev time. Linux GPU wouldn't. Btw try TeamRed on Linux for v8 mining, it burns like fire @PIOUPIOU99: yeah thanks, my new CPU miner also burns like fire Speed on Intel: i don't even have any big Intel CPU, i'm all AMD, as for the GPU (i've zero nVidia). But that's ok, i'll do some theorical optimizations for big Intel CPU too. Can you tell me what exact CPU you have? Maybe a good config can close the gap with xmrstak. I know i must beat it by more than 1.5% to compensate for the devfee. It's true in most cases, but yeah maybe not the i7. I do use it but competitive linux miner is always welcome
|
|
|
|
|
pp55
Newbie
Offline
Activity: 41
Merit: 0
|
|
December 19, 2018, 11:30:03 PM |
|
Hi, JCE! AMD FX-8320E, Turbo boost OFF, CNv8 0.33i CPU 03:41:01 | Hashrate CPU Thread 0: 39.68 h/s 03:41:01 | Hashrate CPU Thread 1: 41.26 h/s 03:41:01 | Hashrate CPU Thread 2: 41.06 h/s 03:41:01 | Hashrate CPU Thread 3: 41.17 h/s 03:41:01 | Hashrate CPU Thread 4: 40.56 h/s 03:41:01 | Hashrate CPU Thread 5: 40.86 h/s 03:41:01 | Hashrate CPU Thread 6: 50.80 h/s 03:41:01 | Total: 295.35 h/s - Max: 296.09 h/s 0.33g CPU 23:39:08 | Hashrate CPU Thread 0: 40.86 h/s 23:39:08 | Hashrate CPU Thread 1: 41.28 h/s 23:39:08 | Hashrate CPU Thread 2: 41.38 h/s 23:39:08 | Hashrate CPU Thread 3: 41.46 h/s 23:39:08 | Hashrate CPU Thread 4: 41.06 h/s 23:39:08 | Hashrate CPU Thread 5: 41.38 h/s 23:39:08 | Hashrate CPU Thread 6: 52.71 h/s 23:39:08 | Total: 300.09 h/s - Max: 301.51 h/s Analyzing Processors topology... AMD FX-8320E Eight-Core Processor Assembly codename: generic_aes_avx SSE2 : Yes SSE3 : Yes SSE4 : Yes AES : Yes AVX : Yes AVX2 : No Auto-configuration, selected CPUs will be highlighted... Found CPU 0, with: L1 Cache: 16 KB L2 Cache: 2048 KB, shared with CPU 1 L3 Cache: 8192 KB, shared with CPU 1, 2, 3, 4, 5, 6, 7 Found CPU 1, with: L1 Cache: 16 KB L2 Cache: 2048 KB, shared with CPU 0 L3 Cache: 8192 KB, shared with CPU 0, 2, 3, 4, 5, 6, 7 Found CPU 2, with: L1 Cache: 16 KB L2 Cache: 2048 KB, shared with CPU 3 L3 Cache: 8192 KB, shared with CPU 0, 1, 3, 4, 5, 6, 7 Found CPU 3, with: L1 Cache: 16 KB L2 Cache: 2048 KB, shared with CPU 2 L3 Cache: 8192 KB, shared with CPU 0, 1, 2, 4, 5, 6, 7 Found CPU 4, with: L1 Cache: 16 KB L2 Cache: 2048 KB, shared with CPU 5 L3 Cache: 8192 KB, shared with CPU 0, 1, 2, 3, 5, 6, 7 Found CPU 5, with: L1 Cache: 16 KB L2 Cache: 2048 KB, shared with CPU 4 L3 Cache: 8192 KB, shared with CPU 0, 1, 2, 3, 4, 6, 7 Found CPU 6, with: L1 Cache: 16 KB L2 Cache: 2048 KB, shared with CPU 7 L3 Cache: 8192 KB, shared with CPU 0, 1, 2, 3, 4, 5, 7 Found CPU 7, with: L1 Cache: 16 KB L2 Cache: 2048 KB, shared with CPU 6 L3 Cache: 8192 KB, shared with CPU 0, 1, 2, 3, 4, 5, 6 HTTP Local Server on port 3334
Preparing 7 Mining Threads...
+-- Thread 0 config ------------------------+ | Run on CPU: 0 | | Use cache: yes | | Multi-hash: no | | Assembly module: generic_aes_avx | +-------------------------------------------+
+-- Thread 1 config ------------------------+ | Run on CPU: 1 | | Use cache: yes | | Multi-hash: no | | Assembly module: generic_aes_avx | +-------------------------------------------+
+-- Thread 2 config ------------------------+ | Run on CPU: 2 | | Use cache: yes | | Multi-hash: no | | Assembly module: generic_aes_avx | +-------------------------------------------+
+-- Thread 3 config ------------------------+ | Run on CPU: 3 | | Use cache: yes | | Multi-hash: no | | Assembly module: generic_aes_avx | +-------------------------------------------+
+-- Thread 4 config ------------------------+ | Run on CPU: 4 | | Use cache: yes | | Multi-hash: no | | Assembly module: generic_aes_avx | +-------------------------------------------+
+-- Thread 5 config ------------------------+ | Run on CPU: 5 | | Use cache: yes | | Multi-hash: no | | Assembly module: generic_aes_avx | +-------------------------------------------+
+-- Thread 6 config ------------------------+ | Run on CPU: 6 | | Use cache: yes | | Multi-hash: no | | Assembly module: generic_aes_avx | +-------------------------------------------+
Cryptonight Variation: Cryptonight V8 fork of Oct-2018
Low intensity.
Starting CPU Thread 0, affinity: CPU 0 Thread 0 successfully bound to CPU 0 Allocated shared Large Page at: 0000014709e00000 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 0 of NUMA node 0 at: 000001470a000000
Starting CPU Thread 1, affinity: CPU 1 Thread 1 successfully bound to CPU 1 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 1 of NUMA node 0 at: 000001470a200000
Starting CPU Thread 2, affinity: CPU 2 Thread 2 successfully bound to CPU 2 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 2 of NUMA node 0 at: 000001470a400000
Starting CPU Thread 3, affinity: CPU 3 Thread 3 successfully bound to CPU 3 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 3 of NUMA node 0 at: 000001470a600000
Starting CPU Thread 4, affinity: CPU 4 Thread 4 successfully bound to CPU 4 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 4 of NUMA node 0 at: 000001470a800000
Starting CPU Thread 5, affinity: CPU 5 Thread 5 successfully bound to CPU 5 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 5 of NUMA node 0 at: 000001470aa00000
Starting CPU Thread 6, affinity: CPU 6 Thread 6 successfully bound to CPU 6 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 6 of NUMA node 0 at: 000001470ac00000 15:59:58 | Monero (XMR/XMV) Mining session starts! Both with --auto --archi vishera -t 7 --low in config
|
|
|
|
HardKano
Newbie
Offline
Activity: 76
Merit: 0
|
|
December 19, 2018, 11:31:23 PM |
|
Nice !
|
|
|
|
pp55
Newbie
Offline
Activity: 41
Merit: 0
|
|
December 19, 2018, 11:39:08 PM |
|
I mean in my case old versions are better on CPU on CNv8 And add miner version in log, pls.
|
|
|
|
|