Lonnegan64
Jr. Member
Offline
Activity: 37
Merit: 5
|
|
June 22, 2018, 06:33:38 AM |
|
When I try the latest version on an AMD Epyc (basically four Ryzen dice on a package) I get the following error: Thread 30 successfully bound to CPU 30 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 30 of NUMA node 3 at: 0000021b3c200000 Starting CPU Mining thread 31, affinity: CPU 31 Thread 31 successfully bound to CPU 31 GetNumaProcessorNode failed for cpu 32, error code: 87
Retrying with no NUMA Allocated 2MB Cached Large Page Scratchpad Buffer at: 0000021b3c400000 Connecting to mining pool support.ipbc.io:17777 ... Devfee is 1.5% That's strange because I only defined threads for the CPUs 0 to 31 in the config file. Why does the miner try to access a CPU (core) 32, which is not present? Apart from that the miner starts mining and has better hashrate than XMR-stak with Bittube: 5200 H/s vs 4900 H/s.
|
|
|
|
|
|
|
|
|
TalkImg was created especially for hosting images on bitcointalk.org: try it next time you want to post an image
|
|
|
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
June 22, 2018, 06:40:47 PM Last edit: June 22, 2018, 06:59:41 PM by JCE-Miner |
|
Impressive processor!
And yes i've IPBC-specific assembly for ryzen/threadripper so JCE is to be faster. That's why the binary is so big, it contains optimizations for all possible combinations.
I look at the ghost CPU 32 bug, your log is very helpful, as you might expect, i don't own any epyc myself. I'm fixing the JSON regression too.
edit: both bugs fixed. The CPU32 bug was due to an overflow in my CPU counter. It's somehow luck the remaining code was functional. The biggest thread flood i had tested so far was my ryzen (12 logical cpu) plus the five double-mem GPUs of my rig (2x 5GPU) total 22 threads.
Now rebuilding version 0.29e, and 0.29d will be removed.
|
|
|
|
UnclWish
|
|
June 22, 2018, 09:50:25 PM |
|
When I try the latest version on an AMD Epyc (basically four Ryzen dice on a package) I get the following error: Thread 30 successfully bound to CPU 30 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 30 of NUMA node 3 at: 0000021b3c200000 Starting CPU Mining thread 31, affinity: CPU 31 Thread 31 successfully bound to CPU 31 GetNumaProcessorNode failed for cpu 32, error code: 87
Retrying with no NUMA Allocated 2MB Cached Large Page Scratchpad Buffer at: 0000021b3c400000 Connecting to mining pool support.ipbc.io:17777 ... Devfee is 1.5% That's strange because I only defined threads for the CPUs 0 to 31 in the config file. Why does the miner try to access a CPU (core) 32, which is not present? Apart from that the miner starts mining and has better hashrate than XMR-stak with Bittube: 5200 H/s vs 4900 H/s. In wikipedia no info about L1/L2 cache on Epyc. Only L3. Does L1/L2 cache exists on it or not? Just interesting... P.S. 2Mb L3 cache per thread allready too low... AMD can add more L3 cache on Epyc... Especcially if L1/L2 cache is absent...
|
|
|
|
s0ftcorn
Newbie
Offline
Activity: 70
Merit: 0
|
|
June 22, 2018, 10:12:07 PM |
|
When I try the latest version on an AMD Epyc (basically four Ryzen dice on a package) I get the following error: Thread 30 successfully bound to CPU 30 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 30 of NUMA node 3 at: 0000021b3c200000 Starting CPU Mining thread 31, affinity: CPU 31 Thread 31 successfully bound to CPU 31 GetNumaProcessorNode failed for cpu 32, error code: 87
Retrying with no NUMA Allocated 2MB Cached Large Page Scratchpad Buffer at: 0000021b3c400000 Connecting to mining pool support.ipbc.io:17777 ... Devfee is 1.5% That's strange because I only defined threads for the CPUs 0 to 31 in the config file. Why does the miner try to access a CPU (core) 32, which is not present? Apart from that the miner starts mining and has better hashrate than XMR-stak with Bittube: 5200 H/s vs 4900 H/s. In wikipedia no info about L1/L2 cache on Epyc. Only L3. Does L1/L2 cache exists on it or not? Just interesting... P.S. 2Mb L3 cache per thread allready too low... AMD can add more L3 cache on Epyc... Especcially if L1/L2 cache is absent... L1 -> Level 1 L2 -> Level 2 L3 -> Level 3 as the levels get higher, the cache is slower but larger. Maybe AMD is using some clever technique so L1/L2 caches are more shared than they are already, so mentioning it is useless. But absend L1 or L2 cache would have a huge impact on performance.
|
|
|
|
UnclWish
|
|
June 22, 2018, 11:03:51 PM |
|
When I try the latest version on an AMD Epyc (basically four Ryzen dice on a package) I get the following error: Thread 30 successfully bound to CPU 30 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 30 of NUMA node 3 at: 0000021b3c200000 Starting CPU Mining thread 31, affinity: CPU 31 Thread 31 successfully bound to CPU 31 GetNumaProcessorNode failed for cpu 32, error code: 87
Retrying with no NUMA Allocated 2MB Cached Large Page Scratchpad Buffer at: 0000021b3c400000 Connecting to mining pool support.ipbc.io:17777 ... Devfee is 1.5% That's strange because I only defined threads for the CPUs 0 to 31 in the config file. Why does the miner try to access a CPU (core) 32, which is not present? Apart from that the miner starts mining and has better hashrate than XMR-stak with Bittube: 5200 H/s vs 4900 H/s. In wikipedia no info about L1/L2 cache on Epyc. Only L3. Does L1/L2 cache exists on it or not? Just interesting... P.S. 2Mb L3 cache per thread allready too low... AMD can add more L3 cache on Epyc... Especcially if L1/L2 cache is absent... L1 -> Level 1 L2 -> Level 2 L3 -> Level 3 as the levels get higher, the cache is slower but larger. Maybe AMD is using some clever technique so L1/L2 caches are more shared than they are already, so mentioning it is useless. But absend L1 or L2 cache would have a huge impact on performance. I know it. No need to explain what means L1/L2/L3 and what they do... I just look in wiki and there no info about L1/L2 on Epyc... Just stays "N\A"...
|
|
|
|
4ward
Member
Offline
Activity: 473
Merit: 18
|
|
June 23, 2018, 05:47:08 AM |
|
When I try the latest version on an AMD Epyc (basically four Ryzen dice on a package) I get the following error: Thread 30 successfully bound to CPU 30 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 30 of NUMA node 3 at: 0000021b3c200000 Starting CPU Mining thread 31, affinity: CPU 31 Thread 31 successfully bound to CPU 31 GetNumaProcessorNode failed for cpu 32, error code: 87
Retrying with no NUMA Allocated 2MB Cached Large Page Scratchpad Buffer at: 0000021b3c400000 Connecting to mining pool support.ipbc.io:17777 ... Devfee is 1.5% That's strange because I only defined threads for the CPUs 0 to 31 in the config file. Why does the miner try to access a CPU (core) 32, which is not present? Apart from that the miner starts mining and has better hashrate than XMR-stak with Bittube: 5200 H/s vs 4900 H/s. In wikipedia no info about L1/L2 cache on Epyc. Only L3. Does L1/L2 cache exists on it or not? Just interesting... P.S. 2Mb L3 cache per thread allready too low... AMD can add more L3 cache on Epyc... Especcially if L1/L2 cache is absent... L1 -> Level 1 L2 -> Level 2 L3 -> Level 3 as the levels get higher, the cache is slower but larger. Maybe AMD is using some clever technique so L1/L2 caches are more shared than they are already, so mentioning it is useless. But absend L1 or L2 cache would have a huge impact on performance. I know it. No need to explain what means L1/L2/L3 and what they do... I just look in wiki and there no info about L1/L2 on Epyc... Just stays "N\A"... Not sure which Wikipedia you are looking at, but on https://en.wikipedia.org/wiki/Epyc you can see L2 cache or even better, on https://en.wikichip.org/wiki/amd/epyc you have even more information And if you use logic, there is no way a cpu will have L3 cache and not have L2/L1, since in that case L3 becomes L1...
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
June 23, 2018, 10:13:27 AM |
|
0.29e availableFix 32+ cpu support Fix JSON syntax
If you have less than 32 cpu and don't use JSON output, no need to update.
|
|
|
|
UnclWish
|
|
June 23, 2018, 12:14:48 PM |
|
When I try the latest version on an AMD Epyc (basically four Ryzen dice on a package) I get the following error: Thread 30 successfully bound to CPU 30 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 30 of NUMA node 3 at: 0000021b3c200000 Starting CPU Mining thread 31, affinity: CPU 31 Thread 31 successfully bound to CPU 31 GetNumaProcessorNode failed for cpu 32, error code: 87
Retrying with no NUMA Allocated 2MB Cached Large Page Scratchpad Buffer at: 0000021b3c400000 Connecting to mining pool support.ipbc.io:17777 ... Devfee is 1.5% That's strange because I only defined threads for the CPUs 0 to 31 in the config file. Why does the miner try to access a CPU (core) 32, which is not present? Apart from that the miner starts mining and has better hashrate than XMR-stak with Bittube: 5200 H/s vs 4900 H/s. In wikipedia no info about L1/L2 cache on Epyc. Only L3. Does L1/L2 cache exists on it or not? Just interesting... P.S. 2Mb L3 cache per thread allready too low... AMD can add more L3 cache on Epyc... Especcially if L1/L2 cache is absent... L1 -> Level 1 L2 -> Level 2 L3 -> Level 3 as the levels get higher, the cache is slower but larger. Maybe AMD is using some clever technique so L1/L2 caches are more shared than they are already, so mentioning it is useless. But absend L1 or L2 cache would have a huge impact on performance. I know it. No need to explain what means L1/L2/L3 and what they do... I just look in wiki and there no info about L1/L2 on Epyc... Just stays "N\A"... Not sure which Wikipedia you are looking at, but on https://en.wikipedia.org/wiki/Epyc you can see L2 cache or even better, on https://en.wikichip.org/wiki/amd/epyc you have even more information And if you use logic, there is no way a cpu will have L3 cache and not have L2/L1, since in that case L3 becomes L1... But in both your links there is no pointed L1 cache... i know that L1/L2 must be, I just wanted to know amount...
|
|
|
|
4ward
Member
Offline
Activity: 473
Merit: 18
|
|
June 23, 2018, 01:37:33 PM |
|
When I try the latest version on an AMD Epyc (basically four Ryzen dice on a package) I get the following error: Thread 30 successfully bound to CPU 30 Allocated 2MB Cached Large Page Scratchpad Buffer for CPU 30 of NUMA node 3 at: 0000021b3c200000 Starting CPU Mining thread 31, affinity: CPU 31 Thread 31 successfully bound to CPU 31 GetNumaProcessorNode failed for cpu 32, error code: 87
Retrying with no NUMA Allocated 2MB Cached Large Page Scratchpad Buffer at: 0000021b3c400000 Connecting to mining pool support.ipbc.io:17777 ... Devfee is 1.5% That's strange because I only defined threads for the CPUs 0 to 31 in the config file. Why does the miner try to access a CPU (core) 32, which is not present? Apart from that the miner starts mining and has better hashrate than XMR-stak with Bittube: 5200 H/s vs 4900 H/s. In wikipedia no info about L1/L2 cache on Epyc. Only L3. Does L1/L2 cache exists on it or not? Just interesting... P.S. 2Mb L3 cache per thread allready too low... AMD can add more L3 cache on Epyc... Especcially if L1/L2 cache is absent... L1 -> Level 1 L2 -> Level 2 L3 -> Level 3 as the levels get higher, the cache is slower but larger. Maybe AMD is using some clever technique so L1/L2 caches are more shared than they are already, so mentioning it is useless. But absend L1 or L2 cache would have a huge impact on performance. I know it. No need to explain what means L1/L2/L3 and what they do... I just look in wiki and there no info about L1/L2 on Epyc... Just stays "N\A"... Not sure which Wikipedia you are looking at, but on https://en.wikipedia.org/wiki/Epyc you can see L2 cache or even better, on https://en.wikichip.org/wiki/amd/epyc you have even more information And if you use logic, there is no way a cpu will have L3 cache and not have L2/L1, since in that case L3 becomes L1... But in both your links there is no pointed L1 cache... i know that L1/L2 must be, I just wanted to know amount... Then you can make a small effort of clicking on a specific CPU and checking the stats... like here: https://en.wikichip.org/wiki/amd/epyc/7351p
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
June 23, 2018, 06:47:22 PM |
|
GPU version done. Now entering polishing and test phase.
A prototype with no autoconfig will be released soon. There will be only the Windows 64 bits version for GPU+CPU, but CPU-only releases will continue on all platforms.
Can you give me the perf ratio loss when you switch from CN-v7 to CN-Heavy ? I mean, on other miners. My rx goes down from 528 to 355 which is pretty disapointing, I expected the Heavy hashrate to be similar.
|
|
|
|
UnclWish
|
|
June 23, 2018, 08:31:57 PM |
|
GPU version done. Now entering polishing and test phase.
A prototype with no autoconfig will be released soon. There will be only the Windows 64 bits version for GPU+CPU, but CPU-only releases will continue on all platforms.
Can you give me the perf ratio loss when you switch from CN-v7 to CN-Heavy ? I mean, on other miners. My rx goes down from 528 to 355 which is pretty disapointing, I expected the Heavy hashrate to be similar.
Heavy needed twice amount of video memory. And intensity usual twice less than in CN v7. On 4Gb 270X max speed on CN-v7 with Claymore 11.3 - 550 h/s (manual modded bios). Intensity 460 (-h 460 -dmem 1) Heavy on SRB Miner with 26-27 intensity got 400-420 h/s and many HW errors. Heavy very strange algo... On RX 580 8Gb heavy gives about 10% more speed than CN-v7. But on 4Gb cards heavy gives less speed. Looks like heavy can give the same or even better speed only with 8Gb+ cards.
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
June 24, 2018, 06:52:25 AM |
|
I tested on a 2G card, my best one is a 4G, i'll give a look, to check i can have same perf on cn-v7 and heavy when i use the whole 4G. thanks for reply
|
|
|
|
maxfunky
|
|
June 24, 2018, 06:56:50 AM |
|
Hi ! Is there support/option for socks5 proxy (like Tor) in config ? in a way similar to various version of cpuminer -x SOCKS5://127.0.0.1:9050
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
June 24, 2018, 09:09:15 AM |
|
Nope, it must mine connected direct to real Internet on a real pool. No stratum or proxy or socks support. This is the exact same rule that for Claymore netcode, it proved to be right, so i do the same.
|
|
|
|
Lonnegan64
Jr. Member
Offline
Activity: 37
Merit: 5
|
|
June 24, 2018, 02:07:34 PM |
|
In wikipedia no info about L1/L2 cache on Epyc. Only L3. Does L1/L2 cache exists on it or not? Just interesting...
P.S. 2Mb L3 cache per thread allready too low... AMD can add more L3 cache on Epyc... Especcially if L1/L2 cache is absent...
My AMD Epyc 7351P has 64 MB shared L3 cache, besides of 16x 512 KB L2 cache dedicated to each core. http://www.cpu-world.com/CPUs/Zen/AMD-EPYC%207351P.html
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
June 25, 2018, 05:28:49 AM |
|
i was testing an optim on cn-heavy than gave negligible perf increase. then i backtested on previous algos and gpu to look for regression.
and, and don't know why, but on Pitcairn the perf has skyrocketed. my 7870 now mine at 540 on v7, on par with my 7950, and far above any other miner, even Claymore 9.7 the gain on Tahiti is smaller, i jumped from 540 to 545
|
|
|
|
UnclWish
|
|
June 25, 2018, 10:56:54 AM |
|
i was testing an optim on cn-heavy than gave negligible perf increase. then i backtested on previous algos and gpu to look for regression.
and, and don't know why, but on Pitcairn the perf has skyrocketed. my 7870 now mine at 540 on v7, on par with my 7950, and far above any other miner, even Claymore 9.7 the gain on Tahiti is smaller, i jumped from 540 to 545
It's good news... I waiting to test it...
|
|
|
|
syncro2017
Newbie
Offline
Activity: 81
Merit: 0
|
|
June 25, 2018, 01:13:34 PM |
|
i was testing an optim on cn-heavy than gave negligible perf increase. then i backtested on previous algos and gpu to look for regression.
and, and don't know why, but on Pitcairn the perf has skyrocketed. my 7870 now mine at 540 on v7, on par with my 7950, and far above any other miner, even Claymore 9.7 the gain on Tahiti is smaller, i jumped from 540 to 545
Great Job man, please post the GPU version for testing, i have a variety of RX550/570/580 2GB/4GB/8GB cards i would happily test with and provide comparison results with other miners
|
|
|
|
JCE-Miner (OP)
Member
Offline
Activity: 350
Merit: 22
|
|
June 25, 2018, 08:01:43 PM Last edit: June 25, 2018, 08:16:57 PM by JCE-Miner |
|
GPU version done. Now entering polishing and test phase.
A prototype with no autoconfig will be released soon. There will be only the Windows 64 bits version for GPU+CPU, but CPU-only releases will continue on all platforms.
Can you give me the perf ratio loss when you switch from CN-v7 to CN-Heavy ? I mean, on other miners. My rx goes down from 528 to 355 which is pretty disapointing, I expected the Heavy hashrate to be similar.
Heavy needed twice amount of video memory. And intensity usual twice less than in CN v7. On 4Gb 270X max speed on CN-v7 with Claymore 11.3 - 550 h/s (manual modded bios). Intensity 460 (-h 460 -dmem 1) Heavy on SRB Miner with 26-27 intensity got 400-420 h/s and many HW errors. Heavy very strange algo... On RX 580 8Gb heavy gives about 10% more speed than CN-v7. But on 4Gb cards heavy gives less speed. Looks like heavy can give the same or even better speed only with 8Gb+ cards.Thanks for the tip bro, I tested on my very only 4GB card, a RX560 Sapphire and... yes it mines Heavy faster than CN-v7 when i push the memory to max allocation (~3.8G). 380 against 360 (which is a bad score, but that card has very bad memory). I do the ultimate tests then I release a prototype. As I said, it will be single-algo (mean all gpu and cpu on same pool, same algo) with no GPU autoconfig yet. GPU fee will be 0.9% (same as the future complete release) and no change for CPU. If you mine with both, it will do a fair linear adjustement. 22:11:57 | Hashrate CPU Thread 0: 61.29 h/s 22:11:57 | Hashrate CPU Thread 1: 2.93 h/s 22:11:57 | Hashrate CPU Thread 2: 2.93 h/s 22:11:57 | Hashrate CPU Thread 3: 61.65 h/s 22:11:57 | Hashrate CPU Thread 4: 61.81 h/s 22:11:57 | Hashrate CPU Thread 5: 2.94 h/s 22:11:57 | Hashrate CPU Thread 6: 2.94 h/s 22:11:57 | Hashrate CPU Thread 7: 61.68 h/s 22:11:57 | Hashrate GPU Thread 8: 174.52 h/s 22:11:57 | Hashrate GPU Thread 9: 174.04 h/s 22:11:57 | Hashrate GPU Thread 10: 167.87 h/s 22:11:57 | Hashrate GPU Thread 11: 167.84 h/s 22:11:57 | Hashrate GPU Thread 12: 190.95 h/s 22:11:57 | Hashrate GPU Thread 13: 191.54 h/s 22:11:57 | Total: 1324.86 h/s - Max: 1324.86 h/s 22:12:03 | GPU Thread 10 Lane 357 finds a Share, value 16650 22:12:03 | Accepted by the pool. That's from my Ryzen 1600 (4 cached threads + 4 uncached) + 3x rx560 rig, only the last card (thread 12+13) being a 4G. All cards use dual-mem, and JCE gpu has been developped to use dual-mem on every card, even the small 1G ones. At least, my implementation is always a lot faster with dual-mem than without, even on my old 7790 or 7850. The "Lane" thing means the Nth parralel compute has found the share. It's stable in memory, during a mining session, the same Lane of same thread use the same memory cells. So it you get bad share from always the same Lane, it's a good way to check if you memory is broken. Yes, JCE has also an indirect video-memtest feature
|
|
|
|
UnclWish
|
|
June 25, 2018, 10:30:34 PM |
|
New version will allow to run several instances? If I want to mine on CPU other olgo than on GPU...
|
|
|
|
|