18-Terahash
Newbie
Offline
Activity: 29
Merit: 0
|
|
March 30, 2019, 01:37:41 PM |
|
Dear dev, I really look forward to the function of the failover pool in pools.txt file!
Yeah! That would be great!!!
|
|
|
|
rednoW
Legendary
Offline
Activity: 1510
Merit: 1003
|
|
March 30, 2019, 03:03:12 PM |
|
Guys, for all of you having issues with vega64lc you can try 1406/1130/905mv (880mv in fact) setup. It will give you 20.2khs and rock stability (hoping you adjust fan setting to keep HBM cool). Yes, it is not so efficient as 1290/960/825mv (800mv in fact) setup of original samsung based vega56 (19.8khs) but better than nothing ))
|
|
|
|
kerney666
Member
Offline
Activity: 658
Merit: 86
|
|
March 30, 2019, 03:54:26 PM |
|
Dear dev, I really look forward to the function of the failover pool in pools.txt file!
Yeah! That would be great!!! Yeah, it's embarassing we've managed to put that one off for so long. That is the one remaining thing before I think we'll drop the beta version marker. We'll get it done in the not-too-distant future, promise.
|
|
|
|
professorkuusamo
Newbie
Offline
Activity: 14
Merit: 0
|
|
March 30, 2019, 04:50:45 PM |
|
OK. I have been trying over several days to replicate the results from pbfarmer on cnv8_trtl and I can say that I am unable to get the efficiency reported by him. Any attempt to reach close to his reported voltage levels of 805mv to 840mV resulted in random DEAD GPUs(stuck in enqueue) after 20-30 minutes hashing or significant hashrate drop per GPU Even at 870mV, I am still unable to get a stable system running. Dead GPUs and hashrate drop to 15kH/s still occurs after several tens of minutes of mining. And the weird thing is hashrate drops happens to Hynix mem GPUs only. Somehow CN-TRTL algo is more taxing that CN_R? I am able to run CN_R without failures for a week and with lower voltage settings (850mV - 870mV). I can't seem to do this for CN_TRTL My settings are as below. But the hashrates aren't sustainable GPUs : Ref Vega 64 and Vega 56 reference bios V64 cclk/memclk 1220/1100 @ 870mV L28+28 (Samsung mem) - 19.5kH/s V56 cclk/memclk 1220/940 @ 870mV L24+24 (Samsung mem) - 19.3kH/s (Hynix mem) - 18.7 kH/s ATW power draw 190W per GPU Adrenalin Driver 18.6.1 Kerney/Todd, Any ideas what could possibly be wrong? Unoptimized CN_TRTL code? i have same situation.unstable only on trtl.try different drivers nothing change. Hey guys! Well, unoptimized code isn't the issue, it's rather that it's optimized so much that it's pounding the gpu more than anything we've produced before, especially the memory subsystem. I'm not chasing efficiency to the same degree that you professional tuners are since my rig(s) are a combination of test and more serious mining. However, on my 8 x Vega 56 ref cards flashed to V64, win10, 18.6.1, clocks at 1408@900, 1100@900, I have zero stability issues, it's been mining for 14h straight now in the current run, and much longer before that as well. I've just stopped it to reconfigure things a few times. However, on my Vega 64 Liquid Cooling in my dev workstation, CN-trtl is the first algo I've ever seen that kills that specific card but not my blower Vegas. It dies after 1-2h, also running at 1408@900, 1100@900. Effective clocks+voltages in hwinfo64 look very similar to the V56s. Right now, I'm doing tests with the single-threaded config support that we also added in 0.4.3. This means I'm running --cn_config=L56+0 instead of --cn_config=L28+28. On my V64 LC, hashrate drops from 19.6 kh/s to 18.8 kh/s, same efficiency. The hashrate is expected to drop a little, but this should be more lean on the gpu, it won't be going full throttle on all parts of the hardware at the same time. Theory vs practice is always a bitch though, I'll report back in a few hours on the results. Meanwhile, you're of course free to try the same trick: switch your problematic Vegas to L56+0 and see if it helps, I'd love to get some more data here. 56 on samsung are stable, works since release 0.4.3.crashed only in rigs with hynix. Got it, although I’ve gotten a nr of reports from liquid V64s having issues as well. Regardless, does anything change for you if you try L56+0 on those Hynix V56s? hash drop to 14 and power to 150w.unstable.
|
|
|
|
dragonmike
|
|
March 30, 2019, 05:54:10 PM |
|
OK. I have been trying over several days to replicate the results from pbfarmer on cnv8_trtl and I can say that I am unable to get the efficiency reported by him. Any attempt to reach close to his reported voltage levels of 805mv to 840mV resulted in random DEAD GPUs(stuck in enqueue) after 20-30 minutes hashing or significant hashrate drop per GPU Even at 870mV, I am still unable to get a stable system running. Dead GPUs and hashrate drop to 15kH/s still occurs after several tens of minutes of mining. And the weird thing is hashrate drops happens to Hynix mem GPUs only. Somehow CN-TRTL algo is more taxing that CN_R? I am able to run CN_R without failures for a week and with lower voltage settings (850mV - 870mV). I can't seem to do this for CN_TRTL My settings are as below. But the hashrates aren't sustainable GPUs : Ref Vega 64 and Vega 56 reference bios V64 cclk/memclk 1220/1100 @ 870mV L28+28 (Samsung mem) - 19.5kH/s V56 cclk/memclk 1220/940 @ 870mV L24+24 (Samsung mem) - 19.3kH/s (Hynix mem) - 18.7 kH/s ATW power draw 190W per GPU Adrenalin Driver 18.6.1 Kerney/Todd, Any ideas what could possibly be wrong? Unoptimized CN_TRTL code? That's quite weird, out of my six reference Vega56's (flashed to 64), 4 are hashing rock solid at 1408@825 core / 1107 mem and the weaker two cards get 850mV. I reckon I could potentially go even lower. Have you fiddled with your power play tables? Is your SoC set higher? That's quite typical for instability, I can sing a song or two about it...
|
|
|
|
pbfarmer
Member
Offline
Activity: 340
Merit: 29
|
|
March 30, 2019, 09:32:20 PM |
|
So, some additional tuning considerations from things I've observed since my first TRTL tests, and general musings:
1. There's almost certainly some cross-talk or strange interaction between GPUs. I had one I was struggling with, and *raising* it's voltage from 831 to 837mv at one point, caused another GPU which had been stable to start crashing immediately after init. I'd seen hints of this situation in the past (even w/ cnr), but this one was pretty obvious, and very odd. These situations can make it very difficult to tune near uv thresholds as it can be confusing as to the source of crashes, so configuring each GPU and running for hours individually may be required.
2. Possibly related, once again I'm seeing h/r discrepancies run-to-run (10-20 h/s per GPU,) which seems to be influenced by which other GPUs are online. They're not always lower either - in some cases, GPUs which are clocked lower and should be underperforming my average, are hashing at the same rate as my top clocked GPUs. Later runs, and the h/rs are more properly aligned w/ the tunings. These 'overperforming' situations almost certainly seem like they would lead to a crash - though i don't have any evidence of actual correlation.
3. My problem GPU really didn't want to run < 837mv around 1200 MHz (actual) - even there i'm not sure it was stable. I've finally settled on 825mv @ <1050MHz, and am still getting 19.43 kh/s out of it. So if you want to run icy-cool, core clocks can come way down.
4. It seemed somewhere around < 1150 MHz actual cclock, dialing back cn_config was required. GPU previously mentioned is now at L26+26.
5. I had a pool connection go down, and when it came back up, one of my GPUs came back online at 17.x kh/s instead of 19.x. I'm assuming a dead thread, but the miner kept going w/o issues otherwise, so not sure. @kerney666 - any reason why re-init would be harsher than first init?
6. If you see h/r drops in the middle of a run, it's probably one of two things. a.) you lost a thread, which means your uv is probably too aggressive for your clocks, or b.) your GPU is throttling due to hbm temperature.
7. A stock 56 tuned to 940MHz mem is different than a 64 at 1100MHz mem. They may perform the similar, but the 56 is near a mem performance limit (due to reduced mem voltage,) while the 64 is only at the SOC scaling limit. Given this algo is so mem intensive, I could see howit could crush a stock 56, esp w/ hynix mem. I would first try dialing back the mem clock if you're having issues.
|
|
|
|
pbfarmer
Member
Offline
Activity: 340
Merit: 29
|
|
March 30, 2019, 09:58:32 PM Last edit: March 30, 2019, 10:14:40 PM by pbfarmer |
|
Also, some additional GPU performance numbers (tho i only have 1 of each.) . This is on windows 10 w/ latest drivers
RX590: 1280Mhz cclock, 2200Mhz mclock, 825mv, L16+16 - 9.11 kh/s
- Auto-tuned L16+16 seemed ideal, after trying a number of different settings. - I would expect 580s to perform similarly, but at a ~7% efficiency loss. - For some reason, running in combination w/ the R VII, I lose about 200-300 h/s on the 590.
R VII: 1500Mhz cclock, 1200Mhz mclock, 835mv, L40+0 - 30.5 kh/s
- Auto-tuned L30+30 seemed super-conservative. Bumping to L32+32 caused a 20% increase in h/r, from ~25 to ~30 kh/s @ 1650/1100/850 iirc. - L35+35 or L36+36 looked liked the peak, but switching to single thread ended up being even better. - Over L40+0 caused a bsod, but not sure if it was just that the GPU needs more clock/power for higher cn_config - haven't played w/ it enough yet. - Best h/r i saw was 32 kh/s around 1700mhz/1200mhz/900mv(?)/L36+36, tho i hadn't gotten into single thread testing yet, and clearly clocks can keep going (tho prob not worth it efficiency-wise.) From my limited testing, mclock and cclock tuning was only pushing at the margins (maybe .5 kh/s for 50mhz) - not sure how to get and additional 8 kh/s out of this like @heavyarms1912 suggested (w/o getting into timings). - I'm still on shipped bios - so maybe there's something to be gained w/ an update to v106?
|
|
|
|
kerney666
Member
Offline
Activity: 658
Merit: 86
|
|
March 30, 2019, 10:40:01 PM |
|
Quick reply on this one: 5. I had a pool connection go down, and when it came back up, one of my GPUs came back online at 17.x kh/s instead of 19.x. I'm assuming a dead thread, but the miner kept going w/o issues otherwise, so not sure. @kerney666 - any reason why re-init would be harsher than first init?
This might be host-side miner code failing to get a properly staggered setup between the two threads (i.e. the stuff xmr-stak calls interleaving nowadays) after an "intra-mining restart" (in lack of a better term). We have a few use cases we need to improve (gpu paused due to high temps, pool/network down). So, I'm guessing your two threads got stuck in a non-interleaving mode and your perf dropped. It should be sufficient for us to detect this and delay one thread a little as a one-time action. This is just a hypothesis, but it makes sense given that you didn't see a hung thread, and since you had been down for a while HBM temps/throttling shouldn't be an issue.
|
|
|
|
tvukoman
Jr. Member
Offline
Activity: 69
Merit: 5
|
|
March 30, 2019, 10:40:47 PM |
|
OK. I have been trying over several days to replicate the results from pbfarmer on cnv8_trtl and I can say that I am unable to get the efficiency reported by him. Any attempt to reach close to his reported voltage levels of 805mv to 840mV resulted in random DEAD GPUs(stuck in enqueue) after 20-30 minutes hashing or significant hashrate drop per GPU Even at 870mV, I am still unable to get a stable system running. Dead GPUs and hashrate drop to 15kH/s still occurs after several tens of minutes of mining. And the weird thing is hashrate drops happens to Hynix mem GPUs only. Somehow CN-TRTL algo is more taxing that CN_R? I am able to run CN_R without failures for a week and with lower voltage settings (850mV - 870mV). I can't seem to do this for CN_TRTL My settings are as below. But the hashrates aren't sustainable GPUs : Ref Vega 64 and Vega 56 reference bios V64 cclk/memclk 1220/1100 @ 870mV L28+28 (Samsung mem) - 19.5kH/s V56 cclk/memclk 1220/940 @ 870mV L24+24 (Samsung mem) - 19.3kH/s (Hynix mem) - 18.7 kH/s ATW power draw 190W per GPU Adrenalin Driver 18.6.1 Kerney/Todd, Any ideas what could possibly be wrong? Unoptimized CN_TRTL code? That's quite weird, out of my six reference Vega56's (flashed to 64), 4 are hashing rock solid at 1408@825 core / 1107 mem and the weaker two cards get 850mV. I reckon I could potentially go even lower. Have you fiddled with your power play tables? Is your SoC set higher? That's quite typical for instability, I can sing a song or two about it... DragonMike, my SOC on reference V56 (bios V64) is 1199, i think it is becouse i use "Safe" ppt file that u/Hellea make. That is only ppt table that for some reason work stable. I want to try lower SOC to 1107. Can you (or someone) share ppt with lover SOC 1107 that work ok for reference V56 (bios changed to V64 samsung mem)? txs
|
|
|
|
pbfarmer
Member
Offline
Activity: 340
Merit: 29
|
|
March 30, 2019, 11:56:39 PM |
|
Quick reply on this one: 5. I had a pool connection go down, and when it came back up, one of my GPUs came back online at 17.x kh/s instead of 19.x. I'm assuming a dead thread, but the miner kept going w/o issues otherwise, so not sure. @kerney666 - any reason why re-init would be harsher than first init?
This might be host-side miner code failing to get a properly staggered setup between the two threads (i.e. the stuff xmr-stak calls interleaving nowadays) after an "intra-mining restart" (in lack of a better term). We have a few use cases we need to improve (gpu paused due to high temps, pool/network down). So, I'm guessing your two threads got stuck in a non-interleaving mode and your perf dropped. It should be sufficient for us to detect this and delay one thread a little as a one-time action. This is just a hypothesis, but it makes sense given that you didn't see a hung thread, and since you had been down for a while HBM temps/throttling shouldn't be an issue. Thanks - I figured as much. Just watched it happen again in real-time, and this time it crashed the miner. Looked like one GPU came back up at 18.x, then a different one died. Might want to move this one up the list, as it seems to be a stability concern... EDIT: I guess we can test your hypothesis by running single thread? Tho maybe that's introducing other variables since you said it's less pressure on the hw...
|
|
|
|
seefatlow
Jr. Member
Offline
Activity: 80
Merit: 1
|
|
March 31, 2019, 02:24:52 AM |
|
OK. I have been trying over several days to replicate the results from pbfarmer on cnv8_trtl and I can say that I am unable to get the efficiency reported by him. Any attempt to reach close to his reported voltage levels of 805mv to 840mV resulted in random DEAD GPUs(stuck in enqueue) after 20-30 minutes hashing or significant hashrate drop per GPU Even at 870mV, I am still unable to get a stable system running. Dead GPUs and hashrate drop to 15kH/s still occurs after several tens of minutes of mining. And the weird thing is hashrate drops happens to Hynix mem GPUs only. Somehow CN-TRTL algo is more taxing that CN_R? I am able to run CN_R without failures for a week and with lower voltage settings (850mV - 870mV). I can't seem to do this for CN_TRTL My settings are as below. But the hashrates aren't sustainable GPUs : Ref Vega 64 and Vega 56 reference bios V64 cclk/memclk 1220/1100 @ 870mV L28+28 (Samsung mem) - 19.5kH/s V56 cclk/memclk 1220/940 @ 870mV L24+24 (Samsung mem) - 19.3kH/s (Hynix mem) - 18.7 kH/s ATW power draw 190W per GPU Adrenalin Driver 18.6.1 Kerney/Todd, Any ideas what could possibly be wrong? Unoptimized CN_TRTL code? i have same situation.unstable only on trtl.try different drivers nothing change. Hey guys! Well, unoptimized code isn't the issue, it's rather that it's optimized so much that it's pounding the gpu more than anything we've produced before, especially the memory subsystem. I'm not chasing efficiency to the same degree that you professional tuners are since my rig(s) are a combination of test and more serious mining. However, on my 8 x Vega 56 ref cards flashed to V64, win10, 18.6.1, clocks at 1408@900, 1100@900, I have zero stability issues, it's been mining for 14h straight now in the current run, and much longer before that as well. I've just stopped it to reconfigure things a few times. However, on my Vega 64 Liquid Cooling in my dev workstation, CN-trtl is the first algo I've ever seen that kills that specific card but not my blower Vegas. It dies after 1-2h, also running at 1408@900, 1100@900. Effective clocks+voltages in hwinfo64 look very similar to the V56s. Right now, I'm doing tests with the single-threaded config support that we also added in 0.4.3. This means I'm running --cn_config=L56+0 instead of --cn_config=L28+28. On my V64 LC, hashrate drops from 19.6 kh/s to 18.8 kh/s, same efficiency. The hashrate is expected to drop a little, but this should be more lean on the gpu, it won't be going full throttle on all parts of the hardware at the same time. Theory vs practice is always a bitch though, I'll report back in a few hours on the results. Meanwhile, you're of course free to try the same trick: switch your problematic Vegas to L56+0 and see if it helps, I'd love to get some more data here. The Hynix GPUs continue to crash with single tread and increased mem voltage. I will try scaling down the mem frequency in this case and see what happens. Think what pbfarmer said makes sense in regards to mem frequency limit for stock v56 bios. The prospect of spending more time to mod my v56s and re-installing them on my rig is scary
|
|
|
|
seefatlow
Jr. Member
Offline
Activity: 80
Merit: 1
|
|
March 31, 2019, 02:34:15 AM |
|
OK. I have been trying over several days to replicate the results from pbfarmer on cnv8_trtl and I can say that I am unable to get the efficiency reported by him. Any attempt to reach close to his reported voltage levels of 805mv to 840mV resulted in random DEAD GPUs(stuck in enqueue) after 20-30 minutes hashing or significant hashrate drop per GPU Even at 870mV, I am still unable to get a stable system running. Dead GPUs and hashrate drop to 15kH/s still occurs after several tens of minutes of mining. And the weird thing is hashrate drops happens to Hynix mem GPUs only. Somehow CN-TRTL algo is more taxing that CN_R? I am able to run CN_R without failures for a week and with lower voltage settings (850mV - 870mV). I can't seem to do this for CN_TRTL My settings are as below. But the hashrates aren't sustainable GPUs : Ref Vega 64 and Vega 56 reference bios V64 cclk/memclk 1220/1100 @ 870mV L28+28 (Samsung mem) - 19.5kH/s V56 cclk/memclk 1220/940 @ 870mV L24+24 (Samsung mem) - 19.3kH/s (Hynix mem) - 18.7 kH/s ATW power draw 190W per GPU Adrenalin Driver 18.6.1 Kerney/Todd, Any ideas what could possibly be wrong? Unoptimized CN_TRTL code? That's quite weird, out of my six reference Vega56's (flashed to 64), 4 are hashing rock solid at 1408@825 core / 1107 mem and the weaker two cards get 850mV. I reckon I could potentially go even lower. Have you fiddled with your power play tables? Is your SoC set higher? That's quite typical for instability, I can sing a song or two about it... Where do I see the SOC voltage and where can I set them? My ppt tables starts with 800mv for P0 and moves up to 820mv for P3-P5. Do you see a problem with this?
|
|
|
|
cas333
Newbie
Offline
Activity: 52
Merit: 0
|
|
March 31, 2019, 06:25:17 AM Last edit: March 31, 2019, 07:07:47 AM by cas333 |
|
I have trouble running new algo turtle. After max 20min from the start or less,my rigs crashing. Cards RX 570 8GB Msi mk2 Micron memory,CC 1200,MC 2190,860mV . Any sugestions,please? Other algos work great with same settings. Thanks in advance!
|
|
|
|
tvukoman
Jr. Member
Offline
Activity: 69
Merit: 5
|
|
March 31, 2019, 07:19:33 AM |
|
Where do I see the SOC voltage and where can I set them? My ppt tables starts with 800mv for P0 and moves up to 820mv for P3-P5. Do you see a problem with this?
HWInfo (to see it)
|
|
|
|
seefatlow
Jr. Member
Offline
Activity: 80
Merit: 1
|
|
March 31, 2019, 09:25:21 AM |
|
I have trouble running new algo turtle. After max 20min from the start or less,my rigs crashing. Cards RX 570 8GB Msi mk2 Micron memory,CC 1200,MC 2190,860mV . Any sugestions,please? Other algos work great with same settings. Thanks in advance!
We are also struggling with this algo on Vega56 wth Hynix memory. Suggest you dial back you mem clock to below 2190 and see what happens
|
|
|
|
bobben2
|
|
March 31, 2019, 10:24:45 AM |
|
For some reason this miner works waay better under Linux than Windows 10. On Xubuntu 18.04 my 2 x Sapphire Pulse RX 574 are mining Monero at 960H/s each at 1175/2000, core Volt at 850mV, consuming only 225 Watts for the whole 2-card rig. The same settings in Win and it was above 240 Watts with core volt 900mV for stability and hash was "only" 933H/s each card..
|
Fellow miners, get your thens and thans in order and help other forum readers understand what you are writing. Remember the grammar basics: B larger THAN A (comparator operator). If something THEN ....
|
|
|
cas333
Newbie
Offline
Activity: 52
Merit: 0
|
|
March 31, 2019, 10:54:27 AM |
|
I have trouble running new algo turtle. After max 20min from the start or less,my rigs crashing. Cards RX 570 8GB Msi mk2 Micron memory,CC 1200,MC 2190,860mV . Any sugestions,please? Other algos work great with same settings. Thanks in advance!
We are also struggling with this algo on Vega56 wth Hynix memory. Suggest you dial back you mem clock to below 2190 and see what happens Thanks seefatlow,but nothing helps! I just switched to other coin.Anyway difficulty for Turtle and Loki is too high now and not worth it the time and efforts for experiments!
|
|
|
|
Iamtutut
|
|
March 31, 2019, 10:59:51 AM |
|
Running it on rx580 8gb, i got 8.6Kh@128W or 8Kh/s@109W downvolting.
Been messing with one of mine... But RX570 4gb L16+16 and seem to be running 8.2 khs. I haven't found any better combos than that for the config. JCE CN GPU miner is still faster then. Except for my bad GPU (MSI Gaming), my 574s get +/- 8,5KH/s.
|
|
|
|
todxx (OP)
Member
Offline
Activity: 176
Merit: 76
|
|
March 31, 2019, 12:58:25 PM |
|
Have any of you guys played around with editing HBM2 memory timings on linux?
If anyone has a reference V64/V56/FE card with Samsung Mem on linux and is feeling adventurous, could you guys try the following timings? --rp 10 --rc 45 --rfc 290
Please keep in mind that editing timings can result in damage to your cards. These timings work for me on my Vega FE, but there's no guarantee they will work for other cards. In particular, I'm curious what your cn trtl results are. Timings currently have a very limited impact on our other kernels.
|
|
|
|
heavyarms1912
|
|
March 31, 2019, 04:07:15 PM |
|
Have any of you guys played around with editing HBM2 memory timings on linux?
If anyone has a reference V64/V56/FE card with Samsung Mem on linux and is feeling adventurous, could you guys try the following timings? --rp 10 --rc 45 --rfc 290
Please keep in mind that editing timings can result in damage to your cards. These timings work for me on my Vega FE, but there's no guarantee they will work for other cards. In particular, I'm curious what your cn trtl results are. Timings currently have a very limited impact on our other kernels.
any idea on the reason it can damage? heat? I get 40+ KHs on VII on cn-trtl without the memory mods @ 1450 core and 1295 HBM2.
|
|
|
|
|