Title: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on November 28, 2015, 10:24:19 AM During the development of the CUDA miner for Ethereum, I ran into an issue where the hashrate on GTX750Ti dramatically drops when the size of the memory buffer the miner operates on exceeds a certain threshold (1GB on Win7/Linux, 512MB on Win8/10). After a long discussion on the CUDA forums, one of the designers of CUDA weighed in and identified the issue as TLB trashing. I'm currently conducting a bit of research on the subject and have created a simple test program that measures these effects. It simulates the 'dagger' part of the Ethereum algorithm at different memory buffer (DAG) sizes and writes the results to a CSV file. So far, I have concluded that it is not an Nvidia-only issue, but manifests on AMD hardware as well. And apparently this is not an ETH-only issue, I've got some reports from srcypt-jane miners in as well.
I'm currently looking for as many as possible hardware/OS combinations to come to a recommendation for miners as well as designers of new algo's. Below is an example for ETH hashrate on GTX780 on Windows with increasing buffer size (in MB): http://i.snag.gy/JQhR3.jpg The test program can be dowloaded from https://github.com/Genoil/dagSimCL. Win-64 binaries are in the x64/Release folder. You can also build it yourself, but only have supporting MSVC files targetted at Nvidia OpenCL. On AMD hardware you may want to run Code: set GPU_MAX_ALLOC_PERCENT 100 first. By default, the program tries to use all of your GPU's RAM up until 4096MB. If you have less system RAM, you may add a cmd line param to test up until a lower maximum: Code: dagsimCL.exe 2048 If you have multiple GPU's, you need to add a second param: Code: dagsimCL.exe 4096 1 If you have multiple OpenCL platform installed: Code: dagsimCL.exe 4096 0 1 I would be very grateful if you could participate in this bit of research and possible discuss any workarounds. Thanks! p.s. note that achieved hashrates with the test program can be significantly higher than what you actaully get with ethminer. This is because it only simautes the Dagge stages, not the Keccak stages. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: MaxDZ8 on November 29, 2015, 09:05:57 AM Interesting.
Please, can you share links to all discussions? Considering OpenCL is possibly higher level than GL ever was, I'm quite surprised one pinpointed an hardware construct issue especially as GPUs are traditionally managed and there's a huge gap between different OS which in my experience should not be there for HW constructs... odd. I have a 1GiB card so there's little I can do. I will try to take a look in the next few days if I can set apart some time. Initial analysis in CodeXL gave me inconsistent results. Have you investigated different access patterns? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on November 29, 2015, 01:41:46 PM This thread on the CUDA forums is most relevant:
https://devtalk.nvidia.com/default/topic/878455/cuda-programming-and-performance/gtx750ti-and-buffers-gt-1gb-on-win7/ Somebody over there (@allnamac) wrote a completely independent test that verified my findings. This is not so interesting but shows the problems affect both NVidia and AMD: http://gathering.tweakers.net/forum/list_messages/1659186 Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: gielbier on November 30, 2015, 01:29:57 PM I'm still trying to get my 7850 2GB do above 1280MB , but getting the out of memory error.
Even with set GPU_MAX_ALLOC_PERCENT=100 / GPU_MAX_ALLOC_PERCENT=95 set GPU_MAX_HEAP_SIZE=100 set GPU_USE_SYNC_OBJECTS=1 Code: DAG size (MB) Bandwidth (GB/s) Hashrate (MH/s) Chunked (512) version below: Code: 128 130.953 17.1643 Chunked (256) version below: Code: DAG size (MB) Bandwidth (GB/s) Hashrate (MH/s) Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on November 30, 2015, 03:23:57 PM I've modified the sourcecode a bit to allocate in 256MB chunks. Now it should be possible for AMD cards to get to use more RAM. On my GTX780, the hashrate curve is just about the same (tiny bit slower) when using 256MB chunks.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on November 30, 2015, 04:35:27 PM I've modified the sourcecode a bit to allocate in Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: apriyoni on November 30, 2015, 04:47:31 PM Do you have a binary for the variable chunk size? I wonder if the future ethminer can also let user choose the chuck size for optimization.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on November 30, 2015, 05:16:32 PM Do you have a binary for the variable chunk size? I wonder if the future ethminer can also let user choose the chuck size for optimization. Binaries are in the x64/Release folder on the git repo Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: apriyoni on November 30, 2015, 05:24:51 PM Do you have a binary for the variable chunk size? I wonder if the future ethminer can also let user choose the chuck size for optimization. Binaries are in the x64/Release folder on the git repo What parameter do I use in the command line to change the chunk size? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on November 30, 2015, 06:43:01 PM Do you have a binary for the variable chunk size? I wonder if the future ethminer can also let user choose the chuck size for optimization. Binaries are in the x64/Release folder on the git repo What parameter do I use in the command line to change the chunk size? Just a number: dagSimCL.exe 128 Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Omegasun on November 30, 2015, 07:19:48 PM It seems the code can be optimised further.
I used the chunk size of 640, the speed varies. Code: DAG size (MB) Bandwidth (GB/s) Hashrate (MH/s) Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Omegasun on November 30, 2015, 07:38:22 PM More details:
AMD 7970 Catalyst 15.7 Win 8.1 Code: DAG size (MB) Bandwidth (GB/s) Hashrate (MH/s) Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on December 01, 2015, 09:06:10 AM That's interesting. Many thanks.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: MaxDZ8 on December 01, 2015, 10:09:12 AM I've looked at the resources. Considering the linked threads are 1) in German 2) hundreds of messages long I cannot be 100% sure I got it completely.
What I can tell you is that I've observed considerable lower than expected memory performance on GCN1.0 even with much smaller buffers. I think it is also worth noticing before 'compute on GPU' became a thing graphical resources always had an upper bound (!= from CL max alloc). It is my understanding no such limitation shall be in place at that point (and it should be bigger than 1GiB anyway)... I'm still very surprised this hardware issue to manifest itself at such big bounds (unless the historical limitation still applies). Leaving aside the max alloc, have you tried how varying the stride affect result (for K MiB buffer)? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: virasog on December 01, 2015, 01:59:38 PM From the 640MB chunk size above, the hash rate changes between 20 to 40 MHz all the time, then it the difference reduces from 20 MH/s to a much lower value. Does it indicate an optimisation opportunity.
Can anybody make the chunk work in the ethminers? The latest ethminer does not display the hash rate, it makes it difficult to compare the results. I wonder this can be added as well. Code: catch (cl::Error const& err) Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Eliovp on December 01, 2015, 10:18:30 PM Hey,
as you already noticed, it is indeed correct, a bigger dag file will decrease speed drastically. I've done some tests too. first results: 390X Code: DAG size (MB) Bandwidth (GB/s) Hashrate (MH/s) Second result: Fury X (stopped @ 2816MB) Code: DAG size (MB) Bandwidth (GB/s) Hashrate (MH/s) "Indien nodig, is het best mogelijk om nog wat testen te doen hoor. heb nog andere kaarten..." Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on December 02, 2015, 12:12:36 PM From the 640MB chunk size above, the hash rate changes between 20 to 40 MHz all the time, then it the difference reduces from 20 MH/s to a much lower value. Does it indicate an optimisation opportunity. Can anybody make the chunk work in the ethminers? The latest ethminer does not display the hash rate, it makes it difficult to compare the results. I wonder this can be added as well. Code: catch (cl::Error const& err) It may be an oppurtunity for an optimization. The chunked implementation in current ethminer is disabled because it doesn't work. I'll see if I can find some time to check if this could work in ethminer. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Dofnatues on December 02, 2015, 05:54:48 PM From the 640MB chunk size above, the hash rate changes between 20 to 40 MHz all the time, then it the difference reduces from 20 MH/s to a much lower value. Does it indicate an optimisation opportunity. Can anybody make the chunk work in the ethminers? The latest ethminer does not display the hash rate, it makes it difficult to compare the results. I wonder this can be added as well. Code: catch (cl::Error const& err) It may be an oppurtunity for an optimization. The chunked implementation in current ethminer is disabled because it doesn't work. I'll see if I can find some time to check if this could work in ethminer. If you can make it work, you save a lot of AMD card from being useful in a month or two. By the way, why the latest ethminer (1.1.0) does not has rate? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Grim on December 10, 2015, 08:16:43 AM Besides all that TLB trashing how come the 280x has more bandwidth (~300GBs) compared to
390x only having 262 GBs Fury X only 249 GBs ??? (also besides bandwidth the gpu memory timings seem to be a major factor) PS: maybe the 280x has optimized timings from the stilt (bios update)? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: MaxDZ8 on December 13, 2015, 09:08:23 AM It's a possibility. I am positive the distribution of math operations VS mem access has a major incidence in GCN; the OpenCL AMD driver is super fast but also very stupid.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Fasdurcas on December 13, 2015, 07:51:42 PM We have about 60 days for most AMD to work without an update of the ethminer. Who is responsible for update the software?
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Masked_Immortal on December 14, 2015, 08:32:16 AM is this issue just related to bandwidth, gtx970 has less bandwidth than 280x.
and what about Maxwell GPU? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on December 15, 2015, 01:13:26 PM We have about 60 days for most AMD to work without an update of the ethminer. Who is responsible for update the software? I filed this as a potential threat in the Ethereum bug bounty program but haven't anything from their end. Keep in mind I'm not 100% certain about this bug. It was an issue with my dagSimCL test program that may apply to ethminer as well. Unfortunately I don't have any time at the moment to further look into this. If it really is an issue, rest assured the private kernel gang has already jumped on it and resolved it, possibly using the approach that's publicly available (https://github.com/Genoil/dagSimCL/commit/cd900ffd83559a3764abfe2fbc6aa5d509c7a448) in the dagSimCL repo. The owners of such modded kernels should be in for some serious profit... is this issue just related to bandwidth, gtx970 has less bandwidth than 280x. and what about Maxwell GPU? Maxwell cards with Compute 5.2 (GTX 9xx) only start suffering badly from TLB trashing after 2GB+ allocations, so they are fine until the switch to POS. Maxwell cards with Compute 5.0 (GTX750) have already bitten the dust and are useless for ETH mining. Note that TLB trashing and the AMD max allocation problem are two separate issues. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: vatusasid on December 15, 2015, 02:30:22 PM The developers of Ethereum were paid 13 million Ethe. How come they could not solve this problem? The ethminer 1.0.1 is not stable yet.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Omegasun on January 08, 2016, 09:11:37 AM Is there any news about the development of the etherminer so that it can cope with the larger DAG size. We are approaching 1280MB.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Bagdar13 on January 12, 2016, 08:54:00 PM Is there any news about the development of the etherminer so that it can cope with the larger DAG size. We are approaching 1280MB. I have heard nothing. I also tested this and found a problem at ~1280 as well. I started looking on forums as I noticed a substancially drop off on the 7970 cards at ~1.2 GB. At this point my 280xs are down from 27 to about 24 and At this point my 7970s are down from 22 to about 17 each (this was what supprised me and this problem is present on XFX, powercolor and one other) This drop in performance seems to be larger than the expected drop Oddly enough my 7870s seem to have suffered little if any performance hit and are still happy doing 15 same as at launch. Edit my 7870s are the ghz edition, however, i also have a sapphire and a 7870 MIST (which is really a broken 7950) both of these are also unaffected. Food for thought. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Venon on January 13, 2016, 08:32:35 PM Is there any news about the development of the etherminer so that it can cope with the larger DAG size. We are approaching 1280MB. I have heard nothing. I also tested this and found a problem at ~1280 as well. I started looking on forums as I noticed a substancially drop off on the 7970 cards at ~1.2 GB. At this point my 280xs are down from 27 to about 24 and At this point my 7970s are down from 22 to about 17 each (this was what supprised me and this problem is present on XFX, powercolor and one other) This drop in performance seems to be larger than the expected drop Oddly enough my 7870s seem to have suffered little if any performance hit and are still happy doing 15 same as at launch. Edit my 7870s are the ghz edition, however, i also have a sapphire and a 7870 MIST (which is really a broken 7950) both of these are also unaffected. Food for thought. Do you find the drop during the test or the actual mining? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: sp_ on January 13, 2016, 09:05:42 PM Let the dagger grow. The ether algo will be perfect for the botnets.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Bagdar13 on January 15, 2016, 04:24:49 AM Is there any news about the development of the etherminer so that it can cope with the larger DAG size. We are approaching 1280MB. I have heard nothing. I also tested this and found a problem at ~1280 as well. I started looking on forums as I noticed a substancially drop off on the 7970 cards at ~1.2 GB. At this point my 280xs are down from 27 to about 24 and At this point my 7970s are down from 22 to about 17 each (this was what supprised me and this problem is present on XFX, powercolor and one other) This drop in performance seems to be larger than the expected drop Oddly enough my 7870s seem to have suffered little if any performance hit and are still happy doing 15 same as at launch. Edit my 7870s are the ghz edition, however, i also have a sapphire and a 7870 MIST (which is really a broken 7950) both of these are also unaffected. Food for thought. Do you find the drop during the test or the actual mining? I am now dropping in actual mining on this hardware with the dag update at block 840000; the point being is my drop in hash seems to be more than predicted by the size of DAG increase. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Dofnatues on January 15, 2016, 06:15:27 PM Because of the drop of the hash rate. I decided to reduce the core clock frequency and keep the memory frequency the same. Is that a good idea?
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on January 18, 2016, 04:09:29 PM I just finished implementing the chunk allocation into my fork of ethminer.
https://github.com/Genoil/cpp-ethereum/tree/opencl-chunks By allocating DAG memory in chunks (--cl-chunks <chunkSizeInMB>), issues with RAM allocation may be averted. A nice side effect of this may be (significantly) higher hashrates. Based on what I've seen from people using dagSimCL, --cl-chunks 640 yields quite good results. It may be however that there is a correlation between optimal setting of chunk size vs dag size. I wrote this change without access to AMD hardware, so your mileage may vary. Don't bother trying this on CUDA devices, using chunks there only has a negative impact on hashrate. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: RustyNoman on January 18, 2016, 05:54:13 PM I just finished implementing the chunk allocation into my fork of ethminer. https://github.com/Genoil/cpp-ethereum/tree/opencl-chunks By allocating DAG memory in chunks (--cl-chunks <chunkSizeInMB>), issues with RAM allocation may be averted. A nice side effect of this may be (significantly) higher hashrates. Based on what I've seen from people using dagSimCL, --cl-chunks 640 yields quite good results. It may be however that there is a correlation between optimal setting of chunk size vs dag size. I wrote this change without access to AMD hardware, so your mileage may vary. Don't bother trying this on CUDA devices, using chunks there only has a negative impact on hashrate. Do you have instructions for building the problem. Do you have an exe version so that we can try. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on January 18, 2016, 06:02:29 PM Binary is on the eth forum in mining section
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Justicemaxx on January 19, 2016, 07:42:35 PM Binary is on the eth forum in mining section I tried with different settings of chunks, 640, 660, these figures reduce hash rate about 3 times on R280x, R290. 6x280x give about 50 MGh. At the same time setting 1300 or more does not affect the speed, the speed becomes normal, about 150 MGh, and chunks 1300 give 150 MGh. Maybe I do something wrong? ....Before starting hash, miner writes that he can't create 2 block DAG file because it is blocked GPU. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: mandica on January 19, 2016, 09:04:29 PM Binary is on the eth forum in mining section I tried with different settings of chunks, 640, 660, these figures reduce hash rate about 3 times on R280x, R290. 6x280x give about 50 MGh. At the same time setting 1300 or more does not affect the speed, the speed becomes normal, about 150 MGh, and chunks 1300 give 150 MGh. Maybe I do something wrong? ....Before starting hash, miner writes that he can't create 2 block DAG file because it is blocked GPU. Did your miner submit valid shares? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Justicemaxx on January 20, 2016, 01:12:28 PM I have extracted a solo miner showed resolve, but the server node seems to be ignored, that is, the balls from the miner with chunks.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Assanger on January 20, 2016, 04:37:58 PM I have extracted a solo miner showed resolve, but the server node seems to be ignored, that is, the balls from the miner with chunks. Does it mean the etherminer is mining, but the shares are not recognized? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Justicemaxx on January 20, 2016, 04:47:56 PM Yes
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on January 25, 2016, 09:00:14 AM I have extracted a solo miner showed resolve, but the server node seems to be ignored, that is, the balls from the miner with chunks. LOL true. I'm sorry man, just knocked this out blindly without access to an actual AMD card. For now, some further testing by others have indicated there presently no need to worry about allocation problems in the near future. I wil have to verify for myself to be absolutely sure though. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Marvell1 on January 25, 2016, 10:01:22 PM I have extracted a solo miner showed resolve, but the server node seems to be ignored, that is, the balls from the miner with chunks. LOL true. I'm sorry man, just knocked this out blindly without access to an actual AMD card. For now, some further testing by others have indicated there presently no need to worry about allocation problems in the near future. I wil have to verify for myself to be absolutely sure though. I'd could send you one of my 7950s if you want to pay for shipping i have a bunch laying around due to no motherboards to host them in. This dag problem is getting huge for my my 900mh/s farm is down to like 700mh/s Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: mandica on January 26, 2016, 08:30:57 PM I have extracted a solo miner showed resolve, but the server node seems to be ignored, that is, the balls from the miner with chunks. LOL true. I'm sorry man, just knocked this out blindly without access to an actual AMD card. For now, some further testing by others have indicated there presently no need to worry about allocation problems in the near future. I wil have to verify for myself to be absolutely sure though. I'd could send you one of my 7950s if you want to pay for shipping i have a bunch laying around due to no motherboards to host them in. This dag problem is getting huge for my my 900mh/s farm is down to like 700mh/s The Dag problem is not a problem as it affect all the graphics cards. But I heard that it affects R9 380 less. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Akarabzie on January 26, 2016, 10:00:18 PM I have extracted a solo miner showed resolve, but the server node seems to be ignored, that is, the balls from the miner with chunks. LOL true. I'm sorry man, just knocked this out blindly without access to an actual AMD card. For now, some further testing by others have indicated there presently no need to worry about allocation problems in the near future. I wil have to verify for myself to be absolutely sure though. I'd could send you one of my 7950s if you want to pay for shipping i have a bunch laying around due to no motherboards to host them in. This dag problem is getting huge for my my 900mh/s farm is down to like 700mh/s The Dag problem is not a problem as it affect all the graphics cards. But I heard that it affects R9 380 less. I keep hearing this as well, but i don't think I've seen enough data to be sure about this yet, or the reason why the 380s aren't affected. Is it the difference in memory types or what? Also what kind of difference if any does the trashing have on the 380 vs 380X? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: RustyNoman on January 27, 2016, 08:58:23 AM I have extracted a solo miner showed resolve, but the server node seems to be ignored, that is, the balls from the miner with chunks. LOL true. I'm sorry man, just knocked this out blindly without access to an actual AMD card. For now, some further testing by others have indicated there presently no need to worry about allocation problems in the near future. I wil have to verify for myself to be absolutely sure though. I'd could send you one of my 7950s if you want to pay for shipping i have a bunch laying around due to no motherboards to host them in. This dag problem is getting huge for my my 900mh/s farm is down to like 700mh/s The Dag problem is not a problem as it affect all the graphics cards. But I heard that it affects R9 380 less. I keep hearing this as well, but i don't think I've seen enough data to be sure about this yet, or the reason why the 380s aren't affected. Is it the difference in memory types or what? Also what kind of difference if any does the trashing have on the 380 vs 380X? Yes. We need more data to assess the situation. I am also interested in knowing the performance of 380 vs 380x. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Marvell1 on January 27, 2016, 05:45:52 PM I have extracted a solo miner showed resolve, but the server node seems to be ignored, that is, the balls from the miner with chunks. LOL true. I'm sorry man, just knocked this out blindly without access to an actual AMD card. For now, some further testing by others have indicated there presently no need to worry about allocation problems in the near future. I wil have to verify for myself to be absolutely sure though. I'd could send you one of my 7950s if you want to pay for shipping i have a bunch laying around due to no motherboards to host them in. This dag problem is getting huge for my my 900mh/s farm is down to like 700mh/s The Dag problem is not a problem as it affect all the graphics cards. But I heard that it affects R9 380 less. I keep hearing this as well, but i don't think I've seen enough data to be sure about this yet, or the reason why the 380s aren't affected. Is it the difference in memory types or what? Also what kind of difference if any does the trashing have on the 380 vs 380X? Yes. We need more data to assess the situation. I am also interested in knowing the performance of 380 vs 380x. I have both the 380 and 380x 4G cards and the hash rate is pretty underwhelming 18mh/s vs 19.5 mh/s max it seems. They are both pretty power hungry too around 240 watts maybe 250 for the x. a 7950 gets close to 23 mhs/s for around the same power. One thing i do notice is the hash rate on the 380 and 380x has remained constant regardless of DAG size vs the drop in hash rate of the 7950s to around 22-21 mh/s not sure to make of all of this . I think the best bet right now is to get 390s and mix and match them with 380 so at least you get better relsae value on your GPU's vs the older cards unles you can get them really cheap. the problem with the 390 and 390x is the run crazy hot and consume close to 300 wats of power , thats even worse with a 290x I'm trying out various brands of 380x cards this week but form my estimation its not worth it to pay anthing more for the 380x at least for mining since it hashes only 5% higer than the 380 and uses more power basically a worthless card. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: RustyNoman on January 28, 2016, 11:06:29 AM I have extracted a solo miner showed resolve, but the server node seems to be ignored, that is, the balls from the miner with chunks. LOL true. I'm sorry man, just knocked this out blindly without access to an actual AMD card. For now, some further testing by others have indicated there presently no need to worry about allocation problems in the near future. I wil have to verify for myself to be absolutely sure though. I'd could send you one of my 7950s if you want to pay for shipping i have a bunch laying around due to no motherboards to host them in. This dag problem is getting huge for my my 900mh/s farm is down to like 700mh/s The Dag problem is not a problem as it affect all the graphics cards. But I heard that it affects R9 380 less. I keep hearing this as well, but i don't think I've seen enough data to be sure about this yet, or the reason why the 380s aren't affected. Is it the difference in memory types or what? Also what kind of difference if any does the trashing have on the 380 vs 380X? Yes. We need more data to assess the situation. I am also interested in knowing the performance of 380 vs 380x. I have both the 380 and 380x 4G cards and the hash rate is pretty underwhelming 18mh/s vs 19.5 mh/s max it seems. They are both pretty power hungry too around 240 watts maybe 250 for the x. a 7950 gets close to 23 mhs/s for around the same power. One thing i do notice is the hash rate on the 380 and 380x has remained constant regardless of DAG size vs the drop in hash rate of the 7950s to around 22-21 mh/s not sure to make of all of this . I think the best bet right now is to get 390s and mix and match them with 380 so at least you get better relsae value on your GPU's vs the older cards unles you can get them really cheap. the problem with the 390 and 390x is the run crazy hot and consume close to 300 wats of power , thats even worse with a 290x I'm trying out various brands of 380x cards this week but form my estimation its not worth it to pay anthing more for the 380x at least for mining since it hashes only 5% higer than the 380 and uses more power basically a worthless card. 380x has 2048 cores while 380 has 1792. The core number is 14% higher, but the hash rate is just 5% high with higher power consumption. So it is not worth it. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: adaseb on February 25, 2016, 08:33:27 AM So we are currently at 1280MB for the DAG file size and most people are still mining. Was the bug fixed?
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on February 25, 2016, 08:53:33 AM So we are currently at 1280MB for the DAG file size and most people are still mining. Was the bug fixed? It turned out the big bug wasn't really there. My (false) assumptions were based on reports by testers of dagSimCL who apparently didn't know how to tune their AMD cards correctly. The impact of DAG size on hashrate is a fact though. While on Nvidia it has the most dramatic effects in certain circumstances, the impact on AMD cards has been growing steadily now to such a level that the 280X is now dethroned as most cost-effective card to mine on, losing its position to GTX970 on Win7/Linux. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: adaseb on February 25, 2016, 09:11:11 AM So we are currently at 1280MB for the DAG file size and most people are still mining. Was the bug fixed? It turned out the big bug wasn't really there. My (false) assumptions were based on reports by testers of dagSimCL who apparently didn't know how to tune their AMD cards correctly. The impact of DAG size on hashrate is a fact though. While on Nvidia it has the most dramatic effects in certain circumstances, the impact on AMD cards has been growing steadily now to such a level that the 280X is now dethroned as most cost-effective card to mine on, losing its position to GTX970 on Win7/Linux. I noticed the decrease in speed also. The 970 however seems to be at least double in price compared to the 280X. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Genoil on February 25, 2016, 09:16:23 AM So we are currently at 1280MB for the DAG file size and most people are still mining. Was the bug fixed? It turned out the big bug wasn't really there. My (false) assumptions were based on reports by testers of dagSimCL who apparently didn't know how to tune their AMD cards correctly. The impact of DAG size on hashrate is a fact though. While on Nvidia it has the most dramatic effects in certain circumstances, the impact on AMD cards has been growing steadily now to such a level that the 280X is now dethroned as most cost-effective card to mine on, losing its position to GTX970 on Win7/Linux. I noticed the decrease in speed also. The 970 however seems to be at least double in price compared to the 280X. Yes it only counts when you have already ROI'd on the cards mining other coins :) Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: sp_ on February 25, 2016, 09:27:14 AM You can get the gtx 970 to 21 MHASH by putting the gtx 970 in P1 mode. (nvidia-smi tool).
The best card for mining etherum is the r9 Nano. It does 28MHASH. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Realetim on February 25, 2016, 09:51:42 AM You can get the gtx 970 to 21 MHASH by putting the gtx 970 in P1 mode. (nvidia-smi tool). The best card for mining etherum is the r9 Nano. It does 28MHASH. Does the R9 nano use more electricity? Which is more efficient in terms of hash per watt? Nano or 970? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: sp_ on February 25, 2016, 10:07:50 AM You can get the gtx 970 to 21 MHASH by putting the gtx 970 in P1 mode. (nvidia-smi tool). Does the R9 nano use more electricity? Which is more efficient in terms of hash per watt? Nano or 970?The best card for mining etherum is the r9 Nano. It does 28MHASH. The NANO use less electricity, but cost more. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: apriyoni on February 25, 2016, 12:55:14 PM You can get the gtx 970 to 21 MHASH by putting the gtx 970 in P1 mode. (nvidia-smi tool). Does the R9 nano use more electricity? Which is more efficient in terms of hash per watt? Nano or 970?The best card for mining etherum is the r9 Nano. It does 28MHASH. The NANO use less electricity, but cost more. The R9 nano costs £388 while the 970 costs £250. So there is £138 or $200 difference. that is quite a lot. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: sp_ on February 25, 2016, 01:53:10 PM You can get the gtx 970 to 21 MHASH by putting the gtx 970 in P1 mode. (nvidia-smi tool). Does the R9 nano use more electricity? Which is more efficient in terms of hash per watt? Nano or 970?The best card for mining etherum is the r9 Nano. It does 28MHASH. 33% faster and 55% more expensive, but it draws less power.. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: rednoW on February 25, 2016, 02:07:19 PM You can get the gtx 970 to 21 MHASH by putting the gtx 970 in P1 mode. (nvidia-smi tool). nope, the best card for eth is 390x now, fury is good for decredThe best card for mining etherum is the r9 Nano. It does 28MHASH. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: adaseb on February 26, 2016, 07:19:10 AM You guys are all wrong the best card to mine is probably the 7950/7970 since its can be bought second hand dirt cheap. And it gets 20Mh/s.
Buying the Nano or Fury? What are the chances that ETH will still be profitable the day you get ROI ? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Satlite on February 26, 2016, 08:34:50 AM You can get the gtx 970 to 21 MHASH by putting the gtx 970 in P1 mode. (nvidia-smi tool). Does the R9 nano use more electricity? Which is more efficient in terms of hash per watt? Nano or 970?The best card for mining etherum is the r9 Nano. It does 28MHASH. 33% faster and 55% more expensive, but it draws less power.. In percentage term, it could be a good deal if you can squeeze 6 GPU and reduce the overhead of the system. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: RustyNoman on March 10, 2016, 10:27:47 AM I usually use 8 GPU in a system. 4x7990 + 4x other GPUs. AMD allow up to 8 GPU in the Windows sytem. So I use 8.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Akarabzie on March 10, 2016, 02:53:49 PM I usually use 8 GPU in a system. 4x7990 + 4x other GPUs. AMD allow up to 8 GPU in the Windows sytem. So I use 8. Most people don't like using the 7990s becuase they are pretty finicky and a pain to keep cool. I haven't had too much problem with mine after a pretty big underclock. I had one GPU go out on me while the other worked, and I had some problems with another one constantly crashing my system. I'd rather just run (5) 280Xs with no system downtime. Hey if you actually got 4x7990s running with no issues, more power to you. Your rig is like what 2100 watts? Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: asrilani on March 10, 2016, 04:03:09 PM I usually use 8 GPU in a system. 4x7990 + 4x other GPUs. AMD allow up to 8 GPU in the Windows sytem. So I use 8. Most people don't like using the 7990s becuase they are pretty finicky and a pain to keep cool. I haven't had too much problem with mine after a pretty big underclock. I had one GPU go out on me while the other worked, and I had some problems with another one constantly crashing my system. I'd rather just run (5) 280Xs with no system downtime. Hey if you actually got 4x7990s running with no issues, more power to you. Your rig is like what 2100 watts? I have 4x7990+4x7970. I undervolt and underclock them a lot. 950mv, 850/1500 MHz, the power is about 1330 and hah rate = 156 MH/s. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: vatusasid on March 28, 2016, 07:58:09 AM I usually use 8 GPU in a system. 4x7990 + 4x other GPUs. AMD allow up to 8 GPU in the Windows sytem. So I use 8. Most people don't like using the 7990s becuase they are pretty finicky and a pain to keep cool. I haven't had too much problem with mine after a pretty big underclock. I had one GPU go out on me while the other worked, and I had some problems with another one constantly crashing my system. I'd rather just run (5) 280Xs with no system downtime. Hey if you actually got 4x7990s running with no issues, more power to you. Your rig is like what 2100 watts? I have 4x7990+4x7970. I undervolt and underclock them a lot. 950mv, 850/1500 MHz, the power is about 1330 and hah rate = 156 MH/s. I have similar configurations. The hash rate is just 149 MH/s. So the DAG file size has reduced the hash rate by 3%. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Akarabzie on March 29, 2016, 08:35:13 PM I usually use 8 GPU in a system. 4x7990 + 4x other GPUs. AMD allow up to 8 GPU in the Windows sytem. So I use 8. Most people don't like using the 7990s becuase they are pretty finicky and a pain to keep cool. I haven't had too much problem with mine after a pretty big underclock. I had one GPU go out on me while the other worked, and I had some problems with another one constantly crashing my system. I'd rather just run (5) 280Xs with no system downtime. Hey if you actually got 4x7990s running with no issues, more power to you. Your rig is like what 2100 watts? I have 4x7990+4x7970. I undervolt and underclock them a lot. 950mv, 850/1500 MHz, the power is about 1330 and hah rate = 156 MH/s. I have similar configurations. The hash rate is just 149 MH/s. So the DAG file size has reduced the hash rate by 3%. What are you guys using to undervolt your 7990s? Just looked at mine and realized they werent actually changing from stock speeds. This may help me keep them running 24/7 since i still get the occasional crash form my 7990. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: Venon on April 01, 2016, 12:58:36 PM I undervolt my 7990 to 950 mV, and the frequency is from 820 to 880 MHz, depending on the cards.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: adaseb on September 28, 2016, 11:35:40 AM Bumping this thread...
Wondering if the RX470/RX480 will be affected by the sudden drop in hashpower when the DAG files goes to 2050MB. Is that Dag Simulator accurate? Seems that all cards would suffer at >2GB and not just the Tahiti based cards. Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: nerdralph on September 28, 2016, 11:42:56 AM Bumping this thread... Wondering if the RX470/RX480 will be affected by the sudden drop in hashpower when the DAG files goes to 2050MB. Is that Dag Simulator accurate? Seems that all cards would suffer at >2GB and not just the Tahiti based cards. AMD GCN does not have a TLB to trash. See pg. 10. https://www.amd.com/Documents/GCN_Architecture_whitepaper.pdf Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: ahsbqt on November 30, 2018, 07:45:45 PM Old thread, but R9 390 are are doing very bad these days 26mhs thanks to tlb bug.
Title: Re: Assessing the impact of TLB trashing on memory hard algorhitms Post by: adaseb on December 01, 2018, 08:57:00 AM Old thread, but R9 390 are are doing very bad these days 26mhs thanks to tlb bug. Yes with the R9 290 its even worse. I think I got 29MH/s with the stock clock settings on (947/1250) with Stilt bios. Now it gets less than 25MH/s and despite the speed decrease the power consumption more or less remains the same and hence its no longer profitable to mine with those GPUs. Surprisingly they still hold a decent value for gamers and are selling on eBay for fair prices. Will most likely be putting mine up for auction soon. Highly doubt AMD will release a fix for the Hawaii chipsets. |