Bitcoin Forum
August 22, 2025, 11:11:54 PM *
News: Latest Bitcoin Core release: 29.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 [280] 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 ... 759 »
  Print  
Author Topic: Claymore's ZCash/BTG AMD GPU Miner v12.6 (Windows/Linux)  (Read 3839401 times)
naeme18720
Sr. Member
****
Offline Offline

Activity: 290
Merit: 250


View Profile
November 25, 2016, 07:58:12 PM
 #5581

I'm not understand for mining zec that server Asia us China...  I'm can is better use.. Please help me

Thanks for help..I'm testing


Try each of the pools and see which one was lowest #

ZEC: 11/25/16-13:28:55 - SHARE FOUND - (GPU 0)
ZEC: Share accepted (125 ms)!  <<<<<<<<<<<<< HERE
Rusguy
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
November 25, 2016, 08:00:03 PM
 #5582

I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better, even when we know with 2xx and 39x GPU and memory clock is more "aligned" and in sync then on RX cards which usualu work 11xx/2000

CRYING here RX4xx is pointless if you know NOTHING about internel GPU arhitecture and even less about zcash prof of work algo and how its computed

Even if 480 and memory bandwidth 256bit bus still only used 50% of its capacity !!! And I think that the manufacturer knowingly went to such a move is likely for this new chip Polaris dostochno and bandwidth 256bit bus, with his new memory controller that provides a slightly lower performance than the 390 !!!
Sorry for my English


That would see the controller load from 390 models think it will give a small concept in this issue
lithiumviper12
Member
**
Offline Offline

Activity: 105
Merit: 10


View Profile
November 25, 2016, 08:13:01 PM
 #5583

Hi guys I have a question. I have 4 RX 480's. For some reason, 3 of them are hashing around 180mh, but the other one is only doing 40mh. Is there anything you can recommend to fix my issue? I have asus h170 pro gaming mb, and a 1200psu, 8gb ram, 120ssd, windows 10.
Jinx99
Member
**
Offline Offline

Activity: 91
Merit: 10



View Profile
November 25, 2016, 08:22:10 PM
 #5584

Hi guys I have a question. I have 4 RX 480's. For some reason, 3 of them are hashing around 180mh, but the other one is only doing 40mh. Is there anything you can recommend to fix my issue? I have asus h170 pro gaming mb, and a 1200psu, 8gb ram, 120ssd, windows 10.
If you have some performance issues - check GPU-Z "sensors" tab.
bardacuda
Sr. Member
****
Offline Offline

Activity: 430
Merit: 254


View Profile
November 25, 2016, 08:25:23 PM
 #5585

I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better, even when we know with 2xx and 39x GPU and memory clock is more "aligned" and in sync then on RX cards which usualu work 11xx/2000

CRYING here RX4xx is pointless if you know NOTHING about internel GPU arhitecture and even less about zcash prof of work algo and how its computed

Even if 480 and memory bandwidth 256bit bus still only used 50% of its capacity !!! And I think that the manufacturer knowingly went to such a move is likely for this new chip Polaris dostochno and bandwidth 256bit bus, with his new memory controller that provides a slightly lower performance than the 390 !!!
Sorry for my English


That would see the controller load from 390 models think it will give a small concept in this issue

R9 290 MC usage:




Do you also know if you want to check if a algo is memory limited, you can go into GPUZ and check out the MCU (memory controller unit) and see the load on it?

I think this is wrong.  Although I primarily mine using Linux, I have a Windoze box that I use for testing cards.  GPU-z appears to show only external bus bandwidth use (to the GDDR), and not the utilization of the bandwidth between the controller and core.  In practical terms, a miner kernel may be using 200GB/s of memory bandwidth, but a significant percentage of it can be from the L2 cache.  The collision counter tables in SA5 would be an example of this.


Do you have a source for this hypothesis? In all memory restricted algos that correlates to MCU usage. Pretty sure it pertains to any sort of memory overload, bandwidth or bus width...

My knowledge of the AMD GCN architecture (and computer architecture in general), and my experience writing OpenCL.

tc61
Hero Member
*****
Offline Offline

Activity: 494
Merit: 500


View Profile
November 25, 2016, 08:25:54 PM
 #5586

anyone running into an out of memory error? win 10 16gb ram rx480
Rusguy
Newbie
*
Offline Offline

Activity: 10
Merit: 0


View Profile
November 25, 2016, 08:33:59 PM
 #5587

I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better, even when we know with 2xx and 39x GPU and memory clock is more "aligned" and in sync then on RX cards which usualu work 11xx/2000

CRYING here RX4xx is pointless if you know NOTHING about internel GPU arhitecture and even less about zcash prof of work algo and how its computed

Even if 480 and memory bandwidth 256bit bus still only used 50% of its capacity !!! And I think that the manufacturer knowingly went to such a move is likely for this new chip Polaris dostochno and bandwidth 256bit bus, with his new memory controller that provides a slightly lower performance than the 390 !!!
Sorry for my English


That would see the controller load from 390 models think it will give a small concept in this issue

R9 290 MC usage:

https://i.imgur.com/UX0NIVb.png



Do you also know if you want to check if a algo is memory limited, you can go into GPUZ and check out the MCU (memory controller unit) and see the load on it?

I think this is wrong.  Although I primarily mine using Linux, I have a Windoze box that I use for testing cards.  GPU-z appears to show only external bus bandwidth use (to the GDDR), and not the utilization of the bandwidth between the controller and core.  In practical terms, a miner kernel may be using 200GB/s of memory bandwidth, but a significant percentage of it can be from the L2 cache.  The collision counter tables in SA5 would be an example of this.


Do you have a source for this hypothesis? In all memory restricted algos that correlates to MCU usage. Pretty sure it pertains to any sort of memory overload, bandwidth or bus width...

My knowledge of the AMD GCN architecture (and computer architecture in general), and my experience writing OpenCL.


Loading controller slightly higher than the 480, but the GPU BPM temperature2 temperature is about the same as the GPU BPM temperature1, there is likely to interfere with the speed of the memory controller already that little bandwidth 256bit bus
KrokoTill
Newbie
*
Offline Offline

Activity: 51
Merit: 0


View Profile
November 25, 2016, 08:35:54 PM
 #5588


r7 370 is actually a "pro" chip meaning it has 1024 sps like the 7850 and r7 265, but for some reason seems to perform more like a 1280 sp "XT" chip. They must have made some minor performance tweaks. The r9 270s are the same as 7870s and 270Xs with 1280 sps but most were voltage locked and so just couldn't clock as high without BIOS mods.

chip wise/core count wise: 7850 = r7 265 = r7 370  <  7870 = r9 270 = r9 270X = 370X  <  7870XT


Regarding clocks - I had in past 270x Sapphire Toxic and MSI Hawk models and now I have a MSI Gaming 370 4GB model. Max 100% stable clock I could achieve with all of them is 1200 MHz. Only difference is, that on 270x cards voltage was unlocked and I was able to downvolt them to 1150 mV, but on 370 it is fixed to 1162 mV. Nice thing about the 370 is that while it is doing 128 sol/s desktop responds well enough and I can work at the same time and it is quiet and not too hot. So I do not know about "pro" or not, but I like the card. With v8 I had to reduce GPU clock from 1200 to 1175 because it was not 100% stable any more.
topgeek
Member
**
Offline Offline

Activity: 96
Merit: 10


View Profile
November 25, 2016, 08:39:52 PM
 #5589

I have something I cannot figure out and was wondering if any of you gents have an idea.

Two computers.
Each has an idential MSI RX480 8G in it.
Both are running Claymore v8.
Both are using an identical start script - except the worker name.

The one miner preiodically reports the GPU temp and fan % - which I really like.
The other doesnt Huh Huh Huh

Here is a screen capture showing the difference:
https://snag.gy/YAPBXj.jpg


Anyone have any ideas?
cheers and thanks
Jinx99
Member
**
Offline Offline

Activity: 91
Merit: 10



View Profile
November 25, 2016, 08:49:23 PM
 #5590

Quote
Anyone have any ideas?
Press "s" on that miner when it runs.
KrokoTill
Newbie
*
Offline Offline

Activity: 51
Merit: 0


View Profile
November 25, 2016, 08:50:45 PM
 #5591

I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better, even when we know with 2xx and 39x GPU and memory clock is more "aligned" and in sync then on RX cards which usualu work 11xx/2000

CRYING here RX4xx is pointless if you know NOTHING about internel GPU arhitecture and even less about zcash prof of work algo and how its computed

Even if 480 and memory bandwidth 256bit bus still only used 50% of its capacity !!! And I think that the manufacturer knowingly went to such a move is likely for this new chip Polaris dostochno and bandwidth 256bit bus, with his new memory controller that provides a slightly lower performance than the 390 !!!
Sorry for my English


That would see the controller load from 390 models think it will give a small concept in this issue

On my 290 memory controller load is rarely over 60%. Big difference is that beside 290 memory bus being 2x wider, memory runs at 1250 MHz vs 2000 MHz on your 480. This means that you can do all possible tricks but no way can use that tight timings as on 290 at 1250 MHz or 390 at 1500 MHz. OK suppose that you reduce mem clock on 480 to 1500 or 1250 MHz to get the same timings but then you still do not get the speed that is possible with 2x wider bus.
Jinx99
Member
**
Offline Offline

Activity: 91
Merit: 10



View Profile
November 25, 2016, 08:57:03 PM
Last edit: November 25, 2016, 09:11:24 PM by Jinx99
 #5592

On my 290 memory controller load is rarely over 60%. Big difference is that beside 290 memory bus being 2x wider, memory runs at 1250 MHz vs 2000 MHz on your 480. This means that you can do all possible tricks but no way can use that tight timings as on 290 at 1250 MHz or 390 at 1500 MHz. OK suppose that you reduce mem clock on 480 to 1500 or 1250 MHz to get the same timings but then you still do not get the speed that is possible with 2x wider bus.
This is not question of memory througput.
Reducing memclock almost twice affects only to 20% hashrate drop.


Update: Tahiti have 768 kB L2, Hawaii have 1 MB L2, Ellesmere have 2 MB L2 cache.
bardacuda
Sr. Member
****
Offline Offline

Activity: 430
Merit: 254


View Profile
November 25, 2016, 09:02:29 PM
 #5593

On my 290 memory controller load is rarely over 60%. Big difference is that beside 290 memory bus being 2x wider, memory runs at 1250 MHz vs 2000 MHz on your 480. This means that you can do all possible tricks but no way can use that tight timings as on 290 at 1250 MHz or 390 at 1500 MHz. OK suppose that you reduce mem clock on 480 to 1500 or 1250 MHz to get the same timings but then you still do not get the speed that is possible with 2x wider bus.
This is not question of memory througput.
Reducing memclock almost twice affects only to 20% hashrate drop.
https://ip.bitcointalk.org/?u=http%3A%2F%2Fi.piccy.info%2Fi9%2Fce2e18589c91c75caab3b10a46a2c9f2%2F1480097808%2F34039%2F1051816%2Fmemdrop.png&t=571&c=zWfZWhuCJrGaPQ


While my initial analysis was focused on the external GDDR5 bandwidth limits, current ZEC GPU mining software seems to be limited by the memory controller/core bus.  On AMD GCN, each memory controller can xfer 64 bytes (1 cache line) per clock.  In SA5, the ht_store function, in addition to adding to row counters, does 4 separate memory writes for most rounds (3 writes for the last couple rounds).  All of these writes are either 4 or 8 bytes, so much less than 64 bytes per clock are being transferred to the L2 cache.  A single thread (1 SIMD element) can transfer at most 16 bytes (dwordX4) in a single instruction.  This means a modified ht_store thread could update a row slot in 2 clocks.  If the update operation is split between 2 (or 4 or more) threads, one slot can be updated in one clock, since 2 threads can simultaneously write to different parts of the same 64-byte block.  This would mean each row update operation could be done in 2 GPU core clock cycles; one for the counter update, and one for updating the row slot.

Even with those changes, my calculations indicate that a ZEC miner would be limited by the core clock, according to a ratio of approximately 5:6.  In other words, when a Rx 470 has a memory clock of 1750Mhz, the core would need to be clocked at 1750 * 5/6 = 1458Mhz in order to achieve maximum performance.

If the row counters can be kept in LDS or GDS, the core:memory ratio required would be 1:2, thereby allowing full use of the external memory bandwidth.  There is 64KB of LDS per CU, and the AMD GCN architecture docs indicate the LDS can be globally addressed; i.e. one CU can access the LDS of another CU.  However the syntax of OpenCL does not permit the local memory of one work-group to be accessed by a different work-group.  There is only 64KB of GDS shared by all CUs, and even if the row counters could be stored in such a small amount of memory, OpenCL does not have any concept of GDS.

This likely means writing a top performance ZEC miner for AMD is the domain of someone who codes in GCN assembler.  Canis lupus?


Core speed has more of an effect on 480s but they are still limited by memory bandwidth.
pacolito
Jr. Member
*
Offline Offline

Activity: 36
Merit: 5


View Profile
November 25, 2016, 09:03:15 PM
 #5594

Hi guys I have a question. I have 4 RX 480's. For some reason, 3 of them are hashing around 180mh, but the other one is only doing 40mh. Is there anything you can recommend to fix my issue? I have asus h170 pro gaming mb, and a 1200psu, 8gb ram, 120ssd, windows 10.

When this happens to me I remove the driver with DDU then reinstall driver. Problem solved. Good luck.
xeridea
Sr. Member
****
Offline Offline

Activity: 449
Merit: 251


View Profile WWW
November 25, 2016, 09:06:13 PM
 #5595

Tonga is really not optimized

Tonga it's the problem, by itself...
It was close to be a scam, from amd...

They said Tonga was going to replace the aged Tahiti.... but they were just kiddin'
Tonga is more efficient for gaming, it has better memory efficiency, and perhaps more efficient GPU?  My memory is cloudy.  Anyway, for mining it isn't as good, improvements are for gaming.

Profitability over time charts for many GPUs - http://xeridea.us/charts

BTC:  bc1qr2xwjwfmjn43zhrlp6pn7vwdjrjnv5z0anhjhn LTC:  LXDm6sR4dkyqtEWfUbPumMnVEiUFQvxSbZ Eth:  0x44cCe2cf90C8FEE4C9e4338Ae7049913D4F6fC24
PontiacGTX
Member
**
Offline Offline

Activity: 71
Merit: 10


View Profile
November 25, 2016, 10:06:26 PM
 #5596

I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better
Fiji should be faster,but maybe code isnt be suited for HBM

▒ NOW token ▒ by ChangeNOW ▒ Get the WIN! ▒
ChangeNOW - an instant Non-custodial Exchange Service  (( changenow.io ))
Whitepaper  ▓  Telegram  ▓  Twitter  ▓  Facebook  ▓  Medium  ▓  Reddit  ▓  Bounty Thread
arielbit
Legendary
*
Offline Offline

Activity: 3444
Merit: 1061


View Profile
November 25, 2016, 10:15:46 PM
 #5597

I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better, even when we know with 2xx and 39x GPU and memory clock is more "aligned" and in sync then on RX cards which usualu work 11xx/2000

CRYING here RX4xx is pointless if you know NOTHING about internel GPU arhitecture and even less about zcash prof of work algo and how its computed

a lot of butt hurt people who bought rx 4xx cards here and some sold their old cards  Grin

NikitaS
Newbie
*
Offline Offline

Activity: 11
Merit: 0


View Profile
November 25, 2016, 10:20:46 PM
 #5598

Too mutch rejects on v8.0 with -i 4 and upper intence.

r9 280x
win 8.1
virtual mem, 16gb
15.12
environment variables setx to on

http://c2n.me/3EOYCOz

naeme18720
Sr. Member
****
Offline Offline

Activity: 290
Merit: 250


View Profile
November 25, 2016, 10:27:24 PM
 #5599

mine devfee
In claymore v.8
Each 15 minutes in my rig 7gpu... 1minutes..for me is good or bad???
orbital_station
Newbie
*
Offline Offline

Activity: 18
Merit: 0


View Profile
November 25, 2016, 10:41:40 PM
Last edit: November 25, 2016, 11:07:07 PM by orbital_station
 #5600

1 x sapphire r9 390 stock clock, stock bios, 63 degrees
~ 260 H/s

any advice how to raise that number?
Also I have no way to measure my wattage, can anyone tell me approx. consumption?
Pages: « 1 ... 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 [280] 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 ... 759 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!