Bitcoin Forum
May 13, 2024, 04:34:13 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 [281] 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 ... 760 »
  Print  
Author Topic: Claymore's ZCash/BTG AMD GPU Miner v12.6 (Windows/Linux)  (Read 3839040 times)
Jinx99
Member
**
Offline Offline

Activity: 91
Merit: 10



View Profile
November 25, 2016, 08:57:03 PM
Last edit: November 25, 2016, 09:11:24 PM by Jinx99
 #5601

On my 290 memory controller load is rarely over 60%. Big difference is that beside 290 memory bus being 2x wider, memory runs at 1250 MHz vs 2000 MHz on your 480. This means that you can do all possible tricks but no way can use that tight timings as on 290 at 1250 MHz or 390 at 1500 MHz. OK suppose that you reduce mem clock on 480 to 1500 or 1250 MHz to get the same timings but then you still do not get the speed that is possible with 2x wider bus.
This is not question of memory througput.
Reducing memclock almost twice affects only to 20% hashrate drop.


Update: Tahiti have 768 kB L2, Hawaii have 1 MB L2, Ellesmere have 2 MB L2 cache.
1715618053
Hero Member
*
Offline Offline

Posts: 1715618053

View Profile Personal Message (Offline)

Ignore
1715618053
Reply with quote  #2

1715618053
Report to moderator
1715618053
Hero Member
*
Offline Offline

Posts: 1715618053

View Profile Personal Message (Offline)

Ignore
1715618053
Reply with quote  #2

1715618053
Report to moderator
1715618053
Hero Member
*
Offline Offline

Posts: 1715618053

View Profile Personal Message (Offline)

Ignore
1715618053
Reply with quote  #2

1715618053
Report to moderator
The forum was founded in 2009 by Satoshi and Sirius. It replaced a SourceForge forum.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715618053
Hero Member
*
Offline Offline

Posts: 1715618053

View Profile Personal Message (Offline)

Ignore
1715618053
Reply with quote  #2

1715618053
Report to moderator
bardacuda
Sr. Member
****
Offline Offline

Activity: 430
Merit: 254


View Profile
November 25, 2016, 09:02:29 PM
 #5602

On my 290 memory controller load is rarely over 60%. Big difference is that beside 290 memory bus being 2x wider, memory runs at 1250 MHz vs 2000 MHz on your 480. This means that you can do all possible tricks but no way can use that tight timings as on 290 at 1250 MHz or 390 at 1500 MHz. OK suppose that you reduce mem clock on 480 to 1500 or 1250 MHz to get the same timings but then you still do not get the speed that is possible with 2x wider bus.
This is not question of memory througput.
Reducing memclock almost twice affects only to 20% hashrate drop.
https://ip.bitcointalk.org/?u=http%3A%2F%2Fi.piccy.info%2Fi9%2Fce2e18589c91c75caab3b10a46a2c9f2%2F1480097808%2F34039%2F1051816%2Fmemdrop.png&t=571&c=zWfZWhuCJrGaPQ


While my initial analysis was focused on the external GDDR5 bandwidth limits, current ZEC GPU mining software seems to be limited by the memory controller/core bus.  On AMD GCN, each memory controller can xfer 64 bytes (1 cache line) per clock.  In SA5, the ht_store function, in addition to adding to row counters, does 4 separate memory writes for most rounds (3 writes for the last couple rounds).  All of these writes are either 4 or 8 bytes, so much less than 64 bytes per clock are being transferred to the L2 cache.  A single thread (1 SIMD element) can transfer at most 16 bytes (dwordX4) in a single instruction.  This means a modified ht_store thread could update a row slot in 2 clocks.  If the update operation is split between 2 (or 4 or more) threads, one slot can be updated in one clock, since 2 threads can simultaneously write to different parts of the same 64-byte block.  This would mean each row update operation could be done in 2 GPU core clock cycles; one for the counter update, and one for updating the row slot.

Even with those changes, my calculations indicate that a ZEC miner would be limited by the core clock, according to a ratio of approximately 5:6.  In other words, when a Rx 470 has a memory clock of 1750Mhz, the core would need to be clocked at 1750 * 5/6 = 1458Mhz in order to achieve maximum performance.

If the row counters can be kept in LDS or GDS, the core:memory ratio required would be 1:2, thereby allowing full use of the external memory bandwidth.  There is 64KB of LDS per CU, and the AMD GCN architecture docs indicate the LDS can be globally addressed; i.e. one CU can access the LDS of another CU.  However the syntax of OpenCL does not permit the local memory of one work-group to be accessed by a different work-group.  There is only 64KB of GDS shared by all CUs, and even if the row counters could be stored in such a small amount of memory, OpenCL does not have any concept of GDS.

This likely means writing a top performance ZEC miner for AMD is the domain of someone who codes in GCN assembler.  Canis lupus?


Core speed has more of an effect on 480s but they are still limited by memory bandwidth.

The future will rely on AI. SingularityNET lets anyone create, monetize, and use AI at scale. From the creators of Sophia the Robot.
pacolito
Jr. Member
*
Offline Offline

Activity: 36
Merit: 5


View Profile
November 25, 2016, 09:03:15 PM
 #5603

Hi guys I have a question. I have 4 RX 480's. For some reason, 3 of them are hashing around 180mh, but the other one is only doing 40mh. Is there anything you can recommend to fix my issue? I have asus h170 pro gaming mb, and a 1200psu, 8gb ram, 120ssd, windows 10.

When this happens to me I remove the driver with DDU then reinstall driver. Problem solved. Good luck.
xeridea
Sr. Member
****
Offline Offline

Activity: 449
Merit: 251


View Profile WWW
November 25, 2016, 09:06:13 PM
 #5604

Tonga is really not optimized

Tonga it's the problem, by itself...
It was close to be a scam, from amd...

They said Tonga was going to replace the aged Tahiti.... but they were just kiddin'
Tonga is more efficient for gaming, it has better memory efficiency, and perhaps more efficient GPU?  My memory is cloudy.  Anyway, for mining it isn't as good, improvements are for gaming.

Profitability over time charts for many GPUs - http://xeridea.us/charts

BTC:  bc1qr2xwjwfmjn43zhrlp6pn7vwdjrjnv5z0anhjhn LTC:  LXDm6sR4dkyqtEWfUbPumMnVEiUFQvxSbZ Eth:  0x44cCe2cf90C8FEE4C9e4338Ae7049913D4F6fC24
PontiacGTX
Member
**
Offline Offline

Activity: 71
Merit: 10


View Profile
November 25, 2016, 10:06:26 PM
 #5605

I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better
Fiji should be faster,but maybe code isnt be suited for HBM

▒ NOW token ▒ by ChangeNOW ▒ Get the WIN! ▒
ChangeNOW - an instant Non-custodial Exchange Service  (( changenow.io ))
Whitepaper  ▓  Telegram  ▓  Twitter  ▓  Facebook  ▓  Medium  ▓  Reddit  ▓  Bounty Thread
arielbit
Legendary
*
Offline Offline

Activity: 3416
Merit: 1059


View Profile
November 25, 2016, 10:15:46 PM
 #5606

I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better, even when we know with 2xx and 39x GPU and memory clock is more "aligned" and in sync then on RX cards which usualu work 11xx/2000

CRYING here RX4xx is pointless if you know NOTHING about internel GPU arhitecture and even less about zcash prof of work algo and how its computed

a lot of butt hurt people who bought rx 4xx cards here and some sold their old cards  Grin

NikitaS
Newbie
*
Offline Offline

Activity: 11
Merit: 0


View Profile
November 25, 2016, 10:20:46 PM
 #5607

Too mutch rejects on v8.0 with -i 4 and upper intence.

r9 280x
win 8.1
virtual mem, 16gb
15.12
environment variables setx to on

http://c2n.me/3EOYCOz

naeme18720
Sr. Member
****
Offline Offline

Activity: 290
Merit: 250


View Profile
November 25, 2016, 10:27:24 PM
 #5608

mine devfee
In claymore v.8
Each 15 minutes in my rig 7gpu... 1minutes..for me is good or bad???
orbital_station
Newbie
*
Offline Offline

Activity: 18
Merit: 0


View Profile
November 25, 2016, 10:41:40 PM
Last edit: November 25, 2016, 11:07:07 PM by orbital_station
 #5609

1 x sapphire r9 390 stock clock, stock bios, 63 degrees
~ 260 H/s

any advice how to raise that number?
Also I have no way to measure my wattage, can anyone tell me approx. consumption?
Jinx99
Member
**
Offline Offline

Activity: 91
Merit: 10



View Profile
November 25, 2016, 10:49:42 PM
 #5610

1 x sapphire r9 380 stock clock, stock bios, 63 degrees
~ 260 H/s

any advice how to raise that number?
Also I have no way to measure my wattage, can anyone tell me approx. consumption?

380 or 390 ?
bardacuda
Sr. Member
****
Offline Offline

Activity: 430
Merit: 254


View Profile
November 25, 2016, 10:56:40 PM
 #5611

I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better
Fiji should be faster,but maybe code isnt be suited for HBM

i feel 280x the cards Claymore loves the most - will have 200 sols per card i nthe update Smiley)

I like 390-390X the most - I'm going to reach 300H/s on stock clocks.
RX480 will show about 190-200 I think.
280X - about 200 or a bit more.
and what about nano/fury ? They have 512 gGB of bandwith...

Yes, but too wide memory bus, 4096bit is too much for most PoW algos and therefore cannot be used completely.
Nano will show about 250H/s, may be I will reach a bit more.

The future will rely on AI. SingularityNET lets anyone create, monetize, and use AI at scale. From the creators of Sophia the Robot.
adaseb
Legendary
*
Offline Offline

Activity: 3752
Merit: 1710



View Profile
November 25, 2016, 10:56:55 PM
 #5612

Too mutch rejects on v8.0 with -i 4 and upper intence.

r9 280x
win 8.1
virtual mem, 16gb
15.12
environment variables setx to on

http://c2n.me/3EOYCOz



means you overclocked too much. Check log for invalid solutions for buffer overflow. I had this also.

.BEST..CHANGE.███████████████
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
███████████████
..BUY/ SELL CRYPTO..
reelen
Full Member
***
Offline Offline

Activity: 189
Merit: 100


View Profile
November 25, 2016, 10:57:57 PM
 #5613

1 x sapphire r9 380 stock clock, stock bios, 63 degrees
~ 260 H/s

any advice how to raise that number?
Also I have no way to measure my wattage, can anyone tell me approx. consumption?

380 or 390 ?

Has got to be 390.  My 390 slightly overclocked is getting 290~sols/s with v8.0 i4.
Jinx99
Member
**
Offline Offline

Activity: 91
Merit: 10



View Profile
November 25, 2016, 11:02:41 PM
 #5614

 
Has got to be 390.  My 390 slightly overclocked is getting 290~sols/s with v8.0 i4.

or 2*380
 Grin
orbital_station
Newbie
*
Offline Offline

Activity: 18
Merit: 0


View Profile
November 25, 2016, 11:11:12 PM
 #5615

Has got to be 390.  My 390 slightly overclocked is getting 290~sols/s with v8.0 i4.

or 2*380
 Grin


it is 390, sry about the typo

what numbers to set for "easy" OC that doesnt cut into card lifespan much?

btw, I was thinking about getting another one, but my psu doesnt have anymore connectors. R9 390 uses 2x 8 pin. I believe my psu can handle the load, its 850W, but it doesnt have any more 8 pin connectors...
Is there any adaptors molex-> 8pin? Or something like that?
AKRO
Member
**
Offline Offline

Activity: 83
Merit: 10


View Profile
November 25, 2016, 11:14:49 PM
 #5616

Has got to be 390.  My 390 slightly overclocked is getting 290~sols/s with v8.0 i4.

or 2*380
 Grin


it is 390, sry about the typo

what numbers to set for "easy" OC that doesnt cut into card lifespan much?

btw, I was thinking about getting another one, but my psu doesnt have anymore connectors. R9 390 uses 2x 8 pin. I believe my psu can handle the load, its 850W, but it doesnt have any more 8 pin connectors...
Is there any adaptors molex-> 8pin? Or something like that?

yes but honestly it's not recommended at all, most 390s are 6+8, only some are 8+8. if it's 850w gold or platinum, with a slight undervolt and near stock clocks, then yeah it should be fine, but the molex isn't a good idea, maybe someone else can explain better than I can, but it comes down to amperage on the rail, in short, if the PSU only comes with 2 8 pins, it should only handle that many.
Jinx99
Member
**
Offline Offline

Activity: 91
Merit: 10



View Profile
November 25, 2016, 11:17:06 PM
 #5617

it is 390, sry about the typo

what numbers to set for "easy" OC that doesnt cut into card lifespan much?

btw, I was thinking about getting another one, but my psu doesnt have anymore connectors. R9 390 uses 2x 8 pin. I believe my psu can handle the load, its 850W, but it doesnt have any more 8 pin connectors...
Is there any adaptors molex-> 8pin? Or something like that?

What is exact model of your PSU?
You must have at least 2*6 pin and 2*8 pin PCI-E connectors.
orbital_station
Newbie
*
Offline Offline

Activity: 18
Merit: 0


View Profile
November 25, 2016, 11:26:03 PM
 #5618

it is 390, sry about the typo

what numbers to set for "easy" OC that doesnt cut into card lifespan much?

btw, I was thinking about getting another one, but my psu doesnt have anymore connectors. R9 390 uses 2x 8 pin. I believe my psu can handle the load, its 850W, but it doesnt have any more 8 pin connectors...
Is there any adaptors molex-> 8pin? Or something like that?

What is exact model of your PSU?
You must have at least 2*6 pin and 2*8 pin PCI-E connectors.

XFX pro 850W 80+ bronze
box says 2x 6/8 pin, 2x 6-pin, 24-pin mobo, 8-pin cpu along with abunch of molex and SATA connectors.
so i got 2x 6 pin left to use, along with molex and sata connectors.

So there is no room to add 1 more r9 390?
manotroll
Sr. Member
****
Offline Offline

Activity: 305
Merit: 250


View Profile
November 25, 2016, 11:29:47 PM
 #5619

It is not possible to optimize for the driver Crimson Edition 16.9.2 ?
Jinx99
Member
**
Offline Offline

Activity: 91
Merit: 10



View Profile
November 25, 2016, 11:31:57 PM
 #5620


XFX pro 850W 80+ bronze
box says 2x 6/8 pin, 2x 6-pin, 24-pin mobo, 8-pin cpu along with abunch of molex and SATA connectors.


You can try to plug one 6-pin and one 8-pin plug to your card.
Or you can look for PCI-E 6 to 8 pin adapter, like this one


Pages: « 1 ... 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 [281] 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 ... 760 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!