Jinx99
Member
Offline
Activity: 91
Merit: 10
|
|
November 25, 2016, 08:57:03 PM Last edit: November 25, 2016, 09:11:24 PM by Jinx99 |
|
On my 290 memory controller load is rarely over 60%. Big difference is that beside 290 memory bus being 2x wider, memory runs at 1250 MHz vs 2000 MHz on your 480. This means that you can do all possible tricks but no way can use that tight timings as on 290 at 1250 MHz or 390 at 1500 MHz. OK suppose that you reduce mem clock on 480 to 1500 or 1250 MHz to get the same timings but then you still do not get the speed that is possible with 2x wider bus.
This is not question of memory througput. Reducing memclock almost twice affects only to 20% hashrate drop. Update: Tahiti have 768 kB L2, Hawaii have 1 MB L2, Ellesmere have 2 MB L2 cache.
|
|
|
|
bardacuda
|
|
November 25, 2016, 09:02:29 PM |
|
While my initial analysis was focused on the external GDDR5 bandwidth limits, current ZEC GPU mining software seems to be limited by the memory controller/core bus. On AMD GCN, each memory controller can xfer 64 bytes (1 cache line) per clock. In SA5, the ht_store function, in addition to adding to row counters, does 4 separate memory writes for most rounds (3 writes for the last couple rounds). All of these writes are either 4 or 8 bytes, so much less than 64 bytes per clock are being transferred to the L2 cache. A single thread (1 SIMD element) can transfer at most 16 bytes (dwordX4) in a single instruction. This means a modified ht_store thread could update a row slot in 2 clocks. If the update operation is split between 2 (or 4 or more) threads, one slot can be updated in one clock, since 2 threads can simultaneously write to different parts of the same 64-byte block. This would mean each row update operation could be done in 2 GPU core clock cycles; one for the counter update, and one for updating the row slot.
Even with those changes, my calculations indicate that a ZEC miner would be limited by the core clock, according to a ratio of approximately 5:6. In other words, when a Rx 470 has a memory clock of 1750Mhz, the core would need to be clocked at 1750 * 5/6 = 1458Mhz in order to achieve maximum performance.
If the row counters can be kept in LDS or GDS, the core:memory ratio required would be 1:2, thereby allowing full use of the external memory bandwidth. There is 64KB of LDS per CU, and the AMD GCN architecture docs indicate the LDS can be globally addressed; i.e. one CU can access the LDS of another CU. However the syntax of OpenCL does not permit the local memory of one work-group to be accessed by a different work-group. There is only 64KB of GDS shared by all CUs, and even if the row counters could be stored in such a small amount of memory, OpenCL does not have any concept of GDS.
This likely means writing a top performance ZEC miner for AMD is the domain of someone who codes in GCN assembler. Canis lupus?
Core speed has more of an effect on 480s but they are still limited by memory bandwidth.
|
The future will rely on AI. SingularityNET lets anyone create, monetize, and use AI at scale. From the creators of Sophia the Robot.
|
|
|
pacolito
Jr. Member
Offline
Activity: 36
Merit: 5
|
|
November 25, 2016, 09:03:15 PM |
|
Hi guys I have a question. I have 4 RX 480's. For some reason, 3 of them are hashing around 180mh, but the other one is only doing 40mh. Is there anything you can recommend to fix my issue? I have asus h170 pro gaming mb, and a 1200psu, 8gb ram, 120ssd, windows 10.
When this happens to me I remove the driver with DDU then reinstall driver. Problem solved. Good luck.
|
|
|
|
xeridea
|
|
November 25, 2016, 09:06:13 PM |
|
Tonga is really not optimized
Tonga it's the problem, by itself... It was close to be a scam, from amd... They said Tonga was going to replace the aged Tahiti.... but they were just kiddin' Tonga is more efficient for gaming, it has better memory efficiency, and perhaps more efficient GPU? My memory is cloudy. Anyway, for mining it isn't as good, improvements are for gaming.
|
Profitability over time charts for many GPUs - http://xeridea.us/chartsBTC: bc1qr2xwjwfmjn43zhrlp6pn7vwdjrjnv5z0anhjhn LTC: LXDm6sR4dkyqtEWfUbPumMnVEiUFQvxSbZ Eth: 0x44cCe2cf90C8FEE4C9e4338Ae7049913D4F6fC24
|
|
|
PontiacGTX
Member
Offline
Activity: 71
Merit: 10
|
|
November 25, 2016, 10:06:26 PM |
|
I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better
Fiji should be faster,but maybe code isnt be suited for HBM
|
▒ NOW token ▒ by ChangeNOW ▒ Get the WIN! ▒ ChangeNOW - an instant Non-custodial Exchange Service (( changenow.io ))
|
|
|
arielbit
Legendary
Offline
Activity: 3416
Merit: 1059
|
|
November 25, 2016, 10:15:46 PM |
|
I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better, even when we know with 2xx and 39x GPU and memory clock is more "aligned" and in sync then on RX cards which usualu work 11xx/2000
CRYING here RX4xx is pointless if you know NOTHING about internel GPU arhitecture and even less about zcash prof of work algo and how its computed
a lot of butt hurt people who bought rx 4xx cards here and some sold their old cards
|
|
|
|
NikitaS
Newbie
Offline
Activity: 11
Merit: 0
|
|
November 25, 2016, 10:20:46 PM |
|
Too mutch rejects on v8.0 with -i 4 and upper intence. r9 280x win 8.1 virtual mem, 16gb 15.12 environment variables setx to on http://c2n.me/3EOYCOz
|
|
|
|
naeme18720
|
|
November 25, 2016, 10:27:24 PM |
|
mine devfee In claymore v.8 Each 15 minutes in my rig 7gpu... 1minutes..for me is good or bad???
|
|
|
|
orbital_station
Newbie
Offline
Activity: 18
Merit: 0
|
|
November 25, 2016, 10:41:40 PM Last edit: November 25, 2016, 11:07:07 PM by orbital_station |
|
1 x sapphire r9 390 stock clock, stock bios, 63 degrees ~ 260 H/s
any advice how to raise that number? Also I have no way to measure my wattage, can anyone tell me approx. consumption?
|
|
|
|
Jinx99
Member
Offline
Activity: 91
Merit: 10
|
|
November 25, 2016, 10:49:42 PM |
|
1 x sapphire r9 380 stock clock, stock bios, 63 degrees ~ 260 H/s
any advice how to raise that number? Also I have no way to measure my wattage, can anyone tell me approx. consumption?
380 or 390 ?
|
|
|
|
bardacuda
|
|
November 25, 2016, 10:56:40 PM |
|
I don`t know how exact you can adjust memory access to GPU memory, but ppl that compare and cry here thet RX 4xx should be fast as 390x should first learn that any architecture is different, the driver is accessing GPU memory as best as it can, if zcash need many small accesses and if 256bit bus is not wide enough its logical that 384 or 512bit bus will be better
Fiji should be faster,but maybe code isnt be suited for HBM i feel 280x the cards Claymore loves the most - will have 200 sols per card i nthe update ) I like 390-390X the most - I'm going to reach 300H/s on stock clocks. RX480 will show about 190-200 I think. 280X - about 200 or a bit more. and what about nano/fury ? They have 512 gGB of bandwith... Yes, but too wide memory bus, 4096bit is too much for most PoW algos and therefore cannot be used completely.Nano will show about 250H/s, may be I will reach a bit more.
|
The future will rely on AI. SingularityNET lets anyone create, monetize, and use AI at scale. From the creators of Sophia the Robot.
|
|
|
adaseb
Legendary
Offline
Activity: 3766
Merit: 1718
CoinPoker.com
|
|
November 25, 2016, 10:56:55 PM |
|
Too mutch rejects on v8.0 with -i 4 and upper intence. r9 280x win 8.1 virtual mem, 16gb 15.12 environment variables setx to on http://c2n.me/3EOYCOzmeans you overclocked too much. Check log for invalid solutions for buffer overflow. I had this also.
|
|
|
|
reelen
|
|
November 25, 2016, 10:57:57 PM |
|
1 x sapphire r9 380 stock clock, stock bios, 63 degrees ~ 260 H/s
any advice how to raise that number? Also I have no way to measure my wattage, can anyone tell me approx. consumption?
380 or 390 ? Has got to be 390. My 390 slightly overclocked is getting 290~sols/s with v8.0 i4.
|
|
|
|
Jinx99
Member
Offline
Activity: 91
Merit: 10
|
|
November 25, 2016, 11:02:41 PM |
|
Has got to be 390. My 390 slightly overclocked is getting 290~sols/s with v8.0 i4.
or 2*380
|
|
|
|
orbital_station
Newbie
Offline
Activity: 18
Merit: 0
|
|
November 25, 2016, 11:11:12 PM |
|
Has got to be 390. My 390 slightly overclocked is getting 290~sols/s with v8.0 i4.
or 2*380 it is 390, sry about the typo what numbers to set for "easy" OC that doesnt cut into card lifespan much? btw, I was thinking about getting another one, but my psu doesnt have anymore connectors. R9 390 uses 2x 8 pin. I believe my psu can handle the load, its 850W, but it doesnt have any more 8 pin connectors... Is there any adaptors molex-> 8pin? Or something like that?
|
|
|
|
AKRO
Member
Offline
Activity: 83
Merit: 10
|
|
November 25, 2016, 11:14:49 PM |
|
Has got to be 390. My 390 slightly overclocked is getting 290~sols/s with v8.0 i4.
or 2*380 it is 390, sry about the typo what numbers to set for "easy" OC that doesnt cut into card lifespan much? btw, I was thinking about getting another one, but my psu doesnt have anymore connectors. R9 390 uses 2x 8 pin. I believe my psu can handle the load, its 850W, but it doesnt have any more 8 pin connectors... Is there any adaptors molex-> 8pin? Or something like that? yes but honestly it's not recommended at all, most 390s are 6+8, only some are 8+8. if it's 850w gold or platinum, with a slight undervolt and near stock clocks, then yeah it should be fine, but the molex isn't a good idea, maybe someone else can explain better than I can, but it comes down to amperage on the rail, in short, if the PSU only comes with 2 8 pins, it should only handle that many.
|
|
|
|
Jinx99
Member
Offline
Activity: 91
Merit: 10
|
|
November 25, 2016, 11:17:06 PM |
|
it is 390, sry about the typo
what numbers to set for "easy" OC that doesnt cut into card lifespan much?
btw, I was thinking about getting another one, but my psu doesnt have anymore connectors. R9 390 uses 2x 8 pin. I believe my psu can handle the load, its 850W, but it doesnt have any more 8 pin connectors... Is there any adaptors molex-> 8pin? Or something like that?
What is exact model of your PSU? You must have at least 2*6 pin and 2*8 pin PCI-E connectors.
|
|
|
|
orbital_station
Newbie
Offline
Activity: 18
Merit: 0
|
|
November 25, 2016, 11:26:03 PM |
|
it is 390, sry about the typo
what numbers to set for "easy" OC that doesnt cut into card lifespan much?
btw, I was thinking about getting another one, but my psu doesnt have anymore connectors. R9 390 uses 2x 8 pin. I believe my psu can handle the load, its 850W, but it doesnt have any more 8 pin connectors... Is there any adaptors molex-> 8pin? Or something like that?
What is exact model of your PSU? You must have at least 2*6 pin and 2*8 pin PCI-E connectors. XFX pro 850W 80+ bronze box says 2x 6/8 pin, 2x 6-pin, 24-pin mobo, 8-pin cpu along with abunch of molex and SATA connectors. so i got 2x 6 pin left to use, along with molex and sata connectors. So there is no room to add 1 more r9 390?
|
|
|
|
manotroll
|
|
November 25, 2016, 11:29:47 PM |
|
It is not possible to optimize for the driver Crimson Edition 16.9.2 ?
|
|
|
|
Jinx99
Member
Offline
Activity: 91
Merit: 10
|
|
November 25, 2016, 11:31:57 PM |
|
XFX pro 850W 80+ bronze box says 2x 6/8 pin, 2x 6-pin, 24-pin mobo, 8-pin cpu along with abunch of molex and SATA connectors.
You can try to plug one 6-pin and one 8-pin plug to your card. Or you can look for PCI-E 6 to 8 pin adapter, like this one
|
|
|
|
|