Bitcoin Forum
May 25, 2024, 08:21:05 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [17] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 »
  Print  
Author Topic: Custom RAM Timings for GPU's with GDDR5 - DOWNLOAD LINKS - UPDATED  (Read 155461 times)
niko2004x
Member
**
Offline Offline

Activity: 126
Merit: 10


View Profile
March 21, 2017, 12:53:33 PM
Last edit: March 21, 2017, 01:09:40 PM by niko2004x
 #321

Guess what, i know almost nothing about CAS, ARB, MISC, but by trial & error method i managed to get a pretty good strap Smiley
Looks like its time to learn some shit, and understand what am i really doing when changing hex values LOL

Thank you ohgod people for sharing the knowledge Smiley

Since i am not Stilt and do not have access to internal AMD stuff,
i did mine 'mad science style' (shameless dick swinging) with search routine based on
genetic algorithms guided by regression forests on PCA transformed register field values (machine learning for fun and profit).
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
March 21, 2017, 01:15:26 PM
 #322

...finally getting interesting... Smiley

somebody might share Hynix AJR and Sam datasheet perhaps..?

I wonder how the optimized custom timings are related to the given minimal values in the official datasheet?
I mean how loose are the official values, are there even room for improvement below them, or custom straps simply are adjusted to the recommended official values...?

Here's an older Hynix GDDR5 datasheet.
https://drive.google.com/file/d/0BwLnDyLLT3WkeTBtekxTTloxMW8/view?usp=sharing

I don't have one for Samsung.  You can also find Elpida/Micron product briefs that will have basic stuff like #of banks (usually 16) and bank groups (4).
Often you can push the official timings by ~30%.  Sometimes, particularly when the RAM is not thermally connected to the heatsink, you may have a hard time pushing the timing by 10%.
http://nerdralph.blogspot.ca/2017/01/hot-video-cards.html

As has been mentioned in these (and other) forums many times, the best straps depend on the mining algorithm.  ZEC is harder to optimize for than ETH since it has lots of writes and more variation in memory access pattern.
https://bitcointalk.org/index.php?topic=1679855.0
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
March 21, 2017, 01:19:22 PM
 #323

So my first try at a custom strap didn't work (GPU crashed almost immediately when mining ETH).
custom 1900: 1500RAS, 1625CAS, MISC2, & ARB
777000000000000022CC1C00AD515A3ED0570F15B98CA50A004AE7001C0714207A8900A00300000 01B11353F922A3217

A straight copy of the 1625 strap to 2000 works fine, while the 1500 strap gave errors even at 1900.  I tried taking the 1900 strap, RAS from the 1500, and CAS, MISC2 & ARB2 from the 1625 strap and using it for the 2000 strap.

My friend, you have a lot to learn...I was like u...a few weeks ago, then I read all the documentation regarding GDDR5 and with a little help(well...not so little) I managed to understand what actually those timings do Smiley
Keep up the good work by the way!
EDIT: I am also very keen to understand HBM/2 timings, if anyone has some knowledge on those(I already know the mode registers) any help via PM is highly appreciated!

I know how to make an optimized strap, just like I know how to re-shingle a shed.  But tying a tarp over the roof is a lot easier...
NisamRobot
Newbie
*
Offline Offline

Activity: 31
Merit: 0


View Profile
March 21, 2017, 02:08:20 PM
 #324

A kitten tamed the wolf Smiley
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
March 21, 2017, 02:27:35 PM
Last edit: March 21, 2017, 05:03:47 PM by nerdralph
 #325

Now that the strap tools are out, let's talk about how to optimize the timings.  I want to start with ETH since it is the simplest (and coincidentally the most profitable ATM).

Ethash is many 128-byte random DAG reads, 8KB of them per hash, so 20MH/s requires 160GB/s of random read bandwidth.  For AMD cards 128 bytes is 2 cache lines of 64 bytes each, and each cache line fill reads 32 bytes from 2 GDDR5 memory chips.  Each 32-byte GDDR5 read burst takes 2 clocks, so when the RAM is clocked at 2GHz, the data will be transferred in 1ns (each bit takes just 125ps!).

Here's a couple references to help the noobs get started:
https://www.micron.com/~/media/documents/products/technical-note/dram/tned01_gddr5_sgram_introduction.pdf
https://www.micron.com/~/media/documents/products/data-sheet/dram/gddr5/4gb_gddr5_sgram_brief.pdf

I'm not going to do one long post, so as to make this more readable.  For the more experienced folks, here's a tidbit of ideas to come: set tFAW and t32AW to 0.  Even Hynix's old H5GQ1H24AFR has FAW (23ns) =~ 4* RRD (5.5ns), so virtually all modern GGD5 should be able to work fine without FAW and 32AW limits.  I get 27.0Mh with sgminer on my Rx470/K4G4 clocked at 2Ghz, tRRD=5, tFAW=0.  Zeroing t32AW gives a bump to 27.35Mh.

 
tharp
Newbie
*
Offline Offline

Activity: 19
Merit: 0


View Profile
March 21, 2017, 02:53:56 PM
 #326

Since everyone is sharing now I suppose i'll put what I've come up with out here. Running -125mv 470 Nitro Sapphire 8GB with Samsung memory with ETH hitting between 28.5MH/s to 29.2MH/s @1140 cor and @2100 mem pulling around 920watts at the wall with 6 GPU per rig. On XMR hitting 785h/s to 795h/s @1170 cor and @2100 mem pulling around 660 watts at the wall with 6 GPU per rig. Also running ethOS 1.2.0. Im here to learn more about the mistakes I made on the mod and see what others in the community have come up with.

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

The timings from wolf and ohgodagirls vbios decode tools release:
TRCDW = 16
TRCDWA = 16
TRCDR = 26
TRCDRA = 22
TRRD = 5
TRC = 71
Pad0 = 0

TRP_WRA = 48
Pad0 = 2
TRP_RDA = 12
TRP = 22
TRFC = 144

PA2RDATA = 0
Pad0 = 0
PA2WDATA = 0
Pad1 = 0
TFAW = 8
TCRCRL = 3
TCRCWL = 7
TFAW32 = 6

MC_SEQ_MISC1: 0x20140514

MC_SEQ_MISC3: 0xA00089FA

MC_SEQ_MISC8: 0x00000003

ACTRD = 25
ACTWR = 13
RASMACTRD = 47
RASMACTWR = 57

RAS2RAS = 157
RP = 45
WRPLUSRP = 46
BUS_TURN = 23

Looking forward to others input! Cheesy
Eliovp
Legendary
*
Offline Offline

Activity: 1050
Merit: 1293

Huh?


View Profile WWW
March 21, 2017, 03:32:56 PM
 #327

Since everyone is sharing now I suppose i'll put what I've come up with out here. Running -125mv 470 Nitro Sapphire 8GB with Samsung memory with ETH hitting between 28.5MH/s to 29.2MH/s @1140 cor and @2100 mem pulling around 920watts at the wall with 6 GPU per rig. On XMR hitting 785h/s to 795h/s @1170 cor and @2100 mem pulling around 660 watts at the wall with 6 GPU per rig. Also running ethOS 1.2.0. Im here to learn more about the mistakes I made on the mod and see what others in the community have come up with.

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

....

Looking forward to others input! Cheesy


Cleaned it up for you.

Code:
--> HEX strap: 777000000000000022CC1C00AD695D47C0570E16B08C05090048C70014051420FA8900A003000000190D2F399D2D2E17

--> MC_SEQ_WR_CTL_D0
    DAT_DLY = 7,   DQS_DLY = 7,  DQS_XTR = 0,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 7,  OEN_EXT = 0

--> MC_SEQ_WR_CTL_D1
    DAT_DLY = 0,   DQS_DLY = 0,  DQS_XTR = 0,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 0,  OEN_EXT = 0

--> MC_SEQ_PMG_TIMING
    TCKSRE = 2,  Pad0 = 0,  TCKSRX = 2,  Pad1 = 0,  TCKE_PULSE = 12,  TCKE = 12,  SEQ_IDLE = 7,  Pad2 = 0,  TCKE_PULSE_MSB = 0, SEQ_IDLE_SS = 0

--> MC_SEQ_RAS_TIMING
    TRCDW = 13,  TRCDWA = 13,  TRCDR = 26,  TRCDRA = 26,  TRRD = 5,  TRC = 71,  Pad0 = 0

--> MC_SEQ_CAS_TIMING
    TNOPW = 0,  TNOPR = 0,  TR2W = 28, TCCLD = 3,  TR2R = 5,  Pad0 = 0,  TW2R = 14,  TCL = 22,  Pad1 = 0

--> MC_SEQ_MISC_TIMING
    TRP_WRA = 48,  Pad0 = 2,  TRP_RDA = 12,  TRP = 22,  TRFC = 144

--> MC_SEQ_MISC_TIMING2
    PA2RDATA = 0,  Pad0 = 0,  PA2WDATA = 0,  Pad1 = 0,  FAW = 8,  TREDC = 2,  TWEDC = 7,  T32AW = 6,  Pad2 = 0,  TWDATATR = 0

--> MC_SEQ_MISC1
 -- MR0
    WL = 4,  CL = 23,  TM = 0,  WR = 25,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR1
    DS = 0,  DT = 1,  ADR = 1,  CAL = 0,  PLL = 0,  RDBI = 0,  WDBI = 0,  ABI = 0,
    RES = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 0

--> MC_SEQ_MISC3
 -- MR4
    EDCHP = 10,  CRC WL = 7,  CRC RL = 3,  RD CRC = 0,  WR CRC = 0,  EDCHPi = 1,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 1
 -- MR5
    LP1 = 0,  LP2 = 0,  LP3 = 0,  PLL/DLL BW = 0,  RAS = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 1


--> MC_SEQ_MISC8
 -- MR8
    CLEHF = 1,  WREHF = 1,  RFU = 0,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR7
    PLL Stby = 0,  PLL Fclk = 0,  PLL DelC = 0,  LF Mode = 0,  Auto Sync = 0,  DQ PreA = 0, Temp Sensor = 0, HVFRED = 0,
    VDD Range = 0,  RFU = 0,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0


--> MC_ARB_DRAM_TIMING
    ACTRD = 25,  ACTWR = 13,  RASMACTRD = 47,  RASMACTWR = 57

--> MC_ARB_DRAM_TIMING2
    RAS2RAS = 157,  RP = 45,  WRPLUSRP = 46,  BUS_TURN = 23

Lots of options.. lots of things to fine tune..

nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
March 21, 2017, 03:41:57 PM
Last edit: March 21, 2017, 04:12:17 PM by nerdralph
 #328

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

The timings from wolf and ohgodagirls vbios decode tools release:

You should update to the version with my changes that show CAS timing.  I see you're using CL=22.  With Samsung CL=21 I was getting errors at 2100 (OK at 2000).  I'll give 22 a try.
Here's what I was using @2000:
555000000000000022CC1C00CE595B3ED0570F1531CB2409004007000B0314207A8900A00300000 0170F2E36922A3217

update: no luck with CL=22@2100; too many HW errors.  I'm pretty sure the Samsung RAM on these cards is rated for 1750, so getting stable results at 2000 is still pretty good.
niko2004x
Member
**
Offline Offline

Activity: 126
Merit: 10


View Profile
March 21, 2017, 04:09:20 PM
 #329

Since everyone is sharing now I suppose i'll put what I've come up with out here. Running -125mv 470 Nitro Sapphire 8GB with Samsung memory with ETH hitting between 28.5MH/s to 29.2MH/s @1140 cor and @2100 mem pulling around 920watts at the wall with 6 GPU per rig. On XMR hitting 785h/s to 795h/s @1170 cor and @2100 mem pulling around 660 watts at the wall with 6 GPU per rig. Also running ethOS 1.2.0. Im here to learn more about the mistakes I made on the mod and see what others in the community have come up with.

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

The timings from wolf and ohgodagirls vbios decode tools release:
TRCDW = 16
TRCDWA = 16
TRCDR = 26
TRCDRA = 22
TRRD = 5
TRC = 71
Pad0 = 0

TRP_WRA = 48
Pad0 = 2
TRP_RDA = 12
TRP = 22
TRFC = 144

PA2RDATA = 0
Pad0 = 0
PA2WDATA = 0
Pad1 = 0
TFAW = 8
TCRCRL = 3
TCRCWL = 7
TFAW32 = 6

MC_SEQ_MISC1: 0x20140514

MC_SEQ_MISC3: 0xA00089FA

MC_SEQ_MISC8: 0x00000003

ACTRD = 25
ACTWR = 13
RASMACTRD = 47
RASMACTWR = 57

RAS2RAS = 157
RP = 45
WRPLUSRP = 46
BUS_TURN = 23

Looking forward to others input! Cheesy

TRCDR & TRCDRA should be the same.

And i don't think that MISC decoded properly. Or at least it is reasonable to expect Pad0 in
Code:
TRP_WRA = 48
Pad0 = 2
TRP_RDA = 12
TRP = 22
TRFC = 144
to be zero.
tharp
Newbie
*
Offline Offline

Activity: 19
Merit: 0


View Profile
March 21, 2017, 04:10:23 PM
 #330

Since everyone is sharing now I suppose i'll put what I've come up with out here. Running -125mv 470 Nitro Sapphire 8GB with Samsung memory with ETH hitting between 28.5MH/s to 29.2MH/s @1140 cor and @2100 mem pulling around 920watts at the wall with 6 GPU per rig. On XMR hitting 785h/s to 795h/s @1170 cor and @2100 mem pulling around 660 watts at the wall with 6 GPU per rig. Also running ethOS 1.2.0. Im here to learn more about the mistakes I made on the mod and see what others in the community have come up with.

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

....

Looking forward to others input! Cheesy


Cleaned it up for you.

Code:
--> HEX strap: 777000000000000022CC1C00AD695D47C0570E16B08C05090048C70014051420FA8900A003000000190D2F399D2D2E17

--> MC_SEQ_WR_CTL_D0
    DAT_DLY = 7,   DQS_DLY = 7,  DQS_XTR = 0,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 7,  OEN_EXT = 0

--> MC_SEQ_WR_CTL_D1
    DAT_DLY = 0,   DQS_DLY = 0,  DQS_XTR = 0,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 0,  OEN_EXT = 0

--> MC_SEQ_PMG_TIMING
    TCKSRE = 2,  Pad0 = 0,  TCKSRX = 2,  Pad1 = 0,  TCKE_PULSE = 12,  TCKE = 12,  SEQ_IDLE = 7,  Pad2 = 0,  TCKE_PULSE_MSB = 0, SEQ_IDLE_SS = 0

--> MC_SEQ_RAS_TIMING
    TRCDW = 13,  TRCDWA = 13,  TRCDR = 26,  TRCDRA = 26,  TRRD = 5,  TRC = 71,  Pad0 = 0

--> MC_SEQ_CAS_TIMING
    TNOPW = 0,  TNOPR = 0,  TR2W = 28, TCCLD = 3,  TR2R = 5,  Pad0 = 0,  TW2R = 14,  TCL = 22,  Pad1 = 0

--> MC_SEQ_MISC_TIMING
    TRP_WRA = 48,  Pad0 = 2,  TRP_RDA = 12,  TRP = 22,  TRFC = 144

--> MC_SEQ_MISC_TIMING2
    PA2RDATA = 0,  Pad0 = 0,  PA2WDATA = 0,  Pad1 = 0,  FAW = 8,  TREDC = 2,  TWEDC = 7,  T32AW = 6,  Pad2 = 0,  TWDATATR = 0

--> MC_SEQ_MISC1
 -- MR0
    WL = 4,  CL = 23,  TM = 0,  WR = 25,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR1
    DS = 0,  DT = 1,  ADR = 1,  CAL = 0,  PLL = 0,  RDBI = 0,  WDBI = 0,  ABI = 0,
    RES = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 0

--> MC_SEQ_MISC3
 -- MR4
    EDCHP = 10,  CRC WL = 7,  CRC RL = 3,  RD CRC = 0,  WR CRC = 0,  EDCHPi = 1,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 1
 -- MR5
    LP1 = 0,  LP2 = 0,  LP3 = 0,  PLL/DLL BW = 0,  RAS = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 1


--> MC_SEQ_MISC8
 -- MR8
    CLEHF = 1,  WREHF = 1,  RFU = 0,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR7
    PLL Stby = 0,  PLL Fclk = 0,  PLL DelC = 0,  LF Mode = 0,  Auto Sync = 0,  DQ PreA = 0, Temp Sensor = 0, HVFRED = 0,
    VDD Range = 0,  RFU = 0,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0


--> MC_ARB_DRAM_TIMING
    ACTRD = 25,  ACTWR = 13,  RASMACTRD = 47,  RASMACTWR = 57

--> MC_ARB_DRAM_TIMING2
    RAS2RAS = 157,  RP = 45,  WRPLUSRP = 46,  BUS_TURN = 23

Lots of options.. lots of things to fine tune..

Thanks will give it a shot when I get back on a computer.

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

The timings from wolf and ohgodagirls vbios decode tools release:

You should update to the version with my changes that show CAS timing.  I see you're using CL=22.  With Samsung CL=21 I was getting errors at 2100 (OK at 2000).  I'll give 22 a try.
Here's what I was using @2000:
555000000000000022CC1C00CE595B3ED0570F1531CB2409004007000B0314207A8900A00300000 0170F2E36922A3217


I am seeing HW errors with the current mod I'm running but not an exponential amount that affects performance on the pool hash rate. I will be able to test more once I get back to my computer.
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
March 21, 2017, 04:44:00 PM
 #331

Since everyone is sharing now I suppose i'll put what I've come up with out here. Running -125mv 470 Nitro Sapphire 8GB with Samsung memory with ETH hitting between 28.5MH/s to 29.2MH/s @1140 cor and @2100 mem pulling around 920watts at the wall with 6 GPU per rig. On XMR hitting 785h/s to 795h/s @1170 cor and @2100 mem pulling around 660 watts at the wall with 6 GPU per rig. Also running ethOS 1.2.0. Im here to learn more about the mistakes I made on the mod and see what others in the community have come up with.

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

....

Looking forward to others input! Cheesy


Cleaned it up for you.

Code:
--> HEX strap: 777000000000000022CC1C00AD695D47C0570E16B08C05090048C70014051420FA8900A003000000190D2F399D2D2E17

--> MC_SEQ_WR_CTL_D0
    DAT_DLY = 7,   DQS_DLY = 7,  DQS_XTR = 0,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 7,  OEN_EXT = 0

--> MC_SEQ_WR_CTL_D1
    DAT_DLY = 0,   DQS_DLY = 0,  DQS_XTR = 0,  DAT_2Y_DLY = 0,  ADR_2Y_DLY = 0,    CMD_2Y_DLY = 0,  OEN_DLY = 0,  OEN_EXT = 0

--> MC_SEQ_PMG_TIMING
    TCKSRE = 2,  Pad0 = 0,  TCKSRX = 2,  Pad1 = 0,  TCKE_PULSE = 12,  TCKE = 12,  SEQ_IDLE = 7,  Pad2 = 0,  TCKE_PULSE_MSB = 0, SEQ_IDLE_SS = 0

--> MC_SEQ_RAS_TIMING
    TRCDW = 13,  TRCDWA = 13,  TRCDR = 26,  TRCDRA = 26,  TRRD = 5,  TRC = 71,  Pad0 = 0

--> MC_SEQ_CAS_TIMING
    TNOPW = 0,  TNOPR = 0,  TR2W = 28, TCCLD = 3,  TR2R = 5,  Pad0 = 0,  TW2R = 14,  TCL = 22,  Pad1 = 0

--> MC_SEQ_MISC_TIMING
    TRP_WRA = 48,  Pad0 = 2,  TRP_RDA = 12,  TRP = 22,  TRFC = 144

--> MC_SEQ_MISC_TIMING2
    PA2RDATA = 0,  Pad0 = 0,  PA2WDATA = 0,  Pad1 = 0,  FAW = 8,  TREDC = 2,  TWEDC = 7,  T32AW = 6,  Pad2 = 0,  TWDATATR = 0

--> MC_SEQ_MISC1
 -- MR0
    WL = 4,  CL = 23,  TM = 0,  WR = 25,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR1
    DS = 0,  DT = 1,  ADR = 1,  CAL = 0,  PLL = 0,  RDBI = 0,  WDBI = 0,  ABI = 0,
    RES = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 0

--> MC_SEQ_MISC3
 -- MR4
    EDCHP = 10,  CRC WL = 7,  CRC RL = 3,  RD CRC = 0,  WR CRC = 0,  EDCHPi = 1,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 1
 -- MR5
    LP1 = 0,  LP2 = 0,  LP3 = 0,  PLL/DLL BW = 0,  RAS = 0,  BA0 = 0,  BA1 = 1,  BA2 = 0,  BA3 = 1


--> MC_SEQ_MISC8
 -- MR8
    CLEHF = 1,  WREHF = 1,  RFU = 0,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0
 -- MR7
    PLL Stby = 0,  PLL Fclk = 0,  PLL DelC = 0,  LF Mode = 0,  Auto Sync = 0,  DQ PreA = 0, Temp Sensor = 0, HVFRED = 0,
    VDD Range = 0,  RFU = 0,  BA0 = 0,  BA1 = 0,  BA2 = 0,  BA3 = 0


--> MC_ARB_DRAM_TIMING
    ACTRD = 25,  ACTWR = 13,  RASMACTRD = 47,  RASMACTWR = 57

--> MC_ARB_DRAM_TIMING2
    RAS2RAS = 157,  RP = 45,  WRPLUSRP = 46,  BUS_TURN = 23

Lots of options.. lots of things to fine tune..

Thanks will give it a shot when I get back on a computer.

Here is the strap I've put together:
777000000000000022CC1C00106A5B47C0570E16B08C05090068C70014051420FA8900A00300000 0190D2F399D2D2E17

The timings from wolf and ohgodagirls vbios decode tools release:

You should update to the version with my changes that show CAS timing.  I see you're using CL=22.  With Samsung CL=21 I was getting errors at 2100 (OK at 2000).  I'll give 22 a try.
Here's what I was using @2000:
555000000000000022CC1C00CE595B3ED0570F1531CB2409004007000B0314207A8900A00300000 0170F2E36922A3217


I am seeing HW errors with the current mod I'm running but not an exponential amount that affects performance on the pool hash rate. I will be able to test more once I get back to my computer.
It was cleaned rom, without modding other than SEQ_RAS params, there is delibarate error for you to figure it out.
Hint: MC_SEQ_MISC_TIMING
EDIT: OhGodAGirl format:
Quote
typedef struct _SEQ_MISC_TIMING_FORMAT
{
        uint32_t TRP_WRA : 6;
        uint32_t Pad0 : 2;
        uint32_t TRP_RDA : 6;
        uint32_t TRP : 6;
        uint32_t TRFC : 11;
} SEQ_MISC_TIMING_FORMAT;
Expected format:
Quote
     5:0  TRP_WRA = 0x0
    13:8  TRP_RDA = 0x0
   19:15  TRP = 0x0
   28:20  TRFC = 0x0
I must admit that I think OhGodAGirl's format looks more like it.

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
lexele
Full Member
***
Offline Offline

Activity: 190
Merit: 100


View Profile
March 21, 2017, 04:45:55 PM
 #332

Gosh, this thread is becoming hotter and hotter; If only I weren't tied to work lately. Good luck folks.
niko2004x
Member
**
Offline Offline

Activity: 126
Merit: 10


View Profile
March 21, 2017, 05:23:23 PM
Last edit: March 21, 2017, 06:15:07 PM by niko2004x
 #333

Expected format:
Quote
    5:0  TRP_WRA = 0x0
    13:8  TRP_RDA = 0x0
   19:15  TRP = 0x0
   28:20  TRFC = 0x0
I must admit that I think OhGodAGirl's format looks more like it.
I think this one is correct but only for preRX series. At least corresponding header in linux kernel dated reasonable before RX.

Here is the additional hint (produced by my tool).
Code:
RX480(Elpida EDW4032BABG)
 20000 0 999000000000000022aa1c0060881107c0540b078f82c00000204100150014209a8840a1000004c0030105070c0a100c [TRP_WRA=015,TRP_RDA=005,unused1=000,TRP=002,unused2=000,TRFC=012,unused3=000]
 40000 0 999000000000000022aa1c006094120fd0540c0815449101002041001d0314209a8880a2000004c006010a0f190e160d [TRP_WRA=021,TRP_RDA=008,unused1=000,TRP=005,unused2=000,TRFC=025,unused3=000]
 80000 0 999000000000000022aa1c00a5ac351f10550e0c21c73203004482003d0914202a8900a5000004c00c06141a33182210 [TRP_WRA=033,TRP_RDA=014,unused1=000,TRP=011,unused2=000,TRFC=051,unused3=000]
100000 0 777000000000000022aa1c002939572750550d0fa68803040068c200540c1420aa8900a6000004c00f0a191e401e2712 [TRP_WRA=038,TRP_RDA=017,unused1=000,TRP=014,unused2=000,TRFC=064,unused3=000]
125000 0 777000000000000022aa1c00ad49593270550e12ad8a14050068c300640f1420ba8980a7000004c0130e202551242e13 [TRP_WRA=045,TRP_RDA=021,unused1=000,TRP=018,unused2=000,TRFC=081,unused3=000]
137500 0 777000000000000022aa1c00ef516a3790550f14b20b9505006ae40074021420ca89c0a8020004c01510232859283315 [TRP_WRA=050,TRP_RDA=023,unused1=000,TRP=020,unused2=000,TRFC=089,unused3=000]
142500 0 777000000000000022aa1c0010d66a3990550f14344cc505006ae40074031420ca8900a9020004c0161124295c293515 [TRP_WRA=052,TRP_RDA=024,unused1=000,TRP=021,unused2=000,TRFC=092,unused3=000]
150000 0 777000000000000022aa1c00315a6b3ca0550f15b68c1506006ae4007c041420ca8980a9020004c01712262b612b3715 [TRP_WRA=054,TRP_RDA=025,unused1=000,TRP=022,unused2=000,TRFC=097,unused3=000]
162500 0 777000000000000022aa1c0073627c41b0551016ba0d9606006c060104061420ea8940aa030004c01914292e692e3b16 [TRP_WRA=058,TRP_RDA=027,unused1=000,TRP=024,unused2=000,TRFC=105,unused3=000]
175000 0 777000000000000022aa1c00b56a7d46c0551017be8e1607006c07010c081420fa8900ab030004c01b162c3171313f17 [TRP_WRA=062,TRP_RDA=029,unused1=000,TRP=026,unused2=000,TRFC=113,unused3=000]
200000 0 999000000000000022aa1c0018f77e4fd055121946501708006c07011d0c1420fa8980ac030004c01e19323781364718 [TRP_WRA=070,TRP_RDA=032,unused1=000,TRP=029,unused2=000,TRFC=129,unused3=000]
R9390(Elpida EDW4032BABG)
 20000 0 999133200000000060881107c0540b060f05c1000020410022aa1c08150014209a8840a1000000c0030105070c0a100c [TRP_WRA=015,unused1=000,TRP_RDA=005,unused2=000,TRP=002,TRFC=012,unused3=000]
 40000 0 99913320000000006094120fd0540c07158892010020410022aa1c081d0314209a8880a2000000c006010a0f190e160d [TRP_WRA=021,unused1=000,TRP_RDA=008,unused2=000,TRP=005,TRFC=025,unused3=000]
 80000 0 9991332000000000a5ac351f10550e0b218e35030044820022aa1c083d0914202a8900a5000000c00c06141a33182210 [TRP_WRA=033,unused1=000,TRP_RDA=014,unused2=000,TRP=011,TRFC=051,unused3=000]
100000 0 77713320000000002939572750550d0e261107040068c20022aa1c08540c1420aa8900a6000000c00f0a191e401e2712 [TRP_WRA=038,unused1=000,TRP_RDA=017,unused2=000,TRP=014,TRFC=064,unused3=000]
125000 0 7771332000000000ad49593270550e102d1519050068c30022aa1c08640f1420ba8980a7000000c0130e202551242e13 [TRP_WRA=045,unused1=000,TRP_RDA=021,unused2=000,TRP=018,TRFC=081,unused3=000]
137500 0 7771332000000000ef516a3790550f1232179a05006ae40022aa1c0874021420ca89c0a8020000c01510232859283315 [TRP_WRA=050,unused1=000,TRP_RDA=023,unused2=000,TRP=020,TRFC=089,unused3=000]
142500 0 777133200000000010d66a3990550f123498ca05006ae40022aa1c0874031420ca8900a9020000c0161124295c293515 [TRP_WRA=052,unused1=000,TRP_RDA=024,unused2=000,TRP=021,TRFC=092,unused3=000]
150000 0 7771332000000000315a6b3ca0550f1336191b06006ae40022aa1c087c041420ca8980a9020000c01712262b612b3715 [TRP_WRA=054,unused1=000,TRP_RDA=025,unused2=000,TRP=022,TRFC=097,unused3=000]
162500 0 777133200000000073627c41b05510143a1b9c06006c060122aa1c0804061420ea8940aa030000c01914292e692e3b16 [TRP_WRA=058,unused1=000,TRP_RDA=027,unused2=000,TRP=024,TRFC=105,unused3=000]
175000 0 7771332000000000b56a7d46c05510153e1d1d07006c070122aa1c080c081420fa8900ab030000c01b162c3171313f17 [TRP_WRA=062,unused1=000,TRP_RDA=029,unused2=000,TRP=026,TRFC=113,unused3=000]
200000 0 999133200000000018f77e4f0054121a06a01e08006c070122aa1c08350c1420fa8980ac030000c01e1932378139471a [TRP_WRA=006,unused1=000,TRP_RDA=032,unused2=000,TRP=029,TRFC=129,unused3=000]
RX480(Hynix H5GC4H24AJR)
 40000 0 555000000000000022dd1c0084941212f0540b0795847102002041001b0414209a8800a00000312006050d0e270f160e [TRP_WRA=021,TRP_RDA=009,unused1=000,TRP=006,unused2=000,TRFC=039,unused3=000]
 80000 0 777000000000000022dd1c00e7ac352210550d0a20c7f20400248100340914209a8800a0000031200c08171b4f172110 [TRP_WRA=032,TRP_RDA=014,unused1=000,TRP=011,unused2=000,TRFC=079,unused3=000]
 90000 0 777000000000000022dd1c002931462620550e0ba20793050026a2003c0a1420aa8800a0000031200d0a1a1d59192311 [TRP_WRA=034,TRP_RDA=015,unused1=000,TRP=012,unused2=000,TRFC=089,unused3=000]
100000 0 777000000000000022dd1c0029b5462930550e0c244823060026a200440b1420aa8800a0000031200e0a1c20621b2511 [TRP_WRA=036,TRP_RDA=016,unused1=000,TRP=013,unused2=000,TRFC=098,unused3=000]
112500 0 777000000000000022ff1c006bbd572f40550f0d28c9f3060048c5004c0d14205a8900a000003120100c20246f1e2912 [TRP_WRA=040,TRP_RDA=018,unused1=000,TRP=015,unused2=000,TRFC=111,unused3=000]
125000 0 777000000000000022ff1c008cc5583460550f0f2c4ab4070048c5005c0f14205a8900a000003120120d23287b222d13 [TRP_WRA=044,TRP_RDA=020,unused1=000,TRP=017,unused2=000,TRFC=123,unused3=000]
137500 0 777000000000000022339d00cecd593980551111ae8a84080048c6006c0014206a8900a002003120140f262b88252f15 [TRP_WRA=046,TRP_RDA=021,unused1=000,TRP=018,unused2=000,TRFC=136,unused3=000]
142500 0 777000000000000022339d00ce516a3b805511112fcbd408004ae6006c0014206a8900a002003120150f272d8d263015 [TRP_WRA=047,TRP_RDA=022,unused1=000,TRP=019,unused2=000,TRFC=141,unused3=000]
150000 0 777000000000000022339d00ce516a3d9055111230cb4409004ae600740114206a8900a002003120150f292f94273116 [TRP_WRA=048,TRP_RDA=022,unused1=000,TRP=019,unused2=000,TRFC=148,unused3=000]
162500 0 999000000000000022559d0010de7b4480551312b78c450a004c0601750414206a8900a00200312018112d34a42a3816 [TRP_WRA=055,TRP_RDA=025,unused1=000,TRP=022,unused2=000,TRFC=164,unused3=000]
175000 0 999000000000000022559d0031627c489055131339cdd50a004c06017d0514206a8900a00200312019123037ad2c3a17 [TRP_WRA=057,TRP_RDA=026,unused1=000,TRP=023,unused2=000,TRFC=173,unused3=000]
200000 0 bbb000000000000022889d0073ee8d53805515133ecf560c004e26017e0514206a8900a0020031201c143840c5303f17 [TRP_WRA=062,TRP_RDA=030,unused1=000,TRP=027,unused2=000,TRFC=197,unused3=000]
R9390(Hynix H5GC4H24AJR)
 40000 0 555133200000000084941212f0540b07150973020020410022dd1c081b0414209a8800a00000012006050d0e270f160e [TRP_WRA=021,unused1=000,TRP_RDA=009,unused2=000,TRP=006,TRFC=039,unused3=000]
 80000 0 7771332000000000e7ac352210550d0a208ef5040024810022dd1c08340914209a8800a0000001200c08171b4f172110 [TRP_WRA=032,unused1=000,TRP_RDA=014,unused2=000,TRP=011,TRFC=079,unused3=000]
 90000 0 77713320000000002931462620550e0b220f96050026a20022dd1c083c0a1420aa8800a0000001200d0a1a1d59192311 [TRP_WRA=034,unused1=000,TRP_RDA=015,unused2=000,TRP=012,TRFC=089,unused3=000]
100000 0 777133200000000029b5462930550e0c249026060026a20022dd1c08440b1420aa8800a0000001200e0a1c20621b2511 [TRP_WRA=036,unused1=000,TRP_RDA=016,unused2=000,TRP=013,TRFC=098,unused3=000]
112500 0 77713320000000006bbd572f40550f0d2892f7060048c50022ff1c084c0d14205a8900a000000120100c20246f1e2912 [TRP_WRA=040,unused1=000,TRP_RDA=018,unused2=000,TRP=015,TRFC=111,unused3=000]
125000 0 77713320000000008cc5583460550f0f2c94b8070048c50022ff1c085c0f14205a8900a000000120120d23287b222d13 [TRP_WRA=044,unused1=000,TRP_RDA=020,unused2=000,TRP=017,TRFC=123,unused3=000]
137500 0 7771332000000000cecd5939805511112e1589080048c60022339d086c0014206a8900a002000120140f262b88252f15 [TRP_WRA=046,unused1=000,TRP_RDA=021,unused2=000,TRP=018,TRFC=136,unused3=000]
142500 0 7771332000000000ce516a3b805511112f96d908004ae60022339d086c0014206a8900a002000120150f272d8d263015 [TRP_WRA=047,unused1=000,TRP_RDA=022,unused2=000,TRP=019,TRFC=141,unused3=000]
150000 0 7771332000000000ce516a3d9055111230964909004ae60022339d08740114206a8900a002000120150f292f94273116 [TRP_WRA=048,unused1=000,TRP_RDA=022,unused2=000,TRP=019,TRFC=148,unused3=000]
162500 0 999133200000000010de7b448055131237194b0a004c060122559d08750414206a8900a00200012018112d34a42a3816 [TRP_WRA=055,unused1=000,TRP_RDA=025,unused2=000,TRP=022,TRFC=164,unused3=000]
175000 0 999133200000000031627c4890551313399adb0a004c060122559d087d0514206a8900a00200012019123037ad2c3a17 [TRP_WRA=057,unused1=000,TRP_RDA=026,unused2=000,TRP=023,TRFC=173,unused3=000]
200000 0 bbb133200000000073ee8d53805515133e9e5d0c004e260122889d087e0514206a8900a0020001201c143840c5303f17 [TRP_WRA=062,unused1=000,TRP_RDA=030,unused2=000,TRP=027,TRFC=197,unused3=000]
Registers in RX and preRX obviously at different offsets but additionally there is no way to decode MISC with same decoder to produce
reasonably similar values for same memory type in RX and R9 cards.

EDIT: For whose who is wondering why TRP_WRA=006 for Elpida in R9 my theory is that it is a bug in the bios (6 bits was designated for field) and
64 from (70=64+6) was cut off.

EDIT: Thinking about it i think realignment of MISC parts was done to give TRP one additional bit.
In Elpida at higher straps TRP=029 which is almost overflow.
doktor83
Hero Member
*****
Offline Offline

Activity: 2548
Merit: 626


View Profile WWW
March 21, 2017, 05:56:33 PM
 #334

yeah, just wanted to ask which one is accurate, ohgod's or niko's MISC_TIMING cause one is 31 bits the other one is 32 Smiley

SRBMiner-MULTI thread - HERE
http://www.srbminer.com
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
March 21, 2017, 05:58:18 PM
 #335

I've started doing the detailed analysis on memory timing for Eth mining.

With tRRD=6, tRC=62, tCL=21 and 2000 mem clock, I can get almost 27Mh/s mining eth.  Each hash takes 64 random DAG reads of 128 bytes each, and since they are random, each read should be from a different page.  As well, the L2 cache hit rate should be near 0, so each DAG access requires a read from GDDR (2x32-byte reads from 2 GDDR chips).

Before reading, a page (row) has to be activated(opened), so 27Mh * 64 activate = 1728M activates per second.  The Rx470/480 has 4 independent cache controllers, so a single GDDR5 chip will open 432M pages per second.  With a 2Ghz mem clock, that's about 5 (4.73) clocks per activate.  The closer that gets to 4, the better.  Lower than 4 is not possible with Eth mining, since it takes 4 clocks to transfer 64 bytes (half a DAG entry).  Note that if tRRD=6, means 6 clocks, some other timing factor is allowing the RAM to sustain <5 clocks per activate

I tried tRRD=5, and it only makes a small (~1%) improvement.  That makes sense, since RRD is the delay between 2 activate commands when they are going to different banks.  With only 16 banks, the memory controller has lots of opportunity to batch activate commands together in the same bank.  However tRC is defined as, "The minimum time interval between two successive ACTIVE commands on the same bank".  With tRC=62, the fastest access pattern would be to spread the accesses across different banks rather than batching them in the same bank.

So it seems I'm missing something about how the RAM timing.  I know there are multiple clocks for GDDR5, and some run at double data rate (i.e. WCK).  If tRRD=6 means six DDR address clocks, that would be 3 SDR command clocks (2Ghz is the command clock rate).



niko2004x
Member
**
Offline Offline

Activity: 126
Merit: 10


View Profile
March 21, 2017, 06:03:14 PM
 #336

yeah, just wanted to ask which one is accurate, ohgod's or niko's MISC_TIMING cause one is 31 bits the other one is 32 Smiley

3 highest bits are unused anyway (so difference between 31 and 32 is irrelevant).
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
March 21, 2017, 06:06:30 PM
 #337

yeah, just wanted to ask which one is accurate, ohgod's or niko's MISC_TIMING cause one is 31 bits the other one is 32 Smiley

I think the linux kernel asic reg headers are misleading.  As far as I can tell the straps are copied into 32-bit registers, and therefore the mask and offset definitions have no functional effect.
Some of the old register names can't even be found in the GDDR5 datasheets.  For example you won't find tR2R in the Hynix datasheet, but you will find tCCDL and tCCDS.  I suspect what the Linux headers refer to as tR2R may actually be tCCDS.
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
March 21, 2017, 06:44:26 PM
 #338

yeah, just wanted to ask which one is accurate, ohgod's or niko's MISC_TIMING cause one is 31 bits the other one is 32 Smiley

3 highest bits are unused anyway (so difference between 31 and 32 is irrelevant).

And whats the correct structure for MC_SEQ_MISC_TIMING according to your decoding tool for RX series?

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
niko2004x
Member
**
Offline Offline

Activity: 126
Merit: 10


View Profile
March 21, 2017, 06:45:22 PM
 #339

yeah, just wanted to ask which one is accurate, ohgod's or niko's MISC_TIMING cause one is 31 bits the other one is 32 Smiley

I think the linux kernel asic reg headers are misleading.  As far as I can tell the straps are copied into 32-bit registers, and therefore the mask and offset definitions have no functional effect.
Some of the old register names can't even be found in the GDDR5 datasheets.  For example you won't find tR2R in the Hynix datasheet, but you will find tCCDL and tCCDS.  I suspect what the Linux headers refer to as tR2R may actually be tCCDS.

Well, you could be right.
But linked Hynix H5GQ2H24AFR (last seen in R9 290) is dated by 2009 and
linux header is more recent (although if data is up to date here is questionable)
and from my point of view it is about which one is more deprecated.
niko2004x
Member
**
Offline Offline

Activity: 126
Merit: 10


View Profile
March 21, 2017, 06:46:01 PM
 #340

yeah, just wanted to ask which one is accurate, ohgod's or niko's MISC_TIMING cause one is 31 bits the other one is 32 Smiley

3 highest bits are unused anyway (so difference between 31 and 32 is irrelevant).

And whats the correct structure for MC_SEQ_MISC_TIMING according to your decoding tool for RX series?

As stated in atom_rom_timings.py in git.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [17] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!