Bitcoin Forum
April 24, 2024, 10:22:30 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 ... 66 »
  Print  
Author Topic: Hacking The KNC Firmware: Overclocking  (Read 144306 times)
fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
October 19, 2013, 12:18:30 PM
Merited by ABCbits (4)
 #1

Since KNC themselves have released no information on how to overclock our units.  I have started hacking the firmware myself to achieve that goal.  Here is what I have learned so far.

TL;DR
I have not yet succeeded in discovering how to change the hashing clocks.  It appears possible.
The VRM voltages should be easy to adjust over I2C.  Though I do not recommend it, yet.

initc

cgminer itself adjusts neither the voltage nor the clock.  Instead, there is a program called "initc" that runs on start-up.  The code for this program is not currently available.  This program appears to be responsible for a whole host of tasks.  It runs various self tests on the unit, loads the FPGA's bitstream, checks system configuration, etc.  Based on what I have seen poking around, (this is conjecture until confirmed) it is also responsible for setting up the VRMs, and the per-die PLLs.

Per Board EEPROM
It seems that every ASIC board has an EEPROM on it.  This EEPROM contains a checksum, serial number, PLL configuration data, a failure flag, and I'm not sure what else.  Only about 128 bytes of data or so.  They can be read like so: "cat /sys/bus/i2c/devices/X-0050/eeprom | hexdump -C" where X is 3, 4, 5, 6, 7, 8.  I have a Jupiter, so I have 3,4,5,6.

The checksum is the first 32 bytes.  I don't know what algorithm they're using for checksum.  I would have thought SHA-256, but that didn't seem to match.  Luckily it doesn't matter; we can just mod initc itself to calculate the checksum for us, when the time comes.

The PLL configuration data is 48 bytes long.  It's located at 0x60 in the EEPROM.  The first two bytes is the data length (48 bytes), and is not included in the length.  There appears to be 3 16-bit numbers for each PLL, with the addresses 0x84, 0x85, and 0x86 associated with them.  There are 4 PLLs per ASIC board, since each chip has 4 dies in it.  One of the numbers is always 0xd101.  The others I haven't figured out.  They appear to be completely random, which is odd.  It's entirely possible that they are encrypted somehow, if KNC had intended to restrict our ability to overclock.  If anyone has some insight here, that would be appreciated.

I know that initc loads this data, verifies the size of the PLL configuration data, and checks the checksum, but I haven't yet reverse engineered the parts of the code that would actually use this data ... if it's used at all.  I'm guessing the data is pushed to the ASICs over SPI.

Phone Home
I also discovered that initc contacts http://192.168.100.1:8080/%s?beagleserial=%s.  I don't know under what circumstances it does so, nor all the combinations of strings it'll format that with.  My guess: it is either part of factory set-up or their hosting facility.  Each miner, on bootup, contacts a master server to retrieve pool information and such.  Either this always happens, only happens once, or only happens when some flag is set.  But we won't know for sure until I finish reverse engineering the program.  This part of it isn't my priority though.

Reverse Engineered Source Code
I've manually de-compiled about 10% of initc so far.  It sure would be nice if KNC just released this code  Roll Eyes  This will have to suffice in the meantime.  I'm hoping initc manipulates the PLL configuration data, so we can gain more insight into its format.  At the very least, it'll be nice to know what this program is doing as a whole.


My EEPROM Dumps
Code:
root@Jupiter-6AB:~# cat /sys/bus/i2c/devices/3-0050/eeprom | hexdump -C
00000000  75 22 4a 27 1b d8 26 f5  fc c5 0c 5c 28 23 28 63  |u"J'..&....\(#(c|
00000010  4d 6c 01 9f 00 55 a2 4c  79 d9 73 62 d3 db 51 27  |Ml...U.Ly.sb..Q'|
00000020  41 31 30 30 30 30 30 37  42 39 41 30 45 00 00 00  |A1000007B9A0E...|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000050  04 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000060  00 30 84 00 0d f1 84 01  c9 e7 84 02 f9 14 84 03  |.0..............|
00000070  91 65 86 00 01 d1 86 01  01 d1 86 02 01 d1 86 03  |.e..............|
00000080  01 d1 85 00 c2 cd 85 01  f5 31 85 02 eb 66 85 03  |.........1...f..|
00000090  03 01 aa 28 91 14 a0 64  12 16 f1 e5 a6 5d 1b a5  |...(...d.....]..|
000000a0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*

root@Jupiter-6AB:~# cat /sys/bus/i2c/devices/4-0050/eeprom | hexdump -C
00000000  5d 90 81 61 21 96 61 2f  79 aa da 0e b0 c2 97 1b  |]..a!.a/y.......|
00000010  96 a4 82 86 a3 eb cb 11  68 87 a0 6c a1 c6 ab d6  |........h..l....|
00000020  41 31 30 30 30 30 30 38  30 30 31 36 33 00 00 00  |A100000800163...|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000050  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000060  00 30 84 00 94 50 84 01  82 5c 84 02 29 a8 84 03  |.0...P...\..)...|
00000070  ab 7b 86 00 01 d1 86 01  01 d1 86 02 01 d1 86 03  |.{..............|
00000080  01 d1 85 00 e6 d0 85 01  57 f8 85 02 60 5b 85 03  |........W...`[..|
00000090  b0 48 a7 af 63 41 fd 14  5e 7c f5 95 66 47 59 d0  |.H..cA..^|..fGY.|
000000a0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*


root@Jupiter-6AB:~# cat /sys/bus/i2c/devices/5-0050/eeprom | hexdump -C
00000000  5b 59 fc 0f 94 ab a3 d7  d8 7c 91 6b a5 29 d6 51  |[Y.......|.k.).Q|
00000010  4a 0f d6 6b e6 35 5e a5  a4 59 04 5f f1 26 1e 19  |J..k.5^..Y._.&..|
00000020  41 31 30 30 30 30 30 37  42 42 45 43 46 00 00 00  |A1000007BBECF...|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 01 00 00  |................|
00000050  08 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000060  00 30 84 00 56 87 84 01  74 fa 84 02 c0 34 84 03  |.0..V...t....4..|
00000070  8f 7a 86 00 01 d1 86 01  01 d1 86 02 01 d1 86 03  |.z..............|
00000080  01 d1 85 00 db 11 85 01  b7 d7 85 02 3c ce 85 03  |............<...|
00000090  3d 1c e7 e3 75 f7 fe 65  21 ea 89 91 25 3a e7 ed  |=...u..e!...%:..|
000000a0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*


root@Jupiter-6AB:~# cat /sys/bus/i2c/devices/6-0050/eeprom | hexdump -C
00000000  7d 1c cb 21 6e a1 4a ad  65 29 bf af 12 de a3 cd  |}..!n.J.e)......|
00000010  76 93 dc e7 0f 8c 28 33  5f 49 b2 86 81 b4 19 d4  |v.....(3_I......|
00000020  41 31 30 30 30 30 30 37  42 41 35 38 41 00 00 00  |A1000007BA58A...|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 04 10 00  |................|
00000050  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000060  00 30 84 00 f3 49 84 01  cb 1e 84 02 d9 dc 84 03  |.0...I..........|
00000070  b1 58 86 00 01 d1 86 01  01 d1 86 02 01 d1 86 03  |.X..............|
00000080  01 d1 85 00 cd 76 85 01  bd b6 85 02 fc f6 85 03  |.....v..........|
00000090  e9 05 cb d2 3f 91 df 4e  ac 30 d4 32 25 65 26 e7  |....?..N.0.2%e&.|
000000a0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

If anyone wants to dump theirs as well, I would be interested in comparing the PLL configuration data.

There's also EEPROMs at 0-0050 and 1-0054.  One of those is an EEPROM on the Beagle board, I believe, and just contains the Beagle's serial number.  The other one I believe is on the IO board.

1713954150
Hero Member
*
Offline Offline

Posts: 1713954150

View Profile Personal Message (Offline)

Ignore
1713954150
Reply with quote  #2

1713954150
Report to moderator
1713954150
Hero Member
*
Offline Offline

Posts: 1713954150

View Profile Personal Message (Offline)

Ignore
1713954150
Reply with quote  #2

1713954150
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713954150
Hero Member
*
Offline Offline

Posts: 1713954150

View Profile Personal Message (Offline)

Ignore
1713954150
Reply with quote  #2

1713954150
Report to moderator
1713954150
Hero Member
*
Offline Offline

Posts: 1713954150

View Profile Personal Message (Offline)

Ignore
1713954150
Reply with quote  #2

1713954150
Report to moderator
sickpig
Legendary
*
Offline Offline

Activity: 1260
Merit: 1008


View Profile
October 19, 2013, 12:59:10 PM
 #2

Thanks really appreciated, keep up the good work!

Bitcoin is a participatory system which ought to respect the right of self determinism of all of its users - Gregory Maxwell.
Zeek_W
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
October 19, 2013, 01:01:29 PM
 #3

That phone home address seems local to me. Correct me if wrong.

sickpig
Legendary
*
Offline Offline

Activity: 1260
Merit: 1008


View Profile
October 19, 2013, 01:15:42 PM
 #4

That phone home address seems local to me. Correct me if wrong.

u are right, but as he said it could be used in knc hosting facility ot manage fw update and admin tasks

Bitcoin is a participatory system which ought to respect the right of self determinism of all of its users - Gregory Maxwell.
joeventura
Hero Member
*****
Offline Offline

Activity: 854
Merit: 500



View Profile
October 19, 2013, 01:18:20 PM
 #5

That phone home address seems local to me. Correct me if wrong.

Yes, the URL seems to indicate that the Beaglebone board serial number is the unique identifier.

So they create an internal website and (speculating here) when Zeek orders a miner,
they put his unique config on that webserver at:

 http://192.168.100.1:8080/1813BBBK3065   <<typical BB serial number format

and when it is built, they plug it in to Ethernet, they power it up and it programs Zeeks info from his profile and off it goes.

ALL SPECULATION, but that is how I would do it.
fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
October 19, 2013, 01:49:06 PM
 #6

Someone was kind enough to share their EEPROM dumps with me.  The basic structure for the PLL data is the same; 4 sets of 3 values.  The values for "0x86" are still always 0xd101.  The other values, though, continue to appear random.

So, at least the results are consistent.  I just wish I knew why the values appear random.  That's pretty strange for what should be PLL parameters.  Either I'm totally off-mark and this isn't PLL configuration data, the data is encrypted, or I'm missing something...

bobsag3
Hero Member
*****
Offline Offline

Activity: 546
Merit: 500

Owner, Minersource.net


View Profile
October 19, 2013, 04:28:53 PM
 #7

Someone was kind enough to share their EEPROM dumps with me.  The basic structure for the PLL data is the same; 4 sets of 3 values.  The values for "0x86" are still always 0xd101.  The other values, though, continue to appear random.

So, at least the results are consistent.  I just wish I knew why the values appear random.  That's pretty strange for what should be PLL parameters.  Either I'm totally off-mark and this isn't PLL configuration data, the data is encrypted, or I'm missing something...
If you want to walk me thru want you need, I have 1 jup on .95 and I will have 3 more at my facility this week.
thomashrev89
Full Member
***
Offline Offline

Activity: 190
Merit: 100



View Profile
October 19, 2013, 05:36:58 PM
 #8

Someone was kind enough to share their EEPROM dumps with me.  The basic structure for the PLL data is the same; 4 sets of 3 values.  The values for "0x86" are still always 0xd101.  The other values, though, continue to appear random.

So, at least the results are consistent.  I just wish I knew why the values appear random.  That's pretty strange for what should be PLL parameters.  Either I'm totally off-mark and this isn't PLL configuration data, the data is encrypted, or I'm missing something...
If you want to walk me thru want you need, I have 1 jup on .95 and I will have 3 more at my facility this week.

If you set up a basic guide on how to do it ill gladly share whatever you need.

fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
October 20, 2013, 08:25:08 AM
 #9

EEPROM Checksum Algorithm
I finally reverse engineered the algorithm KNC is using to checksum the ASIC board EEPROMs.  I had initially thought it was just SHA-256, but a quick try with that failed.  After digging into the code my original hypothesis proved correct, it's SHA-256, but KNC has ... taken some liberties with SHA-256.  It's pretty confusing code.  First, the checksum function is completely broken unless the data being checksummed is a multiple of 64 bytes (the data in the EEPROMs is).  Second, they use SHA-256 without any padding.  Third, they use SHA-256 with their own KNC flavored initial state.  Their initial state is the ASCII representation of "KnCMiner placeholder SHA256 init".  I've recreated the important bits of the algorithm in Python:

Code:
# SHA-256 implementation copied from pypy/lib_pypy/_sha256.py, which is licensed under the MIT license:
# https://github.com/pypy/pypy/blob/master/lib_pypy/_sha256.py
# https://github.com/pypy/pypy/blob/master/LICENSE
import struct

SHA_BLOCKSIZE = 64
SHA_DIGESTSIZE = 32


def new_shaobject():
return {
'digest': [0]*8,
'count_lo': 0,
'count_hi': 0,
'data': [0]* SHA_BLOCKSIZE,
'local': 0,
'digestsize': 0
}


ROR = lambda x, y: (((x & 0xffffffff) >> (y & 31)) | (x << (32 - (y & 31)))) & 0xffffffff
Ch = lambda x, y, z: (z ^ (x & (y ^ z)))
Maj = lambda x, y, z: (((x | y) & z) | (x & y))
S = lambda x, n: ROR(x, n)
R = lambda x, n: (x & 0xffffffff) >> n
Sigma0 = lambda x: (S(x, 2) ^ S(x, 13) ^ S(x, 22))
Sigma1 = lambda x: (S(x, 6) ^ S(x, 11) ^ S(x, 25))
Gamma0 = lambda x: (S(x, 7) ^ S(x, 18) ^ R(x, 3))
Gamma1 = lambda x: (S(x, 17) ^ S(x, 19) ^ R(x, 10))


def sha_transform(sha_info):
    W = []
   
    d = sha_info['data']
    for i in xrange(0,16):
        W.append( (d[4*i]<<24) + (d[4*i+1]<<16) + (d[4*i+2]<<8) + d[4*i+3])
   
    for i in xrange(16,64):
        W.append( (Gamma1(W[i - 2]) + W[i - 7] + Gamma0(W[i - 15]) + W[i - 16]) & 0xffffffff )
   
    ss = sha_info['digest'][:]
   
    def RND(a,b,c,d,e,f,g,h,i,ki):
        t0 = h + Sigma1(e) + Ch(e, f, g) + ki + W[i];
        t1 = Sigma0(a) + Maj(a, b, c);
        d += t0;
        h  = t0 + t1;
        return d & 0xffffffff, h & 0xffffffff
   
    ss[3], ss[7] = RND(ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],0,0x428a2f98);
    ss[2], ss[6] = RND(ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],1,0x71374491);
    ss[1], ss[5] = RND(ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],2,0xb5c0fbcf);
    ss[0], ss[4] = RND(ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],3,0xe9b5dba5);
    ss[7], ss[3] = RND(ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],4,0x3956c25b);
    ss[6], ss[2] = RND(ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],5,0x59f111f1);
    ss[5], ss[1] = RND(ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],6,0x923f82a4);
    ss[4], ss[0] = RND(ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],7,0xab1c5ed5);
    ss[3], ss[7] = RND(ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],8,0xd807aa98);
    ss[2], ss[6] = RND(ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],9,0x12835b01);
    ss[1], ss[5] = RND(ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],10,0x243185be);
    ss[0], ss[4] = RND(ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],11,0x550c7dc3);
    ss[7], ss[3] = RND(ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],12,0x72be5d74);
    ss[6], ss[2] = RND(ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],13,0x80deb1fe);
    ss[5], ss[1] = RND(ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],14,0x9bdc06a7);
    ss[4], ss[0] = RND(ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],15,0xc19bf174);
    ss[3], ss[7] = RND(ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],16,0xe49b69c1);
    ss[2], ss[6] = RND(ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],17,0xefbe4786);
    ss[1], ss[5] = RND(ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],18,0x0fc19dc6);
    ss[0], ss[4] = RND(ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],19,0x240ca1cc);
    ss[7], ss[3] = RND(ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],20,0x2de92c6f);
    ss[6], ss[2] = RND(ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],21,0x4a7484aa);
    ss[5], ss[1] = RND(ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],22,0x5cb0a9dc);
    ss[4], ss[0] = RND(ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],23,0x76f988da);
    ss[3], ss[7] = RND(ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],24,0x983e5152);
    ss[2], ss[6] = RND(ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],25,0xa831c66d);
    ss[1], ss[5] = RND(ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],26,0xb00327c8);
    ss[0], ss[4] = RND(ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],27,0xbf597fc7);
    ss[7], ss[3] = RND(ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],28,0xc6e00bf3);
    ss[6], ss[2] = RND(ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],29,0xd5a79147);
    ss[5], ss[1] = RND(ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],30,0x06ca6351);
    ss[4], ss[0] = RND(ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],31,0x14292967);
    ss[3], ss[7] = RND(ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],32,0x27b70a85);
    ss[2], ss[6] = RND(ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],33,0x2e1b2138);
    ss[1], ss[5] = RND(ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],34,0x4d2c6dfc);
    ss[0], ss[4] = RND(ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],35,0x53380d13);
    ss[7], ss[3] = RND(ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],36,0x650a7354);
    ss[6], ss[2] = RND(ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],37,0x766a0abb);
    ss[5], ss[1] = RND(ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],38,0x81c2c92e);
    ss[4], ss[0] = RND(ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],39,0x92722c85);
    ss[3], ss[7] = RND(ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],40,0xa2bfe8a1);
    ss[2], ss[6] = RND(ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],41,0xa81a664b);
    ss[1], ss[5] = RND(ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],42,0xc24b8b70);
    ss[0], ss[4] = RND(ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],43,0xc76c51a3);
    ss[7], ss[3] = RND(ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],44,0xd192e819);
    ss[6], ss[2] = RND(ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],45,0xd6990624);
    ss[5], ss[1] = RND(ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],46,0xf40e3585);
    ss[4], ss[0] = RND(ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],47,0x106aa070);
    ss[3], ss[7] = RND(ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],48,0x19a4c116);
    ss[2], ss[6] = RND(ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],49,0x1e376c08);
    ss[1], ss[5] = RND(ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],50,0x2748774c);
    ss[0], ss[4] = RND(ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],51,0x34b0bcb5);
    ss[7], ss[3] = RND(ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],52,0x391c0cb3);
    ss[6], ss[2] = RND(ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],53,0x4ed8aa4a);
    ss[5], ss[1] = RND(ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],54,0x5b9cca4f);
    ss[4], ss[0] = RND(ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],55,0x682e6ff3);
    ss[3], ss[7] = RND(ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],56,0x748f82ee);
    ss[2], ss[6] = RND(ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],57,0x78a5636f);
    ss[1], ss[5] = RND(ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],ss[5],58,0x84c87814);
    ss[0], ss[4] = RND(ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],ss[4],59,0x8cc70208);
    ss[7], ss[3] = RND(ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],ss[3],60,0x90befffa);
    ss[6], ss[2] = RND(ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],ss[2],61,0xa4506ceb);
    ss[5], ss[1] = RND(ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],ss[1],62,0xbef9a3f7);
    ss[4], ss[0] = RND(ss[1],ss[2],ss[3],ss[4],ss[5],ss[6],ss[7],ss[0],63,0xc67178f2);
   
    dig = []
    for i, x in enumerate(sha_info['digest']):
        dig.append( (x + ss[i]) & 0xffffffff )
    sha_info['digest'] = dig


whole_data = "4131303030303037424245434600000000000000000000000000000000000000000000000000000000000000000100000800000000000000ffffffffffffffff003084005687840174fa8402c03484038f7a860001d1860101d1860201d1860301d18500db118501b7d785023cce85033d1ce7e375f7fe6521ea8991253ae7ed".decode ('hex')

sha_info = new_shaobject ()
sha_info['digest'] = [0x4b6e434d, 0x696e6572, 0x20706c61, 0x6365686f, 0x6c646572, 0x20534841, 0x32353620, 0x696e6974]
sha_info['data'] = [ord (whole_data[i]) for i in range (0, 64)]
sha_transform (sha_info)
sha_info['data'] = [ord (whole_data[i]) for i in range (64, 128)]
sha_transform (sha_info)

for i in range (8):
print "%08X" % sha_info['digest'][i]

whole_data in the code above is a copy of the data off one of my ASIC board EEPROMs (without the first 32 bytes, which is the stored checksum).  The code prints the calculated checksum, which matched the EEPROM I tested.

So, yay, but it really makes me wonder what the KNC developers were doing.  I can understand not doing the padding and only working on multiples of 64 bytes; the devs probably had higher priorities.  I just don't understand the initial state thing.  The real SHA-256 initial state values exist within initc.  It's just that the checksum function overwrites them.  Wonky.


Progress Update
We can use this to update our EEPROMs and have initc accept the checksum.  I still don't know how to make sense of the PLL configuration data, but some progress is better than none.  I also manually decompiled the write_eeprom function of initc.  It's nothing special though, I can craft modified EEPROM data by hand easily and just use the command line to write it.

Regarding the PLL configuration data.  When write_eeprom is building a new EEPROM image it first fills the buffer with random data (calling rand()).  This explains why the padding on all the EEPROMs was garbage data.  I'm not sure why they do it, though.  It leads me to believe, though, that KNC's developers may have a habit of filling unused data with rand.  Perhaps the PLL 0x84 and 0x85 registers are unused in the current ASIC revision?  Maybe they were used by their FPGA prototype(s).  Most importantly, perhaps only 0x86 is important, and is simply a multiplier value (or two 8-bit values, Multiplier and Divider).  I don't know what the chip's input clock frequency is, so it's hard to make an educated guess about how to interpret the current setting.

That theory could be tested by adjusting the 0x86 registers of the PLL data and ... seeing what happens.  Though if the theory is wrong, doing so could potentially set the chips a blaze.  I don't personally want to try that (yet) ... if anyone with bigger balls than I wants to do so, feel free to try...

I'll continue decompiling initc to see if there are more hints as to the meaning of the PLL data.

fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
October 20, 2013, 12:02:23 PM
 #10

Well I bit the bullet and tweaked the 0x86 register on one die.  I tried a few different values, 0x0168, 0x0068, and 0x01ff.  initc did not seem to throw any errors, and everything hashed like normal.  I did not notice a difference in power consumption on any of the settings, so I can only assume the hash clock was not affected.

So ... there goes that theory.

FeedbackLoop
Hero Member
*****
Offline Offline

Activity: 742
Merit: 500



View Profile
October 20, 2013, 12:39:50 PM
 #11


Thanks for the effort Fpgaminer!

Are there strong indications that the clock is controllable in software (like supplied by the beagle board) and not just some IC in each board with a fixed frequency?
ur0pl
Newbie
*
Offline Offline

Activity: 56
Merit: 0


View Profile
October 20, 2013, 09:04:47 PM
 #12

How can I keep on running asic_test until I get the highest amount of good cores, then use that information to put into the emprom for the max amount of cores
bondus
Newbie
*
Offline Offline

Activity: 13
Merit: 0


View Profile
October 22, 2013, 09:47:03 AM
 #13

Well I bit the bullet and tweaked the 0x86 register on one die.  I tried a few different values, 0x0168, 0x0068, and 0x01ff.  initc did not seem to throw any errors, and everything hashed like normal.  I did not notice a difference in power consumption on any of the settings, so I can only assume the hash clock was not affected.

So ... there goes that theory.

Awesome job figuring out that checksum algorithm.

Take a look in /etc/init.d/cgminer.sh and you will have your answer to why nothing changed.
It is writing to the same SPI PLL registers by hand in there just before starting cgminer, not looking in the eeprom.

   for p in $good_ports ; do
                        # Re-enable PLL
                        i2cset -y 2 0x71 1 $((p+1))
                        for c in 0 1 2 3 ; do
                                cmd=$(printf "0x84,0x%02X,0,0" $c)
                                spi-test -s 50000 -OHC -D /dev/spidev1.0 $cmd >/dev/null
                                cmd=$(printf "0x86,0x%02X,0x01,0xD1" $c)
                                spi-test -s 50000 -OHC -D /dev/spidev1.0 $cmd >/dev/null
                                cmd=$(printf "0x85,0x%02X,0,0" $c)
                                spi-test -s 50000 -OHC -D /dev/spidev1.0 $cmd >/dev/null
                        done

The table for disabled cores is also stored in the eeprom at offset 0x4c and 192 bits forward.


fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
October 24, 2013, 09:28:47 PM
 #14

Quote
Take a look in /etc/init.d/cgminer.sh and you will have your answer to why nothing changed.
Awesome, thanks for the heads up.  I knew I should have done a more thorough grep of the firmware...

I'll take a look, and maybe I'll *gulp* tweak those values.

fpgaminer (OP)
Hero Member
*****
Offline Offline

Activity: 560
Merit: 517



View Profile WWW
October 24, 2013, 10:23:43 PM
 #15

Okay, I tried tweaking the cgminer.sh script to setup my board with a PLL setting of 0x01,0x68.  Power consumption did not change.  I also tried 0x00,0x68; again no change.  This was tested with a restart of the service.  I haven't tried a reboot (would need to modify the BBB's flash or something).  It's possible the PLL settings can be set only once per power-cycle, but that'd be a bit out of the ordinary.

Ytterbium
Full Member
***
Offline Offline

Activity: 238
Merit: 100



View Profile WWW
October 30, 2013, 11:56:17 AM
 #16

Maybe they threw in a bunch of totally random values just to mess with people trying to tweak their hardware.

Maybe you should try probing the data that comes over the ribbon cables.  I would imagine the clockrate (or some analog) would have to be transferred over at some point.

1l1l11ll1l
Legendary
*
Offline Offline

Activity: 1274
Merit: 1000


View Profile WWW
November 08, 2013, 05:33:56 PM
 #17

Any Updates?

dlasher
Sr. Member
****
Offline Offline

Activity: 467
Merit: 250



View Profile WWW
November 08, 2013, 06:18:09 PM
 #18

Thanks for the digging so far..appreciate the share.

I have not yet succeeded in discovering how to change the hashing clocks.  It appears possible.
The VRM voltages should be easy to adjust over I2C.  Though I do not recommend it, yet.

They are easy to check/set, but like you, I wouldn't recommend it.. It can be a great way to try to keep your temps in the 'magic zone'  but you need to watch the VRM amp loads and temps very carefully.

The values are an offset from a 'full' voltage value, which is 0.950 volts. The higher the value, the less subtracted from 0.950 value, with FFFF being ZERO subtracted. Older codes (like 0.91) came with much higher volts, and the current code (0.98) sets 0xff67, which works out to about 0.797 volts.. But even that value appears to be a 'suggestion' since each core on a die will be some variance off THAT number.
tolip_wen
Sr. Member
****
Offline Offline

Activity: 386
Merit: 250


View Profile
December 04, 2013, 11:47:41 AM
Last edit: December 04, 2013, 10:06:30 PM by tolip_wen
 #19

Well I bit the bullet and tweaked the 0x86 register on one die.  I tried a few different values, 0x0168, 0x0068, and 0x01ff.  initc did not seem to throw any errors, and everything hashed like normal.  I did not notice a difference in power consumption on any of the settings, so I can only assume the hash clock was not affected.

So ... there goes that theory.

Awesome job figuring out that checksum algorithm.

Take a look in /etc/init.d/cgminer.sh and you will have your answer to why nothing changed.
It is writing to the same SPI PLL registers by hand in there just before starting cgminer, not looking in the eeprom.

   for p in $good_ports ; do
                        # Re-enable PLL
                        i2cset -y 2 0x71 1 $((p+1))
                        for c in 0 1 2 3 ; do
                                cmd=$(printf "0x84,0x%02X,0,0" $c)
                                spi-test -s 50000 -OHC -D /dev/spidev1.0 $cmd >/dev/null
                                cmd=$(printf "0x86,0x%02X,0x01,0xD1" $c)
                                spi-test -s 50000 -OHC -D /dev/spidev1.0 $cmd >/dev/null
                                cmd=$(printf "0x85,0x%02X,0,0" $c)
                                spi-test -s 50000 -OHC -D /dev/spidev1.0 $cmd >/dev/null
                        done

The table for disabled cores is also stored in the eeprom at offset 0x4c and 192 bits forward.




Thx for the huge hint!

I can confirm it works.
150 Gh/s per module on Oct 4 VRM modules.
I have 2 Saturn doing a reliable 600 at the pool and 720 at the wall.

Along the way I managed to put to sleep and awaken individual die.
I think this is the way to address the dead die issue.
I offered to discuss my findings with a KnCMiner engineer, no reply.
My guess/observation is it is a low priority for KnC to fix the dead die issue.
Sad if true.
The fix seems simple, monitordcdc knows, try different clocks when dead die detected.
I have tested many low current working solutions at stock speeds.

One problem I ran into is that every ASIC is different and the .sh threats them all the same.
One Saturn was easy and one I had to do a lot of hunting (not recommended)

My next step will be unrolling the afore mentioned program loop and try each module seperately and possibly each die seperately for maximum yield.

Trying any of this will VOID YOUR WARRANTY.
You might kill your miner AND have no warranty!
You have been warned.

I have an educated guess on the field width of the variables.

I treat them as 4 bit fields with some success.
If the middle one is 5 or 6 bits wide more finesse is possible.

The lowest 2-4 bits seem to control the output and/or loop divider.
Larger numbers more division (behaves like divider)
Has a major impact, compensate with what I am calling input/reference divider.

Middle nibble seems like 4 bits of loop control.
There may be more lower bits to this field, unconfirmed.
Larger is faster.
This is the one to fiddle with once the others are close.

Highest nibble 3-4 bits of what I call input/reference divider.
Larger numbers less division (behaves like add or multiply)

Start with the middle one only and watch the current!!!
Be ready to bail if needed.
Just stop modified sh and start unmodified one.
(you did modify the COPY of the file right?)

I might have the fields backwards but I get predictable results.

This is a dangerous game to play with an expensive ASIC.
You will VOID YOUR WARRANTY and might kill your machine.

Please refrain from simplifying the process for the masses, yet.
The edits are easy, the possibility of meltdown is real.
Further testing is needed.

Don't even think about editing this stuff on a Nov unit with 8 VRM's
There is easily enough power available to cook your ASIC.
It is also not the same file to edit.
Think twice about it if you have an Oct. with 8 VRM.
There is easily enough power available to cook your ASIC.

I use 47 A and 37 W as the max per die for 150 Gh/s per module.
Under 45 A and 35 W max with stock 144 per module speed.

The main use I have for the tuning suite is feedback.
The SPI clock @ 200k is plenty. Higher is fun but not very effective.
I looked with a scope, there is room to spare @ 200k.
I leave everything default until i get the ASIC clock happy.
Once the clock is happy the rest makes little difference and I only tune V for lower power.
VCO current is IMneverHO more important than all else.
It is significant, an Idle die uses ~15 A, much of that is the clock.
If you see a die with 0 A suspect a stalled clock.
There are even ways to occasionally unstall a stalled clock.
Make a small edit to the SPI clock and hit the apply button.

For the 'hardware' tweakers, I speculate 2 loop caps per die on bottom of board.
Next diff increase or so and I may take the iron and find out.

I spent a lot to find and document the details and pass them on, downtime is money.
You can help fund this research if interested.
Just point your miner to one of my workers for a few minutes.

For BTCGuild worker = tolip_ZsearchFund (PPS, best for short donations)
For GHash.io  worker = tolip.anything (replace anything with your name if you like)

You can also use my reseller ID if you purchase a Neptune.

https://www.kncminer.com/?resellerid=206

Enjoy
Smiley

'twisted research and opinion' donations happily accepted @
13362fxFAdrhagmCvSmFy4WoHrNRPG2V57
My sub 1337 vanity address Wink
volosator
Sr. Member
****
Offline Offline

Activity: 272
Merit: 250



View Profile
December 05, 2013, 05:21:02 PM
 #20

What is maximum stable result you have?
Pages: [1] 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 ... 66 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!