Never seen anything like it, and I have no idea how it could happen?
Here's the parsing code:
job_id = json_array_string(val, 0);
prev_hash = json_array_string(val, 1);
coinbase1 = json_array_string(val, 2);
coinbase2 = json_array_string(val, 3);
bbversion = json_array_string(val, 5);
nbit = json_array_string(val, 6);
ntime = json_array_string(val, 7);
clean = json_is_true(json_array_get(val, 8));
and here's the submit generation code:
work->ntime = strdup(pool->swork.ntime);
sprintf(s, "{\"params\": [\"%s\", \"%s\", \"%s\", \"%s\", \"%s\"], \"id\": %d, \"method\": \"mining.submit\"}",
pool->rpc_user, work->job_id, work->nonce2, work->ntime, noncehex, sshare->id);
Pretty straight forward so your guess is as good as mine? Unless there's memory corruption going on in the miner because of scrypt's weird and wonderful use of memory, but you're describing a simple swapping which is not like memory corruption. Could you track the message the pool sent when the miner returns something odd to make sure it was sent out ok, because this is the first time this sort of bug has been reported and other scrypt pools have been supporting stratum for a while?
It is not only swapping values. jobid, Extranonce2, ntime can be corrupted in many ways.( non hex characters, empty, even " chars ).
I didn't observe any swaps including nonce. This filed seems unaffected.
This issue doesn't occur in neither in BTC nor TRC pool. TRC mining doesn't use scrypt.
Yes I can track all mining.notifications. Here you have example conversation with client 12:
1. Pool sends mining.notification:
[2013-02-03 11:26:46.688] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 12, Outbound message: {"params": ["3", "xxx", "xxx", "xxx", [], "00000001", "1c0d849c", "510e4975", false], "id": null, "method": "mining.notify"}
2. Miner responds: (1st response is corrupted)
[2013-02-03 11:26:47.546] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 12, Inbound message: {"params": ["xxx.elke", "", "", "", "d2af4400"], "id": 10097, "method": "mining.submit"}
[2013-02-03 11:26:54.332] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 12, Inbound message: {"params": ["xxx.elke", "3", "08000000", "510e4975", "0f1b0e00"], "id": 10098, "method": "mining.submit"}
[2013-02-03 11:26:56.033] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 12, Inbound message: {"params": ["xxx.elke", "3", "08000000", "510e4975", "ad8d1100"], "id": 10099, "method": "mining.submit"}
There are 17 miners of 8 different users currently mining at LTC stratum pool. All are using cgminer 2.10.4. Only 5 miners (out of 17) doesn't send those corrupted works. Rest of them has 2%-6% of corrupted submissions. If you want I can provide you with larger sample from my log file. Here you have more examples:
[2013-02-03 11:27:00.619] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 14, Inbound message: {"params": ["yyy.c1l", "3", "510e4975", "10000000", "d9270a00"], "id": 6356, "method": "mining.submit"}
[2013-02-03 11:27:00.619] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 14, Outbound message: {"id": 6356, "result": null, "error": [28, "Ntime out of range", null]}
[2013-02-03 11:27:17.124] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 1, Inbound message: {"params": ["zzz.docointron", "4", "F000000", "510e4993", "92f0ad00"], "id": 1274, "method": "mining.submit"}
[2013-02-03 11:27:17.124] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 1, Outbound message: {"id": 1274, "result": null, "error": [26, "Incorrect Extranonce2 size", null]}
[2013-02-03 11:27:48.558] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 11, Inbound message: {"params": ["bbb.crunch2", "510e49b1", "5", "10000000", "9a3a8300"], "id": 17147, "method": "mining.submit"}
[2013-02-03 11:27:48.558] TraceActivity Verbose: 0 : [AsyncTcpServer], ClientId: 11, Outbound message: {"id": 17147, "result": null, "error": [26, "Incorrect Extranonce2 size", null]}