I believe I have identified the issue with sha256 miners on this pool.
My sha256 miners have been repeatedly crashing on this pool, obviously I don't want that but I like zpool so I'm hoping I can help fix this if need be
I've been dev'ing since the 80s so if I can lend a hand in that way, please let me know.
Forgive me for stinking at formatting, if there's something I can do to make what's coming more readable, again please let me know.
So first, here is when everyhing looks fine:
[ 107.241573] 0x0000: 0x03 0x00 0x00 0x1a 0x86 0x10 0x03 0x13
[ 107.249376] send BC data:
[ 107.249376] 0x0000: 0x03 0x02 0x00 0x1a 0x86 0x10 0x03 0x13
[ 107.257189] send BC data:
[ 107.257189] 0x0000: 0x03 0x05 0x00 0x1a 0x86 0x10 0x03 0x13
[ 107.265002] send BC data:
[ 107.265002] 0x0000: 0x03 0x20 0x00 0x03 0x86 0x10 0x03 0x04
When the log is like that, hashrate stays up, everything is great, blah blah blah...
Next, we come to this, where the extranonce comes in to play:
[ 107.333369] Sending open core work
[ 107.336931] Send new block cmd
[ 107.349539] Timeout {1}
[ 107.352107] Snd Time Interval {21}ms
[ 107.355850] Set bitmain configure
[ 107.359321] clear FPGA nonce buffer
[ 107.363058] drv fifo empty
[ 108.944829] ch0-as50: dev->task_buffer_wr{0x00000000}rd{0x00000000}ret work_id{0xc36c} don't match task_buffer_match{0x436c}
[ 109.315866] drv fifo empty
[ 111.268995] drv fifo empty
[ 113.222124] drv fifo empty
[ 113.376239] get chain0 reg0x0
[ 113.379410] send BC data:
[ 113.379410] 0x0000: 0x03 0x00 0x00 0x03 0x84 0x00 0x00 0x11
[ 113.387640] Change diff to 10
[ 113.390795] diff fix to 6
[ 113.393538] Change net_diff to 23
[ 113.397913] timeout {0x542}
[ 113.400852] rev timeout {0x42050080}
We start getting timeouts, notice we also start getting empty/garbage packets (the fifo empty lines).
After this it's a long line of timeouts, new block commands, and ridiculously low diff:
[ 113.451896] Chain0 PLL: {0x13859805}
[ 113.463230] asic cmd return 0a988513
[ 113.466993] Chain0 PLL: {0x1385980a}
[ 113.470736] asic cmd return 0f988513
[ 113.474488] Chain0 PLL: {0x1385980f}
[ 113.478230] asic cmd return 14988513
[ 113.481976] Chain0 PLL: {0x13859814}
[ 125.781891] Change diff to 11
[ 125.785040] diff fix to 6
[ 125.874532] Send new block cmd
[ 135.585449] Send new block cmd
[ 139.394110] Send new block cmd
[ 147.415553] Send new block cmd
[ 150.136338] Send new block cmd
[ 153.515229] Send new block cmd
[ 163.875027] Change diff to 12
[ 163.878167] diff fix to 6
[ 163.970346] Send new block cmd
[ 164.849064] drv fifo empty
[ 170.919441] Send new block cmd
[ 193.048723] Send new block cmd
[ 193.054117] ch0-as33: dev->task_buffer_wr{0x000035c4}rd{0x000035c4}ret work_id{0xb4c6} don't match task_buffer_match{0x34c6}
[ 203.431180] Send new block cmd
[ 213.729965] Send new block cmd
[ 224.325686] Send new block cmd
[ 227.108914] Send new block cmd
[ 230.558142] Send new block cmd
[ 233.835482] Send new block cmd
[ 244.489735] Send new block cmd
[ 267.679208] Send new block cmd
[ 268.894085] Send new block cmd
[ 269.155778] Send new block cmd
[ 289.927276] Send new block cmd
[ 300.980012] Send new block cmd
[ 311.985830] Send new block cmd
[ 326.124586] Send new block cmd
[ 339.222172] Send new block cmd
[ 353.317892] Send new block cmd
[ 364.382414] Send new block cmd
[ 365.339300] drv fifo empty
[ 367.737935] Send new block cmd
[ 388.552791] Change diff to 13
[ 388.555931] diff fix to 6
[ 388.622595] Send new block cmd
[ 389.558107] Send new block cmd
[ 391.466324] Send new block cmd
[ 399.640137] Send new block cmd
Notice the very low diff? It hovers around there, along with an endless log of "Send new block cmd" issues.
This happens very consistently on both my Antminers and Spondoolies.
They're all running cgminer4.8.0 so the log output is nearly identical for all of them.
This would account for what some of the posts in this thread are saying about the slowly reducing hashrate reported by the zpool website that results in 0 hash.
What is the problem?
Well I would have to look at the zpool logs to identify the problem as this is expected behavior from cgminer but since that is unlikely I'll take a stab, it's likely:
When the algo switches coins, one or more of the coins is not sending proper (or empty, or headlerless) packets or NULLs to the miners, which the sha256 miners and cgminer read as empty/garbage, leading to the miner eventually dying. The pool software is not switching to a new coin at that point because it is not receiving the expected output from the miners either, leading it to attempt to resend the same packet over and over as the msg the miner is sending is telling the pool that it didn't receive the packet (hence the timeout), leading to an infinite loop or at least a failing one.
My suggestion:
Do a (relatively) quick look at the logs for the sha256 coins and see which ones have not created any blocks on this pool and remove them, or, trim the sha256 coins down to the only ones with a regular hashrate on the site. It's likely one of the non-producing coins that is causing the issue.
If there's anything I can do to help, please let me know- I love using zpool