Hi michelem,
hi all,
first I want to say that Minera is so far the most reliable/stable miner solution I've tried for my small Gridseed farm - great job, thanks michelem!
At the moment I'm using 2 Raspberry PIs, one managing 10 Gridseed Minis, the other one supporting 5 Gridseed Blades.
Even a good solution can be improved, that's the reason why I would like to share some of my findings and ideas.
1. Stability- With some hardware revisions and kernel version there is a reported issue with freezing systems. As already mentioned in most of the topics covering various Raspberry PI mining solutions, there is an easy workaround by adding an addional parameter to the "/boot/cmdline.txt" file:
dwc_otg.lpm_enable=0 console=ttyAMA0,115200 kgdboc=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 elevator=deadline rootwait slub_debug=FP
The "
slub_debug=FP" will do the trick.
=> perhaps an option for the next release?
- If this is not enough, or you are looking for an additional safety-line, unsing the build-in harware watchdog is an option. Unfortunately most available sources do not cover a small issue with defaults in the actual version 5.12 of the watchdog daemon. The watchdog in the BMC2708 is limited to ~16s timeout, but the default of the watchdog daemon seem to be 60s. If you want to give it a try, the following steps will do it, incl. a workaround for the timeout issue:
$ sudo modprobe bcm2708_wdog
$ echo "bcm2708_wdog" | sudo tee -a /etc/modules
$ sudo apt-get update
$ sudo apt-get install watchdog
$ sudo chkconfig --add watchdog
$ sudo chkconfig watchdog on
- The configuration in "/etc/watchdog.conf" should look like this:
max-load-1 = 24
min-memory = 1
watchdog-device = /dev/watchdog
realtime = yes
priority = 1
watchdog-timeout = 10
The additional parameter "
watchdog-timeout = 10" solved the problem on my systems.
- The last step is starting/restarting the daemon:
$ sudo /etc/init.d/watchdog restart
=> could be added to the master image in one for the next releases?!?
- Last bit is an automatic reboot every 12 hours, for that I added a line to "/etc/crontab":
...
5 */12 * * * root /bin/sync && /sbin/reboot
...
This will sync the SD card and rebbot the PI at 0:05 and 12:05 every day.
=> configuration through the web ui would be nice!
2. Pool Selection / NiceHash / Donation PoolMy two systems do
NOT behave the same.
- the Minis have not yet reached the "race condition" to end up with the last pool in the list
- for my Blades this happened already several times - was Ok as donation for michelem, but should not be the default behaviour in the future.
Based on what I could track on my system, it's not just NiceHash, but NiceHash with the "p=x.y" option as password will more or less force the failure when the pools are changed too often. It also happend to me with ltcrabbits and a very low diff when switching to this pool, but only with the blades, not with the Minis!
The "cpuminer.log" shows the following:
[2014-06-26 23:21:17] stratum_recv_line failed
[2014-06-26 23:21:17] Stratum connection interrupted
[2014-06-26 23:21:17] Starting Stratum on stratum+tcp://eu.ltcrabbit.com:3333
[2014-06-26 23:21:17] stratum_recv_line failed
[2014-06-26 23:21:17] ...retry after 5 seconds
[2014-06-26 23:21:22] submit_upstream_work stratum_send_line failed
[2014-06-26 23:21:22] ...retry after 5 seconds
[2014-06-26 23:21:23] New Job_id: 2f41 Diff: 32 Work_id: 8ed3f9d6
[2014-06-26 23:21:25] New Job_id: 2f42 Diff: 32 Work_id: 8ed3f9d6
[2014-06-26 23:21:27] submit_upstream_work stratum_send_line failed
[2014-06-26 23:21:27] ...retry after 5 seconds
[2014-06-26 23:21:27] Rejected 600a447b GSD 7@15
[2014-06-26 23:21:27] DEBUG: reject reason: job not found
...
[2014-06-26 23:21:27] DEBUG: reject reason: job not found
[2014-06-26 23:21:32] submit_upstream_work stratum_send_line failed
[2014-06-26 23:21:32] ...retry after 5 seconds
[2014-06-26 23:21:32] Rejected a6714685 GSD 0@26
[2014-06-26 23:21:32] DEBUG: reject reason: job not found
[2014-06-26 23:21:32] stratum_recv_line failed
[2014-06-26 23:21:32] Stratum connection interrupted
...
[2014-06-26 23:21:47] Rejected ecdabe3d GSD 3@37
[2014-06-26 23:21:47] DEBUG: reject reason: low difficulty share of 0.000028805203748063583
[2014-06-26 23:21:47] Starting Stratum on stratum+tcp://eu.ltcrabbit.com:3333
[2014-06-26 23:21:48] stratum_recv_line failed
[2014-06-26 23:21:48] ...retry after 5 seconds
[2014-06-26 23:21:52] submit_upstream_work stratum_send_line failed
[2014-06-26 23:21:52] ...retry after 5 seconds
[2014-06-26 23:21:53] New Job_id: 2f93 Diff: 32 Work_id: 8ef195ed
[2014-06-26 23:21:57] submit_upstream_work stratum_send_line failed
[2014-06-26 23:21:57] ...retry after 5 seconds
[2014-06-26 23:21:58] Rejected 8674e16b GSD 4@21
[2014-06-26 23:21:58] DEBUG: reject reason: low difficulty share of 0.00002044789273909275
...
[2014-06-26 23:21:58] Rejected f34240a1 GSD 0@38
[2014-06-26 23:21:58] DEBUG: reject reason: low difficulty share of 0.0008953760268679294
[2014-06-26 23:22:00] Checking main pool: stratum+tcp://stratum.nicehash.com:3333
[2014-06-26 23:22:00] Stratum authentication failed
[2014-06-26 23:22:03] submit_upstream_work stratum_send_line failed
[2014-06-26 23:22:03] Rejected 6cdbeab1 GSD 7@17
[2014-06-26 23:22:03] DEBUG: reject reason: low difficulty share of 0.00002086725782000612
[2014-06-26 23:22:03] ...retry after 5 seconds
[2014-06-26 23:22:03] Rejected 600f3044 GSD 6@15
[2014-06-26 23:22:03] DEBUG: reject reason: low difficulty share of 0.000026977634249723677
...
[2014-06-26 23:22:03] Rejected b9a9127e GSD 2@29
[2014-06-26 23:22:03] DEBUG: reject reason: low difficulty share of 0.000020940332663484334
[2014-06-26 23:22:03] stratum_recv_line failed
[2014-06-26 23:22:03] Stratum connection interrupted
[2014-06-26 23:22:03] Starting Stratum on stratum+tcp://eu.ltcrabbit.com:3333
[2014-06-26 23:22:03] New Job_id: 2f42 Diff: 32 Work_id: 8efb367d
[2014-06-26 23:22:08] submit_upstream_work stratum_send_line failed
[2014-06-26 23:22:08] ...retry after 5 seconds
[2014-06-26 23:22:08] Rejected e678590b GSD 1@36
[2014-06-26 23:22:08] DEBUG: reject reason: job not found
...
[2014-06-26 23:22:08] Rejected 93459d49 GSD 5@23
[2014-06-26 23:22:08] DEBUG: reject reason: job not found
[2014-06-26 23:22:13] submit_upstream_work stratum_send_line failed
[2014-06-26 23:22:13] ...retry after 5 seconds
[2014-06-26 23:22:13] Rejected 8678f100 GSD 6@21
[2014-06-26 23:22:13] DEBUG: reject reason: job not found
[2014-06-26 23:22:13] stratum_recv_line failed
[2014-06-26 23:22:13] Stratum connection interrupted
[2014-06-26 23:22:13] Starting Stratum on stratum+tcp://eu.ltcrabbit.com:3333
[2014-06-26 23:22:13] stratum_recv_line failed
[2014-06-26 23:22:13] ...retry after 5 seconds
[2014-06-26 23:22:18] submit_upstream_work stratum_send_line failed
[2014-06-26 23:22:18] ...retry after 5 seconds
[2014-06-26 23:22:18] New Job_id: 2f94 Diff: 32 Work_id: 8f0a8598
[2014-06-26 23:22:23] submit_upstream_work stratum_send_line failed
[2014-06-26 23:22:23] ...retry after 5 seconds
[2014-06-26 23:22:23] Rejected 59afe1e6 GSD 7@14
[2014-06-26 23:22:23] DEBUG: reject reason: low difficulty share of 0.000038679563917316584
...
[2014-06-26 23:22:23] Rejected 59b0570a GSD 4@14
[2014-06-26 23:22:23] DEBUG: reject reason: low difficulty share of 0.000015259335727904784
[2014-06-26 23:22:28] Stratum detected new block
[2014-06-26 23:22:28] submit_upstream_work stratum_send_line failed
[2014-06-26 23:22:28] New Job_id: 2f95 Diff: 32 Work_id: 8f1429f3
[2014-06-26 23:22:28] ...retry after 5 seconds
[2014-06-26 23:22:28] Rejected 867d2d1d GSD 0@21
[2014-06-26 23:22:28] DEBUG: reject reason: job not found
...
[2014-06-26 23:22:28] Rejected b3366217 GSD 8@28
[2014-06-26 23:22:28] DEBUG: reject reason: job not found
[2014-06-26 23:22:33] submit_upstream_work stratum_send_line failed
[2014-06-26 23:22:33] Rejected b99a43d0 GSD 1@29
[2014-06-26 23:22:33] ...retry after 5 seconds
[2014-06-26 23:22:33] DEBUG: reject reason: job not found
[2014-06-26 23:22:33] Rejected c667315c GSD 9@31
...
[2014-06-26 23:22:33] Rejected c0006c5f GSD 4@30
[2014-06-26 23:22:33] DEBUG: reject reason: job not found
[2014-06-26 23:22:33] stratum_recv_line failed
[2014-06-26 23:22:33] Stratum connection interrupted
[2014-06-26 23:22:33] Starting Stratum on stratum+tcp://eu.ltcrabbit.com:3333
[2014-06-26 23:22:33] stratum_recv_line failed
[2014-06-26 23:22:33] ...retry after 5 seconds
[2014-06-26 23:22:38] submit_upstream_work stratum_send_line failed
[2014-06-26 23:22:38] ...retry after 5 seconds
[2014-06-26 23:22:38] stratum_recv_line failed
[2014-06-26 23:22:38] ...retry after 5 seconds
If have also seen crap strings as pool names in the screen session during this race, but could not capture this.
Remarks:
- Gridseed Blades don't like very small diffs, like 32, best results for me in a range from 256 to 1024 for my setup.
- Girdseed Minis will work with a diff of 32 already, better results within a range from 128 to 512, again in my setup
- For me it seems to be a problem of the cpuminer, not linked to the Minera UI and/or the donation pool idea
It looks like the miners are not set to a default state during a pool switch, so the results may not fit to new jobs and will produce a high number of rejects.
If I understood right, cpuminer is using an array to store some additional data for each pool during runtime, based on the crap strings I have seen, I would not be surprised if there is a "buffer overflow", "pointer error" or a kind of type mismatch in this lately added part. If it comes to a pool switch followed by a larger number of rejects things will/may go wrong.
>> All mentioned as a person who can't really read and understand code, so maybe I'm totally wrong here. <<Based on this problem I'm not using NiceHash with the "p=x.y" option, witch means I have to monitor manually or use only pools where I can define the minimum diff for each worker - not good.
Regards
MScFW