Starting "chop" test now. The "oh" version was still running fine with log/debug on after 4.25 hours. Looks like that pattern is holding. I love the names for the various test versions - fun to imagine what they might stand for. Edit: just 5 minutes into the "chop" test one LED came on for a couple of seconds but it didn't go zombie. I don't think I've ever seen that before - maybe one second at the most. May not be relevant, but it was unusual. Edit: Breakthrough? LEDs are going on all over the place but then they go out again, often after 10 seconds or more. No zombies flagged. Every one recovers so far. Run is about 14 minutes old. Edit: AMU 5 finally went zombie about 35 minutes into the run. Thanks a lot, it is suspicious that something's changing. Maybe we can combine some of these experiments then. More soon.
|
|
|
Thanks very much for trying all those, guys. I'll see what else I can come up with, but this will be the only way I'll be able to find it since you can reproduce it so easily.
|
|
|
I`m still running previous one w/o any trouble 7hrs now. 2x usb erupter directly to PC + 30GH/s BFL. Should I try another one?
Nono, if it's working fine for you, stick with it. These are attempts to find the problem with aigeezer's
|
|
|
failsafe worked, i stick with the old one enough action for today ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif) Sorry, no idea why it didn't work on yours.
|
|
|
updated the firmware, with keep settings now the avalon is not responsive ![Sad](https://bitcointalk.org/Smileys/default/sad.gif) so be carefull Same issue at the moment, cannot connect via 192... Any suggestions? Did you use that latest firmware I posted? Resetting involves powering it off and on and waiting till the blue light (at the back inside next to the network connector) starts flashing, then pushing in the tiny reset button next to the connector which will make it flash much faster. After that it's ready to connect to at 192.168.1.1. Set your laptop to 192.168.1.2 and plug the ethernet cable into it. Then do the following: ssh 192.168.1.1 -l root mount_root mtd -r erase rootfs_data After it reboots it should be back in failsafe settings and you can log in via ethernet again on 192.168.0.100 and flash it with a good firmware.
|
|
|
Edit: AMU 8 went zombie 14 minutes into the run. I'll rerun with logging and see if anything shows up.
Nope, more important to try the other experiments please.
|
|
|
I had to go back and change all settings like an initial set up (using 192.168.1.100 etc), but I thought that was just me. So presumably you'll have to do the same!
|
|
|
3.6.6 reports two zombies (literally) after about 20 minutes. I'll restart it shortly with logging on.
3.6.6 is almost identical to the experimental .exe you last downloaded. OK, thanks for the info. Behavior seems about the same. Working fine with logging, only running with log for about half an hour so far though. It's still bizarre that it would not go zombie with logging on only. Either way it will be nice to know what error specifically causes cgminer to consider the device dead. All we can do is keep watching and maybe capture something on the log. My 3.6.6 test with logging on has run for just under 24 hours - log size is 1.16GB. Nothing unusual happened - no errors of any kind that I can see, but I haven't read every line of the log. ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif) 13 AMUs and one BAL, Win 7 64. I'll shut it down shortly to try your new "little buglet" fix. So I'm assuming that turning debugging on is having some subtle effect on timing, slowing down writes to the device or something like that, and is indirectly making your devices more stable. So that gives me some other avenues to check out. Give the bugfix a try and I'll make some other binaries trying other shit out, thanks.
|
|
|
Hi all It's been a while since I've been able to do anything with the avalon code thanks to having set up my house for sale recently and then before that, physically breaking the wrt703n in my avalon so I couldn't test any changes. Anyway courtesy of the donated router and that I now have brought my mining rigs back home, I'm able to provide the first updated firmware in a while. Now I'm out of touch so it's the simplest of updates, just bringing cgminer up to the latest version, 3.6.6, which has had numerous fixes and improvements along the way. So far it's been working fine, EXCEPT that presumably due to API updates, the SUMMARY box does not show up any values in the avalon status. Obviously this isn't ideal, but I figured people would be itching to update their firmware to something newer. Grab it here: http://ck.kolivas.org/apps/cgminer/avalon/20131027/ ALL THE USUAL WARNINGS APPLY!
EDIT: NOTE you may have to configure your avalon from scratch(using an ethernet cable and connecting to 192.168.1.100) to use this!Looks like people are having trouble with this one, withdrawn ![Sad](https://bitcointalk.org/Smileys/default/sad.gif)
|
|
|
Stared on 2x USB erupters + 1x BFL 30GH/s What should I expect? ;]
Thanks. Hopefully... nothing untoward \o/ Anyway I sneaked up fresh binary packages anyway with the -1 suffix since it's an obvious bugfix, but not big enough to warrant a new version.
|
|
|
It's the driver KFC wrote - it counts hardware errors as hashrate. None of the drivers in the mainline cgminer code do that.
|
|
|
I'm running 3.6.4 on win7 64 with about 45 sticks and I'm finding that sometimes when a stick goes zombie all the others go sick. Because of that I started running 4 instances with 12 sticks each as to not affect the hash rate that much. Has this been reported? Is zombie a real zobie and killing everyone :-) Now serious, could it be the thread manager going crazy?
we're not sure why some machines get zombified and some don't. I have one machine it works fine on with 36, another machine (much newer and more powerful) I get errors and zombies left and right. if are one of the unlucky ones, try an earlier version until you get one that works. M Yah I've been busting a gut left right and centre trying to find a common reason and fix for all of them on all OSs. My most recent bugfix for this very problem was only uploaded 2 hours ago, along with en experimental .exe using the bugfix.
|
|
|
I know you're all tested out, but this one's a big fix for AMUs...
|
|
|
I've been noticing a lot of these duplicate shares lately (3.5.0 and 3.6.6). [2013-10-26 13:59:00] Accepted 2aa206aa Diff 6/4 BAJ 0 pool 0 [2013-10-26 13:59:00] Rejected 2aa206aa Diff 6/4 BAJ 0 pool 0
Is there something wrong with my Jalapeno or is this just normal crappy BFL behavior? It's been throwing out about 1% hardware errors too. Probably hardware related, but probably also nothing to worry about as it's just how a form of hw errors manifest on that particular device.
|
|
|
Following this thread with interest.
I have been running 3.3.1 for - ooh ages - without problems. I'm using a Windows 7 64bit core i7 cpu with 34 Erupters plugged in to 3 Anker 10-port hubs and a powered 4-port hub. The hubs are all plugged into motherboard usb slots.
No real problems with 3.3.1 apart from the odd zombie maybe every 10-14 days.
Upgraded to 3.6.4 and loads of timeouts and zombies almost instantly. Powered off hubs, restarted. Same problem, so went back to 3.3.1 - sweet as a nut!
Today I have tried 3.6.6 and it ran for about 15 minutes, then one zombie appeared and timeouts stated to come up. Then another zombie and more timeouts. Seems to get worse after the first timeout appears, then it's pretty much regular timeout after timeout. So I've reverted to 3.3.1, powered the hubs off/on and everything is working smoothly again.
Do you see a message just before one of them goes zombie about any usb errors in particular?
|
|
|
3.6.6 reports two zombies (literally) after about 20 minutes. I'll restart it shortly with logging on.
3.6.6 is almost identical to the experimental .exe you last downloaded. OK, thanks for the info. Behavior seems about the same. Working fine with logging, only running with log for about half an hour so far though. It's still bizarre that it would not go zombie with logging on only. Either way it will be nice to know what error specifically causes cgminer to consider the device dead. All we can do is keep watching and maybe capture something on the log.
|
|
|
|