Thanks a lot Con for avalon-auto option. Is it possible to create something like avalon-limit frequency option? One of my avalons has freezing controller board at 350 MHz, but during avalon-auto it worked around 330-340 without problems, so I'm currently locked at 325 ![Sad](https://bitcointalk.org/Smileys/default/sad.gif) . Maybe so.
|
|
|
To be able to average 5 blocks per diff change I'd say, mitigating the loss if you'd miss getting a block across a diff rise which you will never recover.
|
|
|
I cant find Linux installation guide for this miner ![Embarrassed](https://bitcointalk.org/Smileys/default/embarrassed.gif) Trust me, you don't need it. Plus, you are better off compiling cgminer with CPU support as it has faster algorithms I believe. cgminer lost its CPU mining code a long time ago now.
|
|
|
Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff?
So the shown HW errors are a multiple of the the diff mining at? having a higher percentage cgminer restarted in the night, MHz is again at 341 I'm getting a high rate of rejects from the pool so cgminer is showing me nearly 79GHash/s but on the pool bitparking its only around 71GHasch/s like it was at 300 MHz? watching this The hardware errors need to be divided by the diff... I'm absolutely sure you're not at 10-20% errors.
|
|
|
Very much dependent on the chips, so this can only be a wild guess, but... 80+ degrees?
thank you CKolivas! 80C will cook me and my tiny apartment. I think I need to figure a way to vent the heat directly outside without letting the rain and snow in, Hah, well don't take my word for it, as I said, it's pure speculation.
|
|
|
I am seeing little or no improvement by cooling with a portable A/C.
Unit with A/C 1h 37m 58s 83896.29 temp3 43 freq(auto) 354
Unit without A/C 6h 54m 22s 83111.32 temp3 53 freq(auto) 353
I guessed this might be the case since the temperatures really aren't getting into the error range even with regular air cooling - especially since it's 3 degrees at my home overnight and the hashrate doesn't go up. I suspect the hashrate will only get higher with more voltage given to the chips. just curious what might the "error range" be? it's getting rather hot in here and I think I might have to buy another AC unit... summer is right around the corner... Very much dependent on the chips, so this can only be a wild guess, but... 80+ degrees?
|
|
|
I am seeing little or no improvement by cooling with a portable A/C.
Unit with A/C 1h 37m 58s 83896.29 temp3 43 freq(auto) 354
Unit without A/C 6h 54m 22s 83111.32 temp3 53 freq(auto) 353
I guessed this might be the case since the temperatures really aren't getting into the error range even with regular air cooling - especially since it's 3 degrees at my home overnight and the hashrate doesn't go up. I suspect the hashrate will only get higher with more voltage given to the chips.
|
|
|
Thanks for the detailed info! I noticed that during the first 10 or so hours the overclocked avalon was stable, but then it becomes more and more unstable, even the outside temp dropped significantly during night, cgminer restarted repeatedly, I feel that instability might comes from FPGA. What could be the cause of that? Have you observed same accumulated instability over time? P.S. also sent 1B to you, cgminer still rules ![Cool](https://bitcointalk.org/Smileys/default/cool.gif) And thank you ![Wink](https://bitcointalk.org/Smileys/default/wink.gif) I'm sure instability can manifest in any number of ways, and it's probably either resetting the device regularly due to the chips failing or idling frequently due to the PSU not keeping up or something along those lines.
|
|
|
How is the auto balancing between maximising clock speed but minimising fan speed? What is the hierarchy?
Unlike the GPU code, they're totally independent as. Clock speed is determined solely by hardware errors whereas fanspeed is determined by temperature. HW errors tend to run hand in hand with temperature rise on this sort of hardware whereas GPUs are designed to be deterministic right up to failure so hw errors are meant to almost never happen.
|
|
|
Hi ckolivas, thanks for everything. What I have noticed is STROMBOM's firmware + 325 mhz is giving about 1-2% HW errors. On the other hand the latest you put out with the --auto and temp targetting is throwing 15-20% HW errors at the same clock. Is there any way to combine the best of both worlds and get strombom's level of HW errors, but with the ability to control the temp? Would be greatly appreciated.
No idea why that would be the case. Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff? Right? No idea here either. I literally have the exact settings saved and switched between the two firmwares. Auto was setting the clock to 327, but even without Auto and manually set to 325, the HW error was still 15-20%, compared to strombom's 2%. So weird! Try restarting it a few times from the interface perhaps? I find it a bit less reliable to start up normally. But yeah, I don't know why that would be the case... Tried restarting multiple times from the interface, still seeing 15-20% HW errors. Soo weird. Hmm... Auto wont start changing clocks unless the actual nonces returned are within 10% of expected, so perhaps try enabling auto and start at lower clocks like 300.
|
|
|
Hi ckolivas, thanks for everything. What I have noticed is STROMBOM's firmware + 325 mhz is giving about 1-2% HW errors. On the other hand the latest you put out with the --auto and temp targetting is throwing 15-20% HW errors at the same clock. Is there any way to combine the best of both worlds and get strombom's level of HW errors, but with the ability to control the temp? Would be greatly appreciated.
No idea why that would be the case. Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff? Right? No idea here either. I literally have the exact settings saved and switched between the two firmwares. Auto was setting the clock to 327, but even without Auto and manually set to 325, the HW error was still 15-20%, compared to strombom's 2%. So weird! Try restarting it a few times from the interface perhaps? I find it a bit less reliable to start up normally. But yeah, I don't know why that would be the case...
|
|
|
Hi ckolivas, thanks for everything. What I have noticed is STROMBOM's firmware + 325 mhz is giving about 1-2% HW errors. On the other hand the latest you put out with the --auto and temp targetting is throwing 15-20% HW errors at the same clock. Is there any way to combine the best of both worlds and get strombom's level of HW errors, but with the ability to control the temp? Would be greatly appreciated.
No idea why that would be the case. Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff?
|
|
|
fans are heavily circulating from min to around 3400 rpm when i set target-temp to 48 degrees they are influencing power draw above values are taken @min speed high speed will draw about +15W
will try agin with default temp;)
donated 1BTC to ckolivas 7a398e9723d533dfc13d99ec44e040645704f939e037851a84cddc430dab0d00-000
rpm starts a min raises to over 3000rpm when target-temp is hit and then slowing down to min step by step seems not the optimal strategy - I think better raise fans slowly before target is hit
@ckolivas and if I'm allowed to express a wish: 40% fanspeed as min also will be fine for me;) or also configurable as knows from gpus..
Appreciate the donation, thanks ![Smiley](https://bitcointalk.org/Smileys/default/smiley.gif) In actual fact, the fans are told to slowly increase before the target is hit. The thing is, the fans don't really support such small increments in PWM settings and ignore it till certain thresholds. These fans don't support fine control like a GPU fan and really only have about 6 different speeds. Writing a true PID controller with the mathematics involved is truly overkill for this purpose, and the lack of granularity of fanspeed control would make it a futile exercise. The tiny overshoot followed by huge fan boost you describe should only happen when you first start your avalon for a few mins or if you set your temp to very close to the minimum temp your hardware will run at (something like 35?). I'll look at further config options in the future, time permitting.
|
|
|
A few notes about the auto-clocking approach.
First and foremost, you can fry your hardware as you are running your avalon out of specification, especially if you try it on a batch 1 device with its lower power and quality PSU.
As is virtually always the case, manually fine tuning the final result will always be better than an automated process that guesses. With time I wish to get rid of the requirement to have fixed intervals and allow the user to specify any arbitrary value for the frequency, though the interface coping with it is a bit of an issue at the moment.
Ironically some people are finding the frequency a little too high and others a little too low. I suspect everyone is looking at a different endpoint for what is an ideal frequency in their eyes. The targets I've set are based on hardware error as a percentage, with hysteresis of +/- 0.25% - this is because a .5% increase in hardware errors works out to the amount the hashrate would rise with 2Mhz increments; i.e. if your hardware error count is going up at the same rate as the hashrate should rise, you are wasting energy. Ideally, a regression plot is what would be needed, getting the hashrate rise with each increment and the hw error percentage rise, and seeing when one grows faster than the other, but this is absurd stats to try to go looking for, especially when the values fluctuate wildly under normal circumstances only. By default with avalon-auto, you will get hardware errors of 1~1.5% . When looking at the hardware error count, make sure you are comparing it to the diff1 shares and not the accepted since you will almost certainly be mining at higher diff. Hardware errors are harmless in their own right but indicative of how hard you're pushing the chips for their available voltage and cooling. It sounds like these chips are capable of much more with more voltage but no one's done said mod yet.
The way to calculate hardware error percentage is: HW * 100 / (diff1 + HW)
It's also worth mentioning that to simplify the calculation of different frequencies, the values passed to the avalon with this latest firmware on the "regular values", i.e. 300 and below, is slightly lower than the values that would have been passed to it, but it should make only a negligible difference to hashrate, lost in the noise of normal variance that happens with hashrate. The "timeout" value passed is also smaller now, which means you may hit the limit at lower speeds than you used to - but the old timeouts were too high, and even if you apparently had a higher hashrate, if you go back and check your stats you may find you were getting more rejects. This is because the higher timeouts were leading to duplicate shares being generated so it is only a disadvantage.
A sure fire sign that you're overdoing it is cgminer repeatedly being restarted by the avalon watchdog, or periods of hashrate dropping, or smoke coming out of your PSU.
|
|
|
My 2 avalons sitting at 355/352. Temps 35/41 and 35/39 --avalon-temp 40
With strombom firmware they run @365 86.9 GH after 20h
With strombom they run your fans at 100%. You can achieve the same fanspeed with my firmware by using --avalon-temp 0 yes I did that and they sitting at 354 since 5 hours. Can you make a binary only to test with up to 5% HW rate? I don't get it cross compiled... ...can not find curses.h or ...../mips-openwrt-linux-uclibc/bin/ld: cannot find -lcurl Not any time soon, sorry, but you are missing packages to build it yourself.
|
|
|
My 2 avalons sitting at 355/352. Temps 35/41 and 35/39 --avalon-temp 40
With strombom firmware they run @365 86.9 GH after 20h
With strombom they run your fans at 100%. You can achieve the same fanspeed with my firmware by using --avalon-temp 0
|
|
|
[timeout] => 33 [frequency] => 354
I think the timeout is to low. With 365 i had to change timeout to 36 to avoid restarts.
36 will be generating dupes. Check your rejects
|
|
|
Is there a relationship between HW errors and temperature? I.e. is it better to let the fans work harder and make more noise even though the machine can handle 48-50c well to reduce them?
I have the same question. There is a relationship, but you'll have to experiment and tell me what you find yourself. It's cold here so nothing runs hot even with fans at their lowest.
|
|
|
yes I like it very much;) be sure to get a little donation from me in which frequency and time steppings --avalon-auto adjusts the frequency? Thanks appreciate the donations ^_^ It goes up by 2Mhz and down by 1 at a time, and the timeout is automatically adjusted via my new algorithm based on the highest I could find that wouldn't create dupes (the previous values still would on occasion). I spent a few hours playing with what ratio of hardware errors to hashrate yielded close to the ideal amount. It will be interesting to see if those with massive cooling, extra power, setting their target temp lower and so on, what speeds it will tune to. Mine sits at 351.
|
|
|
New release: version 3.3.1, 26th June 2013Hotfix release. Last minute bug went into 3.3.0 preventing BFL SC singles from working as planned ![Roll Eyes](https://bitcointalk.org/Smileys/default/rolleyes.gif) Note now there are avalon firmware images in my directory as well, with the latest version. Human readable changelog:- Bugfix for BFL SC singles and minirigs which prevented them working. - Bugfix for the usb wildcards 1:* - Increased target temperature on Avalons to 50 since 45 was overkill - Added overheat temperature cutoff to avalons - Added dynamic automatic overclocking to avalons with --avalon-auto Full changelog:- Add an avalon-auto option which enables dynamic overclocking based on hardware error rate for maximum effective hashrate. - Add an --avalon-cutoff feature which puts the avalon idle should it reach this temperature, defaulting to 60, re-enabling it when it gets to target temperature. - Change default avalon target temperature to 50 degrees. - usbutils - incorrect test for * in bus:dev - Redo +1 fix in bflsc.
|
|
|
|