Bitcoin Forum
April 25, 2024, 04:32:12 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Warning: One or more bitcointalk.org users have reported that they strongly believe that the creator of this topic is a scammer. (Login to see the detailed trust ratings.) While the bitcointalk.org administration does not verify such claims, you should proceed with extreme caution.
Pages: « 1 ... 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 [2043] 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 ... 2137 »
  Print  
Author Topic: Swedish ASIC miner company kncminer.com  (Read 3049457 times)
GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 12, 2015, 03:42:09 AM
Last edit: August 12, 2015, 04:21:28 AM by GenTarkin
 #40841

https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
"Your bitcoin is secured in a way that is physically impossible for others to access, no matter for what reason, no matter how good the excuse, no matter a majority of miners, no matter what." -- Greg Maxwell
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
Searing
Copper Member
Legendary
*
Offline Offline

Activity: 2898
Merit: 1464


Clueless!


View Profile
August 12, 2015, 05:28:56 AM
 #40842

https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

heh you keep adding stuff I need before i can get to installing it due to work at the end of the month...heh all suggestions I needed heh Smiley

anyway gotta buy me some coin on coinbase and shoot you some again when i can get off work enough to test this and or at least get you some btc
(main hoard is in paper wallet in safety deposit box) Smiley

I'm sure there are more then a few of us that will trickle you some more btc Smiley

again i'm sure i include everyone we appreciate your efforts (by the by all my posts till the end of the month will be away from miners at work...no joy to play with toys)


Old Style Legacy Plug & Play BBS System. Get it from www.synchro.net. Updated 1/1/2021. It also works with Windows 10 and likely 11 and allows 16 bit DOS game doors on the same Win 10 Machine in Multi-Node! Five Minute Install! Look it over it uninstalls just as fast, if you simply want to look it over. Freeware! Full BBS System! It is a frigging hoot!:)
xstr8guy
Hero Member
*****
Offline Offline

Activity: 784
Merit: 1004


Glow Stick Dance!


View Profile
August 12, 2015, 08:42:30 AM
 #40843

Quote
I can check it out pretty quickly

btw, it doesn't seem like the max temp is working, I have it set for 90c and a die hit 92c, nothing happened


Testing a rearrangement & rewrite of hard reset detection. Will have to wait till mine actually needs resetting to see if it works. If it does, it should differentiate between soft reset success vs fail and then applying hard power reset to cube if needed.
My Titan doesnt experience successfull soft resets. So, will need someone to test it out once I verify the soft reset fail then hard reset works.
It doesnt happen instantly. Think it loops every 4 seconds. Ill test it here in a lil while. I dont see why it wouldnt work but Ill double check =)
Grr nevermind, somehow it stopped writing to the config file. Ill have to look into it later. Not sure how it broke =P
Probably a typo somewhere lol

well I had 1 hard reset work flawlessly  Smiley

I'll pledge .5 btc for your efforts, if you can just drop the MHz from 325 to 300, instead of turning the die off. Also can you add another temp cut-off 93c -- I manually turn them down around 92/93 - it's usually at those temps for only a few hours, and haven't had any problems
[/quote]

Hrm....ok well I found an issue w/ the changes but only as of this morning when I started editing the code again, these erronous edits did not make it into my latest release. I fixed the issues but the release u downloaded should still have worked properly.

Any chance you can paste the contents of /var/log/monitordcdc.log when the thermal trip doesnt work for you?
It would require ssh'n into the pi and copying the contents of that file to a text file.

The test works perfect on my box when I set the temp threshold to 70....(added as a testing temp =) )
I dont even have to hit refresh on the advanced settings page, I can see the dies get turned to 0's that go over threshold, after bfgminer restarts.

Also, yes when I have more time, now that I see how KNC updates clocks without needing to restart bfgminer, I will implement a soft clock scale down w/o needing a bfgminer restart. =) I will also put 92/93C in there for ya =)
[/quote]
[/quote]

Fix the damn broken quotes! It's unreadable.
TXSteve
Sr. Member
****
Offline Offline

Activity: 342
Merit: 250


View Profile
August 12, 2015, 09:41:23 AM
Last edit: August 12, 2015, 11:51:02 AM by TXSteve
 #40844

https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??


helmax
Sr. Member
****
Offline Offline

Activity: 440
Merit: 250



View Profile
August 12, 2015, 11:23:11 AM
 #40845

anyone have full image SD card 1.5 GB for neptune ?

looking job
jelin1984
Legendary
*
Offline Offline

Activity: 2408
Merit: 1004



View Profile
August 12, 2015, 01:13:00 PM
 #40846

Can you make your firmware
Work

With rasberry pi 2. Version?


That will be great

Titan firmware
GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 12, 2015, 02:36:16 PM
 #40847

https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??




U would want to do the git pull from /home/pi (default dir that u log into via ssh) ... cuz otherwise the changed webpages wont download.
Then yeah, run the update-webgui.sh and u should ... *should* be set ... haha
(reapply desired temp threshold settings via webgui) .. mine defaults to ON/90

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 12, 2015, 02:36:52 PM
 #40848

Can you make your firmware
Work

With rasberry pi 2. Version?


That will be great

Titan firmware

I only have a pi to work on, so no.

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
TXSteve
Sr. Member
****
Offline Offline

Activity: 342
Merit: 250


View Profile
August 12, 2015, 02:57:52 PM
Last edit: August 12, 2015, 03:11:16 PM by TXSteve
 #40849

https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??




everything is upgraded and running fine so far,  just a couple observations:

-- the temp throttling is sweet, it doesn't even reboot bfgminer,  nice!!

-- you might want to implement a delay of a few min or so before triggering a hard reset because:
           a) a soft reset sometimes needs a few tries before it works and this will minimizes bfgminer restarts
               -- when the rig is rented and the customer is using an unstable pool that takes forever for vardif to adjust & stabilize,
                  frequent restarts are particularly troublesome
           b) a delay will be needed to optimize voltages & MHz, and/or to monitor which die is triggering the resets
 
anyway I'll send .5 btc to 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At   (sent)

thx again, nice work  Smiley


GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 12, 2015, 03:31:48 PM
Last edit: August 12, 2015, 03:44:38 PM by GenTarkin
 #40850

https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??




everything is upgraded and running fine so far,  just a couple observations:

-- the temp throttling is sweet, it doesn't even reboot bfgminer,  nice!!

-- you might want to implement a delay of a few min or so before triggering a hard reset because:
           a) a soft reset sometimes needs a few tries before it works and this will minimizes bfgminer restarts
               -- when the rig is rented and the customer is using an unstable pool that takes forever for vardif to adjust & stabilize,
                  frequent restarts are particularly troublesome
           b) a delay will be needed to optimize voltages & MHz, and/or to monitor which die is triggering the resets
 
anyway I'll send .5 btc to 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At   (sent)

thx again, nice work  Smiley



AWESOME !!! Thanks a ton!!!
I just uploaded another change for webgui, it now shows bfgminer version in status screen.
Ill look into the delay for ya =)
Ill get to working on auto upscaling of cores that previously were downclocked. Have a busy schedule coming up so may not be released as quickly and this is fairly complex =)

Regarding the soft reset, do you know where the soft reset actually fails? during the waas -s command or when bfgminer is told to reconfigure...?
When u see this behaviour happen can you post the relevant contents of /var/log/monitordcdc.log? That way I can see exactly what needs delayed.(or tried a few times)

If I had to guess, soft resets I check to see when they fail the waas command. I base the success / fail of that on whether a hard reset needs to be issued. So, I could do a timed loop of say up to 5 soft resets(on like a couple second timer) via waas command and if they all fail then perform hard reset, the first one that passes it exits loop then proceeds to tell BFGminer to do its die reconfigure. *NOT: The waas command has to succeed before BFGminer will show a "die successfully configured" message.
How that sound?

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
TXSteve
Sr. Member
****
Offline Offline

Activity: 342
Merit: 250


View Profile
August 12, 2015, 04:28:10 PM
 #40851

https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??




everything is upgraded and running fine so far,  just a couple observations:

-- the temp throttling is sweet, it doesn't even reboot bfgminer,  nice!!

-- you might want to implement a delay of a few min or so before triggering a hard reset because:
           a) a soft reset sometimes needs a few tries before it works and this will minimizes bfgminer restarts
               -- when the rig is rented and the customer is using an unstable pool that takes forever for vardif to adjust & stabilize,
                  frequent restarts are particularly troublesome
           b) a delay will be needed to optimize voltages & MHz, and/or to monitor which die is triggering the resets
 
anyway I'll send .5 btc to 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At   (sent)

thx again, nice work  Smiley



AWESOME !!! Thanks a ton!!!
I just uploaded another change for webgui, it now shows bfgminer version in status screen.
Ill look into the delay for ya =)
Ill get to working on auto upscaling of cores that previously were downclocked. Have a busy schedule coming up so may not be released as quickly and this is fairly complex =)

Regarding the soft reset, do you know where the soft reset actually fails? during the waas -s command or when bfgminer is told to reconfigure...?
When u see this behaviour happen can you post the relevant contents of /var/log/monitordcdc.log? That way I can see exactly what needs delayed.(or tried a few times)

If I had to guess, soft resets I check to see when they fail the waas command. I base the success / fail of that on whether a hard reset needs to be issued. So, I could do a timed loop of say up to 5 soft resets(on like a couple second timer) via waas command and if they all fail then perform hard reset, the first one that passes it exits loop then proceeds to tell BFGminer to do its die reconfigure. *NOT: The waas command has to succeed before BFGminer will show a "die successfully configured" message.
How that sound?

no, I don't know where the soft reset actually fails, I get the standard "die configuration failed" message and it tries again 20 or 30 sec later. If I can catch it I'll get the log file. I am using awesome miner to trigger hashrate alerts and can then monitor what's happening, but if the hard reset happens too quickly it doesn't trigger -- it instead triggers the rig offline alert, but then it's too late to see what happened

what you suggested sounds good to me
GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 12, 2015, 04:51:50 PM
 #40852

https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??




everything is upgraded and running fine so far,  just a couple observations:

-- the temp throttling is sweet, it doesn't even reboot bfgminer,  nice!!

-- you might want to implement a delay of a few min or so before triggering a hard reset because:
           a) a soft reset sometimes needs a few tries before it works and this will minimizes bfgminer restarts
               -- when the rig is rented and the customer is using an unstable pool that takes forever for vardif to adjust & stabilize,
                  frequent restarts are particularly troublesome
           b) a delay will be needed to optimize voltages & MHz, and/or to monitor which die is triggering the resets
 
anyway I'll send .5 btc to 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At   (sent)

thx again, nice work  Smiley



AWESOME !!! Thanks a ton!!!
I just uploaded another change for webgui, it now shows bfgminer version in status screen.
Ill look into the delay for ya =)
Ill get to working on auto upscaling of cores that previously were downclocked. Have a busy schedule coming up so may not be released as quickly and this is fairly complex =)

Regarding the soft reset, do you know where the soft reset actually fails? during the waas -s command or when bfgminer is told to reconfigure...?
When u see this behaviour happen can you post the relevant contents of /var/log/monitordcdc.log? That way I can see exactly what needs delayed.(or tried a few times)

If I had to guess, soft resets I check to see when they fail the waas command. I base the success / fail of that on whether a hard reset needs to be issued. So, I could do a timed loop of say up to 5 soft resets(on like a couple second timer) via waas command and if they all fail then perform hard reset, the first one that passes it exits loop then proceeds to tell BFGminer to do its die reconfigure. *NOT: The waas command has to succeed before BFGminer will show a "die successfully configured" message.
How that sound?

no, I don't know where the soft reset actually fails, I get the standard "die configuration failed" message and it tries again 20 or 30 sec later. If I can catch it I'll get the log file. I am using awesome miner to trigger hashrate alerts and can then monitor what's happening, but if the hard reset happens too quickly it doesn't trigger -- it instead triggers the rig offline alert, but then it's too late to see what happened

what you suggested sounds good to me
Ok, cool yeah "die configuration failed" .. if I would have to assume, means the waas soft reset has failed. At least, when my rig requires a hard reset ... thats the message I get until doing the hard reset. Ill impliment the loop sometime either today or tonight =)
If other people could test out the firmware that would be great and any donations helps =)
Thanks again for ur generous donation TXSteve, I really appreciate it =)

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 12, 2015, 07:45:40 PM
Last edit: August 12, 2015, 08:14:15 PM by GenTarkin
 #40853

Came across a behviour issue on my Titan, I just now noticed it issues a soft reset via waas and that returns success yet it still fails to RECONFIGURE successfully in bfgminer. So, a hard reset would be inevitable and really theres no way to differentiate at this point between a soft reset full success vs failure =/
May have to reimpliment hard reset no matter what.

EDIT: yeah what a bummer, its attemping multiple soft resets w/ no success yet waas doesnt fail. I dont know if there is a way around that at this point, may have to revert just hard resets, bummer.

EDIT: rethinking it out, I may have another way to detect die status as a fallback. Coding that in will be tricky, will have to wait till later =)
Damn these things for failing so many different ways ROFL!

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 12, 2015, 08:41:48 PM
 #40854

TXSteve, so the way these things work is every loop of the monitoring script it scans how much current is going through the DCDC's and if under 5 amps is going through it basically incriments the value /var/run/dieXX
once /var/run/dieXX reaches over the threshold variable it takes action to reset the die. So...basically if the ASICs dont have work due to a shitty pool configuration like when renting... that /var/run/dieXX will get incrimented, if the pool is shitty enough and it increments over threshold ... no matter what... one of the reset actions will be taken.
Im not exactly sure whats a good way to recode this so it can differentiate between possible pool comm issues vs an actual die needing reset.
I could impliment maybe 2 thresholds, one being a lower one for the pool issues then another maybe being double of the first one?Huh and if thats reached then it performs a hard reset.
I dont know, what do you think?
Or anyone else care to chime in?
Im just kinda doing a ton of trial and error here LOL!


How this all was originally(the way KNC set it up) was on threshold and if that threshold was hit it just performed a soft reset for an eternity.



In the short term, I was thinking, pool issues aside. Now since we have this other stupid mode of failure where the waas command succeeds but it really doesnt bring the die back to life .... I could do another loop which would attempt soft resets up to say 5x sleep after each attempt, run the dcdc function to see if current is over 5A and each incriment its not, of course increase /var/run/dieXX ... then inside that same loop if /var/run/dieXX goes over the threhold then start a hard reset. I think that would solve the issue I ran into just today =)

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
TXSteve
Sr. Member
****
Offline Offline

Activity: 342
Merit: 250


View Profile
August 12, 2015, 09:02:23 PM
 #40855

TXSteve, so the way these things work is every loop of the monitoring script it scans how much current is going through the DCDC's and if under 5 amps is going through it basically incriments the value /var/run/dieXX
once /var/run/dieXX reaches over the threshold variable it takes action to reset the die. So...basically if the ASICs dont have work due to a shitty pool configuration like when renting... that /var/run/dieXX will get incrimented, if the pool is shitty enough and it increments over threshold ... no matter what... one of the reset actions will be taken.
Im not exactly sure whats a good way to recode this so it can differentiate between possible pool comm issues vs an actual die needing reset.
I could impliment maybe 2 thresholds, one being a lower one for the pool issues then another maybe being double of the first one?Huh and if thats reached then it performs a hard reset.
I dont know, what do you think?
Or anyone else care to chime in?
Im just kinda doing a ton of trial and error here LOL!


How this all was originally(the way KNC set it up) was on threshold and if that threshold was hit it just performed a soft reset for an eternity.



In the short term, I was thinking, pool issues aside. Now since we have this other stupid mode of failure where the waas command succeeds but it really doesnt bring the die back to life .... I could do another loop which would attempt soft resets up to say 5x sleep after each attempt, run the dcdc function to see if current is over 5A and each incriment its not, of course increase /var/run/dieXX ... then inside that same loop if /var/run/dieXX goes over the threhold then start a hard reset. I think that would solve the issue I ran into just today =)

how about this? An "auto on/auto off" button & a "manual reset" button. On problematic rigs we turn auto reset off, & use manual reset as needed. It still beats physically powering down the rig, and restarting it ...just a thought

I only have one die with these flaky soft resets, so it may not be a huge overall problem
GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 12, 2015, 09:13:57 PM
 #40856

TXSteve, so the way these things work is every loop of the monitoring script it scans how much current is going through the DCDC's and if under 5 amps is going through it basically incriments the value /var/run/dieXX
once /var/run/dieXX reaches over the threshold variable it takes action to reset the die. So...basically if the ASICs dont have work due to a shitty pool configuration like when renting... that /var/run/dieXX will get incrimented, if the pool is shitty enough and it increments over threshold ... no matter what... one of the reset actions will be taken.
Im not exactly sure whats a good way to recode this so it can differentiate between possible pool comm issues vs an actual die needing reset.
I could impliment maybe 2 thresholds, one being a lower one for the pool issues then another maybe being double of the first one?Huh and if thats reached then it performs a hard reset.
I dont know, what do you think?
Or anyone else care to chime in?
Im just kinda doing a ton of trial and error here LOL!


How this all was originally(the way KNC set it up) was on threshold and if that threshold was hit it just performed a soft reset for an eternity.



In the short term, I was thinking, pool issues aside. Now since we have this other stupid mode of failure where the waas command succeeds but it really doesnt bring the die back to life .... I could do another loop which would attempt soft resets up to say 5x sleep after each attempt, run the dcdc function to see if current is over 5A and each incriment its not, of course increase /var/run/dieXX ... then inside that same loop if /var/run/dieXX goes over the threhold then start a hard reset. I think that would solve the issue I ran into just today =)

how about this? An "auto on/auto off" button & a "manual reset" button. On problematic rigs we turn auto reset off, & use manual reset as needed. It still beats physically powering down the rig, and restarting it ...just a thought

I only have one die with these flaky soft resets, so it may not be a huge overall problem

Trying to keep this as automated as possible and least user invasive as possible =), its hard for me to code custom stuff for webgui since I dont have much experience w/ all the crazy shit they have going on involving it. LOL

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
TXSteve
Sr. Member
****
Offline Offline

Activity: 342
Merit: 250


View Profile
August 13, 2015, 12:54:01 AM
 #40857

TXSteve, so the way these things work is every loop of the monitoring script it scans how much current is going through the DCDC's and if under 5 amps is going through it basically incriments the value /var/run/dieXX
once /var/run/dieXX reaches over the threshold variable it takes action to reset the die. So...basically if the ASICs dont have work due to a shitty pool configuration like when renting... that /var/run/dieXX will get incrimented, if the pool is shitty enough and it increments over threshold ... no matter what... one of the reset actions will be taken.
Im not exactly sure whats a good way to recode this so it can differentiate between possible pool comm issues vs an actual die needing reset.
I could impliment maybe 2 thresholds, one being a lower one for the pool issues then another maybe being double of the first one?Huh and if thats reached then it performs a hard reset.
I dont know, what do you think?
Or anyone else care to chime in?
Im just kinda doing a ton of trial and error here LOL!


How this all was originally(the way KNC set it up) was on threshold and if that threshold was hit it just performed a soft reset for an eternity.



In the short term, I was thinking, pool issues aside. Now since we have this other stupid mode of failure where the waas command succeeds but it really doesnt bring the die back to life .... I could do another loop which would attempt soft resets up to say 5x sleep after each attempt, run the dcdc function to see if current is over 5A and each incriment its not, of course increase /var/run/dieXX ... then inside that same loop if /var/run/dieXX goes over the threhold then start a hard reset. I think that would solve the issue I ran into just today =)

how about this? An "auto on/auto off" button & a "manual reset" button. On problematic rigs we turn auto reset off, & use manual reset as needed. It still beats physically powering down the rig, and restarting it ...just a thought

I only have one die with these flaky soft resets, so it may not be a huge overall problem

Trying to keep this as automated as possible and least user invasive as possible =), its hard for me to code custom stuff for webgui since I dont have much experience w/ all the crazy shit they have going on involving it. LOL

this v.93 seems to be running pretty good, even the flaky soft reset die seems to have stabilized after several hard resets

when bfgminer randomly shuts down I attribute that to hard resets

GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 13, 2015, 01:01:48 AM
 #40858

TXSteve, so the way these things work is every loop of the monitoring script it scans how much current is going through the DCDC's and if under 5 amps is going through it basically incriments the value /var/run/dieXX
once /var/run/dieXX reaches over the threshold variable it takes action to reset the die. So...basically if the ASICs dont have work due to a shitty pool configuration like when renting... that /var/run/dieXX will get incrimented, if the pool is shitty enough and it increments over threshold ... no matter what... one of the reset actions will be taken.
Im not exactly sure whats a good way to recode this so it can differentiate between possible pool comm issues vs an actual die needing reset.
I could impliment maybe 2 thresholds, one being a lower one for the pool issues then another maybe being double of the first one?Huh and if thats reached then it performs a hard reset.
I dont know, what do you think?
Or anyone else care to chime in?
Im just kinda doing a ton of trial and error here LOL!


How this all was originally(the way KNC set it up) was on threshold and if that threshold was hit it just performed a soft reset for an eternity.



In the short term, I was thinking, pool issues aside. Now since we have this other stupid mode of failure where the waas command succeeds but it really doesnt bring the die back to life .... I could do another loop which would attempt soft resets up to say 5x sleep after each attempt, run the dcdc function to see if current is over 5A and each incriment its not, of course increase /var/run/dieXX ... then inside that same loop if /var/run/dieXX goes over the threhold then start a hard reset. I think that would solve the issue I ran into just today =)

how about this? An "auto on/auto off" button & a "manual reset" button. On problematic rigs we turn auto reset off, & use manual reset as needed. It still beats physically powering down the rig, and restarting it ...just a thought

I only have one die with these flaky soft resets, so it may not be a huge overall problem

Trying to keep this as automated as possible and least user invasive as possible =), its hard for me to code custom stuff for webgui since I dont have much experience w/ all the crazy shit they have going on involving it. LOL

this v.93 seems to be running pretty good, even the flaky soft reset die seems to have stabilized after several hard resets

when bfgminer randomly shuts down I attribute that to hard resets



Yeahp, bfgminer cant be running in order for a proper dcdc power down / up ... dont know if it has to do w/ bus traffic or what, but seems the dcdc power down / up does something but bfgminer will continually ignore them.

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
TXSteve
Sr. Member
****
Offline Offline

Activity: 342
Merit: 250


View Profile
August 13, 2015, 01:08:45 AM
 #40859

TXSteve, so the way these things work is every loop of the monitoring script it scans how much current is going through the DCDC's and if under 5 amps is going through it basically incriments the value /var/run/dieXX
once /var/run/dieXX reaches over the threshold variable it takes action to reset the die. So...basically if the ASICs dont have work due to a shitty pool configuration like when renting... that /var/run/dieXX will get incrimented, if the pool is shitty enough and it increments over threshold ... no matter what... one of the reset actions will be taken.
Im not exactly sure whats a good way to recode this so it can differentiate between possible pool comm issues vs an actual die needing reset.
I could impliment maybe 2 thresholds, one being a lower one for the pool issues then another maybe being double of the first one?Huh and if thats reached then it performs a hard reset.
I dont know, what do you think?
Or anyone else care to chime in?
Im just kinda doing a ton of trial and error here LOL!


How this all was originally(the way KNC set it up) was on threshold and if that threshold was hit it just performed a soft reset for an eternity.



In the short term, I was thinking, pool issues aside. Now since we have this other stupid mode of failure where the waas command succeeds but it really doesnt bring the die back to life .... I could do another loop which would attempt soft resets up to say 5x sleep after each attempt, run the dcdc function to see if current is over 5A and each incriment its not, of course increase /var/run/dieXX ... then inside that same loop if /var/run/dieXX goes over the threhold then start a hard reset. I think that would solve the issue I ran into just today =)

how about this? An "auto on/auto off" button & a "manual reset" button. On problematic rigs we turn auto reset off, & use manual reset as needed. It still beats physically powering down the rig, and restarting it ...just a thought

I only have one die with these flaky soft resets, so it may not be a huge overall problem

Trying to keep this as automated as possible and least user invasive as possible =), its hard for me to code custom stuff for webgui since I dont have much experience w/ all the crazy shit they have going on involving it. LOL

this v.93 seems to be running pretty good, even the flaky soft reset die seems to have stabilized after several hard resets

when bfgminer randomly shuts down I attribute that to hard resets



Yeahp, bfgminer cant be running in order for a proper dcdc power down / up ... dont know if it has to do w/ bus traffic or what, but seems the dcdc power down / up does something but bfgminer will continually ignore them.

If those loops are too big a pain the neck to implement I wouldn't worry about them
GenTarkin
Legendary
*
Offline Offline

Activity: 2450
Merit: 1002


View Profile
August 13, 2015, 03:34:23 AM
 #40860

TXSteve, so the way these things work is every loop of the monitoring script it scans how much current is going through the DCDC's and if under 5 amps is going through it basically incriments the value /var/run/dieXX
once /var/run/dieXX reaches over the threshold variable it takes action to reset the die. So...basically if the ASICs dont have work due to a shitty pool configuration like when renting... that /var/run/dieXX will get incrimented, if the pool is shitty enough and it increments over threshold ... no matter what... one of the reset actions will be taken.
Im not exactly sure whats a good way to recode this so it can differentiate between possible pool comm issues vs an actual die needing reset.
I could impliment maybe 2 thresholds, one being a lower one for the pool issues then another maybe being double of the first one?Huh and if thats reached then it performs a hard reset.
I dont know, what do you think?
Or anyone else care to chime in?
Im just kinda doing a ton of trial and error here LOL!


How this all was originally(the way KNC set it up) was on threshold and if that threshold was hit it just performed a soft reset for an eternity.



In the short term, I was thinking, pool issues aside. Now since we have this other stupid mode of failure where the waas command succeeds but it really doesnt bring the die back to life .... I could do another loop which would attempt soft resets up to say 5x sleep after each attempt, run the dcdc function to see if current is over 5A and each incriment its not, of course increase /var/run/dieXX ... then inside that same loop if /var/run/dieXX goes over the threhold then start a hard reset. I think that would solve the issue I ran into just today =)

how about this? An "auto on/auto off" button & a "manual reset" button. On problematic rigs we turn auto reset off, & use manual reset as needed. It still beats physically powering down the rig, and restarting it ...just a thought

I only have one die with these flaky soft resets, so it may not be a huge overall problem

Trying to keep this as automated as possible and least user invasive as possible =), its hard for me to code custom stuff for webgui since I dont have much experience w/ all the crazy shit they have going on involving it. LOL

this v.93 seems to be running pretty good, even the flaky soft reset die seems to have stabilized after several hard resets

when bfgminer randomly shuts down I attribute that to hard resets



Yeahp, bfgminer cant be running in order for a proper dcdc power down / up ... dont know if it has to do w/ bus traffic or what, but seems the dcdc power down / up does something but bfgminer will continually ignore them.

If those loops are too big a pain the neck to implement I wouldn't worry about them

Well, I rewrote the soft / hard reset code =)
Basically, once it detects a die in error via /var/run/dieXX
It calls the reset die function, it first attempts a soft reset... if that fails right off the bat via waas -s failing then it performs a hard reset. (waas -s shouldnt fail because of pool comm errors)
If waas -s succeeds then it calls bfgminer to perform its internal die reconfiguration update
Then script waits 30 seconds and measures the current output of die in question
If either of the DCDC's are below current treshold then it incriments error count
It will loop through the soft die resets up to 10 times, if it fails 10x then it performs a hard reset.
So, it gives the die roughly 5-6 mins to "work" ... meaning have current flowing through it greater than the threshold, via soft resets.
If that fails it hard resets.

What ya think?
I updated github w/ the changes if you wanna test.

I can say it seems the soft reset logic is working. I have yet to be able to see a hard reset take place, I have to actually wait till my unit acts up to confirm hard reset functionality LOL!

GenTarkin's MOD Kncminer Titan custom firmware! v1.0.4! -- !!NO LONGER AVAILABLE!!
Donations: bitcoin- 1Px71mWNQNKW19xuARqrmnbcem1dXqJ3At || litecoin- LYXrLis3ik6TRn8tdvzAyJ264DRvwYVeEw
Pages: « 1 ... 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 [2043] 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 ... 2137 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!