Nexillus
|
|
August 02, 2017, 11:02:42 AM |
|
Hi guys, I have 2 exactly the same rig, one of them is stable as rock, the other is freezing 0.5 - 2 a day I don't know why. Please help me to find a soultion. I have tried what you wrote before but nothing changed. I set up the reboot script but in this situation it is not working. Thank you in advance! http://ballai.hu/image1.jpghttp://ballai.hu/image2.pngThat seems like a hardware issue. When it comes to the watchdog with saying which one the problem is with it is usually the first one. Looking at the pictures I would double check your risers for each slot and maybe even re-mount the one in GPU slot 0. (not always in the physical to logical slots depending on how the motherboard booted the PCI-e lanes) If you are not sure which one is which, use the nvidia-smi and change the fan speed to 0 (when not mining) to find out which slot is which for the card. What is your overclock on the second rig for all the GPUs?
|
|
|
|
ivoldemar
Newbie
Offline
Activity: 23
Merit: 0
|
|
August 02, 2017, 03:55:42 PM |
|
|
|
|
|
Nexillus
|
|
August 02, 2017, 05:18:12 PM |
|
On the road; I will respond to everyone fully when I get back to a desktop, and reliable internet in a few days.
For those asking where fullzero are, he will be back in a few days.
|
|
|
|
Bibi187
Full Member
Offline
Activity: 420
Merit: 106
https://steemit.com/@bibi187
|
|
August 02, 2017, 06:28:32 PM |
|
3 days without post is not a drama, work by yourself a little ?
|
|
|
|
darkfortedx
Newbie
Offline
Activity: 9
Merit: 0
|
|
August 02, 2017, 08:29:20 PM |
|
I still get so many restarts on two of my rigs.
In the restart file it says
Sun Jul 23 09:46:05 EDT 2017 - Starting miner restart script. Sun Jul 23 11:23:18 EDT 2017 - Starting miner restart script. Sun Jul 23 11:29:41 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 11:32:27 EDT 2017 - Starting miner restart script. Sun Jul 23 15:51:03 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 15:54:19 EDT 2017 - Starting miner restart script. Sun Jul 23 15:57:39 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:00:35 EDT 2017 - Starting miner restart script. Sun Jul 23 16:04:25 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:07:43 EDT 2017 - Starting miner restart script. Sun Jul 23 16:11:29 EDT 2017 - Starting miner restart script. Sun Jul 23 16:19:42 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:22:23 EDT 2017 - Starting miner restart script Sun Jul 23 16:27:43 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:30:46 EDT 2017 - Starting miner restart script.
|
|
|
|
Nexillus
|
|
August 02, 2017, 11:12:00 PM |
|
I still get so many restarts on two of my rigs.
In the restart file it says
Sun Jul 23 09:46:05 EDT 2017 - Starting miner restart script. Sun Jul 23 11:23:18 EDT 2017 - Starting miner restart script. Sun Jul 23 11:29:41 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 11:32:27 EDT 2017 - Starting miner restart script. Sun Jul 23 15:51:03 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 15:54:19 EDT 2017 - Starting miner restart script. Sun Jul 23 15:57:39 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:00:35 EDT 2017 - Starting miner restart script. Sun Jul 23 16:04:25 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:07:43 EDT 2017 - Starting miner restart script. Sun Jul 23 16:11:29 EDT 2017 - Starting miner restart script. Sun Jul 23 16:19:42 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:22:23 EDT 2017 - Starting miner restart script Sun Jul 23 16:27:43 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:30:46 EDT 2017 - Starting miner restart script.
What are the OC settings? Seems like a softcrash on the GPUs without getting more information.
|
|
|
|
salfter
|
|
August 02, 2017, 11:55:18 PM |
|
Amazing Switch, Thanks a lot. So you wrote how to add new coins and miners, I wanted to know if its possible to add coins that dont have MPH pool, like LBRY or Decred ? Is it ok to add them to mph_conf.json with other pools like zpool or suprnova? can the script get their updated profit ratio ? Or it works only for coins that have MPH pool? The NiceHash and MiningPoolHub switchers base their decisions on information provided by the respective pools. MiningPoolHub doesn't support LBRY, so they provide no information on its profitability compared to the coins that they do support. I also wanted to support pools that will exchange mined altcoins for Bitcoin automatically. I wrote another switcher previously ( https://gitlab.com/salfter/CryptoSwitcher) that worked with any pool, though I don't think I had exchanges fully automated (and Cryptsy and BTC-e have both fallen by the wayside) and I don't recall how well it would've handled multi-algorithm mining as I was using SHA256 and Scrypt ASIC miners at the time (this was back when an Antminer S1 was useful as more than just a space heater ). Also, CryptoSwitcher used full-node coin daemons (bitcoind, litecoind, etc.) as an independent source of mining stats (and I might've had pools paying out to local wallets); having a bunch of those running chews up lots of RAM and disk I/O. Another question, when setting up server:port should I set auto switch port or normal port ? For example, ethereum/ethahsh has a 20535 port and an auto switch 17020 port, which one should be add to mph_conf.json ?
The MPH switcher builds miner commands from information provided by their API, including host and port numbers. It should automatically pick normal ports (such as 20535 for Ethereum). The only configuration you should need to do is in the first few lines of the config file...things like your username and miner name. If you benchmark the different algorithms on your cards, you could tweak the speed and power-consumption figures to match your system, though (especially if you're running 1070s) the numbers I put in are probably a good start. It would be nice if you could give us multi pool / multi coin profitability switch based on http://whattomine.com/coins.json So it switch to best coin from 1bash coins/pools/miners config file I'd then need to dig into the current exchanges' APIs and figure out how best to automate their usage in the current environment. I don't want a bunch of different altcoins hanging around. (...though I did find a non-trivial amount of NeosCoin in my wallet the other day that has shot up in value over the past couple of years or so since it was mined...might need to go trade it in.)
|
|
|
|
salfter
|
|
August 03, 2017, 12:12:36 AM |
|
In other news, the risers I ordered nearly a month ago finally arrived earlier this week. After fixing an unrelated power-supply issue, I rearranged my rig to use the risers, powered it up...and saw only about half of the hashrate I had previously been getting, with nvidia-smi indicating per-card power consumption fluctuating all over the place. BIOS settings are as recommended. I've updated the BIOS and redid the settings. I've tried slowing down the bus. I've tried plugging the risers into 1x slots instead of 16x. PCIe spread spectrum is still enabled; would disabling it likely help or hurt? Beyond that, the only thing I can think to try is to install Windows and see if it'll work any better, but then I'd have to port my mining switchers to it (if that'e even possible...are the command-line overclocking tools used by nvOC supported on Windows?). I got so fed up with it that I just put everything aside late yesterday evening and have been mining sweet bugger-all since. In other, other news, as an alternative to flaky USB PCIe risers that never seem to work right, I stumbled across this: http://shop.dmp.com.tw/INT/products/23Depending on how the module's configured, it might need a jumper removed to change the PCIe 1x connector at the end from a target to a host, but once that's done, I'm thinking a simple adapter board might be possible that would hold a GPU in one slot, this card in another, and some power-supply circuitry as appropriate. Add a MicroSD card and an Ethernet jack, and it'd potentially turn a GPU into a standalone miner. The module's also available by itself for inclusion in your own designs, which might be more appropriate in the long run...but just to see if the idea would work? If 128MB RAM isn't likely to be enough, the 1GB version is $15 more. It should run Linux without issue, and as long as nVidia and AMD avoid certain unsupported instructions (someone said CMOV isn't supported), their drivers ought to work on it. It's not particularly speedy, but it ought to be fast enough to keep a GPU fed with mining data.
|
|
|
|
TenaciousJ
|
|
August 03, 2017, 01:41:44 AM |
|
Hey FZ,
I've got a problem with log files - /var/log/kern.log and /var/log/syslog logs are filling up all available space (currently each file is 22.7gb - eating up 45gb of a 64gb thumb drive). All the errors in kern.log and syslog both seem to be of this type:
Aug 2 19:53:48 m1-desktop kernel: [ 103.261020] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261021] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261025] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261028] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261033] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261060] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261065] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261066] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261067] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261070] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261075] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261086] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261090] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261092] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261093] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261126] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261130] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261131] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261132] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261162] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261166] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261167] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261168] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261189] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261193] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261194] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261195] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261226] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261230] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261231] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261235] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261238] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261244] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261264] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261269] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261270] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261271] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261277] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261282] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261289] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
How can I either disable verbose logging for these errors or prevent the log files from getting so large? It's literally to the point where I can't even save oneBash changes because there isn't any room left.
|
|
|
|
salfter
|
|
August 03, 2017, 04:06:11 AM |
|
Aug 2 19:53:48 m1-desktop kernel: [ 103.261020] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261021] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261025] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261028] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261033] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261060] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261065] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261066] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261067] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261070] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261075] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261086] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
FWIW, I was sometimes seeing similar errors when I had my 1070s connected through risers. How's your rig set up?
|
|
|
|
TenaciousJ
|
|
August 03, 2017, 04:48:32 AM |
|
Aug 2 19:53:48 m1-desktop kernel: [ 103.261020] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261021] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261025] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261028] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261033] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261060] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261065] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261066] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261067] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261070] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261075] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261086] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
FWIW, I was sometimes seeing similar errors when I had my 1070s connected through risers. How's your rig set up? I'm running a TB250-Pro BTC 12gpu board,with mixed GPUs, and I can't figure out which one is throwing the error. I've got 2 1080 Ti, 1 980Ti, 5 1070s all on risers. It may be the 980Ti Hybrid - it's been failing while mining occasionally, causing all cards to start doing 0 sols... which creates another problem in that the watchdog script doesn't seem to detect the 0 sols because the power usage is still at full so it doesn't reset the system. I'm sure there's a command to search for what PCIe device is throwing the error, and I'm equally sure I don't know what it is lol. Maybe lspci would give me the actual card name of the device attached to that pcie slot ID #, but with only numbers i'm not sure how to track down the right card other than process of elimination which is a pretty big time sink.
|
|
|
|
papampi
Full Member
Offline
Activity: 686
Merit: 140
Linux FOREVER! Resistance is futile!!!
|
|
August 03, 2017, 05:11:55 AM Last edit: August 03, 2017, 11:07:14 AM by papampi |
|
Amazing Switch, Thanks a lot. So you wrote how to add new coins and miners, I wanted to know if its possible to add coins that dont have MPH pool, like LBRY or Decred ? Is it ok to add them to mph_conf.json with other pools like zpool or suprnova? can the script get their updated profit ratio ? Or it works only for coins that have MPH pool? The NiceHash and MiningPoolHub switchers base their decisions on information provided by the respective pools. MiningPoolHub doesn't support LBRY, so they provide no information on its profitability compared to the coins that they do support. I also wanted to support pools that will exchange mined altcoins for Bitcoin automatically. I wrote another switcher previously ( https://gitlab.com/salfter/CryptoSwitcher) that worked with any pool, though I don't think I had exchanges fully automated (and Cryptsy and BTC-e have both fallen by the wayside) and I don't recall how well it would've handled multi-algorithm mining as I was using SHA256 and Scrypt ASIC miners at the time (this was back when an Antminer S1 was useful as more than just a space heater ). Also, CryptoSwitcher used full-node coin daemons (bitcoind, litecoind, etc.) as an independent source of mining stats (and I might've had pools paying out to local wallets); having a bunch of those running chews up lots of RAM and disk I/O. Another question, when setting up server:port should I set auto switch port or normal port ? For example, ethereum/ethahsh has a 20535 port and an auto switch 17020 port, which one should be add to mph_conf.json ?
The MPH switcher builds miner commands from information provided by their API, including host and port numbers. It should automatically pick normal ports (such as 20535 for Ethereum). The only configuration you should need to do is in the first few lines of the config file...things like your username and miner name. If you benchmark the different algorithms on your cards, you could tweak the speed and power-consumption figures to match your system, though (especially if you're running 1070s) the numbers I put in are probably a good start. It would be nice if you could give us multi pool / multi coin profitability switch based on http://whattomine.com/coins.json So it switch to best coin from 1bash coins/pools/miners config file I'd then need to dig into the current exchanges' APIs and figure out how best to automate their usage in the current environment. I don't want a bunch of different altcoins hanging around. (...though I did find a non-trivial amount of NeosCoin in my wallet the other day that has shot up in value over the past couple of years or so since it was mined...might need to go trade it in.) Thanks a lot for all the detailed info Then if i'm not rude is it ok to ask for zpool.ca switch ?
|
|
|
|
od808
Newbie
Offline
Activity: 1
Merit: 0
|
|
August 03, 2017, 10:02:02 AM |
|
Hey FZ,
I've got a problem with log files - /var/log/kern.log and /var/log/syslog logs are filling up all available space (currently each file is 22.7gb - eating up 45gb of a 64gb thumb drive). All the errors in kern.log and syslog both seem to be of this type:
Aug 2 19:53:48 m1-desktop kernel: [ 103.261020] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261021] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261025] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261028] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261033] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261060] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261065] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261066] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261067] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261070] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261075] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261086] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261090] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261092] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261093] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261126] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261130] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261131] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261132] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261162] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261166] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261167] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261168] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261189] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261193] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261194] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261195] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261226] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261230] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261231] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261235] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261238] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261244] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261264] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261269] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261270] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261271] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261277] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261282] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261289] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
How can I either disable verbose logging for these errors or prevent the log files from getting so large? It's literally to the point where I can't even save oneBash changes because there isn't any room left.
I’m using TB250-Pro BTC with 1050Ti and 1070 and found same error. I fixed as below steps, $sudo cp -p /etc/default/grub /etc/default/grub.bk $sudo vi /etc/default/grub Add “pci=noaer” in grub GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=noaer" $sudo update-grub $sudo reboot The error message was gone. It’s seem no impact, my rig is normal running.
|
|
|
|
darkfortedx
Newbie
Offline
Activity: 9
Merit: 0
|
|
August 03, 2017, 02:56:33 PM |
|
I still get so many restarts on two of my rigs.
In the restart file it says
Sun Jul 23 09:46:05 EDT 2017 - Starting miner restart script. Sun Jul 23 11:23:18 EDT 2017 - Starting miner restart script. Sun Jul 23 11:29:41 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 11:32:27 EDT 2017 - Starting miner restart script. Sun Jul 23 15:51:03 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 15:54:19 EDT 2017 - Starting miner restart script. Sun Jul 23 15:57:39 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:00:35 EDT 2017 - Starting miner restart script. Sun Jul 23 16:04:25 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:07:43 EDT 2017 - Starting miner restart script. Sun Jul 23 16:11:29 EDT 2017 - Starting miner restart script. Sun Jul 23 16:19:42 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:22:23 EDT 2017 - Starting miner restart script Sun Jul 23 16:27:43 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:30:46 EDT 2017 - Starting miner restart script.
What are the OC settings? Seems like a softcrash on the GPUs without getting more information. The OC settings im using are 100/1050. I also tried 0/900
|
|
|
|
VoskCoin
|
|
August 03, 2017, 03:46:27 PM Last edit: August 03, 2017, 04:22:03 PM by VoskCoin |
|
Can you mine KMD with the current setup? I input the settings into the Zec line however it's not connecting to the pool
Do you plan to add it soon?
|
|
|
|
TenaciousJ
|
|
August 03, 2017, 04:50:57 PM |
|
Hey FZ,
I've got a problem with log files - /var/log/kern.log and /var/log/syslog logs are filling up all available space (currently each file is 22.7gb - eating up 45gb of a 64gb thumb drive). All the errors in kern.log and syslog both seem to be of this type:
Aug 2 19:53:48 m1-desktop kernel: [ 103.261020] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261021] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261025] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261028] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261033] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261060] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261065] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261066] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261067] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261070] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261075] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261086] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261090] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261092] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261093] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261126] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261130] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261131] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261132] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261162] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261166] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261167] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261168] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261189] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261193] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261194] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261195] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261226] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261230] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261231] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261235] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261238] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261244] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261264] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261269] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID) Aug 2 19:53:48 m1-desktop kernel: [ 103.261270] pcieport 0000:00:1d.0: device [8086:a299] error status/mask=00000001/00002000 Aug 2 19:53:48 m1-desktop kernel: [ 103.261271] pcieport 0000:00:1d.0: [ 0] Receiver Error Aug 2 19:53:48 m1-desktop kernel: [ 103.261277] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261282] pcieport 0000:00:1d.0: can't find device of ID00e8 Aug 2 19:53:48 m1-desktop kernel: [ 103.261289] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
How can I either disable verbose logging for these errors or prevent the log files from getting so large? It's literally to the point where I can't even save oneBash changes because there isn't any room left.
I’m using TB250-Pro BTC with 1050Ti and 1070 and found same error. I fixed as below steps, $sudo cp -p /etc/default/grub /etc/default/grub.bk $sudo vi /etc/default/grub Add “pci=noaer” in grub GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=noaer" $sudo update-grub $sudo reboot The error message was gone. It’s seem no impact, my rig is normal running. Thanks for the assist... I updated grub as you suggested and seems to be working normally
|
|
|
|
Nexillus
|
|
August 03, 2017, 06:59:23 PM |
|
I still get so many restarts on two of my rigs.
In the restart file it says
Sun Jul 23 09:46:05 EDT 2017 - Starting miner restart script. Sun Jul 23 11:23:18 EDT 2017 - Starting miner restart script. Sun Jul 23 11:29:41 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 11:32:27 EDT 2017 - Starting miner restart script. Sun Jul 23 15:51:03 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 15:54:19 EDT 2017 - Starting miner restart script. Sun Jul 23 15:57:39 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:00:35 EDT 2017 - Starting miner restart script. Sun Jul 23 16:04:25 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:07:43 EDT 2017 - Starting miner restart script. Sun Jul 23 16:11:29 EDT 2017 - Starting miner restart script. Sun Jul 23 16:19:42 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:22:23 EDT 2017 - Starting miner restart script Sun Jul 23 16:27:43 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:30:46 EDT 2017 - Starting miner restart script.
What are the OC settings? Seems like a softcrash on the GPUs without getting more information. The OC settings im using are 100/1050. I also tried 0/900 What is the powerlevel ?
|
|
|
|
lards
Newbie
Offline
Activity: 50
Merit: 0
|
|
August 03, 2017, 07:06:07 PM |
|
Hey Guys,
Have a problem with loading the OS as it tells me "xorg PROBLEM DETECTED" and then reboots and shows: error: unknown filesystem grab rescue>
What can it be and how can I solve this? Used flashing tools as described and tried it at least twice. I am using ASrock h110 and at the moment just one Manli P106-100 card just so I can test if I can install the OS before installing all 13 cards.
|
|
|
|
darkfortedx
Newbie
Offline
Activity: 9
Merit: 0
|
|
August 03, 2017, 07:23:02 PM |
|
I still get so many restarts on two of my rigs.
In the restart file it says
Sun Jul 23 09:46:05 EDT 2017 - Starting miner restart script. Sun Jul 23 11:23:18 EDT 2017 - Starting miner restart script. Sun Jul 23 11:29:41 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 11:32:27 EDT 2017 - Starting miner restart script. Sun Jul 23 15:51:03 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 15:54:19 EDT 2017 - Starting miner restart script. Sun Jul 23 15:57:39 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:00:35 EDT 2017 - Starting miner restart script. Sun Jul 23 16:04:25 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:07:43 EDT 2017 - Starting miner restart script. Sun Jul 23 16:11:29 EDT 2017 - Starting miner restart script. Sun Jul 23 16:19:42 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:22:23 EDT 2017 - Starting miner restart script Sun Jul 23 16:27:43 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:30:46 EDT 2017 - Starting miner restart script.
What are the OC settings? Seems like a softcrash on the GPUs without getting more information. The OC settings im using are 100/1050. I also tried 0/900 What is the powerlevel ? I left it with 125 , but i also tried 135 and i get same issue.
|
|
|
|
Nexillus
|
|
August 03, 2017, 08:19:29 PM |
|
I still get so many restarts on two of my rigs.
In the restart file it says
Sun Jul 23 09:46:05 EDT 2017 - Starting miner restart script. Sun Jul 23 11:23:18 EDT 2017 - Starting miner restart script. Sun Jul 23 11:29:41 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 11:32:27 EDT 2017 - Starting miner restart script. Sun Jul 23 15:51:03 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 15:54:19 EDT 2017 - Starting miner restart script. Sun Jul 23 15:57:39 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:00:35 EDT 2017 - Starting miner restart script. Sun Jul 23 16:04:25 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:07:43 EDT 2017 - Starting miner restart script. Sun Jul 23 16:11:29 EDT 2017 - Starting miner restart script. Sun Jul 23 16:19:42 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:22:23 EDT 2017 - Starting miner restart script Sun Jul 23 16:27:43 EDT 2017 - Utilization is too low: reviving did not work so restarting system in 10 seconds Sun Jul 23 16:30:46 EDT 2017 - Starting miner restart script.
What are the OC settings? Seems like a softcrash on the GPUs without getting more information. The OC settings im using are 100/1050. I also tried 0/900 What is the powerlevel ? Humm, I would physically check all the risers for both solid PCIe connection and power. I had one riser that I didn't snug all the way and caused a similar issue. As your OC and PL are either default or low enough that it shouldn't be causing an issue. I left it with 125 , but i also tried 135 and i get same issue.
|
|
|
|
|