I am so desperate that I will donate eth if i solve this....
6x 1070, asrock h81 btc, win 10, latest nvidia drivers, claymore 9.6, 1200w PSU, Virtual set to 16 gb.
Even at stock settings , the miner crashes, sometimes directly , and sometimes after 1-2 hours.
I have tried different nvidia drivers, and even tried previous claymore versions.
Tried swapping the cards around and setting in bios gen1/gen2 ( it should work out of the box )
I have connected the 2 molex connectors on the mobo and also having 6 powered usb risers.
Only time i can get it to go a bit longer is if i have it at stock, but change the Power target to around 60% for all cards.
I have a "wattman" and it says computer is taking 650w from the wall. but thats with low Power target, otherwise it is around 900.
When it crashes, 1 card Always dissapears , both from the miner and from Afterburner ( in AB its there but cant change anything because its grayed out ) resulting in having to reset PC.
Sometimes when it crashes , my monitor that is connected to 1 of the cards also shuts down.
Most of the times i get BSOD after the crash and if that doesnt happen, the whole computer freezes up.
I really want to Think there is enough Power to drive this with 1200w.
Edit : Just Before it crashes, my mouse curser is almost unmovable. Really laggy. But both cpu and memory are far from maxed out.
My cards temp are also not capping, just around 60.
Any ideas ?
My launcher :
timeout /t 60
setx GPU_FORCE_64BIT_PTR 0
setx GPU_MAX_HEAP_SIZE 100
setx GPU_USE_SYNC_OBJECTS 1
setx GPU_MAX_ALLOC_PERCENT 100
setx GPU_SINGLE_ALLOC_PERCENT 100
ethdcrminer64.exe -epool eu1.ethermine.org:4444 -ewal Adress.Miner01 -epsw x -mode 1 -tt 68 -allpools 1
pause
Log :
234 1678 NVML: cannot get current temperature, error 15
18:33:16:250 1678 NVML: cannot get fan speed, error 15
18:33:16:862 1780 GPU 1, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:447 33c GPU 5, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:455 167c GPU 0, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:463 f7c GPU 2, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:472 e60 GPU 3, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:481 10bc GPU 4, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:496 1780 GPU 1, GpuMiner kx failed 1
18:33:17:506 33c GPU 5, GpuMiner kx failed 1
18:33:17:516 167c GPU 0, GpuMiner kx failed 1
18:33:17:527 f7c GPU 2, GpuMiner kx failed 1
18:33:17:534 e60 GPU 3, GpuMiner kx failed 1
18:33:17:543 10bc GPU 4, GpuMiner kx failed 1
18:33:17:505 1780 Set global fail flag, failed GPU1
18:33:17:559 1780 GPU 1 failed
18:33:17:523 167c Set global fail flag, failed GPU0
18:33:17:582 167c GPU 0 failed
18:33:17:542 e60 Set global fail flag, failed GPU3
18:33:17:599 e60 GPU 3 failed
18:33:17:516 33c Set global fail flag, failed GPU5
18:33:17:627 33c GPU 5 failed
18:33:17:533 f7c Set global fail flag, failed GPU2
18:33:17:641 f7c GPU 2 failed
18:33:17:550 10bc Set global fail flag, failed GPU4
18:33:17:657 10bc GPU 4 failed
18:33:17:665 177c GPU 1, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:673 bc0 GPU 5, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:687 258 GPU 0, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:701 768 GPU 2, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:711 1170 GPU 3, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:718 e74 GPU 4, GpuMiner cu_k1 failed 6, the launch timed out and was terminated
18:33:17:730 177c GPU 1, GpuMiner kx failed 1
18:33:17:738 bc0 GPU 5, GpuMiner kx failed 1
18:33:17:746 258 GPU 0, GpuMiner kx failed 1
18:33:17:754 768 GPU 2, GpuMiner kx failed 1
18:33:17:773 1170 GPU 3, GpuMiner kx failed 1
18:33:17:780 e74 GPU 4, GpuMiner kx failed 1
18:33:17:737 177c Set global fail flag, failed GPU1
18:33:17:795 177c GPU 1 failed
18:33:17:752 258 Set global fail flag, failed GPU0
18:33:17:821 258 GPU 0 failed
18:33:17:779 1170 Set global fail flag, failed GPU3
18:33:17:841 1170 GPU 3 failed
18:33:17:745 bc0 Set global fail flag, failed GPU5
18:33:17:856 bc0 GPU 5 failed
18:33:17:788 e74 Set global fail flag, failed GPU4
18:33:17:872 e74 GPU 4 failed
18:33:17:771 768 Set global fail flag, failed GPU2
18:33:17:887 768 GPU 2 failed
18:33:19:613 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:19:629 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:19:738 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:19:754 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:19:848 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:19:848 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:19:957 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:19:957 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:20:051 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:20:051 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:20:145 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:20:145 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:20:238 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:20:238 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:20:332 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:20:332 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:20:426 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:20:426 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:20:520 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:20:520 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:20:613 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:20:613 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
18:33:20:707 1678 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
18:33:20:707 1678 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
1200W is definitely enough, it can easily take 1300W out of the wall.
I would suggest you to double check the basics:
- reset BIOS to default
- set PCI-E to Gen2 and disable built in video adapter
- do not use Molex on MB if you have powered risers
- setup default BIOS and settings on all cards.
- install all the latest updates for Windows (run>winver>must be 1703)
- uninstall with DDU and reinstall the drivers for Nvidia recommended by Claymore ("10xx cards in Windows 10 x64: just use latest 372.54 drivers from Nvidia website, note that you must have Win10 Anniversary update")
- setup RAM >4GB
- setup virtual memory > 16GB
- setup exclusions for AntiVirus software
- make sure that you have no more than 2 card per SATA PSU cable
- use only essential commands in config/bat file
- avoid using temperature or overclocking settings keys inside config/bat file
- run 1 card only
- use MSI AB to setup up minimal settings - decrease Core clock and memory clock
- if stable add a second card
- reinstall the driver
- setup up minimal settings - decrease Core clock and memory clock
- add more cards one by one
- if running stable on minimal settings increase each parameter one by one
- if stable you may try to overclock it
- use GPU-Z and HW info to monitor the hardware
Post screenshots here. We might notice something odd.