nima
|
|
August 19, 2012, 08:35:26 AM |
|
Hi ckolivas, I have an issue with one of the rigs (2 x 5850), no matter which version of cgminer I use. GPU1 dies in matter of a couple of hours, rig does not respond, sometimes a hard reset is inevitable. The system runs: CentOS 6.3 kernel 2.6.32-279.5.1.el6.x86_64 glibc-2.12-1.80.el6_3.4.x86_64 glib2-2.22.5-7.el6.x86_64 gcc-4.4.6-4.el6.x86_64 AMD-APP-SDK-v2.5-lnx64 ati-driver-installer-11-12-x86.x86_64
I tried the latest cgminer-2.7.0, not using any optimizations at all except intensity set to 2. The following is grabbed from /var/log/messages. -------------------------------------------------------------------------------------------------------------------------------------- Aug 19 06:08:21 hostname kernel: [fglrx] ASIC hang happened Aug 19 06:08:21 hostname kernel: Pid: 3430, comm: cgminer Tainted: P --------------- 2.6.32-279.5.1.el6.x86_64 #1 Aug 19 06:08:21 hostname kernel: Call Trace: Aug 19 06:08:21 hostname kernel: [<ffffffffa0210c1e>] ? KCL_DEBUG_OsDump+0xe/0x10 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa021e24c>] ? firegl_hardwareHangRecovery+0x1c/0x50 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa02b3ad9>] ? _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x9/0x10 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa02b3a7c>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x9c/0xf0 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa02ae3bf>] ? _ZN4Asic19PM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0xaf/0x170 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa023ae62>] ? firegl_trace+0x72/0x1e0 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa023ae62>] ? firegl_trace+0x72/0x1e0 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa02a77a3>] ? _ZN15QS_PRIVATE_CORE27multiVpuPM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RIN G_+0x33/0x50 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa02a0044>] ? _Z19uQSTimeStampRetiredmjj14_LARGE_INTEGER+0x74/0x80 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa029becd>] ? _Z8uCWDDEQCmjjPvjS_+0x54d/0x10c0 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffff81097ede>] ? down+0x2e/0x50 Aug 19 06:08:21 hostname kernel: [<ffffffffa023d432>] ? firegl_cmmqs_CWDDE_32+0x332/0x440 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa023bd60>] ? firegl_cmmqs_CWDDE32+0x70/0x100 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa023bcf0>] ? firegl_cmmqs_CWDDE32+0x0/0x100 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffffa0219ded>] ? firegl_ioctl+0x1ed/0x250 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffff8104452c>] ? __do_page_fault+0x1ec/0x480 Aug 19 06:08:21 hostname kernel: [<ffffffffa020f93e>] ? ip_firegl_unlocked_ioctl+0xe/0x20 [fglrx] Aug 19 06:08:21 hostname kernel: [<ffffffff8118dff2>] ? vfs_ioctl+0x22/0xa0 Aug 19 06:08:21 hostname kernel: [<ffffffff8118e194>] ? do_vfs_ioctl+0x84/0x580 Aug 19 06:08:21 hostname kernel: [<ffffffff8118e711>] ? sys_ioctl+0x81/0xa0 Aug 19 06:08:21 hostname kernel: [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b Aug 19 06:08:21 hostname kernel: pubdev:0xffffffffa049ed20, num of device:2 , name:fglrx, major 8, minor 92. Aug 19 06:08:21 hostname kernel: device 0 : 0xffff88007a1b0000 . Aug 19 06:08:21 hostname kernel: Asic ID:0x6899, revision:0x2, MMIOReg:0xffffc90000340000. Aug 19 06:08:21 hostname kernel: FB phys addr: 0xd0000000, MC :0xf00000000, Total FB size :0x40000000. Aug 19 06:08:21 hostname kernel: gart table MC:0xf0fb27000, Physical:0xdfb27000, size:0x1d8000. Aug 19 06:08:21 hostname kernel: mc_node :FB, total 1 zones Aug 19 06:08:21 hostname kernel: MC start:0xf00000000, Physical:0xd0000000, size:0xfd00000. Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0xfb27000, reference count:21, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xfb27000, size:0x1d9000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: mc_node :INV_FB, total 1 zones Aug 19 06:08:21 hostname kernel: MC start:0xf0fd00000, Physical:0xdfd00000, size:0x30300000. Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: mc_node :GART_USWC, total 2 zones Aug 19 06:08:21 hostname kernel: MC start:0x260c0000, Physical:0x0, size:0x24c00000. Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x2000000, reference count:21, mapping count:0, Aug 19 06:08:21 hostname kernel: mc_node :GART_CACHEABLE, total 3 zones Aug 19 06:08:21 hostname kernel: MC start:0x10400000, Physical:0x0, size:0x15cc0000. Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1500000, size:0x200000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1400000, size:0x100000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1300000, size:0x100000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1200000, size:0x100000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1100000, size:0x100000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1000000, size:0x100000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xf00000, size:0x100000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xe00000, size:0x100000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xd00000, size:0x100000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xb00000, size:0x200000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x800000, size:0x300000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x500000, size:0x300000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x200000, size:0x300000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x200000, reference count:6, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: GRBM : 0xb0633828, SRBM : 0x20000ec0 . Aug 19 06:08:21 hostname kernel: CP_RB_BASE : 0x260c00, CP_RB_RPTR : 0x1920 , CP_RB_WPTR :0x1920. Aug 19 06:08:21 hostname kernel: CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x2686f000. Aug 19 06:08:21 hostname kernel: last submit IB buffer -- MC :0x2686f000,phys:0x7b4f5000. Aug 19 06:08:21 hostname kernel: device 1 : 0xffff88003759c000 . Aug 19 06:08:21 hostname kernel: Asic ID:0x6899, revision:0x2, MMIOReg:0xffffc90004980000. Aug 19 06:08:21 hostname kernel: FB phys addr: 0xe0000000, MC :0xf00000000, Total FB size :0x40000000. Aug 19 06:08:21 hostname kernel: gart table MC:0xf0fb27000, Physical:0xefb27000, size:0x1d8000. Aug 19 06:08:21 hostname kernel: mc_node :FB, total 1 zones Aug 19 06:08:21 hostname kernel: MC start:0xf00000000, Physical:0xe0000000, size:0xfd00000. Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0xfb27000, reference count:20, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xfb27000, size:0x1d9000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: mc_node :INV_FB, total 1 zones Aug 19 06:08:21 hostname kernel: MC start:0xf0fd00000, Physical:0xefd00000, size:0x30300000. Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: mc_node :GART_USWC, total 2 zones Aug 19 06:08:21 hostname kernel: MC start:0x260c0000, Physical:0x0, size:0x24c00000. Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x2000000, reference count:21, mapping count:0, Aug 19 06:08:21 hostname kernel: mc_node :GART_CACHEABLE, total 3 zones Aug 19 06:08:21 hostname kernel: MC start:0x10400000, Physical:0x0, size:0x15cc0000. Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2c00000, size:0x200000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2b00000, size:0x100000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2a00000, size:0x100000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2900000, size:0x100000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2800000, size:0x100000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2600000, size:0x200000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1d00000, size:0x900000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1400000, size:0x900000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xb00000, size:0x900000, reference count:3, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x200000, size:0x900000, reference count:2, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x200000, reference count:5, mapping count:0, Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0, Aug 19 06:08:21 hostname kernel: GRBM : 0xb0633828, SRBM : 0x200006c0 . Aug 19 06:08:21 hostname kernel: CP_RB_BASE : 0x260c00, CP_RB_RPTR : 0x6970 , CP_RB_WPTR :0x6990. Aug 19 06:08:21 hostname kernel: CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x262e7000. Aug 19 06:08:21 hostname kernel: last submit IB buffer -- MC :0x262e7000,phys:0x693c3000. Aug 19 06:08:21 hostname kernel: Dump the trace queue. Aug 19 06:08:21 hostname kernel: End of dump -----------------------------------------------------------------------------------------------------------------------------------
Can you see anything that causes the problem? Thx in advance.
|