sunbird (OP)
Newbie
Offline
Activity: 23
Merit: 0
|
|
October 27, 2011, 02:10:28 AM |
|
Ah ha! So, after running for about 36 hours, I was going to tweak the settings on one of the miners, but CTRL-C didn't do anything. I noticed that all the Mhash/sec figures were frozen. I rebooted, tried to start the miner again, and the machine froze up for a bit, and then syslog had this to say: Oct 26 15:34:03 kernel: [ 690.520008] [fglrx] ASIC hang happened Oct 26 15:34:03 kernel: [ 690.520020] Pid: 9047, comm: clinfo Tainted: P 2.6.38-12-generic #51-Ubuntu Oct 26 15:34:03 kernel: [ 690.520026] Call Trace: Oct 26 15:34:03 kernel: [ 690.520117] [<ffffffffa076dd3e>] ? KCL_DEBUG_OsDump+0xe/0x10 [fglrx] Oct 26 15:34:03 kernel: [ 690.520196] [<ffffffffa077b28c>] ? firegl_hardwareHangRecovery+0x1c/0x50 [fglrx] Oct 26 15:34:03 kernel: [ 690.520325] [<ffffffffa0802519>] ? _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x9/0x10 [fglrx] Oct 26 15:34:03 kernel: [ 690.520448] [<ffffffffa08024cc>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x6c/0xb0 [fglrx] Oct 26 15:34:03 kernel: [ 690.520572] [<ffffffffa08081f2>] ? _ZN19mmEngineR600_DRMDMA4idleEv+0x72/0xc0 [fglrx] Oct 26 15:34:03 kernel: [ 690.520693] [<ffffffffa07faf04>] ? _ZN14CMMHeapManager22freeAllExpiredTSMemoryEj+0x64/0xe0 [fglrx] Oct 26 15:34:03 kernel: [ 690.520816] [<ffffffffa07fc3c6>] ? _ZN18mmEnginesContainer4idleEv+0x46/0x60 [fglrx] Oct 26 15:34:03 kernel: [ 690.520935] [<ffffffffa07fa34d>] ? _ZN15QS_PRIVATE_CORE7idleAllE15idle_WaitMethod+0x2d/0x40 [fglrx] Oct 26 15:34:03 kernel: [ 690.521050] [<ffffffffa07e4bc5>] ? _ZN3MSF19doGarbageCollectionEv+0x35/0x260 [fglrx] Oct 26 15:34:03 kernel: [ 690.521061] [<ffffffff8108d81e>] ? down+0x2e/0x50 Oct 26 15:34:03 kernel: [ 690.521127] [<ffffffffa07698e6>] ? KCL_SPINLOCK_STATIC_Release+0x16/0x20 [fglrx] Oct 26 15:34:03 kernel: [ 690.521213] [<ffffffffa079a1c2>] ? firegl_cmmqs_ProcessTerminate+0x32/0xc0 [fglrx] Oct 26 15:34:03 kernel: [ 690.521287] [<ffffffffa0775068>] ? firegl_release_helper+0x3a8/0x6c0 [fglrx] Oct 26 15:34:03 kernel: [ 690.521362] [<ffffffffa0776b50>] ? firegl_release+0x60/0x1c0 [fglrx] Oct 26 15:34:03 kernel: [ 690.521426] [<ffffffffa0767d91>] ? ip_firegl_release+0x11/0x20 [fglrx] Oct 26 15:34:03 kernel: [ 690.521436] [<ffffffff811661ee>] ? __fput+0xbe/0x200 Oct 26 15:34:03 kernel: [ 690.521444] [<ffffffff81166355>] ? fput+0x25/0x30 Oct 26 15:34:03 kernel: [ 690.521451] [<ffffffff81162c30>] ? filp_close+0x60/0x90 Oct 26 15:34:03 kernel: [ 690.521461] [<ffffffff81069e08>] ? put_files_struct+0x88/0xf0 Oct 26 15:34:03 kernel: [ 690.521469] [<ffffffff81069f34>] ? exit_files+0x54/0x70 Oct 26 15:34:03 kernel: [ 690.521477] [<ffffffff8106a425>] ? do_exit+0x175/0x410 Oct 26 15:34:03 kernel: [ 690.521549] [<ffffffffa076f6e3>] ? drm_free+0xf3/0x180 [fglrx] Oct 26 15:34:03 kernel: [ 690.521558] [<ffffffff8106a878>] ? do_group_exit+0x58/0xd0 Oct 26 15:34:03 kernel: [ 690.521566] [<ffffffff8107abf7>] ? get_signal_to_deliver+0x247/0x410 Oct 26 15:34:03 kernel: [ 690.521650] [<ffffffffa0798170>] ? firegl_cmmqs_CWDDE32+0x0/0x100 [fglrx] Oct 26 15:34:03 kernel: [ 690.521658] [<ffffffff8100b936>] ? do_signal+0x56/0x180 Oct 26 15:34:03 kernel: [ 690.521723] [<ffffffffa0767d6e>] ? ip_firegl_unlocked_ioctl+0xe/0x20 [fglrx] Oct 26 15:34:03 kernel: [ 690.521733] [<ffffffff811764ef>] ? do_vfs_ioctl+0x8f/0x360 Oct 26 15:34:03 kernel: [ 690.521740] [<ffffffff8100bae5>] ? do_notify_resume+0x65/0x80 Oct 26 15:34:03 kernel: [ 690.521748] [<ffffffff81176851>] ? sys_ioctl+0x91/0xa0 Oct 26 15:34:03 kernel: [ 690.521754] [<ffffffff8100c2d0>] ? int_signal+0x12/0x17 Oct 26 15:34:03 kernel: [ 690.521765] pubdev:0xffffffffa09b03c0, num of device:3 , name:fglrx, major 8, minor 86. Oct 26 15:34:03 kernel: [ 690.521772] device 0 : 0xffff880144430000 . Oct 26 15:34:03 kernel: [ 690.521778] Asic ID:0x689c, revision:0x2, MMIOReg:0xffffc90011140000. Oct 26 15:34:03 kernel: [ 690.521784] FB phys addr: 0xc0000000, MC :0xf00000000, Total FB size :0x40000000. Oct 26 15:34:03 kernel: [ 690.521791] gart table MC:0xf0f91f000, Physical:0xcf91f000, size:0x3e0000. Oct 26 15:34:03 kernel: [ 690.521798] mc_node :FB, total 1 zones Oct 26 15:34:03 kernel: [ 690.521803] MC start:0xf00000000, Physical:0xc0000000, size:0xfd00000. Oct 26 15:34:03 kernel: [ 690.521811] Mapped heap -- Offset:0x0, size:0xf91f000, reference count:16, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521818] Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521825] Mapped heap -- Offset:0xf91f000, size:0x3e1000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521832] mc_node :INV_FB, total 1 zones Oct 26 15:34:03 kernel: [ 690.521837] MC start:0xf0fd00000, Physical:0xcfd00000, size:0x30300000. Oct 26 15:34:03 kernel: [ 690.521844] Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521851] mc_node :GART_USWC, total 2 zones Oct 26 15:34:03 kernel: [ 690.521856] MC start:0x3e750000, Physical:0x0, size:0x4d800000. Oct 26 15:34:03 kernel: [ 690.521863] Mapped heap -- Offset:0x30000, size:0x2000000, reference count:14, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521869] mc_node :GART_CACHEABLE, total 3 zones Oct 26 15:34:03 kernel: [ 690.521875] MC start:0x10400000, Physical:0x0, size:0x2e350000. Oct 26 15:34:03 kernel: [ 690.521881] Mapped heap -- Offset:0x2600000, size:0x100000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521889] Mapped heap -- Offset:0x1400000, size:0x900000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521897] Mapped heap -- Offset:0xb00000, size:0x900000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521904] Mapped heap -- Offset:0x200000, size:0x900000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521911] Mapped heap -- Offset:0x0, size:0x200000, reference count:7, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521919] Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.521928] GRBM : 0x3828, SRBM : 0x200000c0 . Oct 26 15:34:03 kernel: [ 690.521937] CP_RB_BASE : 0x3e7800, CP_RB_RPTR : 0x19dc0 , CP_RB_WPTR :0x19dc0. Oct 26 15:34:03 kernel: [ 690.521946] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x3eaa8000. Oct 26 15:34:03 kernel: [ 690.521953] last submit IB buffer -- MC :0x3eaa8000,phys:0x131ebc000. Oct 26 15:34:03 kernel: [ 690.521961] device 1 : 0xffff880145b14000 . Oct 26 15:34:03 kernel: [ 690.521967] Asic ID:0x689c, revision:0x2, MMIOReg:0xffffc90011180000. Oct 26 15:34:03 kernel: [ 690.521973] FB phys addr: 0xb0000000, MC :0xf00000000, Total FB size :0x40000000. Oct 26 15:34:03 kernel: [ 690.521979] gart table MC:0xf0f91f000, Physical:0xbf91f000, size:0x3e0000. Oct 26 15:34:03 kernel: [ 690.521985] mc_node :FB, total 1 zones Oct 26 15:34:03 kernel: [ 690.521990] MC start:0xf00000000, Physical:0xb0000000, size:0xfd00000. Oct 26 15:34:03 kernel: [ 690.521997] Mapped heap -- Offset:0x0, size:0xf91f000, reference count:10, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522004] Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522010] Mapped heap -- Offset:0xf91f000, size:0x3e1000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522017] mc_node :INV_FB, total 1 zones Oct 26 15:34:03 kernel: [ 690.522022] MC start:0xf0fd00000, Physical:0xbfd00000, size:0x30300000. Oct 26 15:34:03 kernel: [ 690.522028] Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522034] mc_node :GART_USWC, total 2 zones Oct 26 15:34:03 kernel: [ 690.522039] MC start:0x3e750000, Physical:0x0, size:0x4d800000. Oct 26 15:34:03 kernel: [ 690.522045] Mapped heap -- Offset:0x30000, size:0x2000000, reference count:10, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522052] mc_node :GART_CACHEABLE, total 3 zones Oct 26 15:34:03 kernel: [ 690.522057] MC start:0x10400000, Physical:0x0, size:0x2e350000. Oct 26 15:34:03 kernel: [ 690.522063] Mapped heap -- Offset:0x1d00000, size:0x900000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522070] Mapped heap -- Offset:0x1400000, size:0x900000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522077] Mapped heap -- Offset:0xb00000, size:0x900000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522084] Mapped heap -- Offset:0x200000, size:0x900000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522091] Mapped heap -- Offset:0x0, size:0x200000, reference count:4, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522098] Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522106] GRBM : 0x3828, SRBM : 0x20000ac0 . Oct 26 15:34:03 kernel: [ 690.522113] CP_RB_BASE : 0x3e7800, CP_RB_RPTR : 0x5b0 , CP_RB_WPTR :0x5b0. Oct 26 15:34:03 kernel: [ 690.522121] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x3e8df000 Oct 26 15:34:03 kernel: [ 690.522127] last submit IB buffer -- MC :0x3e8df000,phys:0x12ffd9000. Oct 26 15:34:03 kernel: [ 690.522135] device 2 : 0xffff880145b08000 . Oct 26 15:34:03 kernel: [ 690.522140] Asic ID:0x9440, revision:0x2, MMIOReg:0xffffc900111c0000. Oct 26 15:34:03 kernel: [ 690.522146] FB phys addr: 0xd0000000, MC :0xf00000000, Total FB size :0x40000000. Oct 26 15:34:03 kernel: [ 690.522152] gart table MC:0xf0fc1f000, Physical:0xdfc1f000, size:0x3e0000. Oct 26 15:34:03 kernel: [ 690.522158] mc_node :FB, total 1 zones Oct 26 15:34:03 kernel: [ 690.522162] MC start:0xf00000000, Physical:0xd0000000, size:0x10000000. Oct 26 15:34:03 kernel: [ 690.522169] Mapped heap -- Offset:0x0, size:0xfc1f000, reference count:11, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522176] Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522183] Mapped heap -- Offset:0xfc1f000, size:0x3e1000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522189] mc_node :INV_FB, total 1 zones Oct 26 15:34:03 kernel: [ 690.522194] MC start:0xf10000000, Physical:0xe0000000, size:0x30000000. Oct 26 15:34:03 kernel: [ 690.522201] Mapped heap -- Offset:0x2fffd000, size:0x3000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522207] mc_node :GART_USWC, total 2 zones Oct 26 15:34:03 kernel: [ 690.522211] MC start:0x3e750000, Physical:0x0, size:0x4d800000. Oct 26 15:34:03 kernel: [ 690.522218] Mapped heap -- Offset:0x30000, size:0x2000000, reference count:6, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522224] mc_node :GART_CACHEABLE, total 3 zones Oct 26 15:34:03 kernel: [ 690.522229] MC start:0x10400000, Physical:0x0, size:0x2e350000. Oct 26 15:34:03 kernel: [ 690.522235] Mapped heap -- Offset:0x1d00000, size:0x900000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522242] Mapped heap -- Offset:0x1400000, size:0x900000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522249] Mapped heap -- Offset:0xb00000, size:0x900000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522256] Mapped heap -- Offset:0x200000, size:0x900000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522263] Mapped heap -- Offset:0x0, size:0x200000, reference count:2, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522270] Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0, Oct 26 15:34:03 kernel: [ 690.522278] GRBM : 0x3028, SRBM : 0x200000c0 . Oct 26 15:34:03 kernel: [ 690.522284] CP_RB_BASE : 0x3e7800, CP_RB_RPTR : 0x330 , CP_RB_WPTR :0x330. Oct 26 15:34:03 kernel: [ 690.522291] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x3e8bc000. Oct 26 15:34:03 kernel: [ 690.522297] last submit IB buffer -- MC :0x3e8bc000,phys:0x12daa5000. Oct 26 15:34:03 kernel: [ 690.522303] Dump the trace queue. Oct 26 15:34:03 kernel: [ 690.522307] End of dump
I've got a gigabyte board with 4 GB ram and 1x 5970 and 1x4830. The MB, ram, CPU are all new. I am also using a new SSD drive for this box. I think I'll pull the 4830 and see if the problem persists. I "fixed" it again by rebooting, running aticonfig -f --initial --adapter=all, rebooting and it plugged away for a bit more before freezing again. One other thought, I'm running this with the latest updates in the 11.04 ubuntu tree, including the most recent kernel. Thoughts? *Edit* I should have listed my clock settings, which I pulled from the mining hardware page. I have both the 5970 and the 4830 set to 850/300 for core/memory. In addition to trying to pull the 4830, I'll try mining at stock clocks to see if I can reproduce.
|