Bitcoin Forum
May 08, 2024, 08:32:26 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Ubuntu 11.04: miners hang without reporting any error (SOLVED)  (Read 2735 times)
sunbird (OP)
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
October 22, 2011, 09:32:01 PM
Last edit: November 01, 2011, 01:50:23 AM by sunbird
 #1

I've got an ubuntu 11.04 install with ati 11.3 drivers and a 5970 and 4830.

aticonfig shows all cards and the I can interact with all cards.

However, when running any of the miners (poclbm, phoenix, cgminer) they just hang after pressing enter. No error is thrown. Sometimes they hang immediately, sometimes it'll connect to my localhost bitcoind and then hang.

I'm having trouble diagnosing the problem. I've tested the various components, ran memtest for days, ran cpuburn, etc.

Any ideas what could be wrong?

Thanks much!
1715157146
Hero Member
*
Offline Offline

Posts: 1715157146

View Profile Personal Message (Offline)

Ignore
1715157146
Reply with quote  #2

1715157146
Report to moderator
1715157146
Hero Member
*
Offline Offline

Posts: 1715157146

View Profile Personal Message (Offline)

Ignore
1715157146
Reply with quote  #2

1715157146
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
grndzero
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250


View Profile
October 24, 2011, 06:00:15 PM
 #2

I've got an ubuntu 11.04 install with ati 11.3 drivers and a 5970 and 4830.

aticonfig shows all cards and the I can interact with all cards.

However, when running any of the miners (poclbm, phoenix, cgminer) they just hang after pressing enter. No error is thrown. Sometimes they hang immediately, sometimes it'll connect to my localhost bitcoind and then hang.

I'm having trouble diagnosing the problem. I've tested the various components, ran memtest for days, ran cpuburn, etc.

Any ideas what could be wrong?

Thanks much!

You did not mention an SDK version.
Ubuntu 11.04 with 11.6 driver and 2.4 SDK are the sweet spot for me.

It does seem odd that the miner would hang instantly, there are some functions that are done before mining starts that take a few seconds. Did you install pyopencl and python-jsonrpc?

https://bitcointalk.org/index.php?topic=7514.msg110334#msg110334

Ubuntu Desktop x64 -  HD5850 Reference - 400Mh/s w/ cgminer  @ 975C/325M/1.175V - 11.6/2.1 SDK
Donate if you find this helpful: 1NimouHg2acbXNfMt5waJ7ohKs2TtYHePy
sunbird (OP)
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
October 26, 2011, 04:48:41 AM
 #3


You did not mention an SDK version.
Ubuntu 11.04 with 11.6 driver and 2.4 SDK are the sweet spot for me.

It does seem odd that the miner would hang instantly, there are some functions that are done before mining starts that take a few seconds. Did you install pyopencl and python-jsonrpc?

https://bitcointalk.org/index.php?topic=7514.msg110334#msg110334

Sorry, I should have put this in the OP. It is 11.6 and 2.4 SDK.

I took the machine apart entirely this past weekend and put it back together and now everything is working, at least for the moment. Hopefully it was just a loose connection somewhere in the box.
sunbird (OP)
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
October 27, 2011, 02:10:28 AM
 #4

Ah ha!

So, after running for about 36 hours, I was going to tweak the settings on one of the miners, but CTRL-C didn't do anything. I noticed that all the Mhash/sec figures were frozen. I rebooted, tried to start the miner again, and the machine froze up for a bit, and then syslog had this to say:

Code:
Oct 26 15:34:03 kernel: [  690.520008] [fglrx] ASIC hang happened
Oct 26 15:34:03 kernel: [  690.520020] Pid: 9047, comm: clinfo Tainted: P            2.6.38-12-generic #51-Ubuntu
Oct 26 15:34:03 kernel: [  690.520026] Call Trace:
Oct 26 15:34:03 kernel: [  690.520117]  [<ffffffffa076dd3e>] ? KCL_DEBUG_OsDump+0xe/0x10 [fglrx]
Oct 26 15:34:03 kernel: [  690.520196]  [<ffffffffa077b28c>] ? firegl_hardwareHangRecovery+0x1c/0x50 [fglrx]
Oct 26 15:34:03 kernel: [  690.520325]  [<ffffffffa0802519>] ? _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x9/0x10 [fglrx]
Oct 26 15:34:03 kernel: [  690.520448]  [<ffffffffa08024cc>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x6c/0xb0 [fglrx]
Oct 26 15:34:03 kernel: [  690.520572]  [<ffffffffa08081f2>] ? _ZN19mmEngineR600_DRMDMA4idleEv+0x72/0xc0 [fglrx]
Oct 26 15:34:03 kernel: [  690.520693]  [<ffffffffa07faf04>] ? _ZN14CMMHeapManager22freeAllExpiredTSMemoryEj+0x64/0xe0 [fglrx]
Oct 26 15:34:03 kernel: [  690.520816]  [<ffffffffa07fc3c6>] ? _ZN18mmEnginesContainer4idleEv+0x46/0x60 [fglrx]
Oct 26 15:34:03 kernel: [  690.520935]  [<ffffffffa07fa34d>] ? _ZN15QS_PRIVATE_CORE7idleAllE15idle_WaitMethod+0x2d/0x40 [fglrx]
Oct 26 15:34:03 kernel: [  690.521050]  [<ffffffffa07e4bc5>] ? _ZN3MSF19doGarbageCollectionEv+0x35/0x260 [fglrx]
Oct 26 15:34:03 kernel: [  690.521061]  [<ffffffff8108d81e>] ? down+0x2e/0x50
Oct 26 15:34:03 kernel: [  690.521127]  [<ffffffffa07698e6>] ? KCL_SPINLOCK_STATIC_Release+0x16/0x20 [fglrx]
Oct 26 15:34:03 kernel: [  690.521213]  [<ffffffffa079a1c2>] ? firegl_cmmqs_ProcessTerminate+0x32/0xc0 [fglrx]
Oct 26 15:34:03 kernel: [  690.521287]  [<ffffffffa0775068>] ? firegl_release_helper+0x3a8/0x6c0 [fglrx]
Oct 26 15:34:03 kernel: [  690.521362]  [<ffffffffa0776b50>] ? firegl_release+0x60/0x1c0 [fglrx]
Oct 26 15:34:03 kernel: [  690.521426]  [<ffffffffa0767d91>] ? ip_firegl_release+0x11/0x20 [fglrx]
Oct 26 15:34:03 kernel: [  690.521436]  [<ffffffff811661ee>] ? __fput+0xbe/0x200
Oct 26 15:34:03 kernel: [  690.521444]  [<ffffffff81166355>] ? fput+0x25/0x30
Oct 26 15:34:03 kernel: [  690.521451]  [<ffffffff81162c30>] ? filp_close+0x60/0x90
Oct 26 15:34:03 kernel: [  690.521461]  [<ffffffff81069e08>] ? put_files_struct+0x88/0xf0
Oct 26 15:34:03 kernel: [  690.521469]  [<ffffffff81069f34>] ? exit_files+0x54/0x70
Oct 26 15:34:03 kernel: [  690.521477]  [<ffffffff8106a425>] ? do_exit+0x175/0x410
Oct 26 15:34:03 kernel: [  690.521549]  [<ffffffffa076f6e3>] ? drm_free+0xf3/0x180 [fglrx]
Oct 26 15:34:03 kernel: [  690.521558]  [<ffffffff8106a878>] ? do_group_exit+0x58/0xd0
Oct 26 15:34:03 kernel: [  690.521566]  [<ffffffff8107abf7>] ? get_signal_to_deliver+0x247/0x410
Oct 26 15:34:03 kernel: [  690.521650]  [<ffffffffa0798170>] ? firegl_cmmqs_CWDDE32+0x0/0x100 [fglrx]
Oct 26 15:34:03 kernel: [  690.521658]  [<ffffffff8100b936>] ? do_signal+0x56/0x180
Oct 26 15:34:03 kernel: [  690.521723]  [<ffffffffa0767d6e>] ? ip_firegl_unlocked_ioctl+0xe/0x20 [fglrx]
Oct 26 15:34:03 kernel: [  690.521733]  [<ffffffff811764ef>] ? do_vfs_ioctl+0x8f/0x360
Oct 26 15:34:03 kernel: [  690.521740]  [<ffffffff8100bae5>] ? do_notify_resume+0x65/0x80
Oct 26 15:34:03 kernel: [  690.521748]  [<ffffffff81176851>] ? sys_ioctl+0x91/0xa0
Oct 26 15:34:03 kernel: [  690.521754]  [<ffffffff8100c2d0>] ? int_signal+0x12/0x17
Oct 26 15:34:03 kernel: [  690.521765] pubdev:0xffffffffa09b03c0, num of device:3 , name:fglrx, major 8, minor 86.
Oct 26 15:34:03 kernel: [  690.521772] device 0 : 0xffff880144430000 .
Oct 26 15:34:03 kernel: [  690.521778] Asic ID:0x689c, revision:0x2, MMIOReg:0xffffc90011140000.
Oct 26 15:34:03 kernel: [  690.521784] FB phys addr: 0xc0000000, MC :0xf00000000, Total FB size :0x40000000.
Oct 26 15:34:03 kernel: [  690.521791] gart table MC:0xf0f91f000, Physical:0xcf91f000, size:0x3e0000.
Oct 26 15:34:03 kernel: [  690.521798] mc_node :FB, total 1 zones
Oct 26 15:34:03 kernel: [  690.521803]     MC start:0xf00000000, Physical:0xc0000000, size:0xfd00000.
Oct 26 15:34:03 kernel: [  690.521811]     Mapped heap -- Offset:0x0, size:0xf91f000, reference count:16, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521818]     Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521825]     Mapped heap -- Offset:0xf91f000, size:0x3e1000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521832] mc_node :INV_FB, total 1 zones
Oct 26 15:34:03 kernel: [  690.521837]     MC start:0xf0fd00000, Physical:0xcfd00000, size:0x30300000.
Oct 26 15:34:03 kernel: [  690.521844]     Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521851] mc_node :GART_USWC, total 2 zones
Oct 26 15:34:03 kernel: [  690.521856]     MC start:0x3e750000, Physical:0x0, size:0x4d800000.
Oct 26 15:34:03 kernel: [  690.521863]     Mapped heap -- Offset:0x30000, size:0x2000000, reference count:14, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521869] mc_node :GART_CACHEABLE, total 3 zones
Oct 26 15:34:03 kernel: [  690.521875]     MC start:0x10400000, Physical:0x0, size:0x2e350000.
Oct 26 15:34:03 kernel: [  690.521881]     Mapped heap -- Offset:0x2600000, size:0x100000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521889]     Mapped heap -- Offset:0x1400000, size:0x900000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521897]     Mapped heap -- Offset:0xb00000, size:0x900000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521904]     Mapped heap -- Offset:0x200000, size:0x900000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521911]     Mapped heap -- Offset:0x0, size:0x200000, reference count:7, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521919]     Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.521928] GRBM : 0x3828, SRBM : 0x200000c0 .
Oct 26 15:34:03 kernel: [  690.521937] CP_RB_BASE : 0x3e7800, CP_RB_RPTR : 0x19dc0 , CP_RB_WPTR :0x19dc0.
Oct 26 15:34:03 kernel: [  690.521946] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x3eaa8000.
Oct 26 15:34:03 kernel: [  690.521953] last submit IB buffer -- MC :0x3eaa8000,phys:0x131ebc000.
Oct 26 15:34:03 kernel: [  690.521961] device 1 : 0xffff880145b14000 .
Oct 26 15:34:03 kernel: [  690.521967] Asic ID:0x689c, revision:0x2, MMIOReg:0xffffc90011180000.
Oct 26 15:34:03 kernel: [  690.521973] FB phys addr: 0xb0000000, MC :0xf00000000, Total FB size :0x40000000.
Oct 26 15:34:03 kernel: [  690.521979] gart table MC:0xf0f91f000, Physical:0xbf91f000, size:0x3e0000.
Oct 26 15:34:03 kernel: [  690.521985] mc_node :FB, total 1 zones
Oct 26 15:34:03 kernel: [  690.521990]     MC start:0xf00000000, Physical:0xb0000000, size:0xfd00000.
Oct 26 15:34:03 kernel: [  690.521997]     Mapped heap -- Offset:0x0, size:0xf91f000, reference count:10, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522004]     Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522010]     Mapped heap -- Offset:0xf91f000, size:0x3e1000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522017] mc_node :INV_FB, total 1 zones
Oct 26 15:34:03 kernel: [  690.522022]     MC start:0xf0fd00000, Physical:0xbfd00000, size:0x30300000.
Oct 26 15:34:03 kernel: [  690.522028]     Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522034] mc_node :GART_USWC, total 2 zones
Oct 26 15:34:03 kernel: [  690.522039]     MC start:0x3e750000, Physical:0x0, size:0x4d800000.
Oct 26 15:34:03 kernel: [  690.522045]     Mapped heap -- Offset:0x30000, size:0x2000000, reference count:10, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522052] mc_node :GART_CACHEABLE, total 3 zones
Oct 26 15:34:03 kernel: [  690.522057]     MC start:0x10400000, Physical:0x0, size:0x2e350000.
Oct 26 15:34:03 kernel: [  690.522063]     Mapped heap -- Offset:0x1d00000, size:0x900000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522070]     Mapped heap -- Offset:0x1400000, size:0x900000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522077]     Mapped heap -- Offset:0xb00000, size:0x900000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522084]     Mapped heap -- Offset:0x200000, size:0x900000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522091]     Mapped heap -- Offset:0x0, size:0x200000, reference count:4, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522098]     Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522106] GRBM : 0x3828, SRBM : 0x20000ac0 .
Oct 26 15:34:03 kernel: [  690.522113] CP_RB_BASE : 0x3e7800, CP_RB_RPTR : 0x5b0 , CP_RB_WPTR :0x5b0.
Oct 26 15:34:03 kernel: [  690.522121] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x3e8df000
Oct 26 15:34:03 kernel: [  690.522127] last submit IB buffer -- MC :0x3e8df000,phys:0x12ffd9000.
Oct 26 15:34:03 kernel: [  690.522135] device 2 : 0xffff880145b08000 .
Oct 26 15:34:03 kernel: [  690.522140] Asic ID:0x9440, revision:0x2, MMIOReg:0xffffc900111c0000.
Oct 26 15:34:03 kernel: [  690.522146] FB phys addr: 0xd0000000, MC :0xf00000000, Total FB size :0x40000000.
Oct 26 15:34:03 kernel: [  690.522152] gart table MC:0xf0fc1f000, Physical:0xdfc1f000, size:0x3e0000.
Oct 26 15:34:03 kernel: [  690.522158] mc_node :FB, total 1 zones
Oct 26 15:34:03 kernel: [  690.522162]     MC start:0xf00000000, Physical:0xd0000000, size:0x10000000.
Oct 26 15:34:03 kernel: [  690.522169]     Mapped heap -- Offset:0x0, size:0xfc1f000, reference count:11, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522176]     Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522183]     Mapped heap -- Offset:0xfc1f000, size:0x3e1000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522189] mc_node :INV_FB, total 1 zones
Oct 26 15:34:03 kernel: [  690.522194]     MC start:0xf10000000, Physical:0xe0000000, size:0x30000000.
Oct 26 15:34:03 kernel: [  690.522201]     Mapped heap -- Offset:0x2fffd000, size:0x3000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522207] mc_node :GART_USWC, total 2 zones
Oct 26 15:34:03 kernel: [  690.522211]     MC start:0x3e750000, Physical:0x0, size:0x4d800000.
Oct 26 15:34:03 kernel: [  690.522218]     Mapped heap -- Offset:0x30000, size:0x2000000, reference count:6, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522224] mc_node :GART_CACHEABLE, total 3 zones
Oct 26 15:34:03 kernel: [  690.522229]     MC start:0x10400000, Physical:0x0, size:0x2e350000.
Oct 26 15:34:03 kernel: [  690.522235]     Mapped heap -- Offset:0x1d00000, size:0x900000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522242]     Mapped heap -- Offset:0x1400000, size:0x900000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522249]     Mapped heap -- Offset:0xb00000, size:0x900000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522256]     Mapped heap -- Offset:0x200000, size:0x900000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522263]     Mapped heap -- Offset:0x0, size:0x200000, reference count:2, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522270]     Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
Oct 26 15:34:03 kernel: [  690.522278] GRBM : 0x3028, SRBM : 0x200000c0 .
Oct 26 15:34:03 kernel: [  690.522284] CP_RB_BASE : 0x3e7800, CP_RB_RPTR : 0x330 , CP_RB_WPTR :0x330.
Oct 26 15:34:03 kernel: [  690.522291] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x3e8bc000.
Oct 26 15:34:03 kernel: [  690.522297] last submit IB buffer -- MC :0x3e8bc000,phys:0x12daa5000.
Oct 26 15:34:03 kernel: [  690.522303] Dump the trace queue.
Oct 26 15:34:03 kernel: [  690.522307] End of dump

I've got a gigabyte board with 4 GB ram and 1x 5970 and 1x4830. The MB, ram, CPU are all new. I am also using a new SSD drive for this box.

I think I'll pull the 4830 and see if the problem persists.

I "fixed" it again by rebooting, running aticonfig -f --initial --adapter=all, rebooting and it plugged away for a bit more before freezing again.

One other thought, I'm running this with the latest updates in the 11.04 ubuntu tree, including the most recent kernel.

Thoughts?

*Edit*

I should have listed my clock settings, which I pulled from the mining hardware page. I have both the 5970 and the 4830 set to 850/300 for core/memory. In addition to trying to pull the 4830, I'll try mining at stock clocks to see if I can reproduce.
grndzero
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250


View Profile
October 28, 2011, 04:40:31 PM
 #5

Some thoughts:

Post your motherboard, cpu, power supply specs, video card model so people with them can offer insight.

Check which screensaver Unity is trying to use, on 11.04 it should just be a blank screen, but if it's trying to use something else it could be locking up the driver when it kicks in.

Some people have said that leaving a mouse plugged in can cause problems if it is constantly seeking it's position on the screen. Make sure the mouse isn't near the edge of the screen or try unplugging it.

Are you setting fan speed and checking temps?

What power supply are you using and what is it plugged into? You should probably have a minimum 650 watts for this machine.

Ubuntu Desktop x64 -  HD5850 Reference - 400Mh/s w/ cgminer  @ 975C/325M/1.175V - 11.6/2.1 SDK
Donate if you find this helpful: 1NimouHg2acbXNfMt5waJ7ohKs2TtYHePy
phorensic
Hero Member
*****
Offline Offline

Activity: 630
Merit: 500



View Profile
October 28, 2011, 05:17:04 PM
 #6

I found the guide in the cgminer documentation extremely useful on setting everything up.  If you follow it step by step you should have problems unless you have a hardware problem like others are mentioning.  Oh and make sure your fan speeds are set to manual above 65%.  Even if the card reports low temps the VRM's could be frying from low air movement across the entire PCB.  Causes a lockup within 5 mins for my cards if I don't set the fan speed high enough.
sunbird (OP)
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
October 29, 2011, 03:43:07 AM
 #7

Some thoughts:

Post your motherboard, cpu, power supply specs, video card model so people with them can offer insight.

Definitely should've put this info up front. Here it is:

  • mb: GIGABYTE GA-990XA-UD3
  • cpu: AMD Sempron 145
  • cpu cooler: Cooler Master RR-B10-212P-G1
  • Antec HCG-750 750W
  • ram: 4GB crucial
  • ssd: patriot PT232GS25SSDR 32GB
  • 1x5970
  • 1x4830

All above except the 5970 and 4830 are new. I got the two cards from different people and I can reproduce the problem with either unplugged from the mb, so I don't think either card is defective.
 
Check which screensaver Unity is trying to use, on 11.04 it should just be a blank screen, but if it's trying to use something else it could be locking up the driver when it kicks in.

This is a fresh 11.04 install, fully updated. I've never logged in on the gdm, only on terminal.

Some people have said that leaving a mouse plugged in can cause problems if it is constantly seeking it's position on the screen. Make sure the mouse isn't near the edge of the screen or try unplugging it.

Thanks for the suggestion, but no mouse here.

Are you setting fan speed and checking temps?

Before I attempt to start the miners, I set the fan at 50% and I have two 'watch' screen sessions running:

Code:
aticonfig --odgt --adapter=all

and

Code:
aticonfig --odgc --adapter=all

Both commands work fine. Temps have never been above 90 and usually in the low 80s. When I start the miner, the loads on the GPU are 0%. I have tried this on both phoenix 1.48 (using either poclbm or phatk) and poclbm, result is the same. When the miner hangs, the machine works fine and CPU shows minimal load. When it is running, I've gotten about 810 M/hash out of this setup.

What power supply are you using and what is it plugged into? You should probably have a minimum 650 watts for this machine.

I don't think it's a power supply issue. Today, I unplugged the 4830 but had the same issue with a single 5970. I've had the problem with the box plugged in at home and at work.

I found the guide in the cgminer documentation extremely useful on setting everything up.  If you follow it step by step you should have problems unless you have a hardware problem like others are mentioning.  Oh and make sure your fan speeds are set to manual above 65%.  Even if the card reports low temps the VRM's could be frying from low air movement across the entire PCB.  Causes a lockup within 5 mins for my cards if I don't set the fan speed high enough.

Thanks for the suggestion. I have gotten this setup running well and have kept a pretty close eye on the temps (monitoring both the GPUs and the CPU/MB temps). When it runs, it runs fine (for days). I am going to try reinstalling everything next to make sure this isn't the problem.
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
October 29, 2011, 03:51:44 AM
 #8

I should have listed my clock settings, which I pulled from the mining hardware page. I have both the 5970 and the 4830 set to 850/300 for core/memory. In addition to trying to pull the 4830, I'll try mining at stock clocks to see if I can reproduce.

When dealing with instabilities turn everything back to stock.  Mine with single card for 4+ hours.  If it doesn't lock up add a card, then the third card.  Given plenty of time to test between changes.  If it works w/ all 3 @ stock start overclocking.
grndzero
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250


View Profile
October 29, 2011, 08:24:32 AM
 #9


Check which screensaver Unity is trying to use, on 11.04 it should just be a blank screen, but if it's trying to use something else it could be locking up the driver when it kicks in.

This is a fresh 11.04 install, fully updated. I've never logged in on the gdm, only on terminal.


Have you tried logging in? I know gdm starts automatically, but I'm not sure if just having gdm running qualifies as having a sufficient X session running. My machine auto logs in my user through gdm, so I have never tried this.

Are you setting fan speed and checking temps?


Both commands work fine. Temps have never been above 90 and usually in the low 80s. When I start the miner, the loads on the GPU are 0%. I have tried this on both phoenix 1.48 (using either poclbm or phatk) and poclbm, result is the same. When the miner hangs, the machine works fine and CPU shows minimal load. When it is running, I've gotten about 810 M/hash out of this setup.


You could try ramping up the GPU fan to 80% and bring the temps down and see if that helps with stability. Yes, they will be noisy. Also consider adding an extra 120mm fan pointed at the input of the card.


What power supply are you using and what is it plugged into? You should probably have a minimum 650 watts for this machine.

I don't think it's a power supply issue. Today, I unplugged the 4830 but had the same issue with a single 5970. I've had the problem with the box plugged in at home and at work.


That power supply should be sufficient. I have run 3 x 5850's on it without a problem.

5970 and 4830 really are kind of an odd combination. With moderate overclocking and good cooling you can get 800-825Mh on just the 5970, I didn't find any specifics, but it looks like the 4830 would only run around 100-125 Mh.

Ubuntu Desktop x64 -  HD5850 Reference - 400Mh/s w/ cgminer  @ 975C/325M/1.175V - 11.6/2.1 SDK
Donate if you find this helpful: 1NimouHg2acbXNfMt5waJ7ohKs2TtYHePy
sunbird (OP)
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
November 01, 2011, 01:49:10 AM
 #10

Thanks for all of the great suggestions.

I've ended up purging and reinstalling the 11.6 ATI drivers, pulling the 4830, and modding my overclocking settings to (so far) fix the problem. I'm going to run with a single card constantly for a week or so and then slowly mod the overclock to see what I can squeeze out of this config.

Right now, I'm just running the 5970 at 800 GPU / 350 mem. I had originally been running it at 850/300 and I suspect that may have been the culprit as once I changed the settings, it started working fine.

Thanks again for the ideas and assistance.
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
November 01, 2011, 01:52:47 AM
 #11

Glad you got it stable.  Now working from stable base you can try modifying clocks, voltages, and adding cards.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!