Bitcoin Forum
April 30, 2024, 04:07:11 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 [45] 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 ... 843 »
  Print  
Author Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.1  (Read 5805215 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic. (3 posts by 1+ user deleted.)
spiccioli
Legendary
*
Offline Offline

Activity: 1378
Merit: 1003

nec sine labore


View Profile
August 17, 2011, 08:27:37 AM
 #881

ckolivas,

sorry to report that on my system 1.5.6 declares my GPUs (a 5850 and a 5870) sick after a few minutes of work just as 1.5.3 did.

I'm on a linuxcoin 0.2a system booting from a USB key (with persistence), I don't think I have hardware problems since different miners work without problems.

best regards.

spiccioli.
1714493231
Hero Member
*
Offline Offline

Posts: 1714493231

View Profile Personal Message (Offline)

Ignore
1714493231
Reply with quote  #2

1714493231
Report to moderator
1714493231
Hero Member
*
Offline Offline

Posts: 1714493231

View Profile Personal Message (Offline)

Ignore
1714493231
Reply with quote  #2

1714493231
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
d3m0n1q_733rz
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250



View Profile WWW
August 17, 2011, 08:50:00 AM
 #882

ckolivas,

sorry to report that on my system 1.5.6 declares my GPUs (a 5850 and a 5870) sick after a few minutes of work just as 1.5.3 did.

I'm on a linuxcoin 0.2a system booting from a USB key (with persistence), I don't think I have hardware problems since different miners work without problems.

best regards.

spiccioli.

You know, this actually could be a problem with Linuxcoin.  It is still a beta after all.  You may want to try using another Linux distro on your USB and see if the problem persists.  If it doesn't, we may need to take a look at the repository of Linuxcoin and see which dependency is causing the issue before a work-around can be cooked-up.

Funroll_Loops, the theoretically quicker breakfast cereal!
Check out http://www.facebook.com/JupiterICT for all of your computing needs.  If you need it, we can get it.  We have solutions for your computing conundrums.  BTC accepted!  12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
-ck (OP)
Legendary
*
Offline Offline

Activity: 4088
Merit: 1631


Ruu \o/


View Profile WWW
August 17, 2011, 09:18:07 AM
 #883

The new kernel works things -even harder- so if you were getting sick GPUs before, you still will now. The difference is it should recover by itself. If you get sick GPUs all the time, I suggest dropping the intensity down till you find where they stop being sick.

Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel
2% Fee Solo mining at solo.ckpool.org
-ck
spiccioli
Legendary
*
Offline Offline

Activity: 1378
Merit: 1003

nec sine labore


View Profile
August 17, 2011, 09:33:08 AM
 #884

You know, this actually could be a problem with Linuxcoin.  It is still a beta after all.  You may want to try using another Linux distro on your USB and see if the problem persists.  If it doesn't, we may need to take a look at the repository of Linuxcoin and see which dependency is causing the issue before a work-around can be cooked-up.

Uhm,

I issued a

Code:
ldd cgminer

on my system and this is what I get, can this be of any help in figuring out what could be wrong?

Code:
	linux-vdso.so.1 =>  (0x00007fffd67ff000)
libcurl.so.4 => /usr/lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f67745aa000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f677438e000)
libOpenCL.so.1 => /opt/AMD-APP-SDK-v2.4-lnx64/lib/x86_64/libOpenCL.so.1 (0x00007f6774188000)
libncurses.so.5 => /lib/libncurses.so.5 (0x00007f6773f41000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6773bbe000)
libidn.so.11 => /usr/lib/libidn.so.11 (0x00007f677398a000)
libssh2.so.1 => /usr/lib/libssh2.so.1 (0x00007f6773765000)
liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x00007f6773557000)
libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0x00007f6773308000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f6773100000)
libgssapi_krb5.so.2 => /usr/lib/libgssapi_krb5.so.2 (0x00007f6772ec1000)
libssl.so.1.0.0 => /usr/lib/libssl.so.1.0.0 (0x00007f6772c6e000)
libcrypto.so.1.0.0 => /usr/lib/libcrypto.so.1.0.0 (0x00007f67728a8000)
librtmp.so.0 => /usr/lib/librtmp.so.0 (0x00007f6772690000)
libz.so.1 => /usr/lib/libz.so.1 (0x00007f6772478000)
libgnutls.so.26 => /usr/lib/libgnutls.so.26 (0x00007f67721ce000)
/lib64/ld-linux-x86-64.so.2 (0x00007f6774821000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f6771fca000)
libgcrypt.so.11 => /lib/x86_64-linux-gnu/libgcrypt.so.11 (0x00007f6771d50000)
libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f6771b3a000)
libsasl2.so.2 => /usr/lib/libsasl2.so.2 (0x00007f677191f000)
libkrb5.so.3 => /usr/lib/libkrb5.so.3 (0x00007f6771652000)
libk5crypto.so.3 => /usr/lib/libk5crypto.so.3 (0x00007f6771429000)
libcom_err.so.2 => /lib/libcom_err.so.2 (0x00007f6771226000)
libkrb5support.so.0 => /usr/lib/libkrb5support.so.0 (0x00007f677101d000)
libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f6770e1b000)
libtasn1.so.3 => /usr/lib/x86_64-linux-gnu/libtasn1.so.3 (0x00007f6770c0a000)
libgpg-error.so.0 => /lib/libgpg-error.so.0 (0x00007f6770a07000)

spiccioli

m3ta
Sr. Member
****
Offline Offline

Activity: 435
Merit: 250



View Profile WWW
August 17, 2011, 09:45:36 AM
 #885

@ck: Is anything like Phoenix's -s planned?

ie
 -s SOCKETNAME, --socketname=SOCKETNAME
                        full path to file for outputting current status.

That would be so delightfully cool... and useful. Smiley

Why the frell so many retards spell "ect" as an abbreviation of "Et Cetera"? "ETC", DAMMIT! http://en.wikipedia.org/wiki/Et_cetera

Host:/# rm -rf /var/forum/trolls
-ck (OP)
Legendary
*
Offline Offline

Activity: 4088
Merit: 1631


Ruu \o/


View Profile WWW
August 17, 2011, 09:49:08 AM
 #886

I've noticed something strange about the Windows version of cgminer.  With only GPU mining enabled, there are 6 instances of the program running; 5 of which are using 1,216K of memory and 0% CPU.  Is there a purpose for these threads or is your miner having accidental child processes?
They are most definitely there for a purpose. 2 per GPU, Workio thread, staging thread, longpoll thread, watchdog thread, gpu restart thread, cpu restart thread, and then there's a thread spawned (and then killed) for every getwork request and every submit work.

Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel
2% Fee Solo mining at solo.ckpool.org
-ck
spiccioli
Legendary
*
Offline Offline

Activity: 1378
Merit: 1003

nec sine labore


View Profile
August 17, 2011, 09:50:11 AM
 #887

The new kernel works things -even harder- so if you were getting sick GPUs before, you still will now. The difference is it should recover by itself. If you get sick GPUs all the time, I suggest dropping the intensity down till you find where they stop being sick.

ckolivas,

I was running it with

Code:
-I 8 -Q 3 -w 256

I've started it again without any parameter, it has lost a few MH/s but it has been running for enarly 20 minutes as I'm writing this...

Let's see if it was just a too high intensity.

thanks.

spiccioli
spiccioli
Legendary
*
Offline Offline

Activity: 1378
Merit: 1003

nec sine labore


View Profile
August 17, 2011, 10:10:02 AM
 #888

ckolivas,

I was running it with

Code:
-I 8 -Q 3 -w 256

I've started it again without any parameter, it has lost a few MH/s but it has been running for enarly 20 minutes as I'm writing this...

Let's see if it was just a too high intensity.

thanks.

spiccioli


No, died again.

Code:
 cgminer version 1.5.6 - Started: [2011-08-17 11:35:55]
-------------------------------------------------------------------------------- 
 [(5s):447.3  (avg):682.5 Mh/s] [Q:132  A:177  R:1  HW:0  E:134%  U:9.09/m]
 TQ: 0  ST: 3  LS: 0  SS: 0  DW: 15  NB: 3  LW: 83  LO: 0  RF: 0  I: 6
 Connected to http://mineco.in:3000/ with LP as user goldcoin.m1
 Block: 000003e91bc67a52ae351870f47438e3...  Started: [11:51:44]
-------------------------------------------------------------------------------- 
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0: [SICK / 259.9 Mh/s] [Q:27  A:63  R:0  HW:0  E:233%  U:3.24/m]
 GPU 1: [446.2 / 424.4 Mh/s] [Q:97  A:116  R:1  HW:0  E:120%  U:5.96/m]
-------------------------------------------------------------------------------- 

[2011-08-17 11:55:06] Accepted d7f8941b GPU 1 thread 1 pool 0
[2011-08-17 11:55:15] Accepted d9550e07 GPU 1 thread 3 pool 0
[2011-08-17 11:55:16] Accepted 835c880f GPU 1 thread 3 pool 0
[2011-08-17 11:55:19] Accepted 8fe0cbcc GPU 1 thread 1 pool 0
[2011-08-17 11:55:23] Accepted 12e6a406 GPU 1 thread 1 pool 0
[2011-08-17 11:55:25] Accepted 3b704c87 GPU 1 thread 3 pool 0
[2011-08-17 11:55:27] Thread 0 idle for more than 5 minutes, GPU 0 declared DEAD!

[2011-08-17 11:55:27] Attempting to restart GPU
[2011-08-17 11:55:27] Thread 0 still exists, killing it off
[2011-08-17 11:55:27] Thread 2 still exists, killing it off

It stayed like this for a little, than I saw a stack dump on screen flash for a little (I'm remote).

/var/messages contains


Code:
[42564.060008] BUG: soft lockup - CPU#0 stuck for 67s! [cgminer:22186]
[42564.060012] Modules linked in: bnep bluetooth rfkill fuse dm_crypt dm_mod snd_hda_codec_hdmi fglrx(P) snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq snd_timer snd_seq_device psmouse pcspkr evdev snd edac_core serio_raw k10temp edac_mce_amd soundcore snd_page_alloc i2c_piix4 i2c_core processor wmi button thermal_sys ext4 mbcache jbd2 crc16 uhci_hcd squashfs loop aufs(C) nls_utf8 nls_cp437 vfat fat sd_mod crc_t10dif ide_generic ide_core sg usb_storage ata_generic uas pata_atiixp ahci libahci ohci_hcd xhci_hcd ehci_hcd libata usbcore scsi_mod r8169 mii [last unloaded: scsi_wait_scan]
[42564.060041] CPU 0
[42564.060042] Modules linked in: bnep bluetooth rfkill fuse dm_crypt dm_mod snd_hda_codec_hdmi fglrx(P) snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq snd_timer snd_seq_device psmouse pcspkr evdev snd edac_core serio_raw k10temp edac_mce_amd soundcore snd_page_alloc i2c_piix4 i2c_core processor wmi button thermal_sys ext4 mbcache jbd2 crc16 uhci_hcd squashfs loop aufs(C) nls_utf8 nls_cp437 vfat fat sd_mod crc_t10dif ide_generic ide_core sg usb_storage ata_generic uas pata_atiixp ahci libahci ohci_hcd xhci_hcd ehci_hcd libata usbcore scsi_mod r8169 mii [last unloaded: scsi_wait_scan]
[42564.060064]
[42564.060066] Pid: 22186, comm: cgminer Tainted: P         C O 2.6.39-2-amd64 #1 MSI MS-7660/870A Fuzion (MS-7660)
[42564.060070] RIP: 0010:[<ffffffffa069b011>]  [<ffffffffa069b011>] _ZN15ExecutableUnits18isTimeStampExpiredER14_LARGE_INTEGER12_QS_CP_RING_+0x71/0x90 [fglrx]
[42564.060138] RSP: 0018:ffff88005b91fbd8  EFLAGS: 00000293
[42564.060140] RAX: ffffc9001162c000 RBX: ffffffffa0714730 RCX: ffffc90000379000
[42564.060142] RDX: 000000000029b3ec RSI: 0000000000000000 RDI: ffffc9001162bf40
[42564.060144] RBP: ffffc9001162bf40 R08: ffffffffa0714730 R09: ffff8800793a6008
[42564.060146] R10: ffff88004ad4c2b0 R11: 000000000029b3ef R12: ffffffff8133978e
[42564.060148] R13: 000000000029b3ef R14: ffff88005a062b90 R15: ffffc9001162bf40
[42564.060150] FS:  00007fe261422700(0000) GS:ffff88007ee00000(0000) knlGS:0000000000000000
[42564.060152] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42564.060154] CR2: 00007fe261420e10 CR3: 000000005b97d000 CR4: 00000000000006f0
[42564.060156] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[42564.060158] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[42564.060160] Process cgminer (pid: 22186, threadinfo ffff88005b91e000, task ffff88005b395820)
[42564.060162] Stack:
[42564.060163]  ffff88005b91fc58 0000000000000001 0000000000000000 0000000000000000
[42564.060167]  0000000000010000 ffffffffa069c53d 0000000000000202 ffff88005b91fc58
[42564.060169]  0000000100a1988c ffffffffa069c1ff ffff8800793a6008 0000000000000001
[42564.060172] Call Trace:
[42564.060218]  [<ffffffffa069c53d>] ? _ZN4Asic27ElapsedTS_PollingInfinitely19ConditionSuccessfulEv+0x2d/0x70 [fglrx]
[42564.060262]  [<ffffffffa069c1ff>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x1f/0xb0 [fglrx]
[42564.060306]  [<ffffffffa069a6bf>] ? _ZN4Asic19PM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0xaf/0x160 [fglrx]
[42564.060350]  [<ffffffffa0694023>] ? _ZN15QS_PRIVATE_CORE27multiVpuPM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0x33/0x50 [fglrx]
[42564.060395]  [<ffffffffa068d944>] ? _Z19uQSTimeStampRetiredmjj14_LARGE_INTEGER+0x74/0x80 [fglrx]
[42564.060438]  [<ffffffffa068984d>] ? _Z8uCWDDEQCmjjPvjS_+0x54d/0x10c0 [fglrx]
[42564.060468]  [<ffffffffa0633632>] ? firegl_cmmqs_CWDDE_32+0x332/0x440 [fglrx]
[42564.060498]  [<ffffffffa0631f60>] ? firegl_cmmqs_CWDDE32+0x70/0x100 [fglrx]
[42564.060528]  [<ffffffffa0631ef0>] ? firegl_cmmqs_createdriver+0x170/0x170 [fglrx]
[42564.060555]  [<ffffffffa0610c18>] ? firegl_ioctl+0x1e8/0x250 [fglrx]
[42564.060559]  [<ffffffff810d6fa6>] ? vma_merge+0x1ef/0x34a
[42564.060585]  [<ffffffffa0607352>] ? ip_firegl_unlocked_ioctl+0x9/0xd [fglrx]
[42564.060588]  [<ffffffff8110899d>] ? do_vfs_ioctl+0x445/0x492
[42564.060591]  [<ffffffff810d75ed>] ? do_brk+0x2ca/0x326
[42564.060595]  [<ffffffff813314f5>] ? schedule+0x5a8/0x5d5
[42564.060598]  [<ffffffff81108a35>] ? sys_ioctl+0x4b/0x72
[42564.060601]  [<ffffffff81338dd2>] ? system_call_fastpath+0x16/0x1b
[42564.060602] Code: 3b 16 7c 27 41 bd 01 00 00 00 41 0f b6 c5 48 8b 1c 24 48 8b 6c 24 08 4c 8b 64 24 10 4c 8b 6c 24 18 4c 8b 74 24 20 48 83 c4 28 c3 <4c> 8b 45 00 44 89 e6 48 89 ef 41 ff 50 48 48 8b b3 f0 00 00 00
[42564.060622] Call Trace:
[42564.060665]  [<ffffffffa069c53d>] ? _ZN4Asic27ElapsedTS_PollingInfinitely19ConditionSuccessfulEv+0x2d/0x70 [fglrx]
[42564.060708]  [<ffffffffa069c1ff>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x1f/0xb0 [fglrx]
[42564.060751]  [<ffffffffa069a6bf>] ? _ZN4Asic19PM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0xaf/0x160 [fglrx]
[42564.060796]  [<ffffffffa0694023>] ? _ZN15QS_PRIVATE_CORE27multiVpuPM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0x33/0x50 [fglrx]
[42564.060839]  [<ffffffffa068d944>] ? _Z19uQSTimeStampRetiredmjj14_LARGE_INTEGER+0x74/0x80 [fglrx]
[42564.060882]  [<ffffffffa068984d>] ? _Z8uCWDDEQCmjjPvjS_+0x54d/0x10c0 [fglrx]
[42564.060912]  [<ffffffffa0633632>] ? firegl_cmmqs_CWDDE_32+0x332/0x440 [fglrx]
[42564.060942]  [<ffffffffa0631f60>] ? firegl_cmmqs_CWDDE32+0x70/0x100 [fglrx]
[42564.060972]  [<ffffffffa0631ef0>] ? firegl_cmmqs_createdriver+0x170/0x170 [fglrx]
[42564.060998]  [<ffffffffa0610c18>] ? firegl_ioctl+0x1e8/0x250 [fglrx]
[42564.061001]  [<ffffffff810d6fa6>] ? vma_merge+0x1ef/0x34a
[42564.061026]  [<ffffffffa0607352>] ? ip_firegl_unlocked_ioctl+0x9/0xd [fglrx]
[42564.061029]  [<ffffffff8110899d>] ? do_vfs_ioctl+0x445/0x492
[42564.061031]  [<ffffffff810d75ed>] ? do_brk+0x2ca/0x326
[42564.061034]  [<ffffffff813314f5>] ? schedule+0x5a8/0x5d5
[42564.061036]  [<ffffffff81108a35>] ? sys_ioctl+0x4b/0x72
[42564.061039]  [<ffffffff81338dd2>] ? system_call_fastpath+0x16/0x1b
[42648.060005] BUG: soft lockup - CPU#0 stuck for 67s! [cgminer:22186]
[42648.060007] Modules linked in: bnep bluetooth rfkill fuse dm_crypt dm_mod snd_hda_codec_hdmi fglrx(P) snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq snd_timer snd_seq_device psmouse pcspkr evdev snd edac_core serio_raw k10temp edac_mce_amd soundcore snd_page_alloc i2c_piix4 i2c_core processor wmi button thermal_sys ext4 mbcache jbd2 crc16 uhci_hcd squashfs loop aufs(C) nls_utf8 nls_cp437 vfat fat sd_mod crc_t10dif ide_generic ide_core sg usb_storage ata_generic uas pata_atiixp ahci libahci ohci_hcd xhci_hcd ehci_hcd libata usbcore scsi_mod r8169 mii [last unloaded: scsi_wait_scan]
[42648.060029] CPU 0
[42648.060030] Modules linked in: bnep bluetooth rfkill fuse dm_crypt dm_mod snd_hda_codec_hdmi fglrx(P) snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq snd_timer snd_seq_device psmouse pcspkr evdev snd edac_core serio_raw k10temp edac_mce_amd soundcore snd_page_alloc i2c_piix4 i2c_core processor wmi button thermal_sys ext4 mbcache jbd2 crc16 uhci_hcd squashfs loop aufs(C) nls_utf8 nls_cp437 vfat fat sd_mod crc_t10dif ide_generic ide_core sg usb_storage ata_generic uas pata_atiixp ahci libahci ohci_hcd xhci_hcd ehci_hcd libata usbcore scsi_mod r8169 mii [last unloaded: scsi_wait_scan]
[42648.060051]
[42648.060053] Pid: 22186, comm: cgminer Tainted: P         C O 2.6.39-2-amd64 #1 MSI MS-7660/870A Fuzion (MS-7660)
[42648.060056] RIP: 0010:[<ffffffffa069af20>]  [<ffffffffa069af20>] _ZN15ExecutableUnits18submitListInternalEP9QS_CLIENTP13_QS_PARAM_WA_P10QS_IBUFFER20_QS_SUBMISSION_FLAGS+0x1a0/0x1a0 [fglrx]
[42648.060101] RSP: 0018:ffff88005b91fbd0  EFLAGS: 00000293
[42648.060103] RAX: ffffc9001162c000 RBX: ffffffffa0714730 RCX: ffffc90000379000
[42648.060105] RDX: 000000000029b3ec RSI: 0000000000000000 RDI: ffffc9001162bf40
[42648.060107] RBP: ffffc9001162bf40 R08: ffffffffa0714730 R09: ffff8800793a6008
[42648.060109] R10: ffff88004ad4c2b0 R11: 000000000029b3ef R12: ffffffff8133978e
[42648.060111] R13: 000000000029b3ef R14: ffff88005a062b90 R15: ffffc9001162bf40
[42648.060113] FS:  00007fe261422700(0000) GS:ffff88007ee00000(0000) knlGS:0000000000000000
[42648.060115] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42648.060117] CR2: 00007fe261420e10 CR3: 000000005b97d000 CR4: 00000000000006f0
[42648.060119] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[42648.060121] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[42648.060123] Process cgminer (pid: 22186, threadinfo ffff88005b91e000, task ffff88005b395820)
[42648.060125] Stack:
[42648.060126]  ffffffffa069b01f ffff88005b91fc58 0000000000000001 0000000000000000
[42648.060129]  0000000000000000 0000000000010000 ffffffffa069c53d 0000000000000292
[42648.060131]  ffff88005b91fc58 0000000100a1988c ffffffffa069c1ff ffff8800793a6008
[42648.060134] Call Trace:
[42648.060177]  [<ffffffffa069b01f>] ? _ZN15ExecutableUnits18isTimeStampExpiredER14_LARGE_INTEGER12_QS_CP_RING_+0x7f/0x90 [fglrx]
[42648.060221]  [<ffffffffa069c53d>] ? _ZN4Asic27ElapsedTS_PollingInfinitely19ConditionSuccessfulEv+0x2d/0x70 [fglrx]
[42648.060265]  [<ffffffffa069c1ff>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x1f/0xb0 [fglrx]
[42648.060308]  [<ffffffffa069a6bf>] ? _ZN4Asic19PM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0xaf/0x160 [fglrx]
[42648.060352]  [<ffffffffa0694023>] ? _ZN15QS_PRIVATE_CORE27multiVpuPM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0x33/0x50 [fglrx]
[42648.060396]  [<ffffffffa068d944>] ? _Z19uQSTimeStampRetiredmjj14_LARGE_INTEGER+0x74/0x80 [fglrx]
[42648.060439]  [<ffffffffa068984d>] ? _Z8uCWDDEQCmjjPvjS_+0x54d/0x10c0 [fglrx]
[42648.060469]  [<ffffffffa0633632>] ? firegl_cmmqs_CWDDE_32+0x332/0x440 [fglrx]
[42648.060499]  [<ffffffffa0631f60>] ? firegl_cmmqs_CWDDE32+0x70/0x100 [fglrx]
[42648.060529]  [<ffffffffa0631ef0>] ? firegl_cmmqs_createdriver+0x170/0x170 [fglrx]
[42648.060556]  [<ffffffffa0610c18>] ? firegl_ioctl+0x1e8/0x250 [fglrx]
[42648.060558]  [<ffffffff810d6fa6>] ? vma_merge+0x1ef/0x34a
[42648.060583]  [<ffffffffa0607352>] ? ip_firegl_unlocked_ioctl+0x9/0xd [fglrx]
[42648.060586]  [<ffffffff8110899d>] ? do_vfs_ioctl+0x445/0x492
[42648.060589]  [<ffffffff810d75ed>] ? do_brk+0x2ca/0x326
[42648.060591]  [<ffffffff813314f5>] ? schedule+0x5a8/0x5d5
[42648.060594]  [<ffffffff81108a35>] ? sys_ioctl+0x4b/0x72
[42648.060596]  [<ffffffff81338dd2>] ? system_call_fastpath+0x16/0x1b
[42648.060598] Code: 4c 8b 27 41 ff 94 24 b0 01 00 00 eb 90 48 8b 7b 30 4c 8b 3f 41 ff 97 90 01 00 00 e9 61 ff ff ff 90 66 66 66 90 66 66 90 66 66 90
[42648.060611]  83 ec 18 48 89 5c 24 08 48 89 6c 24 10 48 89 fb 48 8b 4f 20
[42648.060618] Call Trace:
[42648.060661]  [<ffffffffa069b01f>] ? _ZN15ExecutableUnits18isTimeStampExpiredER14_LARGE_INTEGER12_QS_CP_RING_+0x7f/0x90 [fglrx]
[42648.060704]  [<ffffffffa069c53d>] ? _ZN4Asic27ElapsedTS_PollingInfinitely19ConditionSuccessfulEv+0x2d/0x70 [fglrx]
[42648.060748]  [<ffffffffa069c1ff>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x1f/0xb0 [fglrx]
[42648.060791]  [<ffffffffa069a6bf>] ? _ZN4Asic19PM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0xaf/0x160 [fglrx]
[42648.060835]  [<ffffffffa0694023>] ? _ZN15QS_PRIVATE_CORE27multiVpuPM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0x33/0x50 [fglrx]
[42648.060879]  [<ffffffffa068d944>] ? _Z19uQSTimeStampRetiredmjj14_LARGE_INTEGER+0x74/0x80 [fglrx]
[42648.060922]  [<ffffffffa068984d>] ? _Z8uCWDDEQCmjjPvjS_+0x54d/0x10c0 [fglrx]
[42648.060952]  [<ffffffffa0633632>] ? firegl_cmmqs_CWDDE_32+0x332/0x440 [fglrx]
[42648.060982]  [<ffffffffa0631f60>] ? firegl_cmmqs_CWDDE32+0x70/0x100 [fglrx]
[42648.061012]  [<ffffffffa0631ef0>] ? firegl_cmmqs_createdriver+0x170/0x170 [fglrx]
[42648.061038]  [<ffffffffa0610c18>] ? firegl_ioctl+0x1e8/0x250 [fglrx]
[42648.061041]  [<ffffffff810d6fa6>] ? vma_merge+0x1ef/0x34a
[42648.061066]  [<ffffffffa0607352>] ? ip_firegl_unlocked_ioctl+0x9/0xd [fglrx]
[42648.061069]  [<ffffffff8110899d>] ? do_vfs_ioctl+0x445/0x492
[42648.061071]  [<ffffffff810d75ed>] ? do_brk+0x2ca/0x326
[42648.061074]  [<ffffffff813314f5>] ? schedule+0x5a8/0x5d5
[42648.061077]  [<ffffffff81108a35>] ? sys_ioctl+0x4b/0x72
[42648.061079]  [<ffffffff81338dd2>] ? system_call_fastpath+0x16/0x1b
[42661.396178] [fglrx] ASIC hang happened
[42661.396180] Pid: 22186, comm: cgminer Tainted: P         C O 2.6.39-2-amd64 #1
[42661.396182] Call Trace:
[42661.396209]  [<ffffffffa061500c>] ? firegl_hardwareHangRecovery+0x1c/0x50 [fglrx]
[42661.396252]  [<ffffffffa069c299>] ? _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x9/0x10 [fglrx]
[42661.396296]  [<ffffffffa069c24c>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x6c/0xb0 [fglrx]
[42661.396339]  [<ffffffffa069a6bf>] ? _ZN4Asic19PM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0xaf/0x160 [fglrx]
[42661.396383]  [<ffffffffa0694023>] ? _ZN15QS_PRIVATE_CORE27multiVpuPM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0x33/0x50 [fglrx]
[42661.396427]  [<ffffffffa068d944>] ? _Z19uQSTimeStampRetiredmjj14_LARGE_INTEGER+0x74/0x80 [fglrx]
[42661.396470]  [<ffffffffa068984d>] ? _Z8uCWDDEQCmjjPvjS_+0x54d/0x10c0 [fglrx]
[42661.396500]  [<ffffffffa0633632>] ? firegl_cmmqs_CWDDE_32+0x332/0x440 [fglrx]
[42661.396530]  [<ffffffffa0631f60>] ? firegl_cmmqs_CWDDE32+0x70/0x100 [fglrx]
[42661.396560]  [<ffffffffa0631ef0>] ? firegl_cmmqs_createdriver+0x170/0x170 [fglrx]
[42661.396586]  [<ffffffffa0610c18>] ? firegl_ioctl+0x1e8/0x250 [fglrx]
[42661.396589]  [<ffffffff810d6fa6>] ? vma_merge+0x1ef/0x34a
[42661.396614]  [<ffffffffa0607352>] ? ip_firegl_unlocked_ioctl+0x9/0xd [fglrx]
[42661.396617]  [<ffffffff8110899d>] ? do_vfs_ioctl+0x445/0x492
[42661.396619]  [<ffffffff810d75ed>] ? do_brk+0x2ca/0x326
[42661.396622]  [<ffffffff813314f5>] ? schedule+0x5a8/0x5d5
[42661.396625]  [<ffffffff81108a35>] ? sys_ioctl+0x4b/0x72
[42661.396627]  [<ffffffff81338dd2>] ? system_call_fastpath+0x16/0x1b
[42661.396631] pubdev:0xffffffffa084fa70, num of device:2 , name:fglrx, major 8, minor 86.
[42661.396634] device 0 : 0xffff880079560000 .
[42661.396636] Asic ID:0x6898, revision:0x2, MMIOReg:0xffffc90010cc0000.
[42661.396638] FB phys addr: 0xd0000000, MC :0xf00000000, Total FB size :0x40000000.
[42661.396641] gart table MC:0xf0fb07000, Physical:0xdfb07000, size:0x1f8000.
[42661.396644] mc_node :FB, total 1 zones
[42661.396646]     MC start:0xf00000000, Physical:0xd0000000, size:0xfd00000.
[42661.396649]     Mapped heap -- Offset:0x0, size:0xfb07000, reference count:11, mapping count:0,
[42661.396652]     Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
[42661.396655]     Mapped heap -- Offset:0xfb07000, size:0x1f9000, reference count:1, mapping count:0,
[42661.396658] mc_node :INV_FB, total 1 zones
[42661.396660]     MC start:0xf0fd00000, Physical:0xdfd00000, size:0x30300000.
[42661.396663]     Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0,
[42661.396665] mc_node :GART_USWC, total 2 zones
[42661.396667]     MC start:0x27a40000, Physical:0x0, size:0x27400000.
[42661.396670]     Mapped heap -- Offset:0x0, size:0x2000000, reference count:18, mapping count:0,
[42661.396672] mc_node :GART_CACHEABLE, total 3 zones
[42661.396675]     MC start:0x10400000, Physical:0x0, size:0x17640000.
[42661.396677]     Mapped heap -- Offset:0xb00000, size:0x200000, reference count:1, mapping count:0,
[42661.396681]     Mapped heap -- Offset:0xa00000, size:0x100000, reference count:1, mapping count:0,
[42661.396684]     Mapped heap -- Offset:0x900000, size:0x100000, reference count:1, mapping count:0,
[42661.396687]     Mapped heap -- Offset:0x800000, size:0x100000, reference count:2, mapping count:0,
[42661.396690]     Mapped heap -- Offset:0x700000, size:0x100000, reference count:1, mapping count:0,
[42661.396693]     Mapped heap -- Offset:0x600000, size:0x100000, reference count:2, mapping count:0,
[42661.396696]     Mapped heap -- Offset:0x500000, size:0x100000, reference count:1, mapping count:0,
[42661.396699]     Mapped heap -- Offset:0x400000, size:0x100000, reference count:2, mapping count:0,
[42661.396702]     Mapped heap -- Offset:0x300000, size:0x100000, reference count:1, mapping count:0,
[42661.396705]     Mapped heap -- Offset:0x200000, size:0x100000, reference count:2, mapping count:0,
[42661.396708]     Mapped heap -- Offset:0x0, size:0x200000, reference count:3, mapping count:0,
[42661.396711]     Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
[42661.396716] GRBM : 0x3828, SRBM : 0x200000c0 .
[42661.396722] CP_RB_BASE : 0x27a400, CP_RB_RPTR : 0x3d330 , CP_RB_WPTR :0x3d330.
[42661.396727] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x28107000.
[42661.396730] last submit IB buffer -- MC :0x28107000,phys:0x6a155000.
[42661.396733] device 1 : 0xffff880079344000 .
[42661.396735] Asic ID:0x6899, revision:0x2, MMIOReg:0xffffc90010940000.
[42661.396737] FB phys addr: 0xc0000000, MC :0xf00000000, Total FB size :0x40000000.
[42661.396740] gart table MC:0xf0fb07000, Physical:0xcfb07000, size:0x1f8000.
[42661.396742] mc_node :FB, total 1 zones
[42661.396744]     MC start:0xf00000000, Physical:0xc0000000, size:0xfd00000.
[42661.396747]     Mapped heap -- Offset:0x0, size:0xfb07000, reference count:13, mapping count:0,
[42661.396750]     Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
[42661.396753]     Mapped heap -- Offset:0xfb07000, size:0x1f9000, reference count:1, mapping count:0,
[42661.396755] mc_node :INV_FB, total 1 zones
[42661.396757]     MC start:0xf0fd00000, Physical:0xcfd00000, size:0x30300000.
[42661.396760]     Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0,
[42661.396763] mc_node :GART_USWC, total 2 zones
[42661.396765]     MC start:0x27a40000, Physical:0x0, size:0x27400000.
[42661.396767]     Mapped heap -- Offset:0x0, size:0x2000000, reference count:18, mapping count:0,
[42661.396770] mc_node :GART_CACHEABLE, total 3 zones
[42661.396772]     MC start:0x10400000, Physical:0x0, size:0x17640000.
[42661.396775]     Mapped heap -- Offset:0x1400000, size:0x200000, reference count:1, mapping count:0,
[42661.396778]     Mapped heap -- Offset:0xa00000, size:0x100000, reference count:1, mapping count:0,
[42661.396781]     Mapped heap -- Offset:0x900000, size:0x100000, reference count:2, mapping count:0,
[42661.396784]     Mapped heap -- Offset:0x800000, size:0x100000, reference count:1, mapping count:0,
[42661.396787]     Mapped heap -- Offset:0x700000, size:0x100000, reference count:2, mapping count:0,
[42661.396790]     Mapped heap -- Offset:0x600000, size:0x100000, reference count:2, mapping count:0,
[42661.396793]     Mapped heap -- Offset:0x500000, size:0x100000, reference count:1, mapping count:0,
[42661.396796]     Mapped heap -- Offset:0x400000, size:0x100000, reference count:1, mapping count:0,
[42661.396799]     Mapped heap -- Offset:0x200000, size:0x200000, reference count:1, mapping count:0,
[42661.396802]     Mapped heap -- Offset:0xb00000, size:0x900000, reference count:2, mapping count:0,
[42661.396805]     Mapped heap -- Offset:0x0, size:0x200000, reference count:7, mapping count:0,
[42661.396808]     Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
[42661.396812] GRBM : 0xb0633828, SRBM : 0x20004ec0 .
[42661.396817] CP_RB_BASE : 0x27a400, CP_RB_RPTR : 0x11fb0 , CP_RB_WPTR :0x11fb0.
[42661.396822] CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x28086000.
[42661.396824] last submit IB buffer -- MC :0x28086000,phys:0x7aab7000.
[42661.396826] Dump the trace queue.
[42661.396827] End of dump



Don't know if this gives any hint, now GPU0 is marked dead and GPU1 is still mining.


spiccioli.
 
-ck (OP)
Legendary
*
Offline Offline

Activity: 4088
Merit: 1631


Ruu \o/


View Profile WWW
August 17, 2011, 10:18:08 AM
 #889

Now that one is a driver crash. Nothing to do with cgminer directly. It even caused a linux kernel warning about a prolonged soft lockup. Perhaps try a new kernel ati driver and ati sdk.

Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel
2% Fee Solo mining at solo.ckpool.org
-ck
spiccioli
Legendary
*
Offline Offline

Activity: 1378
Merit: 1003

nec sine labore


View Profile
August 17, 2011, 10:40:43 AM
 #890

Now that one is a driver crash. Nothing to do with cgminer directly. It even caused a linux kernel warning about a prolonged soft lockup. Perhaps try a new kernel ati driver and ati sdk.

ckolivas,

the thing that baffles me is that this does not happen if I use the other available miners, even DiabloMiner, which is a single-miner multi-GPUs kind of miner goes without problems.

my kernel driver is SDK 2.4 and Catalyst 11.6, which I think, under linux, are the latest available.

anyway, as soon as I have some more spare time I'll try to setup a USB key with a standard linux distribution (not linuxcoin) and I'll try again.

thanks a lot.

spiccioli.

ovidiusoft
Sr. Member
****
Offline Offline

Activity: 252
Merit: 250


View Profile
August 17, 2011, 11:04:44 AM
 #891

Returning with the ps snapshots I promised. I only left it running for a few hours, because I wanted to upgrade to 1.5.6. The lines below are done at 10 minutes each, with 'ps aux'. I only edited my username and password. The full command line is:

Code:
./cgminer -o http://pit.deepbit.net:8332 -u XXXXX -p XXXXX -o http://uscentral.btcguild.com:8332 -u XXXXX -p XXXXX -o http://api2.bitcoin.cz:8332 -u XXXXX -p XXXXX --submit-stale -t 0 -Q 4 -g 2 -I 10 -k phatk -v 2 -w 256 -d 0 -d 1

I think it's not enough data to draw a conclusion, but I have the suspicion that there is a memory leak somewhere, most likely in the dead gpu detection/restart code. The jumps in allocated memory are after such an event. I already started to get ps snapshots of 1.5.6 (I used the same -I option to make the gpus crash often), let me know if you need any other data:

Code:
root     16885  1.3 10.8 382588 110448 pts/1   Sl+  05:26   0:35 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 10.8 382588 110476 pts/1   Sl+  05:26   0:41 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 16.8 476560 171716 pts/1   Sl+  05:26   0:49 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 16.8 476560 171876 pts/1   Sl+  05:26   0:56 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202216 pts/1   Sl+  05:26   1:04 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202116 pts/1   Sl+  05:26   1:11 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202092 pts/1   Sl+  05:26   1:18 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202092 pts/1   Sl+  05:26   1:25 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202252 pts/1   Sl+  05:26   1:32 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202120 pts/1   Sl+  05:26   1:39 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202084 pts/1   Sl+  05:26   1:46 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 22.8 554360 232812 pts/1   Sl+  05:26   1:53 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 609636 263576 pts/1   Sl+  05:26   2:01 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263556 pts/1   Sl+  05:26   2:08 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263504 pts/1   Sl+  05:26   2:15 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263540 pts/1   Sl+  05:26   2:22 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263488 pts/1   Sl+  05:26   2:29 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263380 pts/1   Sl+  05:26   2:36 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263596 pts/1   Sl+  05:26   2:43 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263620 pts/1   Sl+  05:26   2:50 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 28.8 656772 294424 pts/1   Sl+  05:26   2:58 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 28.8 664968 294376 pts/1   Sl+  05:26   3:05 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 31.8 712032 324872 pts/1   Sl+  05:26   3:14 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 742484 355504 pts/1   Sl+  05:26   3:21 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355484 pts/1   Sl+  05:26   3:28 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355436 pts/1   Sl+  05:26   3:35 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355368 pts/1   Sl+  05:26   3:43 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355388 pts/1   Sl+  05:26   3:49 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355352 pts/1   Sl+  05:26   3:49 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
-ck (OP)
Legendary
*
Offline Offline

Activity: 4088
Merit: 1631


Ruu \o/


View Profile WWW
August 17, 2011, 11:11:01 AM
 #892

Oh yes there is no way to clean up after a dead GPU restarted thread. It's a dramatic manoeuvre where I disable the thread and create an entirely new context. See the problem is that when code hits the GPU and NEVER RETURNS there is no reliable way of resuscitating the code that called it. So I disable the thread (in case it ever comes back) and create an entirely new thread. If I free any of the ram from the original thread, it will often crash if the thread returns and the memory has been de-referenced from under it. This is all because it's done in c low level.

Executive summary: If you get lots of GPU sick scenarios that restart the GPU, decrease the intensity level with -I till it stops happening.

Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel
2% Fee Solo mining at solo.ckpool.org
-ck
d3m0n1q_733rz
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250



View Profile WWW
August 17, 2011, 11:11:25 AM
 #893

Returning with the ps snapshots I promised. I only left it running for a few hours, because I wanted to upgrade to 1.5.6. The lines below are done at 10 minutes each, with 'ps aux'. I only edited my username and password. The full command line is:

Code:
./cgminer -o http://pit.deepbit.net:8332 -u XXXXX -p XXXXX -o http://uscentral.btcguild.com:8332 -u XXXXX -p XXXXX -o http://api2.bitcoin.cz:8332 -u XXXXX -p XXXXX --submit-stale -t 0 -Q 4 -g 2 -I 10 -k phatk -v 2 -w 256 -d 0 -d 1

I think it's not enough data to draw a conclusion, but I have the suspicion that there is a memory leak somewhere, most likely in the dead gpu detection/restart code. The jumps in allocated memory are after such an event. I already started to get ps snapshots of 1.5.6 (I used the same -I option to make the gpus crash often), let me know if you need any other data:

Code:
root     16885  1.3 10.8 382588 110448 pts/1   Sl+  05:26   0:35 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 10.8 382588 110476 pts/1   Sl+  05:26   0:41 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 16.8 476560 171716 pts/1   Sl+  05:26   0:49 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 16.8 476560 171876 pts/1   Sl+  05:26   0:56 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202216 pts/1   Sl+  05:26   1:04 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202116 pts/1   Sl+  05:26   1:11 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202092 pts/1   Sl+  05:26   1:18 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202092 pts/1   Sl+  05:26   1:25 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202252 pts/1   Sl+  05:26   1:32 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202120 pts/1   Sl+  05:26   1:39 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 19.8 523532 202084 pts/1   Sl+  05:26   1:46 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 22.8 554360 232812 pts/1   Sl+  05:26   1:53 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 609636 263576 pts/1   Sl+  05:26   2:01 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263556 pts/1   Sl+  05:26   2:08 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263504 pts/1   Sl+  05:26   2:15 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263540 pts/1   Sl+  05:26   2:22 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263488 pts/1   Sl+  05:26   2:29 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263380 pts/1   Sl+  05:26   2:36 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263596 pts/1   Sl+  05:26   2:43 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 25.8 617832 263620 pts/1   Sl+  05:26   2:50 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 28.8 656772 294424 pts/1   Sl+  05:26   2:58 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 28.8 664968 294376 pts/1   Sl+  05:26   3:05 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 31.8 712032 324872 pts/1   Sl+  05:26   3:14 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 742484 355504 pts/1   Sl+  05:26   3:21 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355484 pts/1   Sl+  05:26   3:28 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355436 pts/1   Sl+  05:26   3:35 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355368 pts/1   Sl+  05:26   3:43 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355388 pts/1   Sl+  05:26   3:49 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
root     16885  1.2 34.8 758876 355352 pts/1   Sl+  05:26   3:49 ./cgminer -o http://pit.deepbit.net:8332 -u XXXXXX -p XXXXXX
Actually, I would suspect it to be in the new phatk2 kernel.  There was a similar leak in the original phatk2 kernel (which was actually much faster before they fixed it) and they had to revert some changes to fix it.  It's probable that the source used for cgminer was of the problematic version and a similar reversion will need to be made for it as well.

Funroll_Loops, the theoretically quicker breakfast cereal!
Check out http://www.facebook.com/JupiterICT for all of your computing needs.  If you need it, we can get it.  We have solutions for your computing conundrums.  BTC accepted!  12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
-ck (OP)
Legendary
*
Offline Offline

Activity: 4088
Merit: 1631


Ruu \o/


View Profile WWW
August 17, 2011, 11:13:38 AM
 #894

No, the source for the new phatk kernel came from phateus himself and is as up to date as it gets.

Developer/maintainer for cgminer, ckpool/ckproxy, and the -ck kernel
2% Fee Solo mining at solo.ckpool.org
-ck
d3m0n1q_733rz
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250



View Profile WWW
August 17, 2011, 11:17:37 AM
 #895

Cool.  I was just worried about it being the same issue as was found with Phoenix Miner...maybe that was with the miner...eh, it's too late at night/early in the morning for me to think about things.  So please disregard my confused thoughts.

Funroll_Loops, the theoretically quicker breakfast cereal!
Check out http://www.facebook.com/JupiterICT for all of your computing needs.  If you need it, we can get it.  We have solutions for your computing conundrums.  BTC accepted!  12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
ovidiusoft
Sr. Member
****
Offline Offline

Activity: 252
Merit: 250


View Profile
August 17, 2011, 11:22:19 AM
 #896

Actually, I would suspect it to be in the new phatk2 kernel.  There was a similar leak in the original phatk2 kernel (which was actually much faster before they fixed it) and they had to revert some changes to fix it.  It's probable that the source used for cgminer was of the problematic version and a similar reversion will need to be made for it as well.

No, I was running 1.5.5 with the old kernel. As Con said, if I still get a lot of crashes I will decrease intensity. But it appears that's not the case with 1.5.6 - no crashes, no hardware errors, which is consistent with my results with phoenix-1.5 + phatk-2.2. I might even try to increase the frequency a little and see if hardware errors went away at higher freqs too (also expected).

The other 2 notable differences are increased hashrate (hooray) and +2 degrees C heat per board (booo) - also expected.
Ali
Member
**
Offline Offline

Activity: 84
Merit: 10


View Profile
August 17, 2011, 11:52:49 AM
Last edit: August 17, 2011, 12:02:50 PM by Ali
 #897

Great work on the new windows-build.

However on one machine I'm trying the cpu-only version on Windows XP (x86) which crashes instantly.
I start it from a bat-file with the following options:

Quote
set http-proxy=http://127.0.0.1:8118
cgminer-cpuonly.exe -o http://mining.com/8332 -u user -p pass

What may be the cause?

EDIT: The benchmark-mode (--algo auto) works. After finishing the last benchmark the miner crashes.
d3m0n1q_733rz
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250



View Profile WWW
August 17, 2011, 12:03:11 PM
 #898

Great work on the new windows-build.

However on one machine I'm trying the cpu-only version on Windows XP (x86) which crashes instantly.
I start it from a bat-file with the following options:

Quote
set http-proxy=http://127.0.0.1:8118
cgminer-cpuonly.exe -o http://mining.com/8332 -u user -p pass

What may be the cause?
For one, the backslash instead of the colon before your port number.  Also, try tagging on --algo auto for good measure.  It's a nice feature that will ensure you're using the best algorithm for your processor.  But yeah, http://mining.com:8332 not /8332.

Funroll_Loops, the theoretically quicker breakfast cereal!
Check out http://www.facebook.com/JupiterICT for all of your computing needs.  If you need it, we can get it.  We have solutions for your computing conundrums.  BTC accepted!  12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
Tx2000
Full Member
***
Offline Offline

Activity: 182
Merit: 100



View Profile
August 17, 2011, 02:26:32 PM
 #899

Apologize if it was already addressed or noted somewhere but I cannot seem to find it.  Is it possible to set Intensity per GPU?  If not, would that be something difficult to add?  I like being able to manage each GPU individually and in my case, one of my GPUs isn't always a dedicated miner while the other is.
Unfortunately there is not a way to do that at the moment. In the readme you'll see it recommends running 2 instances of cgminer selecting which GPUs to use. Use one instance in dynamic mode and the other with a set intensity.

Good idea.  I don't know why I didn't see that.... I'll give it a shot, thanks!
gigica viteazu`
Sr. Member
****
Offline Offline

Activity: 458
Merit: 250

beast at work


View Profile
August 17, 2011, 02:59:19 PM
 #900

Code:
[2011-08-17 17:56:04] Failed to init GPU thread 0
[2011-08-17 17:56:07] OpenCL compiler generated a zero sized binary, may need to reboot!
[2011-08-17 17:56:07] Failed to init GPU thread 1
[2011-08-17 17:56:10] OpenCL compiler generated a zero sized binary, may need to reboot!
[2011-08-17 17:56:10] Failed to init GPU thread 2
[2011-08-17 17:56:13] OpenCL compiler generated a zero sized binary, may need to reboot!
[2011-08-17 17:56:13] Failed to init GPU thread 3


for me 1.5.6 seems to work only on single card machines, i get the same message from 2 diff machines, one is 2x 5870 and one is 2x 5830

any ideas ?

P.S.
1.5.5 works just fine
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 [45] 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 ... 843 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!