Mobius
|
|
August 31, 2011, 12:58:50 PM |
|
A proper pools file example in the wiki for multiple mining rigs would also be helpful. I currently have a separate poolsX file for every GPU. This works but there must be a simpler method.
One simpler method is to just have one pools file and use it for all your GPUs. You only need separate files if you actually want each GPU going to a different pool. @lodcrappo Feeling rather dense tonight. So if I create one pool file for 9 miners on Deepbit the three separate machines hosting the 9 GPUs will read the file (on each machine, haven't done a single config file yet) and they will all 9 find a miner account on Deepbit? I must be missing something. you need one pools file per machine, though the managed config option can help you get a single central file onto all the machines. for instance I have one pools file running 12 machines with total of 30 GPUs. it is stored on a server, and all the rigs pull the file using rsync. this is through the managed_config_command in bamt.conf. a simpler compromise would be to just copy the same pools file onto all your machines, if you don't change pools a lot that will work fine. Some pools have separate worker sub accounts, I keep a directory on my workstation with one pool file for each gpu and then run scp manually(in a .sh with one scp entry per rig) to update all the rigs.
|
|
|
|
kirax
|
|
August 31, 2011, 02:59:11 PM |
|
A proper pools file example in the wiki for multiple mining rigs would also be helpful. I currently have a separate poolsX file for every GPU. This works but there must be a simpler method.
One simpler method is to just have one pools file and use it for all your GPUs. You only need separate files if you actually want each GPU going to a different pool. @lodcrappo Feeling rather dense tonight. So if I create one pool file for 9 miners on Deepbit the three separate machines hosting the 9 GPUs will read the file (on each machine, haven't done a single config file yet) and they will all 9 find a miner account on Deepbit? I must be missing something. you need one pools file per machine, though the managed config option can help you get a single central file onto all the machines. for instance I have one pools file running 12 machines with total of 30 GPUs. it is stored on a server, and all the rigs pull the file using rsync. this is through the managed_config_command in bamt.conf. a simpler compromise would be to just copy the same pools file onto all your machines, if you don't change pools a lot that will work fine. Some pools have separate worker sub accounts, I keep a directory on my workstation with one pool file for each gpu and then run scp manually(in a .sh with one scp entry per rig) to update all the rigs. any pools have that... but in general, there is not a lot of need to use them. I have had 12 cards connecting to one account on btcguild, working fine.
|
VPS, shared, dedicated hosting at: electronstorm.ca. No bitcoin payment for that yet, but bitcoins possible for general IT, and mining/GPGPU rigs. PM for details.
|
|
|
blackhat
Newbie
Offline
Activity: 53
Merit: 0
|
|
August 31, 2011, 06:27:27 PM |
|
Anyone running 8 GPUs with bamt? Yes. Well, uh .... no! I *tried* but got stuck where you are standing now. I've got 4x 6870x2 (DUAL-GPU) that I've been struggling with for some days. Goal is to put them on a single board altogether. When I try to boot BAMT (0.4b, all recent fixes), I got the same problem. This probably hasn't got to do with BAMT directly, but with X, or more specifically with the AMD fglrx drivers that X calls. When I remove one of the Dual-GPUs, hence only 6 GPU's left in the system, BAMT boots happily doing its work. I've put some time into that issue today, because I'm very keen on putting on this rig ultimately with all FOUR cards. Operating with only three cards is simply not an option. So, when i had 8 GPUs installed, BAMT died when it tried to display the GUI. So it made it through the BAMT startup screen and the initial startup.
Exactly same thing here. What happens is this: X tries to start up and initialize all cards. When 8 GPU are at work the fglrx driver segfaults with following error: Backtrace: 0: /usr/bin/X (xorg_backtrace+0x3b) [0x80adedb] 1: /usr/bin/X (0x8048000+0x5aab5) [0x80a2ab5] 2: (vdso) (__kernel_rt_sigreturn+0x0) [0xb778740c] 3: /usr/lib/xorg/modules/drivers/fglrx_drv.so (xdl_x750_atiddxPreInit+0x2554) [0xb6946d84] 4: /usr/bin/X (InitOutput+0x5c8) [0x80b09b8] 5: /usr/bin/X (0x8048000+0x1e7f0) [0x80667f0] 6: /lib/i686/cmov/libc.so.6 (__libc_start_main+0xe6) [0xb74bec76] 7: /usr/bin/X (0x8048000+0x1e5a1) [0x80665a1] Segmentation fault at address 0x8
Fatal server error: Caught signal 11 (Segmentation fault). Server aborting
The full Xorg.0.log shows that all 8 GPUs (plus the primary iGPU) get detected correctly, and then the drivers are loaded. After this, the backtrace shows up and X can't get started. Because you have the same issue, but probably are connected through the first (primary) GPU with your monitor, you are not seeing anything because on bailout of X the cards get reset and the system get stuck in an unstable state. I could only see this after enabling the iGPU (onboard gfx) from the mobo and plugging the monitor there. (Didn't help on the effect, though) Unfortunately, I'm not sure if upgrading to latest 11.8 helps. Frankly, I doubt it. I removed the GPU on the 1x extender and all is well. Is this a problem of me needing a MB with 4 16x PCIe slot or is it a problem with BAMT?
It's probably not the extender and the mobo is well, too, as long as you get to boot into the kernel and stop right before X starts and bails out with the segfault. If it stops way before, i.e. throwing things at you before booting into the FS and INIT, you're encountering a different problem. Has anyone had a similar or the same problem while migrating to BAMT? I already found this one: http://blog.zorinaq.com/?e=46, but there's said that 8 GPU work fine whereas 10 GPU are currently the limiting number.
|
|
|
|
kirax
|
|
August 31, 2011, 07:33:24 PM |
|
Anyone running 8 GPUs with bamt? Yes. Well, uh .... no! I *tried* but got stuck where you are standing now. I've got 4x 6870x2 (DUAL-GPU) that I've been struggling with for some days. Goal is to put them on a single board altogether. When I try to boot BAMT (0.4b, all recent fixes), I got the same problem. This probably hasn't got to do with BAMT directly, but with X, or more specifically with the AMD fglrx drivers that X calls. When I remove one of the Dual-GPUs, hence only 6 GPU's left in the system, BAMT boots happily doing its work. I've put some time into that issue today, because I'm very keen on putting on this rig ultimately with all FOUR cards. Operating with only three cards is simply not an option. So, when i had 8 GPUs installed, BAMT died when it tried to display the GUI. So it made it through the BAMT startup screen and the initial startup.
Exactly same thing here. What happens is this: X tries to start up and initialize all cards. When 8 GPU are at work the fglrx driver segfaults with following error: Backtrace: 0: /usr/bin/X (xorg_backtrace+0x3b) [0x80adedb] 1: /usr/bin/X (0x8048000+0x5aab5) [0x80a2ab5] 2: (vdso) (__kernel_rt_sigreturn+0x0) [0xb778740c] 3: /usr/lib/xorg/modules/drivers/fglrx_drv.so (xdl_x750_atiddxPreInit+0x2554) [0xb6946d84] 4: /usr/bin/X (InitOutput+0x5c8) [0x80b09b8] 5: /usr/bin/X (0x8048000+0x1e7f0) [0x80667f0] 6: /lib/i686/cmov/libc.so.6 (__libc_start_main+0xe6) [0xb74bec76] 7: /usr/bin/X (0x8048000+0x1e5a1) [0x80665a1] Segmentation fault at address 0x8
Fatal server error: Caught signal 11 (Segmentation fault). Server aborting
The full Xorg.0.log shows that all 8 GPUs (plus the primary iGPU) get detected correctly, and then the drivers are loaded. After this, the backtrace shows up and X can't get started. Because you have the same issue, but probably are connected through the first (primary) GPU with your monitor, you are not seeing anything because on bailout of X the cards get reset and the system get stuck in an unstable state. I could only see this after enabling the iGPU (onboard gfx) from the mobo and plugging the monitor there. (Didn't help on the effect, though) Unfortunately, I'm not sure if upgrading to latest 11.8 helps. Frankly, I doubt it. I removed the GPU on the 1x extender and all is well. Is this a problem of me needing a MB with 4 16x PCIe slot or is it a problem with BAMT?
It's probably not the extender and the mobo is well, too, as long as you get to boot into the kernel and stop right before X starts and bails out with the segfault. If it stops way before, i.e. throwing things at you before booting into the FS and INIT, you're encountering a different problem. Has anyone had a similar or the same problem while migrating to BAMT? I already found this one: http://blog.zorinaq.com/?e=46, but there's said that 8 GPU work fine whereas 10 GPU are currently the limiting number. The extenders used, by the way, did they happen to be powered? Not related to BAMT, but I have seen people with boards in the mining forum that if they put in too many cards, the motherboard cannot power all of them: If X dies just as it tries to display the GUI, wouldn't that be right where it tries to kick them all to 3D mode and the power draw increases? YOu can have the biggest, baddest power supply in the world, but a motherboard can only supply so much power, and I bet 8 6870's suck a lot of power off the PCIe extender. Of course, this is just a thought, I might be totally wrong.
|
VPS, shared, dedicated hosting at: electronstorm.ca. No bitcoin payment for that yet, but bitcoins possible for general IT, and mining/GPGPU rigs. PM for details.
|
|
|
blackhat
Newbie
Offline
Activity: 53
Merit: 0
|
|
August 31, 2011, 09:26:54 PM |
|
The extenders used, by the way, did they happen to be powered?
I don't use any extenders, gigasvps uses one. The cards I'm talking about are 4 Dual-GPU Cards with actually two GPUs per card. So no need for an extender if you use a board that can take 4 cards. YOu can have the biggest, baddest power supply in the world, but a motherboard can only supply so much power, I'm not quite sure which circuit we are talking about, 5V / 3,3V from the mobo? The GPU gets its main current from separate +12V feeds directly connected to the PSU, as you probably know. The powerdraw from the other circuits is not as much, so the mobo won't get into trouble. earlier grafix cards sucked the hell out of the slots (some early-day AGP-driven monster GPUs without additional +12V feeds), but today this is hardly an issue. If X dies just as it tries to display the GUI, wouldn't that be right where it tries to kick them all to 3D mode and the power draw increases? Yesterday, I suspected the same. But it's not logical. It may sound strange, but 4 6870 dual-cores pull just the same as 3 6990 dual-cores from the system. When the latter works (and there is wide proof that it does) there is no reason why the former shouldn't work too, provided you use the right PSU. Plus, I noticed the input load climbing up to ~600W shortly after starting X with three cards. consequently, with 4 cards it should be anywhere around 800 to 900, but no higher. Remember, when X starts, the cards just initialize in graphics mode, and fall back to sleep soon. This short load spike should have been handled by the 1250W PSU that was already installed yesterday. But hey, for to be sure: Today I installed a 1500W SilverStone PSU, with 120A outlets just only for +12V (!) and as soon as I placed the fourth card into the system, the dang thing did go dying again. This can't be a power issue anymore. But, aside from power issues, I figured out that the system didn't just lock up as a hardware failure would do. As said, X segfaults after initializing the AMD FireGL driver. This is seconds away before the load real climbs up in the normal case, and power failure on any circuit wouldn't make a driver segfaulting. It would shutdown the board. Seriously, I doubt this. I'm pretty sure that it is a bug or maybe a limitation in the current drivers. I appreciate any hint on this one.
|
|
|
|
lodcrappo (OP)
|
|
August 31, 2011, 10:29:03 PM |
|
The extenders used, by the way, did they happen to be powered?
I don't use any extenders, gigasvps uses one. The cards I'm talking about are 4 Dual-GPU Cards with actually two GPUs per card. So no need for an extender if you use a board that can take 4 cards. YOu can have the biggest, baddest power supply in the world, but a motherboard can only supply so much power, I'm not quite sure which circuit we are talking about, 5V / 3,3V from the mobo? The GPU gets its main current from separate +12V feeds directly connected to the PSU, as you probably know. The powerdraw from the other circuits is not as much, so the mobo won't get into trouble. earlier grafix cards sucked the hell out of the slots (some early-day AGP-driven monster GPUs without additional +12V feeds), but today this is hardly an issue. If X dies just as it tries to display the GUI, wouldn't that be right where it tries to kick them all to 3D mode and the power draw increases? Yesterday, I suspected the same. But it's not logical. It may sound strange, but 4 6870 dual-cores pull just the same as 3 6990 dual-cores from the system. When the latter works (and there is wide proof that it does) there is no reason why the former shouldn't work too, provided you use the right PSU. Plus, I noticed the input load climbing up to ~600W shortly after starting X with three cards. consequently, with 4 cards it should be anywhere around 800 to 900, but no higher. Remember, when X starts, the cards just initialize in graphics mode, and fall back to sleep soon. This short load spike should have been handled by the 1250W PSU that was already installed yesterday. But hey, for to be sure: Today I installed a 1500W SilverStone PSU, with 120A outlets just only for +12V (!) and as soon as I placed the fourth card into the system, the dang thing did go dying again. This can't be a power issue anymore. But, aside from power issues, I figured out that the system didn't just lock up as a hardware failure would do. As said, X segfaults after initializing the AMD FireGL driver. This is seconds away before the load real climbs up in the normal case, and power failure on any circuit wouldn't make a driver segfaulting. It would shutdown the board. Seriously, I doubt this. I'm pretty sure that it is a bug or maybe a limitation in the current drivers. I appreciate any hint on this one. I will extend to you the standard BAMT support policy: I will try to make BAMT work on anything you buy for me. Since I don't have any of this hardware, there is really nothing I can do. If you guys find a solution I am happy to put it into the next version of BAMT.
|
|
|
|
lodcrappo (OP)
|
|
August 31, 2011, 10:34:37 PM |
|
A proper pools file example in the wiki for multiple mining rigs would also be helpful. I currently have a separate poolsX file for every GPU. This works but there must be a simpler method.
One simpler method is to just have one pools file and use it for all your GPUs. You only need separate files if you actually want each GPU going to a different pool. @lodcrappo Feeling rather dense tonight. So if I create one pool file for 9 miners on Deepbit the three separate machines hosting the 9 GPUs will read the file (on each machine, haven't done a single config file yet) and they will all 9 find a miner account on Deepbit? I must be missing something. you need one pools file per machine, though the managed config option can help you get a single central file onto all the machines. for instance I have one pools file running 12 machines with total of 30 GPUs. it is stored on a server, and all the rigs pull the file using rsync. this is through the managed_config_command in bamt.conf. a simpler compromise would be to just copy the same pools file onto all your machines, if you don't change pools a lot that will work fine. Some pools have separate worker sub accounts, I keep a directory on my workstation with one pool file for each gpu and then run scp manually(in a .sh with one scp entry per rig) to update all the rigs. any pools have that... but in general, there is not a lot of need to use them. I have had 12 cards connecting to one account on btcguild, working fine. Yeah, I have never needed to make more than one worker for any given pool, and frankly with the number of GPUs in my farm I just wouldn't mine at a pool that required such silliness. No way I'm managing a bunch of silly worker accounts.
|
|
|
|
jamesg
VIP
Legendary
Offline
Activity: 1358
Merit: 1000
AKA: gigavps
|
|
September 01, 2011, 01:33:16 AM |
|
Has anyone had a similar or the same problem while migrating to BAMT? I already found this one: http://blog.zorinaq.com/?e=46, but there's said that 8 GPU work fine whereas 10 GPU are currently the limiting number. I am glad it isn't just me. Unfortunately I am not a rocket surgeon when it comes to linux, so I think I'm just going to do 3 cards per box. If this is fixed, awesome, if not, it definitely won't derail my plans. Thanks again for BAMT, it's awesome.
|
|
|
|
blackhat
Newbie
Offline
Activity: 53
Merit: 0
|
|
September 01, 2011, 12:43:06 PM |
|
Since I don't have any of this hardware, there is really nothing I can do. If you guys find a solution I am happy to put it into the next version of BAMT. To my current knowledge and after the research I've been done it's nothing special about the hardware. Everything is fine until one puts 8 GPU or more together. 6 GPU work. Today I will find out if 7 GPU will be working, by replacing one of the 6870x2 cards with a 5850. I'll let you guys know.
|
|
|
|
mikeo
Full Member
Offline
Activity: 196
Merit: 100
Oikos.cash | Decentralized Finance on Tron
|
|
September 01, 2011, 01:29:03 PM Last edit: September 01, 2011, 01:51:25 PM by mikeo |
|
I've messed up a couple pools files in /etc/bamt/ and I want to delete them. However, from File Manager I get a permissions error that won't allow me to delete, rename, or overwrite the old poolsX files. Someone help this linux noob, please. I have changed the default P/W.
|
|
|
|
jamesg
VIP
Legendary
Offline
Activity: 1358
Merit: 1000
AKA: gigavps
|
|
September 01, 2011, 01:34:53 PM |
|
I am having an issue where mining on a rig will hault for no apparent reason. I can still ssh into the box and if I reboot, everything starts up fine again. Also, when I try to access gpumon after ssh-ing into the box, the process seem to be hung.
Is there a way to monitor the phoenix processes and if they become hung, restart them or the box itself?
|
|
|
|
kirax
|
|
September 01, 2011, 03:18:30 PM |
|
I've messed up a couple pools files in /etc/bamt/ and I want to delete them. However, from File Manager I get a permissions error that won't allow me to delete, rename, or overwrite the old poolsX files. Someone help this linux noob, please. I have changed the default P/W.
As long as you are comfortable with a little command line, that is the easiest way to do it: Go to your "start menu that we cannot call a start menu because microsoft trademarked it", whatever it is called these days: Under the top option, I think it is system? There is a root terminal option. You'll have to enter your root password, default is "changeme", but you did specify you changed it. Once in there, the following commands, without quotes of course: "cd /etc/bamt" This brings you to the directory /etc/bamt, similar to cd on dos systems "ls" this is similar to the dos "dir" command, in that it will show you all of the files in the directory To remove a file, like if the file is your pool32 , just go "rm ./pool32", and it should delete it, so you can put whatever else you want in its place. The "./" just tells it to look in the current directory and no where else for the file... Generally totally not needed, but safer :p I usually remote in to my BAMT boxes from my desktop, so I do not remember how to do it in the gui, although you might want to look for somethign like "root file manager" in the menu if you want to do it that way.
|
VPS, shared, dedicated hosting at: electronstorm.ca. No bitcoin payment for that yet, but bitcoins possible for general IT, and mining/GPGPU rigs. PM for details.
|
|
|
gnar1ta$
Donator
Hero Member
Offline
Activity: 798
Merit: 500
|
|
September 01, 2011, 04:06:02 PM |
|
I am having an issue where mining on a rig will hault for no apparent reason. I can still ssh into the box and if I reboot, everything starts up fine again. Also, when I try to access gpumon after ssh-ing into the box, the process seem to be hung.
Is there a way to monitor the phoenix processes and if they become hung, restart them or the box itself?
Sounds like one of your cards is hanging. gpumon and atitweak don't work for me if one hangs (or crashes?? IDK the correct term). type top and look for a phoenix instace at 100% cpu. I have been able to edit the delay times in the .conf file and run gpumon quick enough to see which adapter is causing the issue.
|
Losing hundreds of Bitcoins with the best scammers in the business - BFL, Avalon, KNC, HashFast.
|
|
|
jamesg
VIP
Legendary
Offline
Activity: 1358
Merit: 1000
AKA: gigavps
|
|
September 01, 2011, 04:14:31 PM |
|
Sounds like one of your cards is hanging. gpumon and atitweak don't work for me if one hangs (or crashes?? IDK the correct term). type top and look for a phoenix instace at 100% cpu. I have been able to edit the delay times in the .conf file and run gpumon quick enough to see which adapter is causing the issue.
I should have given more info. The computer ends up hanging randomly, hours or days after it was started. With the 5970s, i did need to up the wait time between starting gpus from 3 to 6 as the computer would sometimes hang when starting the mine process.
|
|
|
|
gnar1ta$
Donator
Hero Member
Offline
Activity: 798
Merit: 500
|
|
September 01, 2011, 06:19:05 PM |
|
Sounds like one of your cards is hanging. gpumon and atitweak don't work for me if one hangs (or crashes?? IDK the correct term). type top and look for a phoenix instace at 100% cpu. I have been able to edit the delay times in the .conf file and run gpumon quick enough to see which adapter is causing the issue.
I should have given more info. The computer ends up hanging randomly, hours or days after it was started. With the 5970s, i did need to up the wait time between starting gpus from 3 to 6 as the computer would sometimes hang when starting the mine process. You can still use top to monitor the processes, or screen -r gpuX to see the individual miners. But it still sounds like a card crashing, mine sometimes takes hours or days.
|
Losing hundreds of Bitcoins with the best scammers in the business - BFL, Avalon, KNC, HashFast.
|
|
|
jamesg
VIP
Legendary
Offline
Activity: 1358
Merit: 1000
AKA: gigavps
|
|
September 01, 2011, 06:37:19 PM |
|
You can still use top to monitor the processes, or screen -r gpuX to see the individual miners. But it still sounds like a card crashing, mine sometimes takes hours or days.
If one card crashes it takes the rest of them down with it or this a driver issue where a card crashes and none of the other cards can function?
|
|
|
|
gnar1ta$
Donator
Hero Member
Offline
Activity: 798
Merit: 500
|
|
September 01, 2011, 07:22:54 PM |
|
If one card crashes it takes the rest of them down with it or this a driver issue where a card crashes and none of the other cards can function?
I think they will all continue mining for some time but eventually they will drop, I've had plenty mornings when testing clocks overnight causes a card to hang and the rest stop mining.
|
Losing hundreds of Bitcoins with the best scammers in the business - BFL, Avalon, KNC, HashFast.
|
|
|
blackhat
Newbie
Offline
Activity: 53
Merit: 0
|
|
September 01, 2011, 08:58:12 PM |
|
To my current knowledge and after the research I've been done it's nothing special about the hardware. Everything is fine until one puts 8 GPU or more together. 6 GPU work. Today I will find out if 7 GPU will be working, by replacing one of the 6870x2 cards with a 5850. I'll let you guys know.
7 GPU work without any hassle. As soon as it gets to 8, X is punching out. It's irrelevant which card I remove, all of them work OK in combination with two others. The 4 cards together all show up in aticonfig --list-adapters, however starting X is impossible.
|
|
|
|
lodcrappo (OP)
|
|
September 01, 2011, 09:38:57 PM |
|
Since I don't have any of this hardware, there is really nothing I can do. If you guys find a solution I am happy to put it into the next version of BAMT. To my current knowledge and after the research I've been done it's nothing special about the hardware. Everything is fine until one puts 8 GPU or more together. 6 GPU work. Today I will find out if 7 GPU will be working, by replacing one of the 6870x2 cards with a 5850. I'll let you guys know. I don't have any motherboards that take more than 3 GPUs. I went with "build your farm as cheap as possible", which meant lots of crap motherboards, crap power supplies, and tons of GPUs. Those massive motherboards that run tons of GPUs and monster power supplies just don't work economically. Neither do the dual GPU cards. So what I am saying is, I have no way to test 8 GPUs, or even 4 for that matter
|
|
|
|
lodcrappo (OP)
|
|
September 01, 2011, 09:43:53 PM |
|
Sounds like one of your cards is hanging. gpumon and atitweak don't work for me if one hangs (or crashes?? IDK the correct term). type top and look for a phoenix instace at 100% cpu. I have been able to edit the delay times in the .conf file and run gpumon quick enough to see which adapter is causing the issue.
I should have given more info. The computer ends up hanging randomly, hours or days after it was started. With the 5970s, i did need to up the wait time between starting gpus from 3 to 6 as the computer would sometimes hang when starting the mine process. Step 1: Remove all overclocking. Does the problem go away? Step 2: Remove the GPUs one at a time. After each one is removed, does the problem go away?
|
|
|
|
|