The-Real-Link
|
|
February 22, 2012, 01:17:52 AM |
|
Awesome job there! Wow they're packed in tight (but that's the point)! Good to know they all work just fine.
|
Oh Loaded, who art up in Mt. Gox, hallowed be thy name! Thy dollars rain, thy will be done, on BTCUSD. Give us this day our daily 10% 30%, and forgive the bears, as we have bought their bitcoins. And lead us into quadruple digits
|
|
|
grue
Legendary
Offline
Activity: 2058
Merit: 1446
|
|
February 22, 2012, 01:29:09 AM |
|
For power, I got myself a Dell 2360 watt power supply from their current-generation blade servers - it cost me $95 That's pretty cheap for a 2KW power supply . any disadvantages in using this rather than a standard ATX power supply?
|
|
|
|
check_status
Full Member
Offline
Activity: 196
Merit: 100
Web Dev, Db Admin, Computer Technician
|
|
February 22, 2012, 01:44:47 AM |
|
Found some info on the 8 GPU limit and part of the issue appears to be BIOS limitations. Initially, when we first connected 13 GPUs, the system refused to boot. After discussing this with ASUS, we concluded that the problem was that the GPUs required a larger block of physical address space than the BIOS was able to provide. The 32 bit BIOS can only map the PCI devices (including PCI-E) below the 4GB boundary, so this meant there was at most roughly 3GB of address space available for the devices. Because each GPU requires a block of 16MB, a block of 32MB and a block of 256MB, only 8 or 9 GPUs worked, depending on how many on-board devices we disabled in the BIOS setup. Adding more cards than that caused a boot failure.
ASUS was extremely helpful with solving this, and they provided a custom BIOS for our motherboard that skipped the address space allocation of the GTX295 cards entirely. This is also the reason we have a single GTX275 card in the FASTRA II: it is the one card that is fully initialized by the BIOS and can provide graphics output to the monitor.
With this custom BIOS, the system booted successfully, but without working GTX295 cards since those were not initialized yet. To enable these cards, we modified a Linux 2.6.29.1 kernel (the latest at the time) to allocate physical address space to the GPUs manually. Since the kernel is 64-bit, we could map the large 256MB resource blocks above the 4GB boundary, thereby ensuring there was plenty of room for them. The smaller 16MB and 32MB blocks easily fit below 4GB, where the GPU required them.
The remaining problem was unexpected: each GPU requires a block of 4KB of I/O port space, for which only 64KB is reserved in total. Together with low-level system devices and devices like network and USB controllers also taking up I/O space this was a very tight fit. We needed to re-map inefficiently allocated system devices and disable as many devices as possible entirely, such as the RAID controller and the second network controller. From later experiments we suspect it might actually only be necessary to allocate this 4KB block of I/O ports for the primary VGA controller, but we haven’t verified that. http://fastra2.ua.ac.be/?page_id=214Would Infiniband allow access to more than 8 GPU's?
|
For Bitcoin to be a true global currency the value of BTC needs always to rise. If BTC became the global currency & money supply = 100 Trillion then ⊅1.00 BTC = $4,761,904.76. P2Pool Server List | How To's and Guides Mega List | 1 EndfedSryGUZK9sPrdvxHntYzv2EBexGA
|
|
|
RandyFolds
|
|
February 22, 2012, 01:46:51 AM |
|
Nice to see this thing already pumping away...
|
|
|
|
rjk (OP)
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
February 22, 2012, 02:38:10 AM |
|
For power, I got myself a Dell 2360 watt power supply from their current-generation blade servers - it cost me $95 That's pretty cheap for a 2KW power supply . any disadvantages in using this rather than a standard ATX power supply? Yes, mainly that I will have to solder up some custom connections to it, and figure out how to turn it on. See the picture in the imgur album of the output connectors on the thing. Other than that though, not really - it has very high efficiency (80Plus Gold level), and is very compact in size (smaller than one of my PCP&C 1200 watt PSUs). Well, the BIOS in this setup already has a 64-bit addressing option (currently disabled). I wasn't able to make BAMT boot properly with it enabled, and Windows wanted a driver (I didn't know what driver to provide, so I was never able to install Windows). I have no idea how Infiniband works, sorry I can't say there. ----------------------- In other news, this thing is gonna end up costing me. I just got a PM from the guy that makes awesome miner frames (Website: http://richchomiczewski.wordpress.com/), and since I am such a sucker for a well-designed frame/case, I'm getting him to quote me on a custom version to fit this board. I hate all these sigs that solicit donations for no reason, but I wonder if I oughta start posting one lol Firstbits: 1ngldh
|
|
|
|
rjk (OP)
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
February 22, 2012, 02:22:16 PM |
|
Huh, I just noticed that bumping the memory clocks across 7 cards by 100 Mhz made the power draw jump from 11 to 13 amps. I think I'm getting pretty close to the limit of these PSUs and I need to get the server PSU working. Do any of you work for Dell, and if so can you get me the pinout on this thing? Otherwise, I'll have to start poking around and playing with the pins using a resistor until it turns on, and I'd rather just hook it up right the first time. In case I didn't post it before, here is a pic of the pins:
|
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
February 22, 2012, 02:26:05 PM Last edit: February 25, 2012, 02:48:26 AM by DeathAndTaxes |
|
Found some info on the 8 GPU limit and part of the issue appears to be BIOS limitations. Initially, when we first connected 13 GPUs, the system refused to boot. After discussing this with ASUS, we concluded that the problem was that the GPUs required a larger block of physical address space than the BIOS was able to provide. The 32 bit BIOS can only map the PCI devices (including PCI-E) below the 4GB boundary, so this meant there was at most roughly 3GB of address space available for the devices. Because each GPU requires a block of 16MB, a block of 32MB and a block of 256MB, only 8 or 9 GPUs worked, depending on how many on-board devices we disabled in the BIOS setup. Adding more cards than that caused a boot failure.
ASUS was extremely helpful with solving this, and they provided a custom BIOS for our motherboard that skipped the address space allocation of the GTX295 cards entirely. This is also the reason we have a single GTX275 card in the FASTRA II: it is the one card that is fully initialized by the BIOS and can provide graphics output to the monitor.
With this custom BIOS, the system booted successfully, but without working GTX295 cards since those were not initialized yet. To enable these cards, we modified a Linux 2.6.29.1 kernel (the latest at the time) to allocate physical address space to the GPUs manually. Since the kernel is 64-bit, we could map the large 256MB resource blocks above the 4GB boundary, thereby ensuring there was plenty of room for them. The smaller 16MB and 32MB blocks easily fit below 4GB, where the GPU required them.
The remaining problem was unexpected: each GPU requires a block of 4KB of I/O port space, for which only 64KB is reserved in total. Together with low-level system devices and devices like network and USB controllers also taking up I/O space this was a very tight fit. We needed to re-map inefficiently allocated system devices and disable as many devices as possible entirely, such as the RAID controller and the second network controller. From later experiments we suspect it might actually only be necessary to allocate this 4KB block of I/O ports for the primary VGA controller, but we haven’t verified that. There are three limits. 1) BIOS 2) Linux kernel 3) Driver hard limit. AMD hasn't seen much need to fix their drivers because even if it was you would still have the other two issues. Based on FASTRA II results it appears the other two aren't impossible just difficulty and require workarounds. My understanding is that FASTRA II rig doesn't expose the other 12 GPU as graphics cards to the OS. The OS (and BIOS) believe there is a single graphics card and then using CUDA they are able to access those GPU as compute devices (similar to how Tesla cards aren't seen by OS as graphics/video cards). This presents two unique problems with AMD devices: a) AMD has a hard limit of 8 GPU and has shown no interest in removing it. b) AMD drivers require xorg which would indicate a tight coupling as graphical devices (which just happen to be able to do other non-graphical stuff) so FASTRA "workaround" of custom BIOS & Kernel may not work. Hopefully someday AMD decouples their drivers from xorg (this would allow linux distros with no GUI or windowing environement loaded at all) and remove the 8 GPU limit. Given AMD track record with solving even simple bugs (100% CPU anyone) I am not holding my breath.Would Infiniband allow access to more than 8 GPU's?
Physically sure but logically the same problems above exist. TL/DR version: AMD needs to improve their drivers so GPU compute cores can be exposed in a non-graphical manner. This would allow infinite number of GPU per system allowing even supercomputer sized arrays involving hundreds of GPUs to be seen by the OS as a single logical system. For the record all supercomputers to date which use GPU use NVidia even those ironically which use AMD CPUs.
|
|
|
|
rjk (OP)
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
February 22, 2012, 08:14:21 PM |
|
Here is a pic of the PSU that I need to mod, sitting on top of the GPUs to give an idea of the scale. It is almost exactly the same size (length, height, width) as two 5870s stacked next to each other. BTW, in cgminer I can set --failover-only, but how do I do this in BAMT (phoenix)?
|
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
February 22, 2012, 08:31:58 PM |
|
Here is a pic of the PSU that I need to mod, sitting on top of the GPUs to give an idea of the scale. It is almost exactly the same size (length, height, width) as two 5870s stacked next to each other. BTW, in cgminer I can set --failover-only, but how do I do this in BAMT (phoenix)? Not sure but BAMT works w/ cgminer also.
|
|
|
|
rjk (OP)
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
February 22, 2012, 08:40:49 PM |
|
Not sure but BAMT works w/ cgminer also.
I'm such a linux noob that I would probably stuff something up modifying BAMT for cgminer. I guess I'll wait until lodcrappo releases a final version optimized with cgminer. It's coming soon, right?
|
|
|
|
hmblm1245
|
|
February 22, 2012, 08:48:03 PM |
|
Is that blue cable an anti static wrist strap?
|
|
|
|
rjk (OP)
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
February 22, 2012, 08:54:23 PM |
|
Is that blue cable an anti static wrist strap?
Yep!
|
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
February 22, 2012, 09:08:11 PM |
|
Not sure but BAMT works w/ cgminer also.
I'm such a linux noob that I would probably stuff something up modifying BAMT for cgminer. I guess I'll wait until lodcrappo releases a final version optimized with cgminer. It's coming soon, right? I think you misunderstand. BAMT has cgminer. It ALREADY works with cgminer. Run fixer to ensure you are on the latest version and modify the bamt.conf to set cgminer = 1 and then configure cgminer.conf.
|
|
|
|
rjk (OP)
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
February 22, 2012, 09:11:28 PM |
|
Not sure but BAMT works w/ cgminer also.
I'm such a linux noob that I would probably stuff something up modifying BAMT for cgminer. I guess I'll wait until lodcrappo releases a final version optimized with cgminer. It's coming soon, right? I think you misunderstand. BAMT has cgminer. It ALREADY works with cgminer. Run fixer to ensure you are on the latest version and modify the bamt.conf to set cgminer = 1 and then configure cgminer.conf. Well. I didn't know that, and this is an interesting and positive development. Could you elaborate on how to "run fixer" - I had no idea that it already contained cgminer. I don't see an existing value in bamt.conf that says cgminer=0, so am I out of date?
|
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
February 22, 2012, 09:18:17 PM |
|
Not sure but BAMT works w/ cgminer also.
I'm such a linux noob that I would probably stuff something up modifying BAMT for cgminer. I guess I'll wait until lodcrappo releases a final version optimized with cgminer. It's coming soon, right? I think you misunderstand. BAMT has cgminer. It ALREADY works with cgminer. Run fixer to ensure you are on the latest version and modify the bamt.conf to set cgminer = 1 and then configure cgminer.conf. Well. I didn't know that, and this is an interesting and positive development. Could you elaborate on how to "run fixer" - I had no idea that it already contained cgminer. I don't see an existing value in bamt.conf that says cgminer=0, so am I out of date? probably. run /opt/bamt/fixer when logged in as root. the patch which installs cgminer I don't think replaces bamt.conf but it does put an updated version at /opt/bamt/examples/bamt.conf A quick: "cp /opt/bamt/examples/bamt.conf /etc/bamt/bamt.conf" will get you up to date (warning: will replace your existing bamt.conf). set cgminer=1 in options section and set cgminer=1 for each card you want controlled by cgminer (likely all). Even if you don't want to use cgminer you should always run fixer when using BAMT. Lots of bug fixes, enhanced features, improved reporting, etc.
|
|
|
|
rjk (OP)
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
February 22, 2012, 09:30:40 PM |
|
DeathAndTaxes, you are a hero. Just ran a skidload of updates, and I'll try getting cgminer to work now.
|
|
|
|
rjk (OP)
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
February 23, 2012, 02:14:26 AM |
|
So guys, what do you think of this?
|
|
|
|
BinaryMage
|
|
February 23, 2012, 02:18:15 AM |
|
So guys, what do you think of this? Are you using extra long PCIe extenders? (I know the ones I have from Cablesaurus wouldn't be long enough) Other than that, looks promising!
|
|
|
|
rjk (OP)
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
February 23, 2012, 02:19:41 AM |
|
Are you using extra long PCIe extenders? (I know the ones I have from Cablesaurus wouldn't be long enough)
Other than that, looks promising!
Yes, I'll probably have to custom order some, unless I can find a place that makes good quality extra long ones.
|
|
|
|
the joint
Legendary
Offline
Activity: 1834
Merit: 1020
|
|
February 23, 2012, 02:26:32 AM |
|
This thing deserves a name when you're done with it.
|
|
|
|
|