Bitcoin Forum
March 19, 2024, 02:13:44 AM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 »
  Print  
Author Topic: Mining rig extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS]  (Read 169360 times)
The-Real-Link
Hero Member
*****
Offline Offline

Activity: 533
Merit: 500


View Profile
February 22, 2012, 01:17:52 AM
 #41

Awesome job there!  Wow they're packed in tight (but that's the point)!  Good to know they all work just fine.

Oh Loaded, who art up in Mt. Gox, hallowed be thy name!  Thy dollars rain, thy will be done, on BTCUSD.  Give us this day our daily 10% 30%, and forgive the bears, as we have bought their bitcoins.  And lead us into quadruple digits
The forum strives to allow free discussion of any ideas. All policies are built around this principle. This doesn't mean you can post garbage, though: posts should actually contain ideas, and these ideas should be argued reasonably.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1710814424
Hero Member
*
Offline Offline

Posts: 1710814424

View Profile Personal Message (Offline)

Ignore
1710814424
Reply with quote  #2

1710814424
Report to moderator
1710814424
Hero Member
*
Offline Offline

Posts: 1710814424

View Profile Personal Message (Offline)

Ignore
1710814424
Reply with quote  #2

1710814424
Report to moderator
grue
Legendary
*
Offline Offline

Activity: 2058
Merit: 1431



View Profile
February 22, 2012, 01:29:09 AM
 #42

Quote
For power, I got myself a Dell 2360 watt power supply from their current-generation blade servers - it cost me $95
That's pretty cheap for a 2KW power supply Shocked. any disadvantages in using this rather than a standard ATX power supply?

It is pitch black. You are likely to be eaten by a grue.

Adblock for annoying signature ads | Enhanced Merit UI
check_status
Full Member
***
Offline Offline

Activity: 196
Merit: 100


Web Dev, Db Admin, Computer Technician


View Profile
February 22, 2012, 01:44:47 AM
 #43

Found some info on the 8 GPU limit and part of the issue appears to be BIOS limitations.
Quote
Initially, when we first connected 13 GPUs, the system refused to boot. After discussing this with ASUS, we concluded that the problem was that the GPUs required a larger block of physical address space than the BIOS was able to provide. The 32 bit BIOS can only map the PCI devices (including PCI-E) below the 4GB boundary, so this meant there was at most roughly 3GB of address space available for the devices. Because each GPU requires a block of 16MB, a block of 32MB and a block of 256MB, only 8 or 9 GPUs worked, depending on how many on-board devices we disabled in the BIOS setup. Adding more cards than that caused a boot failure.

ASUS was extremely helpful with solving this, and they provided a custom BIOS for our motherboard that skipped the address space allocation of the GTX295 cards entirely.  This is also the reason we have a single GTX275 card in the FASTRA II: it is the one card that is fully initialized by the BIOS and can provide graphics output to the monitor.

With this custom BIOS, the system booted successfully, but without working GTX295 cards since those were not initialized yet. To enable these cards, we modified a Linux 2.6.29.1 kernel (the latest at the time) to allocate physical address space to the GPUs manually. Since the kernel is 64-bit, we could map the large 256MB resource blocks above the 4GB boundary, thereby ensuring there was plenty of room for them. The smaller 16MB and 32MB blocks easily fit below 4GB, where the GPU required them.

The remaining problem was unexpected: each GPU requires a block of 4KB of I/O port space, for which only 64KB is reserved in total. Together with low-level system devices and devices like network and USB controllers also taking up I/O space this was a very tight fit. We needed to re-map inefficiently allocated system devices and disable as many devices as possible entirely, such as the RAID controller and the second network controller. From later experiments we suspect it might actually only be necessary to allocate this 4KB block of I/O ports for the primary VGA controller, but we haven’t verified that.
http://fastra2.ua.ac.be/?page_id=214

Would Infiniband allow access to more than 8 GPU's?

For Bitcoin to be a true global currency the value of BTC needs always to rise.
If BTC became the global currency & money supply = 100 Trillion then ⊅1.00 BTC = $4,761,904.76.
P2Pool Server List | How To's and Guides Mega List |  1EndfedSryGUZK9sPrdvxHntYzv2EBexGA
RandyFolds
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250



View Profile
February 22, 2012, 01:46:51 AM
 #44

Nice to see this thing already pumping away...

rjk (OP)
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 22, 2012, 02:38:10 AM
 #45

Quote
For power, I got myself a Dell 2360 watt power supply from their current-generation blade servers - it cost me $95
That's pretty cheap for a 2KW power supply Shocked. any disadvantages in using this rather than a standard ATX power supply?
Yes, mainly that I will have to solder up some custom connections to it, and figure out how to turn it on. See the picture in the imgur album of the output connectors on the thing.

Other than that though, not really - it has very high efficiency (80Plus Gold level), and is very compact in size (smaller than one of my PCP&C 1200 watt PSUs).

Found some info on the 8 GPU limit and part of the issue appears to be BIOS limitations.

http://fastra2.ua.ac.be/?page_id=214

Would Infiniband allow access to more than 8 GPU's?
Well, the BIOS in this setup already has a 64-bit addressing option (currently disabled). I wasn't able to make BAMT boot properly with it enabled, and Windows wanted a driver (I didn't know what driver to provide, so I was never able to install Windows). I have no idea how Infiniband works, sorry I can't say there.

-----------------------

In other news, this thing is gonna end up costing me. Grin I just got a PM from the guy that makes awesome miner frames (Website: http://richchomiczewski.wordpress.com/), and since I am such a sucker for a well-designed frame/case, I'm getting him to quote me on a custom version to fit this board. I hate all these sigs that solicit donations for no reason, but I wonder if I oughta start posting one lol Tongue

Firstbits: 1ngldh

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
rjk (OP)
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 22, 2012, 02:22:16 PM
 #46

Huh, I just noticed that bumping the memory clocks across 7 cards by 100 Mhz made the power draw jump from 11 to 13 amps. I think I'm getting pretty close to the limit of these PSUs and I need to get the server PSU working. Do any of you work for Dell, and if so can you get me the pinout on this thing? Grin

Otherwise, I'll have to start poking around and playing with the pins using a resistor until it turns on, and I'd rather just hook it up right the first time.

In case I didn't post it before, here is a pic of the pins:


Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1063


Gerald Davis


View Profile
February 22, 2012, 02:26:05 PM
Last edit: February 25, 2012, 02:48:26 AM by DeathAndTaxes
 #47

Found some info on the 8 GPU limit and part of the issue appears to be BIOS limitations.
Quote
Initially, when we first connected 13 GPUs, the system refused to boot. After discussing this with ASUS, we concluded that the problem was that the GPUs required a larger block of physical address space than the BIOS was able to provide. The 32 bit BIOS can only map the PCI devices (including PCI-E) below the 4GB boundary, so this meant there was at most roughly 3GB of address space available for the devices. Because each GPU requires a block of 16MB, a block of 32MB and a block of 256MB, only 8 or 9 GPUs worked, depending on how many on-board devices we disabled in the BIOS setup. Adding more cards than that caused a boot failure.

ASUS was extremely helpful with solving this, and they provided a custom BIOS for our motherboard that skipped the address space allocation of the GTX295 cards entirely.  This is also the reason we have a single GTX275 card in the FASTRA II: it is the one card that is fully initialized by the BIOS and can provide graphics output to the monitor.

With this custom BIOS, the system booted successfully, but without working GTX295 cards since those were not initialized yet. To enable these cards, we modified a Linux 2.6.29.1 kernel (the latest at the time) to allocate physical address space to the GPUs manually. Since the kernel is 64-bit, we could map the large 256MB resource blocks above the 4GB boundary, thereby ensuring there was plenty of room for them. The smaller 16MB and 32MB blocks easily fit below 4GB, where the GPU required them.

The remaining problem was unexpected: each GPU requires a block of 4KB of I/O port space, for which only 64KB is reserved in total. Together with low-level system devices and devices like network and USB controllers also taking up I/O space this was a very tight fit. We needed to re-map inefficiently allocated system devices and disable as many devices as possible entirely, such as the RAID controller and the second network controller. From later experiments we suspect it might actually only be necessary to allocate this 4KB block of I/O ports for the primary VGA controller, but we haven’t verified that.

There are three limits.
1) BIOS
2) Linux kernel
3) Driver hard limit.

AMD hasn't seen much need to fix their drivers because even if it was you would still have the other two issues.  Based on FASTRA II results it appears the other two aren't impossible just difficulty and require workarounds.

My understanding is that FASTRA II rig doesn't expose the other 12 GPU as graphics cards to the OS.  The OS (and BIOS) believe there is a single graphics card and then using CUDA they are able to access those GPU as compute devices (similar to how Tesla cards aren't seen by OS as graphics/video cards).

This presents two unique problems with AMD devices:
a) AMD has a hard limit of 8 GPU and has shown no interest in removing it.
b) AMD drivers require xorg which would indicate a tight coupling as graphical devices (which just happen to be able to do other non-graphical stuff) so FASTRA "workaround" of custom BIOS & Kernel may not work.

Hopefully someday AMD decouples their drivers from xorg (this would allow linux distros with no GUI or windowing environement loaded at all) and remove the 8 GPU limit.  Given AMD track record with solving even simple bugs (100% CPU anyone) I am not holding my breath.

Quote
Would Infiniband allow access to more than 8 GPU's?

Physically sure but logically the same problems above exist.

TL/DR version:
AMD needs to improve their drivers so GPU compute cores can be exposed in a non-graphical manner.  This would allow infinite number of GPU per system allowing even supercomputer sized arrays involving hundreds of GPUs to be seen by the OS as a single logical system.  For the record all supercomputers to date which use GPU use NVidia even those ironically which use AMD CPUs.  Smiley
rjk (OP)
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 22, 2012, 08:14:21 PM
 #48

Here is a pic of the PSU that I need to mod, sitting on top of the GPUs to give an idea of the scale. It is almost exactly the same size (length, height, width) as two 5870s stacked next to each other.



BTW, in cgminer I can set --failover-only, but how do I do this in BAMT (phoenix)?

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1063


Gerald Davis


View Profile
February 22, 2012, 08:31:58 PM
 #49

Here is a pic of the PSU that I need to mod, sitting on top of the GPUs to give an idea of the scale. It is almost exactly the same size (length, height, width) as two 5870s stacked next to each other.



BTW, in cgminer I can set --failover-only, but how do I do this in BAMT (phoenix)?

Not sure but BAMT works w/ cgminer also.
rjk (OP)
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 22, 2012, 08:40:49 PM
 #50

Not sure but BAMT works w/ cgminer also.
I'm such a linux noob that I would probably stuff something up modifying BAMT for cgminer. I guess I'll wait until lodcrappo releases a final version optimized with cgminer. It's coming soon, right? Grin

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
hmblm1245
Hero Member
*****
Offline Offline

Activity: 628
Merit: 500


View Profile
February 22, 2012, 08:48:03 PM
 #51

Is that blue cable an anti static wrist strap?
rjk (OP)
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 22, 2012, 08:54:23 PM
 #52

Is that blue cable an anti static wrist strap?
Yep!

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1063


Gerald Davis


View Profile
February 22, 2012, 09:08:11 PM
 #53

Not sure but BAMT works w/ cgminer also.
I'm such a linux noob that I would probably stuff something up modifying BAMT for cgminer. I guess I'll wait until lodcrappo releases a final version optimized with cgminer. It's coming soon, right? Grin

I think you misunderstand.  BAMT has cgminer.  It ALREADY works with cgminer.  Run fixer to ensure you are on the latest version and modify the bamt.conf to set cgminer = 1 and then configure cgminer.conf.
rjk (OP)
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 22, 2012, 09:11:28 PM
 #54

Not sure but BAMT works w/ cgminer also.
I'm such a linux noob that I would probably stuff something up modifying BAMT for cgminer. I guess I'll wait until lodcrappo releases a final version optimized with cgminer. It's coming soon, right? Grin

I think you misunderstand.  BAMT has cgminer.  It ALREADY works with cgminer.  Run fixer to ensure you are on the latest version and modify the bamt.conf to set cgminer = 1 and then configure cgminer.conf.
Well. I didn't know that, and this is an interesting and positive development. Could you elaborate on how to "run fixer" - I had no idea that it already contained cgminer. I don't see an existing value in bamt.conf that says cgminer=0, so am I out of date?

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1063


Gerald Davis


View Profile
February 22, 2012, 09:18:17 PM
 #55

Not sure but BAMT works w/ cgminer also.
I'm such a linux noob that I would probably stuff something up modifying BAMT for cgminer. I guess I'll wait until lodcrappo releases a final version optimized with cgminer. It's coming soon, right? Grin

I think you misunderstand.  BAMT has cgminer.  It ALREADY works with cgminer.  Run fixer to ensure you are on the latest version and modify the bamt.conf to set cgminer = 1 and then configure cgminer.conf.
Well. I didn't know that, and this is an interesting and positive development. Could you elaborate on how to "run fixer" - I had no idea that it already contained cgminer. I don't see an existing value in bamt.conf that says cgminer=0, so am I out of date?

probably.

run /opt/bamt/fixer when logged in as root.

the patch which installs cgminer I don't think replaces bamt.conf but it does put an updated version at /opt/bamt/examples/bamt.conf

A quick:
"cp /opt/bamt/examples/bamt.conf /etc/bamt/bamt.conf"  will get you up to date (warning: will replace your existing bamt.conf).

set cgminer=1 in options section and set cgminer=1 for each card you want controlled by cgminer (likely all).

Even if you don't want to use cgminer you should always run fixer when using BAMT.  Lots of bug fixes, enhanced features, improved reporting, etc.


rjk (OP)
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 22, 2012, 09:30:40 PM
 #56

DeathAndTaxes, you are a hero. Just ran a skidload of updates, and I'll try getting cgminer to work now.

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
rjk (OP)
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 23, 2012, 02:14:26 AM
 #57

So guys, what do you think of this?


Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
BinaryMage
Hero Member
*****
Offline Offline

Activity: 560
Merit: 500


Ad astra.


View Profile
February 23, 2012, 02:18:15 AM
 #58

So guys, what do you think of this?



Are you using extra long PCIe extenders? (I know the ones I have from Cablesaurus wouldn't be long enough)

Other than that, looks promising!

-- BinaryMage -- | OTC | PGP
rjk (OP)
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 23, 2012, 02:19:41 AM
 #59

Are you using extra long PCIe extenders? (I know the ones I have from Cablesaurus wouldn't be long enough)

Other than that, looks promising!
Yes, I'll probably have to custom order some, unless I can find a place that makes good quality extra long ones.

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
the joint
Legendary
*
Offline Offline

Activity: 1834
Merit: 1020



View Profile
February 23, 2012, 02:26:32 AM
 #60

This thing deserves a name when you're done with it.

Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!