Anyone who runs multiple nVidia cards on risers?
I am looking into running 5-6 cards per rig, but i think someone earlier in this thread mentioned that cudaMiner still has some trouble with high traffic so that risers could be a problem. (1x to 16x risers, 5 or 6 per rig)
I notice a severe slowdown from 340 khash to ~280 khash when going to a 1 to 16 riser, I was able to recover some speed at the suggestion of Christian to use the -H 2 switch which got me back up to the 320 khash level, but nothing I can do gets me back to 340. I will be getting 16 to 16 risers in tomorrow for testing this weekend. Not sure if anyone has done any serious testing of the 1 to 16 risers with NVidia cards and cudaminer, I will try with the cards I have (750 Ti, 680, 780 Ti) but it will have to wait till this weekend. I will post what I find. If anyone has any special requests, just post 'em here. I also have a motherboard with 4 x1 and 3 x16 slots coming in tomorrow.
I'm running into the same issue.
Setup:
Intel Pentium G3220
Asus Z78-Pro
4GB RAM
1kW Seasonic Modular PSU
7x x1-x16 Riser Assemblies
2x Zotac 750 Ti, Reference
5x ASUS 750 Ti, OC
So far I've only messed with the Zotac reference cards.
I have OC'd CPU+135,VRAM+700; Modified VBIOS to lift TDP limit.
On x16 slot, 320-330 kH/s (with -H 2 flag). On x1-x16 riser assembly, ~280kH (with -H 1 flag).
-H2 flag using x1 riser causes unusually low performance. I believe this is due to the fact that for the CPU to offload the GPU, the data bandwidth required for the offload traffic is saturating the x1 bandwidth, thus artificially limiting the card. I suspect that even with -H2, we're seeing some limitations relating to either the way cudaminer works or the x1 bandwidth limitation.
My theory is that the only fix for a riser use-case is to utilize x16-x16 riser assemblies. Otherwise, you're throwing away ~50kH/s per card attached to a x1 riser. As an electrical design engineer, this hurts me.
I've now got 6 750 Ti's running on Win8 x64. I'm pretty disappointed.
1 Cards, OC'd, in x16 slot (running at x16): 320-330 kH/s [Total of 320-330 kH/s]
2 Cards, OC'd, in 2 x16 slots (each running at x8): 300 kH/s each [Total of ~600KH/s]
4 Cards, OC'd, running on x1-x16 risers (each running at x1): ~270-280 each [Total of ~1100 kH/s]
6 Cards, stock, running on x1-x16 risers (each running at x1): 230-240 each (one runs at ~210 and is operating at PCIe 1.1 x1...whereas all the other cards are running at PCIe 2.0 x1) [Total of ~1430kH/s]
I haven't been able to get 7 Cards to show up in Windows...figure that it could be a MoBo/BIOS address space limitation.
I can't really apply any overclock at all when I have 6 cards loaded. MSI AB, GPU-Z, etc., all feel a little unstable when using. Applying even just +100MHz to MEMCLK will cause cudaminer to crash.
I have also noticed that my little G3220 is way too puny to run -H 1 flag.
Here are the flags I'm running -H 2 -i 0 -l T25x16
Will I see improved scaling in Ubuntu with cudaminer?
I'm thinking of returning all of these cards because 230 kH/s per card is not so great. If I was even hitting 280, I'd maybe be able to swallow this.
What is the primary factor here causing a scalability issue? Is it the PCIe operating mode? Is it Windows 8? Is it the Nvidia Driver? Is cudaminer not written to scale well past 4 GPUs?
Yes, I did try running 2 or 4 cards each in a separate cudaminer instance. While the system stability improved, the results yielded the same aggregate hash rate, and any attempts to boost any clocks cause a quick cudaminer crash.
If I can't get more out of these cards by Monday, I'll have to send them back to Newegg.