I only looked at the XC6SLX75 to XC6SLX150, but those both fit the board. I guess the XC6SLX100 will also fit, but I didn't check. The XC6SLX??-?N will also fit, as there is just some missing functionality (not needed).
How about, for example, the XC6SLX25-N3FGG484C. It's ~$50USD, one third the price of an LX150. If you can verify that both chips will happily use the same landing, then the first prototype can have that LX25 soldered on. Full testing of the board, FPGA, power supply, and FPGA with firmware can be performed. If everything looks good the remaining boards can be soldered with LX150s.
It may also be good for those who wish to sample the product before buying the more expensive LX150 version, or just generally play around with the board without feeling like they are risking destroying an expensive device.
For the MSP430 code - what functions will need to be written (have not looked at updated schematic yet). What clock rates will the communication bus need to run at? Do I need to come up with a protocol, or will this be a community effort? How do the community envision the USB connection (CDC, HID, etc)? For the connection to the mainboard - what will the packet look like?
I'm not involved in this design enough to give a perfectly accurate answer, but since no one else has chimed in yet I'll make some quick comments. I've written controllers for my FPGA firmware before, so I have a bit of experience with it already.
Communications between uC and FPGA:
The uC needs to deliver Work to the FPGA at least every 2^32/(MH/s) seconds. Supposing FPGA performance of 200MH/s that is every 21 seconds. However, Long Polling will result in massive spikes of BUS bandwidth, requiring ASAP deliver of Work to every FPGA.
Work going to the FPGA is a data packet with a raw size of 96+256 (352) bits of data, but could go up to 608 bits in highly optimized FPGA firmware.
With clever optimization it would be possible to eliminate the peak bandwidth issues caused by a Long Polling event. Hence I would suggest that the delivery of ~608bits in a 1/4th of a second should be sufficient at worst case. That's about 2.5Kbits/s of bandwidth. Any better than that increases the MH/s performance of the system as a whole during Long Poll events, which are typically devastating regardless.
The delivery of 608-bit Work units to at most 64 FPGAs over a 20 second period would also give under 2.5Kbit/s. So 2.5Kbit/s seems like a good target. Double or triple that for communication fluff.
The uC needs to pull successful calculations from the FPGA on average every 21 seconds. These Results will be buffered in a FIFO on the FPGA, so you do not need to worry about peak bandwidth here. As long as you can pull the data out a little bit faster than one per 21 seconds per FPGA then that is sufficient.
Results coming from the FPGA will have a raw size of 96+32 (128) bits. This is unlikely to be larger, but could potentially be 1024bits.
The uC may want to send various commands to the FPGA and pull back some information. For example:
Commands: Change Clock Frequency, Self Test
Queries: Version, Self Test Result, Current Nonce, Estimated Temperature, Current Clock Frequency, Fault Error
Communications between uC and PC:
The uC needs to talk to the PC to give information about the current status of the FPGAs, update stored firmware, report current hashing rate, enable/disable FPGAs, etc. The PC needs to give the uC new work, and indicate Long Polling events. The uC needs to give the PC results from the FPGAs.
If the uC is responsible for programming the FPGAs, it must do that. Hopefully the FPGAs will have fault detection and downclocking built-in. If that is the case the uC needs to report faults, and possibly re-enable FPGAs at lower clock speeds. This is under the assumption that the FPGAs might have the ability to adjust their internal clock speeds. If they don't then they'll only have two clock speeds: MAX and OFF, possibly more with the aid of firmwares.
... That's all I can think of off the top of my head. I hope it is useful.