Is there some mechanism for preventing a single node storing a single copy of the data, and then spoofing multiple identities and claiming payment as-if the data were stored across multiple nodes?
That's impossible unfortunately - no amount of math can prevent a node from outsourcing the actual storage to a central location that is a single-point-of-failure.
The best you'll ever be able to do is pay enough that there is incentive to do the job right, and/or use social/legal mechanisms to audit what node operators do and then arrange these transactions with specific operators. Not terribly exciting solutions unfortunately...
The obvious solutions are proof-of-work or proof-of-stake, but I can't see how proof-of-work alone would suffice.
Yeah, proof-of-work can only prove IO bandwidth, not where the data is located.
However... interactive latency measurements
can prove data within some sphere of radius t*c. With multiple trusted challenge servers around the world located in co-location centers one could easily prove that the data must be physically present in multiple locations.
You can even do in a fully decentralized way with some extensions to the Bitcoin scripting language by creating a txout that can only be spent by proving you have some data fragment, where the fragment is chosen randomly based on the previous block hash.
Because the fragment is based on the previous block hash, there is a time limit to how quickly the fragment must be retrieved, thereby proving (after sufficient trials) that the data is physically located within a sphere of radius 10minutes * the speed of light - currently this would prove the data may be physically located on Earth, the Moon and Venus, but no other planet. With a second proof-of-work blockchain established on, say, Pluto, we could then easily prove a similar result for data located on or nearby Pluto. (proving the Pluto proof-of-work blockchain is in fact located on Pluto is left as an exercise for the reader)
I think it's much slower than the speed of light unless you are sending only 1 or a few bits. The higher bandwidth desired, the shorter distance is possible. And the costs increase exponentially with distance, surpassing the mining reward. So your method may be good enough, with random fragments calculated based on a previous hash. The miners may have to store all the data in a nearby storage in order to compete with other miners.
I've been looking for something like this. What I recently read was PAST p2p storage developed in Microsoft Research a long time ago:
http://research.microsoft.com/en-us/um/people/antr/past/ It doesn't have currency implemented as incentives. Each file has K copies stored on the p2p network. The problem with this approach is that POW can only be done on the K nodes which contains a fragment. We seem to need every miner to have full access to all the data in order to calculate the hash of a random fragment. But that will bloat the block chain. How do we solve this problem?