Would you be including the prefixes of the merkle tree nodes too? These would serve as a checksum, so the receiver would at least know which transaction has the colliding prefix. In fact, with the tree nodes included, you could be much less conservative with the prefix length, sending 2 or event 1 byte(s).
Collisions could also be handled by having a re-send system.
Hashes were grouped together into 32 hash groups and the merkle root for those hashes.
The sender could estimate how much of a prefix is required. If 4 bytes was required, then a group of 32 hashes would require (32 * 4 + 32) bytes = 5 bytes per hash.
The receiver could replace the 32 hashes with its best guess and then check the CRC/Merkle root. If it doesn't match, then it could ask for those hashes in expanded form.
Assuming a block with 4000 transactions in a block.
Peer 1: Sends block with 5 bytes per hash.
This works out at 20kB
Peer 2: Checks block and finds 2 groups with a mismatch
Peer 2: sends tx request
byte[32]: Block hash
varInt: group count
int: index 1
int: index 2
This works out as 41 bytes
Peer 1: Sends full transaction hashes for those blocks
This works out at 64 * 32 = 2048 bytes
This is much less than 4000 * 32 = 128kB. However, it is at the cost of an additional round trip delay.
If each compressed hash was prefixed by a length, then the sender could try to estimate which hashes require longer prefixes.
The birthday paradox may cause potential issues here.
This "groups" idea is essentially the same thing as the merkle tree, if you make the group size 2. With smaller group size, you have to resend less transaction hashes in the case of collision. Also, the birthday paradox is less of an issue; you could use smaller prefixes.
Also, consider that we can compress the merkle nodes. Collisions in the nodes can be checked with the parent nodes all the way up the tree, but it becomes exponentially less likely that you'll need to resend a branch as you move up the tree. The advantage of this hierarchical checksum is that you can make the prefixes, of the nodes and the transaction hashes,
much smaller.
I think that re-requesting any transactions is too costly, so we should work out the prefix size so that there are "probably" no collisions, but only just. Someone needs to do the math.