what else can I do to either pin it down that there is an actual hardware error
Make sure you stress
simultaneous reading and writing to the same drive and that the drive is the bottleneck in both directions, not the network transfer.
My general (not bitcoin specific) recipe for triggering errors in disk controllers is as follows:
1) keep the source tree (e.g. ~/.bitcoin) on the local disk that is mounted read-only.
2) it is acceptable to have source over the net only if the net is gigabit
3) recursively compute MD5 checksum of the source tree
cd ~/.bitcoin; find . -depth -type f -exec md5sum "{}" \;
4) start filling out the target disk by doing the recursive copy, not with "cp -r" or "scp -r" but with piped tar "tar cf - . | tar xvf -" or cpio "find . -depth | cpio -o | cpio -i"
5) as soon as the first copy is made spawn running in parallel MD5 checksum verification of the sums computed in step (3)
6) continue running step (4) and (5) until the last copy fail with target disk nearly 100% full
7) remove exactly one target copy making room for exactly one more copy of source
8) keep doing the above over a weekend
I'm not up to speed about current SSD market, but in the past only Intel SSD drives survived this kind of torture. Only Intel from the "reasonably priced" segment, we also tested some models from "ridiculously expensive" segment. Our test also involved doing very similar things running through commercial database engines over many drives in parallel (not as RAID but as FILEGROUP).