I got Mach-O x86_64 SSE2 to work (Mac OS 10.6.7). Here's the list of everything I did:
1. I got the file libcurl.m4 from a curl source tarball and stuck it in an m4 directory.
2. I installed the gnulib byteswap module into the cpuminer directory and updated build scripts as necessary.
3. When configuring, I had to use both "-arch x86_64" in CFLAGS and the '--build=x86_64-apple-darwin10.7.0' flag, since it seems to detect that my kernel is 32 bit rather than that my processor is 64 bit.
4. To assemble sha256_xmm_amd64.asm, I set "-f macho64" in the Makefile (todo: integrate this into configure)
5. To fix the assembly failures, I applied the patch here: <https://gist.github.com/971291>.
It fixes the 32-bit offset by using another register to hold the current target, which is then used directly and updated as necessary. This adds one extra line of assembly (I'm not sufficiently familiar with x86_64 assembly to know whether or not it actually adds an extra _instruction_, but it probably does) to the main loop as well as a push and a pop to save and restore the values of the register I used (r8). The assembly already uses r10 without doing a save and restore, but I'm not sure how to check that r8 is unused, so this might be unnecessary. It also fixes a problem where my gcc apparently adds leading underscores to all of the names.
This is the least clean section because it probably breaks the build on Linux. Nobody said assembly was supposed to be portable...