Thanks a lot, this is very interesting information.
1) With regard to "Mining tools are specifically written to speak the protocol the ASIC units expect", how would that "usually" be done?
Since many mining clients appear to be written in Phython (well, this is my personal impression), let's stick to this as an example. Let's also assume we have an ASIC which "speaks" the UART (or I2C, or SPI, or ...) protocol, and is designed in such a way that all it does is read the 512-bit block header, current target, and nonce from a "load register", does its "double-SHA256 magic", compares the result (the computed hash) against the current target. If result > current target, the nonce is incremented by 1, and the double-SHA256 magic is done again. If result < current target, the result is written to a "store register" and the ASIC signals some kind of success flag.
In the given example, am I right to assume that it would be sufficient to enhance the code of the mining client so that it can communicate with the ASIC using said UART (I2C/SPI) protocol? If so, it's a trivial task, because this can be easily done with e.g. pyserial (or the MPSSE wrapper in case of I2C/SPI).
It would boil down to something like:
* mining client gets a block from the network (getwork or getblocktemplate)
* mining client strips the block header and the target from the block (plus some litte/big endian stuff)
* mining client writes the block header, the target, and the desired start value of the nonce to the ASIC's load register
* mining client signals the ASIC to start processing
* mining client waits for the ASIC to signal success
* mining client reads the result from the ASIC's store register, and sends it to the network
OK, I left out some exception handling stuff like stale block handling and such, but in principle ...
My problem: it's hard for me to believe that it's really that simple, so the question is, where am I wrong, and what am I missing?
2) full verification vs. simplified verification
> although existing implementations expect to be able to load the wallet into RAM
OK, do I understand it right: it's sufficient to keep a single instance of the wallet in RAM -- it's not required to keep an instance of the wallet in RAM _per_SHA256_worker_thread_ -- is this correct? If so, it shouldn't be a problem at all, since you get gigabytes of memory a dozen a dime (well, almost
)
Again, thanks a lot in advance for your help!