canonical encoding means a numbering system for each block, tx, vin, vout so that the same number references the same one. Since the blocks are ordered and the tx are ordered within each block and vins and vouts are ordered within each tx, this is a matter of just iterating through the blockchain in a deterministic way.
Is it just a count of outputs, or <block-height | transaction index | output index>?
I have no idea how this would cause any privacy loss as it is just using 32 bit integers as pointers to the hashes. The privacy issue was raised as somehow a reason to not use efficient encoding.
Ahh ok, I guess it is confusion due to the thread split. I agree, I see no loss in privacy by referring to transactions using historical positions.
With a hard fork, transactions could have both options. If you want to spend a recent transactions, you could refer to the output by hash. For transactions that are buried deeply, you could use the index. With reorgs, the indexes could be invalidated, but that is very low risk for 100+ confirms.
I used to use a 32bit index for the entire chain, but that doesnt work for parallel sync, plus in a few years it would actually overflow.
now it is a 32 bit index for txidind, unspentind and spendind, for the txids, vouts and vins within each bundle of 2000 blocks
and an index for the bundle, which is less than 16 bits
so (bundle, txidind) and (bundle, unspentind) and (bundle, spendind) would be the corresponding txid, vout and vin within each bundle
but yes, use hashes for non-permanent data, index for permanent