Back in 2013 an Anonymous Author wrote a brief paper on using BLS signatures to achieve a kind of non-interactive coinjoin. I jumped on it right away, pointing out that it also could have useful anti-censorship properties. (Edit: Paper link is down, a copy is available from Andrew Poelstra.)

The core property it depends on is that fact that for a BLS signature one can take a collection of {message, pubkey, signature} tuples and merge their signatures: {[message1, message2, ...], [pubkey1, pubkey2, ...], signature}, in a way which is infeasible to unmerge them without knowing the components, and still be able to verify the result. Via a clever trick that involves signing per input and per output, the authorship of inputs and outputs can be unlinked. I have a sidechain implementation of this that I've been working on-- like coinjoin generally it composes well with CT, but I'm unsure of what to do about the performance (more below).

Another application of these aggregatable signatures which has come up from time to time in #bitcoin-wizards is using them to reduce blocksize ("2013-09-13 21:38 < gmaxwell> sipa: I'm mildly excited about this pairing crypto aggregate signature idea. Not because of the anonymity stuff, but because it makes scalable relay fees viable. and can also reduce transaction sizes in the non-anonymous case"), since you'd save the sizes of the free standing signatures. Aggregate BLS signatures could even be made compatible with 'efficient' fraud proofs, though the cost of doing probably makes it not a win. (You would do a merkel sum tree over the interior computation, which aggregates the partial signatures as products in a 3kbit field; which would make it possible to show a block had an invalid aggregate signature with a compact proof).

The main downside of this are two fold-- the BLS signatures involve Pairing Cryptography-- less mature security assumptions than plain ECC signatures, and more importantly they're slow to verify: On hardware that can verify 70,000 secp256k1 signatures per second (or 140,000/s with batched Schnorr) they could only process about 8,000 BLS signature per second, with state of the art parameter selection and software. So, while bandwidth would be reduced, the capacity of most nodes wouldn't be increased much because they'd be bottlenecked on signatures. (And, right now, signature validation is extensively cached in Bitcoin-- and the aggregation would move validation back on the critical path).

[I've had some conversations about Dan Boneh about using multi-exponentiation like optimizations in computing the required product of parings here-- but it's likely that the performance improvement that this would get would only be on the order of 2x.]

Adam Back was asking me about this again today and I pointed him to the prior discussions regarding the performance limits. Then he suggested just using Schnorr multi-signature instead of the BLS signatures. The idea is that if you have inputs with pubkeys P and P2 you can combine them and to form a P+P2 pubkey and sign with that to prove authorization with both keys. When this had previously been discussed the fact that someone malicious could set their P2 to P3-P and then sign the aggregate with just P3 seemed to kill it.

But actually a few months ago Pieter Wuille solved this as a side effect of his key tree multi-signature work, He suggested instead that to combine P and P2, you use P*H(P) + P2*H(P2); this doesn't inhibit signing, but it makes it impossible to create a pubkey as a linear combination of other pre-existing pubkeys. Pieter has an implementation of this as part of the Libsecp256k1 multi-signature work: https://github.com/bitcoin/secp256k1/pull/322

So the way we could use this to improve scalability in Bitcoin would be to add a OP_GROUPCHECKSIGVERIFY (and perhaps a multi-version of it) as part of a new segwit script typecode that takes as input sighash flags and a pubkey. The script operation would push a the pubkey onto a transaction scoped 'signature stack' and the sighash-masked transaction hash onto a transaction scoped hash stack.

After all inputs are processed the pubkeys on the stack would be combined to a new Pubkey as P = P1*H(H(all_Pn) || P1) + P2*H(H(all_Pn) || P2) + ..., and a root message hash computed-- then an additional group signature witness at the end of the transaction would contain a signature of the root with P.

Verification speed would be similar to individual inputs with batch schnorr (the multiplication with H() can be folded into the multi-exp in verification), so this is more of a space savings than a speedup; but the size of the transaction would be reduced by 41% asymptotically-- without any new security assumptions or performance hit. The main limit is that a transaction must have many inputs to fully realize that benefit.

Unlike other past shared signature proposals based on exploiting pubkey reuse, there is no advantage in this design in reusing public keys; so it also doesn't create a new privacy moral hazard for the network.

Critically, multiple distinct users could cooperate to securely produce a single large transaction with a two round protocol which may make it reasonable to get _many_ inputs in a single transaction-- but unlike the BLS signatures they must interact in order to merge their signatures. Effectively this would be a CoinJoin, with a big space (and thus fee) savings for the CoinJoin users.

Even without combining multiple users this approach would make it much more economical for a single user to sweep up txout dust which would be an important improvement on its own. For block 400083 if all transactions used this (but with no increase in CoinJoin, or even behavioral changes to prefer sendmany over chains of transactions which we would expect if this actually existed; and also ignoring multisig use, which would further improve this figure) I compute this would reduce the size of the block by 172050 bytes out of 897956 or a 19% reduction which is substantial fraction of the asymptotic result. [Edit: A scan of 2016 recent blocks with accurate counting for multisig, shows the actual reduction for existing usage patterns would be 30%]

In effect this could achieve at least half the asymptotic gain while having no negative performance impact (10x-17x better performance than BLS) and no new strong cryptographic security assumptions. To me it seems like a no brainer improvement that Bitcoin is almost sure to eventually adopt once implemented.