Thought I'd chime in here as I've spent a number of years now investigating possible solutions to allow a distributed ledger to process a high tps throughput (VISA+ scale), yet remain trust-less and decentralized (no super nodes, witnesses, or any of the other myriad of semi-centralization tricks to allow scale). I'm not going to delve too much into the technical with this post, just share some of the ideas and philosophies that I had and where I ultimately settled. Perhaps it can give others some ideas, inspirations, etc
First though a quick recap....
Way back when (late 2012) the question I wanted an answer to was; at what TPS did a pedigree Satoshi block chain secured with POW start to become problematic.
I performed a number of tests which ultimately concluded in a figure of 150-300 TPS depending on the topology of the network and the average performance of nodes within the network graph. Past that point orphan thrashing began and deteriorated the performance of the network in general and the efficiency of POW mining rapidly (using efficient and POW in the same sentence seems a real oxymoron now!). These days with a higher average node spec and internet connections, I'd wager ~500 TPS would be possible before any headaches (a block size of about 300MB if anyone is wondering).
After that (2013) I started to experiment with different ledger architectures, the first of which was what I called a Block Tree (it was really more akin to a DAG). Without getting into too much detail the premise was that at times of high load, the "tree" could widen and portions of the network could be in varying states of total consensus (parts of the tree missing for example) but ensure a correct consensus for the parts of the tree they had. With a large enough portion of the correct tree nodes could estimate the chances of being out of consensus before the fact, when load then decreased the tree would narrow again and lagging nodes would eventually catch up.
There was some improvement (especially with regard to load spikes), but ultimately the same issues as a block chain surfaced at a higher load, and with extreme continuous load the whole thing fell on its ass.
I then went "full DAG" and dropped the blocks, which again resulted in further improvement, but traditional consensus algorithms (POW, POS, etc) again led to ultimate upper limits and various new problems such as no true global state that a block chain based approach provides. A DAG also couldn't support a large number of other features that were determined as "must have" for a real mass market targeted product.
That was end of 2014 and I went back to the drawing board completely and developed a ledger architecture called CAST (Channeled Asynchronous State Tree) and a consensus mechanism called EVEI (Evolving Voters via Endorseable Interactions). Together they allow scaling to VERY high throughput and meet all the necessary requirements.
The eureka moment was upon the realization that it is possible to split the data from the state, yet ensure that the data determines the state. This yields a number of very important properties when considering scalability:
1. The states are small (2000 tps consumes around 50kb per second)
2. The states have multiple points of origin
3. The states can be split into sub-states that reference a sub-set of the total transactions
First lets look at blocks and block chains with regard to the above points:
In a block chain the block is the state AND the data. This is required due to how the consensus operates with mining, specifically the miner of the next block may have transactions that others do not know about so the state data has to be packaged as the state itself (this is true no matter the algorithm, POW, POS, DPOS etc).
This in turn leads to there being only a single point of origin for the next valid block and so it has to propagate over the network. This leads to the inevitable latency and CAP considerations. If the block is too large and takes too long to fully propagate, orphan thrashing begins to occur and reduces overall performance and efficiency. Another side effect is that ALL transactions are broadcast twice, once when the transaction is created, and later within the block itself further adding to network and bandwidth overheads.
Finally a block obviously can not be split into sub-blocks once it has been mined to mitigate any of the above.
Going back to CAST and EVEI. In a gossip driven P2P network it can be assumed that the majority of nodes will always know about the majority of transactions, therefore the majority of nodes will output the same state independently and without any specific state communication with each other. This covers points #1 and #2, whereby the states can be small due to the redundant requirement of the data being embedded in the state and provides multiple points of origin for the state, grossly reducing propagation time (the majority of nodes have the state so in a healthy network propagation is practically zero).
This greatly increases the performance of the network and its efficiency. I've witnessed continuous loads of > 500 tps over long periods of time and short term spikes of > 2,500 tps in both small and large networks consisting of hardware ranging from PIs to enterprise servers with no issues.
Furthermore, having a global state of the ledger with consensus mitigates a lot of the problems associated with a DAG and its progressive state mechanics.
Some might argue that CAST + EVEI is then a block chain, and yes there are some similarities and overlap, but the principles and operational functionality underpinning it is radically different thus I consider it in a different camp. Either way, call it what you will
Moving on, 500-2,500+ tps is pretty good, especially when hardware such as a Pi is able to keep pace most of the time with minimal issues, but, it's not enough. VISA alone on Black Friday reportedly processes peaks of 40,000 tps, but even when discarding Black Friday, adding MasterCard, Amex, Paypal, and all the banking payments into the mix, it quickly becomes obvious that a couple 1000 tps is not enough for a global payments system. Throw IoT in the bag too and the requirements roll into the 100,000+ very quickly. Which is where #3 comes into play.
Block chains are generally unstructured, with the block containing a soup of transactions from various addresses. CAST on the other hand is very structured, with addresses owning one or more channels and each transaction has at least 2 components...a spend and a claim. The spend lives in the spenders channel and the claim lives in the receivers. With this structuring it is very easy to chop to the ledger up into more manageable partitions.
This then leads to a conclusion of; with a structured ledger, and compact states that are determined by the data itself, it should therefore be possible for the global ledger state to also be split into sub-states according to each data partition. WIN!
Nodes can configure according to their performance and support
n partitions rather than having to upgrade or even go offline to stay in the game as load increases over time.
EVEI consensus operates at a partition level, and the global state is simply a culmination of all partition level state consensus outcomes. This functions reliably due to the fact that most nodes will operate more than a single partition and the variance of node partition configurations in the network will lead to an amount of overlap. This overlap provides an auditable causality of the global state from current and past partition states.
Partitioning the data does bring with it some overhead, and presently the sweet spot seems to be about 1000 partitions before the curve exponent gets too large. This can probably be improved, but even if not, 1000 partitions each with the ability to process ~500 tps should be more than enough scale for now!
Some might be thinking, "hmm that partitioning thing sounds awfully similar to Ethereums sharding" and it does because it is. However, Ethereum's partitioning/sharding implementation is inferior due to 3 points:
1. It uses a block chain/s and is more akin to a set of side chains, which means there cant be a true consensus on global state
2. It is difficult and inefficient for shards to communicate due to the architecture of its smart contract VM and ambiguous state data
3. It's at least 2 years out, EVEI and CAST are not
Conclusion and TL;DR: To scale, remove the block chain, replace with a structured ledger and states that are decoupled from data, use consensus that embraces determinism...then chop the ledger into smaller chunks