Even if the download speed is the bottleneck, it still seems like sharing all the information across the network will eventually be too slow. And I know that a block can contain more transactions than others and that the difficulty changes every 2016 blocks, but more transactions means more data to transfer. That's why I did all my calculations in terms of # of transactions per seconds, not blocks per second.
How are you coming to this conclusion? The wiki you linked already contains a section on bandwidth and it contradicts what you're saying
Let's assume an average rate of 2000tps, so just VISA. Transactions vary in size from about 0.2 kilobytes to over 1 kilobyte, but it's averaging half a kilobyte today.
That means that you need to keep up with around 8 megabits/second of transaction data (2000tps * 512 bytes) / 1024 bytes in a kilobyte / 1024 kilobytes in a megabyte = 0.97 megabytes per second * 8 = 7.8 megabits/second.
This sort of bandwidth is already common for even residential connections today, and is certainly at the low end of what colocation providers would expect to provide you with.
When blocks are solved, the current protocol will send the transactions again, even if a peer has already seen it at broadcast time. Fixing this to make blocks just list of hashes would resolve the issue and make the bandwidth needed for block broadcast negligable. So whilst this optimization isn't fully implemented today, we do not consider block transmission bandwidth here.