About a year ago I created my own homebrew DDoS protection. Here's how it worked:
Set up several smallish EC2 instances. Each one acts as a reverse proxy to the origin server:
client1-->gate1\
client2-->gate2->--->origin server
client3-->gate3/
It's just an nginx reverse proxy. Pass the real IP in the X-Real-IP header, etc. Easy.
Each gate will have iptables and nginx rules to detect easy attacks (eg. rate limiting). Importantly, they all must have SYNPROXY rules, a feature of modern Linux kernels. Having SYNPROXY iptables rules over several gateways like this completely defeats all SYN flood attacks.
The gateways need to be in an AWS VPC set to block all UDP traffic in the VPC's stateless traffic settings. This completely blocks all UDP flood attacks. If you instead block UDP traffic in the gates' security groups, then very large UDP floods can still affect you. It has to be at the VPC level.
I found that the best way to set up the DNS to distribute traffic was like this. Assume that you have 4 gates, g1 through g4. Then using Route 53's weighted record feature, you would have the DNS return at random one of the following 5 pairs of IPs, each with a TTL of 5 minutes:
g1&g2
g2&g3
g3&g4
g4&g1
This seems to work better than just putting all of the gate IPs into one A record. I think that the randomization that should happen in that case actually gets cached at some points, and so whichever record is returned first at <wherever it's cached> gets hit harder.
Additionally, I had a system of classifying and blocking malicious-looking IPs, but it failed to work well enough in the end, so I'm not going to describe it in detail.
So that's my homebrew DDoS protection that we were using for the last year or so. It worked impressively well against many attacks which you might think would require something like Cloudflare, but failed in the end against attackers with thousands of IPs, making full TCP connections, who can blend into the legit traffic too well. A more complete solution which could replace Cloudflare etc. in many ways would look more like this:
-----
The first major flaw with my setup is that it wasn't easy to change. My setup would grab a few configuration details (eg. the origin server IP) from VPC-local DNS records that I would set, but if I wanted to make deeper changes, I'd have to modify one of the instances, convert that into a new AMI, terminate all of the other instances, and then start new instances again. If I wanted to change the number of gates, I'd have to start/stop them manually and change the DNS records myself. A
good solution would never require this much manual work, and would use things like auto scaling groups and CloudFormation to simplify it. It should only take a couple of minutes to add a new iptables rule, for example.
The second major flaw with my setup is that it lacked a good, systematic way of classifying IPs as good/bad/neutral. All of the gates should collect long-term stats on every IP which connects to them and contribute it to a central database. Using some sort of model over the data in the central IP database, it should then be able to determine whether an IP address is probably good (because it's been acting like a normal person browsing the site for a long time), probably bad (because it eg. just started requesting tons of pages), or unknown/neutral. Then based on that classification plus an idea of how busy the site currently is, it can block an IP, allow an IP, or insert a Cloudflare-style captcha challenge for an IP. If you pass the challenge, the system sets a cookie on you which whitelists you for several days.
For the forum to go back to a homebrew solution from Cloudflare, the above two pieces would need to be very-well-satisfied.
Another point is that you could design the system such that it does not require looking into HTTPS traffic. It can just work at the TCP layer and pass the encrypted HTTPS traffic verbatim. I'm not sure how
exactly you would tunnel the real data to the real server (I previously thought that GRE tunnels would work, but somebody told me that this might not be the appropriate tool), but it should definitely be possible. The upside to this is that you can use a very powerful service like AWS without trusting them too much. The downside is that you cannot use layer 7 data for IP classification, and you cannot insert a challenge; it's either block or allow. The
ideal anti-DDoS solution would give you the option of whether you want to give the gates access to your HTTPS or not.