In this case it may not have mattered. Cloudflare was having a real bad morning today. Got the calls from a few clients that they could not get to their own sites from the own offices that are in the IP whitelist.
Having your IP(s) in it can't hurt, but when Cloudflare themselves are having problems I don't think it's going to matter.
If it's a large problem on Cloudflare's side, I would have expected them to fix it in much less than 8.5 hours. Considering how widely used they are, that's terrible.
I think a more reliable solution would be to have purpose-built scraping endpoints that take an API key. The endpoints could be defined in such a way to allow for checksum-based data reconciliation. That way, whenever there are outages (Cloudflare or otherwise) the missing data could be correctly retrieved after the fact, and there would never be any doubt as to whether you're one-to-one with the master data.
Currently, some of the "master data" doesn't exist: if a user edits their post within 10 minutes, the original is lost from the forum.
Allowing scrappers has already made all forum posts and other data freely searchable, so unless this will change, I see no reason to it implement an api for people like loyce. It would probably make sense to charge for said access
New access structure for loyce.club: Unedited post: $2. Viewing a Trust list: $2. Notifications: $0.50 each. Somehow charging fees for non-commercial community projects doesn't seem right.
My scraper got whitelisted! Thanks theymos
Whitelisting is, as theymos called it, a "janky" solution. Downloading
All for instance still gives:
ERROR 503: Service Temporarily Unavailable.
I don't need "All" so that's okay. Let's see if this works next time Cloudflare kills bots again.
Ninjastic.space
is whitelisted too.