Community: What is the fastest way to scan the forum to check for deleted posts given a set of post & topic ID pairs?
First scrape all board lists, that gives you a list of all topics that haven't been deleted yet. Then scrape all topics.
To check over 50 million posts, at an average speed of 1 post per 2.64 seconds, it will take 132 million seconds or over four years.
You're allowed one page request per second, so waiting 2.64 seconds isn't necessary.
Maybe you can use "
All" for topics with no more than 26 pages to get up to 500 posts at once, but Cloudflare will probably stop you from doing that.
Even if I checked checked 20 posts per page, that would still take 2.5 months assuming no downtime.
It took me several months (years back), but that included scraping non-existing topics because I hadn't thought of scraping the boards first.
I would like to be able to query all this information taking only several days, or at worst a few weeks.
Does it help to prioritize boards? Forget about the altcoin bounty boards, that should take off millions if not tens of millions of posts.