Bitcoin Forum
November 25, 2025, 04:48:58 AM *
News: Latest Bitcoin Core release: 30.0 [Torrent]
 
   Home   Help Search Login Register More  
Poll
Question: Should I create translations of the website?
Yes - 25 (73.5%)
No - 9 (26.5%)
Total Voters: 34

Pages: « 1 2 3 4 5 6 7 8 9 [10]  All
  Print  
Author Topic: Talksearch.io - Advanced Bitcointalk Search Engine  (Read 4304 times)
NotATether (OP)
Legendary
*
Offline Offline

Activity: 2156
Merit: 9108


Trêvoid █ No KYC-AML Crypto Swaps


View Profile WWW
October 20, 2025, 06:52:39 PM
Merited by joker_josue (1)
 #181

PSA: We are not affected by the Amazon Web Services global outage. Our infrastructure is hosted on Google Cloud and on dedicated providers.

Cheers!

.
 betpanda.io 
 
ANONYMOUS & INSTANT
.......ONLINE CASINO.......
▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
███████░░░░░░░░░███████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀
.
SLOT GAMES
....SPORTS....
LIVE CASINO
▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄   
█████████████
█░░░░░░░░░░░█
█████████████

▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄█████▄██▄▀▄
▄▀▄▐▐▌▐▐▌▄▀▄
▄▀▄█▀██▀█▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team
examplens
Legendary
*
Offline Offline

Activity: 3836
Merit: 4193


Trêvoid █ No KYC-AML Crypto Swaps


View Profile WWW
November 06, 2025, 11:34:07 AM
 #182

I used TalkSearch a little, I would suggest two improvements if possible. It is related to the search result.
For example, what I miss is the sorting of search results by creation date and by the date of the last post in the topic. I'm not sure how the algorithm decides the order of the prints, but it seems that they are thrown randomly.

Also, if possible, separate or at least mark archived or locked threads. It took me a long time to search and check each link, and many were just archived or locked, which was useless to me.

.
 betpanda.io 
 
ANONYMOUS & INSTANT
.......ONLINE CASINO.......
▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
███████░░░░░░░░░███████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀
.
SLOT GAMES
....SPORTS....
LIVE CASINO
▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄   
█████████████
█░░░░░░░░░░░█
█████████████

▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄█████▄██▄▀▄
▄▀▄▐▐▌▐▐▌▄▀▄
▄▀▄█▀██▀█▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team
NotATether (OP)
Legendary
*
Offline Offline

Activity: 2156
Merit: 9108


Trêvoid █ No KYC-AML Crypto Swaps


View Profile WWW
November 09, 2025, 11:20:33 AM
Last edit: November 09, 2025, 02:39:05 PM by NotATether
 #183

I used TalkSearch a little, I would suggest two improvements if possible. It is related to the search result.
For example, what I miss is the sorting of search results by creation date and by the date of the last post in the topic. I'm not sure how the algorithm decides the order of the prints, but it seems that they are thrown randomly.

Also, if possible, separate or at least mark archived or locked threads. It took me a long time to search and check each link, and many were just archived or locked, which was useless to me.


I hope to finally add vector embedding search within the next few days. It would require a complete re-index though since I have to add new fields.

Also the second request might be a good idea. I return the entire post data to the frontend anyway so it would be possible to do such a thing.



The bitcointalk scraper is being restarted, in order to use a lower post search threshold which is expected to index posts twice as fast. There may be a minor disruption in indexing during this operation, but it is expected to be immediate.

Update 14:08:00 UTC: as expected, the ML server is causing a few errors with Talksearch that are preventing searches from completing successfully. Please be patient while I fix these.

Update 14:37:00 UTC: All errors have now been fixed. Working on vectorizing all the posts now. I really should've ordered this thing with 64GB memory  Undecided

.
 betpanda.io 
 
ANONYMOUS & INSTANT
.......ONLINE CASINO.......
▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
███████░░░░░░░░░███████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀
.
SLOT GAMES
....SPORTS....
LIVE CASINO
▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄   
█████████████
█░░░░░░░░░░░█
█████████████

▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄█████▄██▄▀▄
▄▀▄▐▐▌▐▐▌▄▀▄
▄▀▄█▀██▀█▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team
NotATether (OP)
Legendary
*
Offline Offline

Activity: 2156
Merit: 9108


Trêvoid █ No KYC-AML Crypto Swaps


View Profile WWW
November 15, 2025, 06:56:10 AM
 #184

I'm still working on adding ML features to the search queries, but first I am scanning through the entire post index to identify and trash deleted posts. Talksearch does not currently detect when a post is deleted so I have to do this manually.

.
 betpanda.io 
 
ANONYMOUS & INSTANT
.......ONLINE CASINO.......
▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
███████░░░░░░░░░███████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀
.
SLOT GAMES
....SPORTS....
LIVE CASINO
▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄   
█████████████
█░░░░░░░░░░░█
█████████████

▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄█████▄██▄▀▄
▄▀▄▐▐▌▐▐▌▄▀▄
▄▀▄█▀██▀█▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team
NotATether (OP)
Legendary
*
Offline Offline

Activity: 2156
Merit: 9108


Trêvoid █ No KYC-AML Crypto Swaps


View Profile WWW
November 20, 2025, 05:45:44 AM
 #185

Community: What is the fastest way to scan the forum to check for deleted posts given a set of post & topic ID pairs?

I'm not interested in edited posts, only deleted posts.

To check over 50 million posts, at an average speed of 1 post per 2.64 seconds, it will take 132 million seconds or over four years.

I need to do a one-time scan after I downloaded a post set which may or may not include deleted posts some time ago, and obviously I can't wait that long.

Even if I checked checked 20 posts per page, that would still take 2.5 months assuming no downtime.

I would like to be able to query all this information taking only several days, or at worst a few weeks. These times are way too long for me.

.
 betpanda.io 
 
ANONYMOUS & INSTANT
.......ONLINE CASINO.......
▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
███████░░░░░░░░░███████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀
.
SLOT GAMES
....SPORTS....
LIVE CASINO
▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄   
█████████████
█░░░░░░░░░░░█
█████████████

▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄█████▄██▄▀▄
▄▀▄▐▐▌▐▐▌▄▀▄
▄▀▄█▀██▀█▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team
LoyceV
Legendary
*
Offline Offline

Activity: 3864
Merit: 20459


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
November 20, 2025, 07:45:31 AM
Merited by vapourminer (1)
 #186

Community: What is the fastest way to scan the forum to check for deleted posts given a set of post & topic ID pairs?
First scrape all board lists, that gives you a list of all topics that haven't been deleted yet. Then scrape all topics.

Quote
To check over 50 million posts, at an average speed of 1 post per 2.64 seconds, it will take 132 million seconds or over four years.
You're allowed one page request per second, so waiting 2.64 seconds isn't necessary. Maybe you can use "All" for topics with no more than 26 pages to get up to 500 posts at once, but Cloudflare will probably stop you from doing that.

Quote
Even if I checked checked 20 posts per page, that would still take 2.5 months assuming no downtime.
It took me several months (years back), but that included scraping non-existing topics because I hadn't thought of scraping the boards first.

Quote
I would like to be able to query all this information taking only several days, or at worst a few weeks.
Does it help to prioritize boards? Forget about the altcoin bounty boards, that should take off millions if not tens of millions of posts.

¡uʍop ǝpᴉsdn pɐǝɥ ɹnoʎ ɥʇᴉʍ ʎuunɟ ʞool no⅄
NotATether (OP)
Legendary
*
Offline Offline

Activity: 2156
Merit: 9108


Trêvoid █ No KYC-AML Crypto Swaps


View Profile WWW
November 20, 2025, 10:37:08 AM
 #187

You're allowed one page request per second, so waiting 2.64 seconds isn't necessary. Maybe you can use "All" for topics with no more than 26 pages to get up to 500 posts at once, but Cloudflare will probably stop you from doing that.

It appears twice as slow because the actual indexer is also running in parallel. I just ran the deletion script on top of it and it's also running at the same speed.

Cloudflare would work, but I would have to embed a token of some kind.

Quote
Does it help to prioritize boards? Forget about the altcoin bounty boards, that should take off millions if not tens of millions of posts.

Not really. The deleted posts occur randomly across any board (though I suspect the gambling discussion board has a higher proportion of deleted posts).

.
 betpanda.io 
 
ANONYMOUS & INSTANT
.......ONLINE CASINO.......
▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
███████░░░░░░░░░███████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀
.
SLOT GAMES
....SPORTS....
LIVE CASINO
▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄   
█████████████
█░░░░░░░░░░░█
█████████████

▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄█████▄██▄▀▄
▄▀▄▐▐▌▐▐▌▄▀▄
▄▀▄█▀██▀█▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team
LoyceV
Legendary
*
Offline Offline

Activity: 3864
Merit: 20459


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
November 20, 2025, 11:57:39 AM
 #188

The deleted posts occur randomly across any board (though I suspect the gambling discussion board has a higher proportion of deleted posts).
Have you considered modlog as a first hint of where to look for deleted posts? I don't think there's a foolproof way to catch all deleted posts. You could also prioritize more recent topics over older ones: there's no need to check this topic from 2010 every few weeks for deleted posts.

¡uʍop ǝpᴉsdn pɐǝɥ ɹnoʎ ɥʇᴉʍ ʎuunɟ ʞool no⅄
NotATether (OP)
Legendary
*
Offline Offline

Activity: 2156
Merit: 9108


Trêvoid █ No KYC-AML Crypto Swaps


View Profile WWW
November 23, 2025, 08:41:31 AM
Merited by LoyceV (4)
 #189

Have you considered modlog as a first hint of where to look for deleted posts? I don't think there's a foolproof way to catch all deleted posts. You could also prioritize more recent topics over older ones: there's no need to check this topic from 2010 every few weeks for deleted posts.

That may not be necessary anymore - theymos suggested I should use the sitemap.xml page. It's a little convoluted, but I managed to hack together a script that will check for deleted/updated posts.

There is one big pass I have to make first, before I can revert to tiny passes from the past day or two.

.
 betpanda.io 
 
ANONYMOUS & INSTANT
.......ONLINE CASINO.......
▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
███████░░░░░░░░░███████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀
.
SLOT GAMES
....SPORTS....
LIVE CASINO
▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄   
█████████████
█░░░░░░░░░░░█
█████████████

▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄█████▄██▄▀▄
▄▀▄▐▐▌▐▐▌▄▀▄
▄▀▄█▀██▀█▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team
LoyceV
Legendary
*
Offline Offline

Activity: 3864
Merit: 20459


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
November 23, 2025, 09:20:34 AM
 #190

theymos suggested I should use the sitemap.xml page.
Is that https://bitcointalk.org/sitemap.php? I've seen it before, but wasn't sure how to use it (and later on couldn't find it back).

Note:
I can't post this the way I wanted, Bitcointalk turns it into the above:
[url=https://bitcointalk.org/sitemap.php]bitcointalk.org/sitemap.php[/url]?

Even code tags can't post the nobbc-code correctly, the above turns into this:
Code:
Is that https://bitcointalk.org/sitemap.php?

¡uʍop ǝpᴉsdn pɐǝɥ ɹnoʎ ɥʇᴉʍ ʎuunɟ ʞool no⅄
NotATether (OP)
Legendary
*
Offline Offline

Activity: 2156
Merit: 9108


Trêvoid █ No KYC-AML Crypto Swaps


View Profile WWW
November 23, 2025, 12:06:55 PM
Merited by LoyceV (12), joker_josue (2)
 #191

theymos suggested I should use the sitemap.xml page.
Is that https://bitcointalk.org/sitemap.php? I've seen it before, but wasn't sure how to use it (and later on couldn't find it back).

Yep, that's the one. Basically, he explained to me that the sitemap is organized into many smaller sitemaps, presumably to avoid generating a single giant sitemap. The main sitemap.php contains topic/page boundaries, where p= and o= denote the topic and page inside the topic (such as .20) respectively.


The inner sitemap XML looks like this

Code:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://bitcointalk.org/index.php?topic=178336.703420</loc>
<lastmod>2025-11-18T20:07:14+00:00</lastmod>
<changefreq>hourly</changefreq>
<priority>0.568</priority>
</url>
<url>
<loc>https://bitcointalk.org/index.php?topic=178336.703440</loc>
<lastmod>2025-11-19T02:41:11+00:00</lastmod>
<changefreq>hourly</changefreq>
<priority>0.412</priority>
</url>
<url>
<loc>https://bitcointalk.org/index.php?topic=178336.703460</loc>
<lastmod>2025-11-19T08:24:00+00:00</lastmod>
<changefreq>hourly</changefreq>
<priority>0.418</priority>
</url>
<url>
<loc>https://bitcointalk.org/index.php?topic=178336.703480</loc>
<lastmod>2025-11-19T15:14:25+00:00</lastmod>
<changefreq>always</changefreq>
<priority>0.478</priority>
</url>
<url>
<loc>https://bitcointalk.org/index.php?topic=178336.703500</loc>
<lastmod>2025-11-19T17:59:25+00:00</lastmod>
<changefreq>always</changefreq>
<priority>0.7</priority>
</url>
...

From this data, the most important are the loc and the lastmod. These basically tell you whether a page has been edited or deleted. Deleted posts will cause subsequent pages to appear modified as well.

I guess the other two parameters can be used to gauge how frequently to check for updates, but I don't use those in my current implementation - I am performing a full sweep over the sitemap first.



This process is also causing regular indexing to slow down because I am throttling the indexer speed in order to avoid breaching the rate limit. New posts should now appear instantly though (but only because I am using the recentposts page, nothing to do with the sitemap). But edits will appear to be much slower until I finish crawling through the pages linked by the sitemap.

.
 betpanda.io 
 
ANONYMOUS & INSTANT
.......ONLINE CASINO.......
▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
███████░░░░░░░░░███████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀
▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀
.
SLOT GAMES
....SPORTS....
LIVE CASINO
▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄   
█████████████
█░░░░░░░░░░░█
█████████████

▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄█████▄██▄▀▄
▄▀▄▐▐▌▐▐▌▄▀▄
▄▀▄█▀██▀█▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team
Pages: « 1 2 3 4 5 6 7 8 9 [10]  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!