Bitcoin Forum
January 29, 2020, 02:10:25 PM *
News: Latest Bitcoin Core release: 0.19.0.1 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: How to get all posts through "recent"?  (Read 272 times)
LoyceV
Legendary
*
Offline Offline

Activity: 1750
Merit: 5641


Most of loyce.club works again


View Profile WWW
October 01, 2019, 02:05:33 PM
Last edit: October 01, 2019, 02:38:36 PM by LoyceV
Merited by redsn0w (2), hatshepsut93 (1)
 #1

While scraping recent, I noticed I missed some posts. My logs show this:
Quote
Downloading recent.html
1. userID: 819696 - username: Hypnosis00 - msgID: 52615289
2. userID: 2286354 - username: FrequencyRules058 - msgID: 52615288
3. userID: 662400 - username: kzv - msgID: 52615287
4. userID: 1226689 - username: phoen - msgID: 52615285
5. userID: 93751 - username: ltcdice - msgID: 52615284
6. userID: 947291 - username: Polar91 - msgID: 52615283
7. userID: 2480302 - username: Bullrunking - msgID: 52615282
8. userID: 543165 - username: citronick - msgID: 52615281
9. userID: 2294946 - username: reena024 - msgID: 52615280
10. userID: 1000199 - username: krogothmanhattan - msgID: 52615279
The post ending on 86 this post is missing. I missed another post from the same thread too. I don't have the board on ignore, some other posts in the same thread show up as expected.

It's missing from half way the recent-page, and I have the same post missing a few seconds earlier or later too. That means the post was really missing from the page, which makes me think it's a bug in "recent".

1580307025
Hero Member
*
Offline Offline

Posts: 1580307025

View Profile Personal Message (Offline)

Ignore
1580307025
Reply with quote  #2

1580307025
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
suchmoon
Legendary
*
Offline Offline

Activity: 2212
Merit: 4591


https://bpip.org


View Profile WWW
October 01, 2019, 02:32:37 PM
 #2

Doesn't "recent" show only the most recent post in each thread? Not sure where I got that from but I thought that's how it worked.

LoyceV
Legendary
*
Offline Offline

Activity: 1750
Merit: 5641


Most of loyce.club works again


View Profile WWW
October 01, 2019, 02:34:58 PM
Last edit: October 03, 2019, 02:31:49 PM by LoyceV
 #3

Doesn't "recent" show only the most recent post in each thread?
You're right! Mind blown :O

I never knew that. I'll edit the title to my new question: how do I get all posts? This messes up my data projects.

o_e_l_e_o
Legendary
*
Offline Offline

Activity: 826
Merit: 3449


Decent


View Profile
October 01, 2019, 03:25:15 PM
 #4

Are you sure? See posts 26 and 30 in the screenshot below - both show up for me in recent, both in the same thread (Here: https://bitcointalk.org/index.php?topic=5174107.msg52617047#msg52617047).





Edit: Confirmed I can see this post and suchmoon's test post below both on recent at the same time, albeit on different pages.

suchmoon
Legendary
*
Offline Offline

Activity: 2212
Merit: 4591


https://bpip.org


View Profile WWW
October 01, 2019, 03:26:55 PM
 #5

test

Edit: it looks like I was wrong, sorry LoyceV for confusing you. I got the same result as o_e_l_e_o. No idea then why you missed those posts.

hosseinimr93
Hero Member
*****
Offline Offline

Activity: 868
Merit: 584


First 100% Liquid Stablecoin Backed by Gold


View Profile
October 01, 2019, 03:42:48 PM
 #6

As far as I know all posts should be shown in "recent".
These numbers are the IDs of missed posts in http://loyce.club/archive/posts/5259/ and http://loyce.club/archive/posts/5260/
52590233
52591100
52591174
52591179
52591311
52591721
52592748
52597731
52598319
52598892
52602024
52602357
52604597
52607589
It seems that there is a bug. It can be from Loyce.club or Bitcointalk.

suchmoon
Legendary
*
Offline Offline

Activity: 2212
Merit: 4591


https://bpip.org


View Profile WWW
October 01, 2019, 03:49:38 PM
 #7

These numbers are the IDs of missed posts in http://loyce.club/archive/posts/5259/ and http://loyce.club/archive/posts/5260/

Some of those might be missing legitimately - e.g. quickly deleted, or posted on an invisible board, for example:

52591179
52591721
52592748
52598319
52598892
52602024
52602357

All others seem to exist in the WO thread, except this one in a different thread:

https://bitcointalk.org/index.php?topic=5026942.msg52597731#msg52597731

hosseinimr93
Hero Member
*****
Offline Offline

Activity: 868
Merit: 584


First 100% Liquid Stablecoin Backed by Gold


View Profile
October 01, 2019, 04:17:11 PM
 #8

Some of those might be missing legitimately - e.g. quickly deleted, or posted on an invisible board, for example:
Invisible boards?
Which boards are invisible? Are there some boards that are only visible to moderators?

All others seem to exist in the WO thread, except this one in a different thread:
Do you mean this thread?
So, there is a bug. Am I right?
All of the posts in this thread should be shown in "Recent" too.

May I know how could you find this post only with knowing msgID?
The links of posts contain topic number too.

Halab
Staff
Hero Member
*****
Offline Offline

Activity: 840
Merit: 562



View Profile
October 01, 2019, 05:07:33 PM
Merited by LoyceV (1), hosseinimr93 (1)
 #9

Some of those might be missing legitimately - e.g. quickly deleted, or posted on an invisible board, for example:
[...]
52602024
52602357

I didn't check the other ids, but these two are the last 2 posts in the Staff forum.

Invisible boards?
Which boards are invisible? Are there some boards that are only visible to moderators?

Yes there is a special board for the Staff. Another one for the VIPs. And maybe other boards, but I don't have access to these ones Smiley.
And a special one for the April Fool's Day ideas, Theymos takes this very seriously Smiley.

Ok, I'm lying for the last one.

suchmoon
Legendary
*
Offline Offline

Activity: 2212
Merit: 4591


https://bpip.org


View Profile WWW
October 01, 2019, 05:46:59 PM
Merited by LoyceV (1), hosseinimr93 (1)
 #10

May I know how could you find this post only with knowing msgID?

Quote a post - any post, doesn't matter. Then in the URL that looks like this:

Code:
https://bitcointalk.org/index.php?action=post;quote=52617659;topic=5189156.0;num_replies=8;sesc=...

Replace the number after "quote=" with the ID of the post you're looking for. You'll get the post quoted in the text box and then you can use the user's post history and the contents of the post to find it. Note that the link in the quote is not valid, e.g. if you click Preview and click the link it won't go to the correct thread.

theymos
Administrator
Legendary
*
Offline Offline

Activity: 3640
Merit: 7423


View Profile
October 01, 2019, 10:47:49 PM
 #11

All posts you can see should be listed there, though due to database concurrency limitations, ones made in the last few seconds might not show up, even if others before/after them do.

Note that if you don't need to get posts ASAP, it may be more easy and efficient for you to use https://bitcointalk.org/sitemap.php. All of the last-modification times are accurate to within a couple of hours.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
LoyceV
Legendary
*
Offline Offline

Activity: 1750
Merit: 5641


Most of loyce.club works again


View Profile WWW
October 03, 2019, 03:43:47 PM
 #12

Are you sure?
No, apparently I wasn't sure. I tested it again, and all 5 test-posts in this thread ended showed up in "recent".

I checked 40,000 of my scraped posts, and I have:
9990/10000
9996/10000
9996/10000
9993/10000

That means it must have been a coincidence that I missed 2 posts in the Wall Observer thread in a short time span, right at the moment I was testing my scraper there.

I can live with missing less than 0.1% of all posts (and some of the missing posts are on hidden boards (I only know of VIP and Staff boards) and Investigations is excluded.

May I know how could you find this post only with knowing msgID?
Quote a post - any post, doesn't matter. Then in the URL that looks like this:

Code:
https://bitcointalk.org/index.php?action=post;quote=52617659;topic=5189156.0;num_replies=8;sesc=...

Replace the number after "quote=" with the ID of the post you're looking for.
That's a neath trick! But difficult to automate, so I can't really use it to check for missing posts.

All posts you can see should be listed there, though due to database concurrency limitations, ones made in the last few seconds might not show up, even if others before/after them do.
Thanks. If it's a known limitation, I'll just let it be Smiley

Quote
Note that if you don't need to get posts ASAP, it may be more easy and efficient for you to use https://bitcointalk.org/sitemap.php. All of the last-modification times are accurate to within a couple of hours.
That page looks different in Firefox and in Chrome, but I can't really figure out what I'm looking at.

hosseinimr93
Hero Member
*****
Offline Offline

Activity: 868
Merit: 584


First 100% Liquid Stablecoin Backed by Gold


View Profile
October 03, 2019, 06:31:28 PM
 #13

Most of the missed posts are in Wall Observer thread
25 out of those 40,000 posts have been missed.
14 out of 25 posts are in hidden threads. So we can say that 11 out of 40,000 posts have been missed. 9 out of 11 missed posts are in Wall Observer thread. That's 82% of missed posts.

theymos
Administrator
Legendary
*
Offline Offline

Activity: 3640
Merit: 7423


View Profile
October 03, 2019, 09:14:05 PM
 #14

That page looks different in Firefox and in Chrome, but I can't really figure out what I'm looking at.

It's an XML sitemap file. Search engines use that file to keep up-to-date on forum posts. It's designed for computers to process, not humans; different browsers display it differently.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
Pages: [1]
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!