Bitcoin Forum
May 05, 2024, 02:19:37 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: How to get all posts through "recent"?  (Read 359 times)
LoyceV (OP)
Legendary
*
Online Online

Activity: 3304
Merit: 16596


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
October 01, 2019, 02:05:33 PM
Last edit: October 01, 2019, 02:38:36 PM by LoyceV
Merited by redsn0w (2), hatshepsut93 (1)
 #1

While scraping recent, I noticed I missed some posts. My logs show this:
Quote
Downloading recent.html
1. userID: 819696 - username: Hypnosis00 - msgID: 52615289
2. userID: 2286354 - username: FrequencyRules058 - msgID: 52615288
3. userID: 662400 - username: kzv - msgID: 52615287
4. userID: 1226689 - username: phoen - msgID: 52615285
5. userID: 93751 - username: ltcdice - msgID: 52615284
6. userID: 947291 - username: Polar91 - msgID: 52615283
7. userID: 2480302 - username: Bullrunking - msgID: 52615282
8. userID: 543165 - username: citronick - msgID: 52615281
9. userID: 2294946 - username: reena024 - msgID: 52615280
10. userID: 1000199 - username: krogothmanhattan - msgID: 52615279
The post ending on 86 this post is missing. I missed another post from the same thread too. I don't have the board on ignore, some other posts in the same thread show up as expected.

It's missing from half way the recent-page, and I have the same post missing a few seconds earlier or later too. That means the post was really missing from the page, which makes me think it's a bug in "recent".

1714918777
Hero Member
*
Offline Offline

Posts: 1714918777

View Profile Personal Message (Offline)

Ignore
1714918777
Reply with quote  #2

1714918777
Report to moderator
1714918777
Hero Member
*
Offline Offline

Posts: 1714918777

View Profile Personal Message (Offline)

Ignore
1714918777
Reply with quote  #2

1714918777
Report to moderator
1714918777
Hero Member
*
Offline Offline

Posts: 1714918777

View Profile Personal Message (Offline)

Ignore
1714918777
Reply with quote  #2

1714918777
Report to moderator
Whoever mines the block which ends up containing your transaction will get its fee.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714918777
Hero Member
*
Offline Offline

Posts: 1714918777

View Profile Personal Message (Offline)

Ignore
1714918777
Reply with quote  #2

1714918777
Report to moderator
suchmoon
Legendary
*
Offline Offline

Activity: 3654
Merit: 8922


https://bpip.org


View Profile WWW
October 01, 2019, 02:32:37 PM
 #2

Doesn't "recent" show only the most recent post in each thread? Not sure where I got that from but I thought that's how it worked.
LoyceV (OP)
Legendary
*
Online Online

Activity: 3304
Merit: 16596


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
October 01, 2019, 02:34:58 PM
Last edit: October 03, 2019, 02:31:49 PM by LoyceV
 #3

Doesn't "recent" show only the most recent post in each thread?
You're right! Mind blown :O

I never knew that. I'll edit the title to my new question: how do I get all posts? This messes up my data projects.

o_e_l_e_o
In memoriam
Legendary
*
Offline Offline

Activity: 2268
Merit: 18509


View Profile
October 01, 2019, 03:25:15 PM
 #4

Are you sure? See posts 26 and 30 in the screenshot below - both show up for me in recent, both in the same thread (Here: https://bitcointalk.org/index.php?topic=5174107.msg52617047#msg52617047).





Edit: Confirmed I can see this post and suchmoon's test post below both on recent at the same time, albeit on different pages.
suchmoon
Legendary
*
Offline Offline

Activity: 3654
Merit: 8922


https://bpip.org


View Profile WWW
October 01, 2019, 03:26:55 PM
 #5

test

Edit: it looks like I was wrong, sorry LoyceV for confusing you. I got the same result as o_e_l_e_o. No idea then why you missed those posts.
hosseinimr93
Legendary
*
Offline Offline

Activity: 2394
Merit: 5235



View Profile
October 01, 2019, 03:42:48 PM
 #6

As far as I know all posts should be shown in "recent".
These numbers are the IDs of missed posts in http://loyce.club/archive/posts/5259/ and http://loyce.club/archive/posts/5260/
52590233
52591100
52591174
52591179
52591311
52591721
52592748
52597731
52598319
52598892
52602024
52602357
52604597
52607589
It seems that there is a bug. It can be from Loyce.club or Bitcointalk.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
suchmoon
Legendary
*
Offline Offline

Activity: 3654
Merit: 8922


https://bpip.org


View Profile WWW
October 01, 2019, 03:49:38 PM
 #7

These numbers are the IDs of missed posts in http://loyce.club/archive/posts/5259/ and http://loyce.club/archive/posts/5260/

Some of those might be missing legitimately - e.g. quickly deleted, or posted on an invisible board, for example:

52591179
52591721
52592748
52598319
52598892
52602024
52602357

All others seem to exist in the WO thread, except this one in a different thread:

https://bitcointalk.org/index.php?topic=5026942.msg52597731#msg52597731
hosseinimr93
Legendary
*
Offline Offline

Activity: 2394
Merit: 5235



View Profile
October 01, 2019, 04:17:11 PM
 #8

Some of those might be missing legitimately - e.g. quickly deleted, or posted on an invisible board, for example:
Invisible boards?
Which boards are invisible? Are there some boards that are only visible to moderators?

All others seem to exist in the WO thread, except this one in a different thread:
Do you mean this thread?
So, there is a bug. Am I right?
All of the posts in this thread should be shown in "Recent" too.

May I know how could you find this post only with knowing msgID?
The links of posts contain topic number too.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
Halab
Staff
Legendary
*
Offline Offline

Activity: 2408
Merit: 2021


I find your lack of faith in Bitcoin disturbing.


View Profile
October 01, 2019, 05:07:33 PM
Merited by LoyceV (1), hosseinimr93 (1)
 #9

Some of those might be missing legitimately - e.g. quickly deleted, or posted on an invisible board, for example:
[...]
52602024
52602357

I didn't check the other ids, but these two are the last 2 posts in the Staff forum.

Invisible boards?
Which boards are invisible? Are there some boards that are only visible to moderators?

Yes there is a special board for the Staff. Another one for the VIPs. And maybe other boards, but I don't have access to these ones Smiley.
And a special one for the April Fool's Day ideas, Theymos takes this very seriously Smiley.

Ok, I'm lying for the last one.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
suchmoon
Legendary
*
Offline Offline

Activity: 3654
Merit: 8922


https://bpip.org


View Profile WWW
October 01, 2019, 05:46:59 PM
Merited by LoyceV (1), hosseinimr93 (1)
 #10

May I know how could you find this post only with knowing msgID?

Quote a post - any post, doesn't matter. Then in the URL that looks like this:

Code:
https://bitcointalk.org/index.php?action=post;quote=52617659;topic=5189156.0;num_replies=8;sesc=...

Replace the number after "quote=" with the ID of the post you're looking for. You'll get the post quoted in the text box and then you can use the user's post history and the contents of the post to find it. Note that the link in the quote is not valid, e.g. if you click Preview and click the link it won't go to the correct thread.
theymos
Administrator
Legendary
*
Offline Offline

Activity: 5194
Merit: 12972


View Profile
October 01, 2019, 10:47:49 PM
 #11

All posts you can see should be listed there, though due to database concurrency limitations, ones made in the last few seconds might not show up, even if others before/after them do.

Note that if you don't need to get posts ASAP, it may be more easy and efficient for you to use https://bitcointalk.org/sitemap.php. All of the last-modification times are accurate to within a couple of hours.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
LoyceV (OP)
Legendary
*
Online Online

Activity: 3304
Merit: 16596


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
October 03, 2019, 03:43:47 PM
 #12

Are you sure?
No, apparently I wasn't sure. I tested it again, and all 5 test-posts in this thread ended showed up in "recent".

I checked 40,000 of my scraped posts, and I have:
9990/10000
9996/10000
9996/10000
9993/10000

That means it must have been a coincidence that I missed 2 posts in the Wall Observer thread in a short time span, right at the moment I was testing my scraper there.

I can live with missing less than 0.1% of all posts (and some of the missing posts are on hidden boards (I only know of VIP and Staff boards) and Investigations is excluded.

May I know how could you find this post only with knowing msgID?
Quote a post - any post, doesn't matter. Then in the URL that looks like this:

Code:
https://bitcointalk.org/index.php?action=post;quote=52617659;topic=5189156.0;num_replies=8;sesc=...

Replace the number after "quote=" with the ID of the post you're looking for.
That's a neath trick! But difficult to automate, so I can't really use it to check for missing posts.

All posts you can see should be listed there, though due to database concurrency limitations, ones made in the last few seconds might not show up, even if others before/after them do.
Thanks. If it's a known limitation, I'll just let it be Smiley

Quote
Note that if you don't need to get posts ASAP, it may be more easy and efficient for you to use https://bitcointalk.org/sitemap.php. All of the last-modification times are accurate to within a couple of hours.
That page looks different in Firefox and in Chrome, but I can't really figure out what I'm looking at.

hosseinimr93
Legendary
*
Offline Offline

Activity: 2394
Merit: 5235



View Profile
October 03, 2019, 06:31:28 PM
 #13

Most of the missed posts are in Wall Observer thread
25 out of those 40,000 posts have been missed.
14 out of 25 posts are in hidden threads. So we can say that 11 out of 40,000 posts have been missed. 9 out of 11 missed posts are in Wall Observer thread. That's 82% of missed posts.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
theymos
Administrator
Legendary
*
Offline Offline

Activity: 5194
Merit: 12972


View Profile
October 03, 2019, 09:14:05 PM
 #14

That page looks different in Firefox and in Chrome, but I can't really figure out what I'm looking at.

It's an XML sitemap file. Search engines use that file to keep up-to-date on forum posts. It's designed for computers to process, not humans; different browsers display it differently.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!