Bitcoin Forum
April 26, 2024, 04:59:41 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Users that wiped all their posts. Or anohter project for LoyceV  (Read 463 times)
DaveF (OP)
Legendary
*
Offline Offline

Activity: 3458
Merit: 6235


Crypto Swap Exchange


View Profile WWW
November 07, 2019, 03:52:22 PM
 #1

So I noticed yesterday or the day before some posts were missing in one of threads I was in.
Did not think much about it, could have been the user, could have been a mod.

Today I noticed that another thread was shorter and it looks like the user deleted a bunch of their posts.

https://bitcointalk.org/index.php?action=profile;u=2668562;sa=showPosts

So, is there a way to generate a list of what was scraped vs. what is there?

Not important, just curious.

-Dave

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
1714150781
Hero Member
*
Offline Offline

Posts: 1714150781

View Profile Personal Message (Offline)

Ignore
1714150781
Reply with quote  #2

1714150781
Report to moderator
"Your bitcoin is secured in a way that is physically impossible for others to access, no matter for what reason, no matter how good the excuse, no matter a majority of miners, no matter what." -- Greg Maxwell
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714150781
Hero Member
*
Offline Offline

Posts: 1714150781

View Profile Personal Message (Offline)

Ignore
1714150781
Reply with quote  #2

1714150781
Report to moderator
JohnBitCo
Sr. Member
****
Offline Offline

Activity: 2030
Merit: 356


View Profile
November 07, 2019, 04:36:11 PM
 #2

So I noticed yesterday or the day before some posts were missing in one of threads I was in.
Did not think much about it, could have been the user, could have been a mod.

Today I noticed that another thread was shorter and it looks like the user deleted a bunch of their posts.

https://bitcointalk.org/index.php?action=profile;u=2668562;sa=showPosts

So, is there a way to generate a list of what was scraped vs. what is there?

Not important, just curious.

-Dave

This User HardwalletAttacker1 had recently changed the password and he deleted all of this pervious posts. I think he is someone who had bought this account and trying to clear the pervious posts history.
However the account is already tagged on some false information explanations.
Quote
Do not trust this user's explanation of technical details. He prefers to spew nonsense rather than learn how things actually work.
Deathwing
Legendary
*
Offline Offline

Activity: 1638
Merit: 1328


Stultorum infinitus est numerus


View Profile WWW
November 07, 2019, 05:06:33 PM
 #3

Unless there is a public database where it logs every edit for every post. (While it does check if it has been edited, hence the last edit timestamp) I don't think it's really that possible. Unless, of course, you literally archive every single post ever on Bitcointalk. If anyone has several terabytes of storage lying around, that might actually be cool. If anyone codes a system like this, I might be able to provide some storage though.
DaveF (OP)
Legendary
*
Offline Offline

Activity: 3458
Merit: 6235


Crypto Swap Exchange


View Profile WWW
November 07, 2019, 05:09:48 PM
 #4

Unless there is a public database where it logs every edit for every post. (While it does check if it has been edited, hence the last edit timestamp) I don't think it's really that possible. Unless, of course, you literally archive every single post ever on Bitcointalk. If anyone has several terabytes of storage lying around, that might actually be cool. If anyone codes a system like this, I might be able to provide some storage though.

Loyce is trying to:

https://bitcointalk.org/index.php?topic=5167469.0

Which was why I was asking him about the scrapes.

-Dave

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Deathwing
Legendary
*
Offline Offline

Activity: 1638
Merit: 1328


Stultorum infinitus est numerus


View Profile WWW
November 07, 2019, 05:19:24 PM
Merited by LoyceV (1)
 #5

Unless there is a public database where it logs every edit for every post. (While it does check if it has been edited, hence the last edit timestamp) I don't think it's really that possible. Unless, of course, you literally archive every single post ever on Bitcointalk. If anyone has several terabytes of storage lying around, that might actually be cool. If anyone codes a system like this, I might be able to provide some storage though.

Loyce is trying to:

https://bitcointalk.org/index.php?topic=5167469.0

Which was why I was asking him about the scrapes.

-Dave

Oh. I did not know this. Looking into it I see that Loyce already has somewhat working system in place. If you check this link you should be able to access all the posts of that specific account. Although Loyce mentioned that his webhost ran out of space. That might be the reason there are gaps between the time periods of that acc. Although you can't see whether it has been edited and changed into what, you can see the original post.
The Sceptical Chymist
Legendary
*
Online Online

Activity: 3318
Merit: 6800


Cashback 15%


View Profile
November 07, 2019, 05:28:27 PM
 #6

Don't know if it's possible (though I'm pretty sure LoyceV will prevail), but I'd be interested to see the results.  There would probably be a lot of Newbie accounts that tried to scam and then tried to cover their tracks, but I'd like to see if there are any older members whose names I haven't seen in a long time.  I think there were a couple of them a while back who were deleting most if not all of their posts.

Too bad there's no option on the forum to delete your account.  I'm not sure why that is.

.
.HUGE.
▄██████████▄▄
▄█████████████████▄
▄█████████████████████▄
▄███████████████████████▄
▄█████████████████████████▄
███████▌██▌▐██▐██▐████▄███
████▐██▐████▌██▌██▌██▌██
█████▀███▀███▀▐██▐██▐█████

▀█████████████████████████▀

▀███████████████████████▀

▀█████████████████████▀

▀█████████████████▀

▀██████████▀▀
█▀▀▀▀











█▄▄▄▄
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
.
CASINSPORTSBOOK
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀▀█











▄▄▄▄█
LoyceV
Legendary
*
Offline Offline

Activity: 3290
Merit: 16555


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
November 07, 2019, 05:54:26 PM
 #7

Loyce is trying to:

https://bitcointalk.org/index.php?topic=5167469.0

Which was why I was asking him about the scrapes.
As shown in that link, I have 3 versions: all posts (updated every minute), per topic (no automated updates yet due to lack of time for testing) and per user (also no automated topics yet). I'm currently running an update for both.

You'll have to check the posts numbers to be sure, but I think my data is pretty complete for the past couple of months.

If you check this link you should be able to access all the posts of that specific account. Although Loyce mentioned that his webhost ran out of space. That might be the reason there are gaps between the time periods of that acc. Although you can't see whether it has been edited and changed into what, you can see the original post.
In the "early days" of my scraper, I may have missed some posts, but I think the user just didn't post during those gaps (such as from for instance Oct 9 to Oct 16). It'll take a while to update, if he made any posts after Oct 25 they'll show up in a while. I think it takes around 12 hours to update.

Unfortunately it's not possible to detect deleted posts, so I can't highlight them without scraping everything again.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Pmalek
Legendary
*
Offline Offline

Activity: 2744
Merit: 7104



View Profile
November 09, 2019, 01:37:12 PM
 #8

His profile shows that he has 35 posts, when you check his latest posts 6 are still there. Those are the threads that he once created and he can't delete those. BPIP on the other hand shows he has 0 posts, why the difference on bpip?

Most of the content he posted in his OPs are visible because it was quoted somewhere in the threads.
I remember his name, he was very active in Development & Technical Discussion and Bitcoin Technical Support.


.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
PrimeNumber7
Copper Member
Legendary
*
Offline Offline

Activity: 1610
Merit: 1899

Amazon Prime Member #7


View Profile
November 10, 2019, 12:39:48 AM
 #9

Unless there is a public database where it logs every edit for every post. (While it does check if it has been edited, hence the last edit timestamp) I don't think it's really that possible. Unless, of course, you literally archive every single post ever on Bitcointalk. If anyone has several terabytes of storage lying around, that might actually be cool. If anyone codes a system like this, I might be able to provide some storage though.
You could log every post until x time, and scrape every account's profile, moving up 'x' and re-scraping profiles whose scraped post count doesn't match their profile post count until the number of posts on the profile match the number of posts that have been scraped. This will calibrate how many posts each person should have to their actual posts.

After the above is done, you can continuously scrape new posts, and profile links to confirm if their post count has increased by one for each new post they have made. If not, their post history can be scraped, and checked against their existing posts in your DB. This subsequent scrape doesn't need to be saved, you will only need to update your DB with which of their posts was deleted when you find it.  


As shown in that link, I have 3 versions: all posts (updated every minute), per topic (no automated updates yet due to lack of time for testing) and per user (also no automated topics yet). I'm currently running an update for both.
This is horribly inefficient. You should be able to scrape once, and run various queries for each of your three copies of what you are saving.

I believe I remember reading that you are not a programmer, nor know any programming languages so I am curious who is helping you with your project.
LoyceV
Legendary
*
Offline Offline

Activity: 3290
Merit: 16555


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
November 10, 2019, 06:53:58 AM
 #10

As shown in that link, I have 3 versions: all posts (updated every minute), per topic (no automated updates yet due to lack of time for testing) and per user (also no automated topics yet). I'm currently running an update for both.
This is horribly inefficient. You should be able to scrape once, and run various queries for each of your three copies of what you are saving.
I only scrape the data once, but have separate processes for each "category".

Quote
I believe I remember reading that you are not a programmer, nor know any programming languages so I am curious who is helping you with your project.
Just me Smiley

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
suchmoon
Legendary
*
Offline Offline

Activity: 3654
Merit: 8909


https://bpip.org


View Profile WWW
November 10, 2019, 05:54:41 PM
Merited by Deathwing (1)
 #11

Unless there is a public database where it logs every edit for every post. (While it does check if it has been edited, hence the last edit timestamp) I don't think it's really that possible. Unless, of course, you literally archive every single post ever on Bitcointalk. If anyone has several terabytes of storage lying around, that might actually be cool. If anyone codes a system like this, I might be able to provide some storage though.

Average post length is ~1000 bytes. At 50 million posts it would be around 50GB, maybe up to 100GB if you want to go crazy with fancy text indexing. Not terabytes.

However it is technically impossible to capture edits without continuously scraping the whole post history of every active user. AFAIK there is no public indication of an edit anywhere except the timestamp on a post inside a thread, so using that timestamp would mean rescraping every thread. Assuming that users have to log in order to edit their posts it would probably be easier to go by their profile "last active" timestamp and scrape only post histories of active users. This could miss some moderator edits though.

Determining which posts have been deleted is also impossible without massive rescraping. Post counts can't be relied upon due to some boards that don't count posts (SD/IT).

Although most of the time that's not really an issue. Usually a question about deleted (or edited) posts arises when there's a suspicion about a specific user and then that user can be checked e.g. against LoyceV's archive.
Deathwing
Legendary
*
Offline Offline

Activity: 1638
Merit: 1328


Stultorum infinitus est numerus


View Profile WWW
November 10, 2019, 06:23:19 PM
 #12

Unless there is a public database where it logs every edit for every post. (While it does check if it has been edited, hence the last edit timestamp) I don't think it's really that possible. Unless, of course, you literally archive every single post ever on Bitcointalk. If anyone has several terabytes of storage lying around, that might actually be cool. If anyone codes a system like this, I might be able to provide some storage though.

Average post length is ~1000 bytes. At 50 million posts it would be around 50GB, maybe up to 100GB if you want to go crazy with fancy text indexing. Not terabytes.

However it is technically impossible to capture edits without continuously scraping the whole post history of every active user. AFAIK there is no public indication of an edit anywhere except the timestamp on a post inside a thread, so using that timestamp would mean rescraping every thread. Assuming that users have to log in order to edit their posts it would probably be easier to go by their profile "last active" timestamp and scrape only post histories of active users. This could miss some moderator edits though.

Determining which posts have been deleted is also impossible without massive rescraping. Post counts can't be relied upon due to some boards that don't count posts (SD/IT).

Although most of the time that's not really an issue. Usually a question about deleted (or edited) posts arises when there's a suspicion about a specific user and then that user can be checked e.g. against LoyceV's archive.

Isn't moderator edit a rare thing anyway? I honestly think "Assuming that users have to log in order to edit their posts it would probably be easier to go by their profile "last active" timestamp and scrape only post histories of active users" is a very viable option. Assuming that every post is recorded, if a user edits their post which changes their last activity it can check the posts of that user's (lets say like last 25 posts) posts and compares the posts to the posts in the database that were registered beforehand. If there are any differences, that's it. However, there must be some sort of a "edited post" call for this, probably. Otherwise refreshing a page pretty much updates last activity too.
suchmoon
Legendary
*
Offline Offline

Activity: 3654
Merit: 8909


https://bpip.org


View Profile WWW
November 10, 2019, 07:03:32 PM
 #13

viable option

Well, it's better than constantly rescraping 50 million posts but you would still need to rescrape nearly 3 million user profiles.

Another shortcut would be to rescrape only users who post something new (going by the "recent" page) but that would miss a few if someone edits something and never posts again. Probably rare though.
Deathwing
Legendary
*
Offline Offline

Activity: 1638
Merit: 1328


Stultorum infinitus est numerus


View Profile WWW
November 10, 2019, 07:11:53 PM
 #14

viable option

Well, it's better than constantly rescraping 50 million posts but you would still need to rescrape nearly 3 million user profiles.

Another shortcut would be to rescrape only users who post something new (going by the "recent" page) but that would miss a few if someone edits something and never posts again. Probably rare though.


I mean it would probably be easier to implement something similar to seclog where it just tracks every edit. If a person edits one of their posts, it just gets posted there. If not, there can be a similar thing that just tracks "post" calls and someone whose scraping can just monitor it instead.
friends1980
Legendary
*
Offline Offline

Activity: 1582
Merit: 1059


nutildah-III / NFT2021-04-01


View Profile
November 11, 2019, 11:12:34 AM
Last edit: November 11, 2019, 01:00:27 PM by friends1980
 #15

I guess it's better when they delete the posts themselves. It means that we don't have to waste our time and energy reporting them, and the result is the same.

I mentioned before that the sad thing about reporting and "cleaning up" someone's spammy mess, is the fact that after everything has been reported and deleted, those guys' profiles look like they've never spammed in their life. Roll Eyes

nutildah-III - First BitcoinTalk NFT Transaction ever - 2021-04-01 [666 fBTC]
hosseinimr93
Legendary
*
Online Online

Activity: 2380
Merit: 5213



View Profile
November 11, 2019, 11:29:01 AM
Merited by friends1980 (1)
 #16

I mean it would probably be easier to implement something similar to seclog where it just tracks every edit. If a person edits one of their posts, it just gets posted there. If not, there can be a similar thing that just tracks "post" calls and someone whose scraping can just monitor it instead.
Theymos is the only person who can implement this. I don't think LoyceV or any other user can provide such data unless all posts are tracked one by one.

I mentioned before that the sad thing about reporting and "cleaning up" someone's spammy mess, is the fact that after everything has been reported and deleted, those guys' profiles look like they've never spammed in their life. Roll Eyes
That's why I had already suggested in another thread showing number of recent posts deleted by moderators on profiles. 

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
LoyceV
Legendary
*
Offline Offline

Activity: 3290
Merit: 16555


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
November 11, 2019, 11:31:03 AM
 #17

those guys' profiles look like they've never spammed in their life. Roll Eyes
BPIP has a wall of shame.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Deathwing
Legendary
*
Offline Offline

Activity: 1638
Merit: 1328


Stultorum infinitus est numerus


View Profile WWW
November 11, 2019, 01:36:46 PM
 #18

I mean it would probably be easier to implement something similar to seclog where it just tracks every edit. If a person edits one of their posts, it just gets posted there. If not, there can be a similar thing that just tracks "post" calls and someone whose scraping can just monitor it instead.
Theymos is the only person who can implement this. I don't think LoyceV or any other user can provide such data unless all posts are tracked one by one.

That is what I have meant there. It is probably easier for theymos to code and create and for all the patrollers, scripters to have access and use to locate edits faster and efficiently. You can scrape posts, that's not going to scale well though.
friends1980
Legendary
*
Offline Offline

Activity: 1582
Merit: 1059


nutildah-III / NFT2021-04-01


View Profile
November 12, 2019, 09:20:21 PM
 #19

those guys' profiles look like they've never spammed in their life. Roll Eyes
BPIP has a wall of shame.

I had no idea. This is damned brilliant mate Grin I think I've noticed some names I've been reporting in the last months, which is quite funny. Tongue

nutildah-III - First BitcoinTalk NFT Transaction ever - 2021-04-01 [666 fBTC]
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!