Title: [Active] Finding spam and scams by keyword Post by: LoyceV on August 13, 2020, 02:42:47 PM The data
See loyce.club/badposts (https://loyce.club/badposts/) All categories (https://loyce.club/badposts/) / spam (https://loyce.club/badposts/spam.html) / scam (https://loyce.club/badposts/scam.html) / other (https://loyce.club/badposts/other.html) / advertising (https://loyce.club/badposts/advertising.html) / email (https://loyce.club/badposts/email.html) Most recent posts are shown first. Whitelisted users Code: LoyceV Post keywords Whitelisted users can post keywords. Please don't use very common words (such as "scam") that trigger too many false positives. And please keep all keywords in one code tag in one post: edit it to add/remove keywords. Other users can also post, with 2 possible outcomes:
Tips Leave out "https" from scam links. Format Within code-tags, post either "scam:thiswebsiteisascam" or "spam:Ikeeppostingthesamecrapeverywhere". One phrase per line. See this example (https://bitcointalk.org/index.php?topic=5268608.msg54988216#msg54988216). Remove the line to exclude the keyword in the next update. Only use a space after "scam:" if you want to include a space in your search. Keyword "other:" is meant for text that isn't necessarily spam or a scam, but needs highlighting nonetheless. Keyword "advertising" is meant for sites that are often used to copy an article and create a backlink. Keyword categories:
Features I search the last ~200,000 posts for new keywords. This covers approximately 1 month. I update all keywords every 20 minutes. I show all matches for all usernames and ranks. No exceptions. Not all of them are bad. Whitelisted users are shown in green. Limitations There's a maximum of 60 keywords per post (if you add more those will be ignored). I can increase this if needed. There's a maximum of 4000 matches per category. If there are more, the oldest are removed. The minimum keyword length is 4 (for other) or 5 (for spam/scam) characters. I also search quotes. Report posts! This list is only useful if someone actually reports the bad posts :) Post removal I want to keep this topic compact (so I can quickly scrape it many times). That means I'll delete almost all posts that don't contain a list from a Whitelisted user. I would say "I hope I'm not offending anyone", but really, I'm okay with that (https://bitcointalk.org/index.php?topic=5213722.msg53482650#msg53482650) :P Q&A What are you trying to accomplish with this thread? See around here (https://bitcointalk.org/index.php?topic=4720640.msg54980111#msg54980111) :)Will this look into titles? For now: nope :(Some types of scams involve posting very little content in the OP body but then go on to include important keywords in the title. I don't keep track of titles with this data. This could be cross-checked with your list of banned accounts. Thanks, I've added it.Can I use it for searching for alts? Like links to twitter, facebook, telegram usernames, etc? My plan was to only look back about 100k posts (currently just over 2 weeks), so it won't really help you here. But it would (near) instantly add new posts, and that's what I'm aiming for here.Can I use it for catching plagiarism, like searching for a whole sentence or this will clogged the server, or maybe only a phrases and not so common words like we did in the SpamBuster club with suchmoon? Searching all my data without database takes too long to do on a regular basis. You should try TryNinja's database (https://bitcointalk.org/index.php?topic=5248878.0) though! would it catch spoofed urls? www.bitcointalk.org (https://9gag.com) Without "www", the url turns into 9gag.com. I think theymos should give "www.bitcointalk.org" the same treatment.I've added category "other" for things like "www.bitcointalk.org". It shows every word which has keyword "moron" or "moran" inside: I search for the exact phrase (case insensitive), so it matches anything. You can add a space in front of it (as you did already), but that might miss some matches too. It's more or less as intended, if I change this, it might overlook other matches.umoran Is it supposed to work like this?Quote https://prnt.sc/u0l039 I search my deleted posts (https://bitcointalk.org/index.php?topic=5167469.0), I can't search live posts this way.https://bitcointalk.org/index.php?topic=5260507.msg54802273#msg54802273 I believe topic is trashed. Can we also include the reasoning behind why something is a scam? Perhaps a link to a thread that explains it? It can be explained in the post in this topic. I don't want to add repetitive explanations to my badposts (https://loyce.club/badposts/) page."the eth pill stuff" has malware https://bitcointalk.org/index.php?topic=5182222.msg54876299#msg54876299 It would be great if you can add this to your earlier post :)Please don't quote code-tags. Please don't quote code-tags. You should write that on the first row of the OP.Each 15 minute update takes about 1 second to process. Each new keyword takes a few minutes to process, reading all 200,000 posts is slow. Processing several new keywords at once is more efficient, so feel free to add them :) spam:minepi.com/ I'm trying to improve searching for whole words only. I now remove the trailing slash ("/") from the keyword before searching. I don't think it matters for your keywords, but it can improve other strings.scam:github.com/pillforeth/ other: moran Try without the spaces now :)other: moron other: moron I've tried with space in front and the end, it didn't find anything. I have also tried with space in front it also didn't find anything. You were trying to adjust for my old search, right while I was adjusting it to improve matching complete words only.Quote This is too much for me...I am going to drink a beer and think about "drink another beer finding " "ej" space in the back" for another beer. Enjoy!scam:https://github.com/pillforethereum/ETHpillAN/ There's many of those altcoin pills nowadays.You should probably omit the "https://"-part, a scammer can do the same. How far back does your search go? See:I search the last ~200,000 posts for new keywords. This covers approximately 1 month. This search takes about 2 minutes for new keywords. It's mainly meant to catch new posts, older posts can be found through other means.Quote My examples come up multiple times in the BTC search box, but not via this thread's search? It might also be because I search unedited posts.Any thoughts why this post and User wasnt catched today The post was edited, see the unedited post (https://loyce.club/archive/posts/5503/55039179.html).Unfortunately, I can't know which posts have been edited, so this is a loophole to escape my badposts (https://loyce.club/badposts/) list. I think you should just make another link/html file that contains all cointelegraph spammers. I did already, see loyce.club/badposts/advertising.html (https://loyce.club/badposts/advertising.html).Quote I don't know if it would be a hard tas to migrate or create another file I can easily add new categories.Quote Or should I suggest that each spam website should had its own html file That's too much work to check, I now just quickly check a page once in a while, and report a few posts.@Loyce can you please remove one of these keywords : Thanks, done:github_com/ProjectEthereumPill github_com/ProjectEthereumPill/EthereumPill/ The following overlapping keywords have been removed: github.com/ProjectEthereumPill/EthereumPill github.com/pillforethereum/ETHpillAN (because of github.com/pillforeth) I also noticed the "Banned" notification is only useful for new keywords, because my banned list (https://bitcointalk.org/index.php?topic=5092983.0) is only updated once a day. I ran a one-time update from scratch, searching the last ~800,000 posts, this updates the banned-status on older posts. Hope I did it right. HYIP and MLM are too short, the minimum word length is 5 for scams.@LoyceV did you delete the old info ? The scam link (https://loyce.club/badposts/scam.html) displays only 22 archived posts. I do have logs, but it's too many lines to search now. I think someone must have entered a keyword with many hits, then removed the keyword again. My "badposts" only shows the latest 4000 posts each time I update it, but when a keyword is removed, all those entries are removed too.I've manually reset it to re-check all keywords in the last ~200k posts. This restored a longer list again. Is ETC officially a scam? I see it in the blacklist words as a scam. That is debatable...However the "scam : ETChash" keyword just helps catch posts like this one (https://loyce.club/archive/posts/5680/56802387.html) that have malware download links I think the keyword "PhoenixMiner" is too general and gives out a lot of fake positives... I made a new toy: [Newbie scrutiny instead of jail] Every new user's first post: loyce.club/patrol (https://bitcointalk.org/index.php?topic=5298940.0): See loyce.club/patrol/ (https://loyce.club/patrol/) Please Report (or Merit) the posts when needed ;) It's updated once a minute. Sample: https://loyce.club/other/patrol.png Title: Re: Work in progress: finding spam and scams by keyword Post by: LoyceV on August 13, 2020, 02:44:12 PM Code: scam:trustcoin.exchange Feel free to explain the reason why you add a certain phrase in your post Just don't put it inside the code-tag. Explanation provides a free bonus of (https://archive.is/fi0zw) For advertising, see this user's post history (https://bitcointalk.org/index.php?action=profile;u=9645;sa=showPosts). For 1HZwkjkeaoZfTSaJxDw6aKkxp45agDiEzN, see this topic (https://bitcointalk.org/index.php?topic=5300127.0). Binarium, see this post (https://bitcointalk.org/index.php?topic=5182222.msg63661587#msg63661587). Removed from my list: Quote advertising:cointelegraph.com/news advertising:coindesk.com Title: Re: Work in progress: finding spam and scams by keyword Post by: Rizzrack on August 13, 2020, 03:47:41 PM Code: spam:kintum.io/ github.com/ethpillan => was malware (https://bitcointalk.org/index.php?topic=5182222.msg54876299#msg54876299) (I said "was" because the repo is currently deleted, but maybe it can find some deleted posts with that bad link) Title: Re: Work in progress: finding spam and scams by keyword Post by: Lafu on August 13, 2020, 08:43:46 PM Code: spam:unlimited-hash.com Edited Title: Re: Work in progress: finding spam and scams by keyword Post by: Timelord2067 on August 15, 2020, 01:43:21 AM I'm just trying to get this clear in my head, so if you will permit me to try out a couple of test samples?
Code:
Thanks. (If I'm reading this right, I just edit this one post - yes?) Timeline: Other: 20/08/20 Initial list Humbertin + hellow (the latter used by the former) 06/10/20 Added grunch while investigating spike420211/TrevorS/bitcoinst 21/10/2020 Changed category to "email" for a few entries. Title: Re: [Active] Finding spam and scams by keyword Post by: morvillz7z on August 19, 2020, 05:30:21 PM Code: scam:github.com/pillforethereum/ETHpillAN/ Changelog: 10/3/2020 - https_://github.com/EthereumPill/PillForETH/ 10/8/2020 - https_://github.com/EthereumPillProject/EthereumPill/ 10/12/2020 - https_://github.com/ProjectEthereumPill/EthereumPill/ 10/19/2020 - https_://github.com/ProjectEthereum/EthereumPill/ 10/22/2020 - https_://github.com/ProjectPill/ Title: Re: [Active] Finding spam and scams by keyword Post by: TheBeardedBaby on August 26, 2020, 07:55:36 AM Let's see if we can catch some word spinners with this tool :)
Code: other:conversant Title: Re: [Active] Finding spam and scams by keyword Post by: NotATether on September 05, 2020, 12:34:13 AM I want to test this tool against a known (https://bitcointalk.org/index.php?topic=5271084.msg55132912#msg55132912) spammer (https://bitcointalk.org/index.php?topic=5272976.msg55132826#msg55132826).
Code: spam:TradingView Social Bump: should I remove the cointelegraph link? There are so many of those, it fills the list, and I haven't reported them anyway. I think you should, in order to lower the signal to noise ratio for the spam list. It makes it harder to search for posts with the other keywords. Title: Re: [Active] Finding spam and scams by keyword Post by: actmyname on September 05, 2020, 10:09:11 AM Code: spam:>agree, Title: Re: [Active] Finding spam and scams by keyword Post by: nutildah on September 16, 2020, 09:07:02 PM Hope I did it right.
Code: scam:HYIP I noticed that the news-related tags get used far more often that the others... my suggestion would be to take off coindesk and cointelegraph, but leave on the ones that come across as desperate spammers, like coinidol. There's probably a lot of people posting coindesk links out of genuine interest and aren't advertising for them on purpose. Title: Re: [Active] Finding spam and scams by keyword Post by: hosseinimr93 on November 14, 2020, 02:28:26 PM Recently, I've reported more than 30 posts including links below.
Their posts get deleted. They make new accounts and spam again and again. Code: spam:247sports.com Title: Re: [Active] Finding spam and scams by keyword Post by: Mitchell on November 15, 2020, 06:35:43 PM Code: spam:coca-colascholarsfoundation.org Title: Re: [Active] Finding spam and scams by keyword Post by: Bthd on November 19, 2020, 08:59:49 PM Code: spam:fairspin Title: Re: [Active] Finding spam and scams by keyword Post by: NotATether on September 24, 2021, 08:10:50 AM Code: spam:PhoenixMiner Title: Re: [Active] Finding spam and scams by keyword Post by: LoyceV on March 26, 2022, 08:21:19 AM 2 month bump 3 month bumpCode: spam:-fortnite- @NotATether: can you merge your 2 posts in this topic into one? Title: Re: [Active] Finding spam and scams by keyword Post by: light_warrior on July 17, 2023, 03:56:32 PM Several users are spamming ShibaMemu. I found 24 posts containing ShibaMemu spam. Some posts have already been removed using reports to the moderators. The user wrangler26 (https://bitcointalk.org/index.php?action=profile;u=2882623) is spamming the most.
Code: spam:ShibaMemu P.S.. Sorry if I wrote something wrong. Title: Re: [Active] Finding spam and scams by keyword Post by: LoyceV on July 17, 2023, 04:25:39 PM P.S.. Sorry if I wrote something wrong. Before I whitelist you: are you sure you want a space in front of the keywords?Title: Re: [Active] Finding spam and scams by keyword Post by: light_warrior on July 17, 2023, 05:49:34 PM P.S.. Sorry if I wrote something wrong. Before I whitelist you: are you sure you want a space in front of the keywords?Oops, my mistake. I read your advice from the first post and realized my mistake. No, of course, I don't need to add a space in front of the keywords. I corrected my mistake. I apologize again for not reading the first post carefully. Title: Re: [Active] Finding spam and scams by keyword Post by: LoyceV on July 18, 2023, 12:26:06 PM @light_warrior: I've whitelisted you. Your keywords should show up in the next update.
Title: Re: [Active] Finding spam and scams by keyword Post by: LoyceMobile on March 14, 2024, 02:43:11 PM Code: spam:https://tabi.foundation |