Bitcoin Forum
April 25, 2024, 06:00:40 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: "Multiple Accounts" / Copy-pasta detection scripts/bots  (Read 872 times)
Initscri (OP)
Hero Member
*****
Offline Offline

Activity: 1540
Merit: 759


View Profile WWW
September 18, 2018, 06:31:59 PM
Last edit: September 20, 2018, 06:36:49 PM by Initscri
Merited by suchmoon (4), bones261 (4), dbshck (3), k0er (3), qwk (1), LoyceV (1), Piggy (1)
 #1

Hey all,

I've been planning to write a few scripts relating to BitcoinTalk. It's been on my "developer bucket list" to write something to detect users who have multiple accounts. In order to accomplish this, and have a reliable list, I'd have to determine some logic in order to base this.

Content within HR tags will be updated as the thread goes along.



I have a few things in mind (and I'll be updating this as the thread goes along - adding new ideas & such):
When and if topics are created for the data (which will either be by me, or others): I'll post the links here under the respective categories.

For account quality detection:
- Looking for # of words, paragraphs, sentences, etc.. gathering the average of each user in order to determine a account quality number. This number can be used in tandem to determine if a report is made on a account (with other scripts). Obviously this isn't enough to report by itself, but usernames w/ low quality could be sent into a spreadsheet of some sort for manual lookup.
- [your/others ideas here]

For multiple account detection:

- Look for same address usage between posts (BTC, ETH, etc)
- Look for same account usage between posts (telegram, skype, emails, etc)
- [your/others ideas here]

For copy-pasta detection:
- Write a script to determine copy-pasta from accounts by matching the text of posts to similar text of other sites in order to return a probability percentage of the user copy/pasting (including src for manual analysis). Users w/ percentage points above a certain number will be put into a list & potentially reported to threads/mods. IE: external plagiarism detection
- Write a script to determine copy-pasta by matching post content against other users post content. High similarities will raise red flags. IE: internal plagiarism detection *note: suchmoon mentioned that working on something similar, so other scripts may set precedence*
- Original script may want to ignore quote tags. However, if the case, depending on how built (if use full text, or word by word) another side-script would have to be built to prevent users from just wrapping their messages in quote tags.
- [your/others ideas here]

For trust abuse/merit abuse:
- Detecting trust abuse (users who send out a large amount of negative trusts, using the same text). This would obviously avoid trusted members (as some good campaign managers send out trusts w/ same text). This is mostly targeted towards members w/ no trust, or negative trust (ie: newer members, no trade history, etc). Results would be posted in a thread in a list format using tildas "~" so people can copy/paste the list of abusers into their trust lists. Allowing the ability for users to request they be removed from this list by public poll within thread (this should probably be handled manually)
- [your/others ideas here]

General ideas for all scripts
- Automatic posting to anti-spam threads w/ results (in such a way as to not create more spam though)
- Platform where users which have been reported by scripts can be documented, with automatic ban detection. That way scripts aren't looking into users if they have already been reported/banned.
- [your/others ideas here]



Results would be posted here for mods to look at (if need be), or just to keep a record of such a connection. I'd also probably link to results in this topic and maybe load it up on a website of mine.

I wanted to post this thread in advance to see if anyone else had any other logic / ideas in mind for these scripts/bots? This will solely be when I have the time to create this (which won't be for a couple of weeks), so I thought I'd post this well in advance. I'll update the above list with approved suggestions that I plan to work on.

Thanks!

P.S: If any mods/admins aren't ok with me scraping the site, by all means let me know. I'd obviously write the bot/script in such a way that it doesn't slam the server & only send a certain amount of requests per second/minute (more or less like a Google bot). I know other users have written similar bots/scraping tools, so I thought it'd be ok. But if not, just let me know Smiley



Change log:

Code:
Edit (September 19th, 2018): I'll be updating this thread (see under bolds) with new ideas as this thread progresses. Also, if anyone else wishes to contribute to my scripts (or even build their own one-offs targeting the ideas above), just let me know that you're working on it, and I'll mark it in the thread. While I agree different scripts/algorithms would be harder to avoid/abuse, obviously I'd want all of the scripts to developed in a timely manner, so duplicating work probably isn't a good idea as of this moment. 
Edit (September 20th, 2018): Adding trust/merit abuse columns - automatic detection of users abusing trust/merit system system.

----------------------------------
Web Developer. PM for details.
----------------------------------
1714024840
Hero Member
*
Offline Offline

Posts: 1714024840

View Profile Personal Message (Offline)

Ignore
1714024840
Reply with quote  #2

1714024840
Report to moderator
1714024840
Hero Member
*
Offline Offline

Posts: 1714024840

View Profile Personal Message (Offline)

Ignore
1714024840
Reply with quote  #2

1714024840
Report to moderator
1714024840
Hero Member
*
Offline Offline

Posts: 1714024840

View Profile Personal Message (Offline)

Ignore
1714024840
Reply with quote  #2

1714024840
Report to moderator
The Bitcoin network protocol was designed to be extremely flexible. It can be used to create timed transactions, escrow transactions, multi-signature transactions, etc. The current features of the client only hint at what will be possible in the future.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714024840
Hero Member
*
Offline Offline

Posts: 1714024840

View Profile Personal Message (Offline)

Ignore
1714024840
Reply with quote  #2

1714024840
Report to moderator
1714024840
Hero Member
*
Offline Offline

Posts: 1714024840

View Profile Personal Message (Offline)

Ignore
1714024840
Reply with quote  #2

1714024840
Report to moderator
1714024840
Hero Member
*
Offline Offline

Posts: 1714024840

View Profile Personal Message (Offline)

Ignore
1714024840
Reply with quote  #2

1714024840
Report to moderator
Piggy
Hero Member
*****
Offline Offline

Activity: 784
Merit: 1416



View Profile WWW
September 18, 2018, 07:28:09 PM
 #2

Good idea, i have been thinking about doing something like that myself but at the moment got busy with other things. Automated checks is the way to go for the spam problems, plagiarism and so on.

A trivial check i was experimenting with, is getting the hash of the messages posted, save it in a dictionary and see if the same hash comes up again.

Another simple check which could be done for monitoring activity on threads. Using the global average of posts per thread and calculating then the variance for a thread you should be able to spot spam-spree. The same can be applied to user posting.

There are other more complex techniques out there, but better start with something simple at start.
Initscri (OP)
Hero Member
*****
Offline Offline

Activity: 1540
Merit: 759


View Profile WWW
September 18, 2018, 07:33:56 PM
 #3

Good idea, i have been thinking about doing something like that myself but at the moment got busy with other things. Automated checks is the way to go for the spam problems, plagiarism and so on.

A trivial check i was experimenting with, is getting the hash of the messages posted, save it in a dictionary and see if the same hash comes up again.

Another simple check which could be done for monitoring activity on threads. Using the global average of posts per thread and calculating then the variance for a thread you should be able to spot spam-spree. The same can be applied to user posting.

There are other more complex techniques out there, but better start with something simple at start.


Not bad ideas. I like the idea of monitoring threads for abnormal posting frequencies/amount of posts. OFC these threads would have to be manually checked through (as there may be extenuating circumstances where a thread may require a higher post frequency).

TBH, If I do create this, I may just create a repo so others can contribute.
My only fear is that others will run the script (which is okay, unless many users run it. I don't want to add unnecessary load to BitcoinTalk servers unintentionally)

----------------------------------
Web Developer. PM for details.
----------------------------------
TheBeardedBaby
Legendary
*
Offline Offline

Activity: 2184
Merit: 3134


₿uy / $ell


View Profile
September 18, 2018, 07:34:57 PM
 #4

I will closely follow this project. We've been waiting for such thing for a very long time. Most of the bots are using now word spinner to hide the copy-pasting, it's not easy to detect them but it's not impossible either.

LoyceV
Legendary
*
Offline Offline

Activity: 3290
Merit: 16545


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
September 18, 2018, 07:35:59 PM
Merited by dbshck (1)
 #5

P.S: If any mods/admins aren't ok with me scraping the site, by all means let me know. I'd obviously write the bot/script in such a way that it doesn't slam the server & only send a certain amount of requests per second/minute (more or less like a Google bot). I know other users have written similar bots/scraping tools, so I thought it'd be ok. But if not, just let me know Smiley
I've recently started scraping recent. My script saves the first unedited version of the post in raw HTML, excluding quotes. Your post for example looks like this:
Code:
Initscri
186520
45883661
<a href="https://bitcointalk.org/index.php#4">Other</a> / <a href="https://bitcointalk.org/index.php?board=24.0">Meta</a> / <b><a href="https://bitcointalk.org/index.php?topic=5032322.msg45883661#msg45883661">&quot;Multiple Accounts&quot; / Copy-pasta detection scripts/bots</a></b>

Hey all,<br /><br />I&#039;ve been planning to write a few scripts relating to BitcoinTalk. It&#039;s been on my &quot;developer bucket list&quot; to write something to detect users who have multiple accounts. In order to accomplish this, and have a reliable list, I&#039;d have to determine some logic in order to base this.<br /><br />I have a few things in mind:<br /><br />Index/scrape posts &amp;:<br /><br />For <b>multiple account detection</b>:<br /><br />- Look for same address usage between posts (BTC, ETH, etc)<br />- Look for same account usage between posts (telegram, skype, etc)<br />- [other ideas here]<br /><br />For <b>copy-pasta detection</b>:<br />- write a script to determine copy-pasta from accounts by matching the text of posts to similar text of other sites in order to return a probability percentage of the user copy/pasting (including src for manual analysis)<br />- [other ideas here]<br /><br />Results would be posted here for mods to look at (if need be), or just to keep a record of such a connection. I&#039;d also probably link to results in <a class="ul" href="https://bitcointalk.org/index.php?topic=1926895.0">this topic</a><br /><br />I wanted to post this thread in advance to see if anyone else had any other logic / ideas in mind for these scripts/bots? This will solely be when I have the time to create this (which won&#039;t be for a couple of weeks), so I thought I&#039;d post this well in advance.<br /><br />Thanks!
The first line is your Username, then userID, post number, some raw headers, and the last line is the post itself.
In compressed format, it takes about 10 MB per day. Instead of scraping the same data again, I could easily send it to you, and a few day's worth of data should be enough for you to start testing. If interested, let me know.


You'll be in for a surprise if you start looking for plagiarism! I sometimes sort a day's worth of posts and search for exact duplicates. This typically gives a few dozen posts that are posted a few dozen times. Most of them are spam, many of them are just spammers posting the same useless "proof of authentication" and more crap like that.
Detecting the text spinners will be a whole different level!

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Initscri (OP)
Hero Member
*****
Offline Offline

Activity: 1540
Merit: 759


View Profile WWW
September 18, 2018, 07:42:03 PM
 #6

P.S: If any mods/admins aren't ok with me scraping the site, by all means let me know. I'd obviously write the bot/script in such a way that it doesn't slam the server & only send a certain amount of requests per second/minute (more or less like a Google bot). I know other users have written similar bots/scraping tools, so I thought it'd be ok. But if not, just let me know Smiley
I've recently started scraping recent. It saves the first unedited version of the post in raw HTML, excluding quotes. Your post for example looks like this:
Code:
Initscri
186520
45883661
<a href="https://bitcointalk.org/index.php#4">Other</a> / <a href="https://bitcointalk.org/index.php?board=24.0">Meta</a> / <b><a href="https://bitcointalk.org/index.php?topic=5032322.msg45883661#msg45883661">&quot;Multiple Accounts&quot; / Copy-pasta detection scripts/bots</a></b>

Hey all,<br /><br />I've been planning to write a few scripts relating to BitcoinTalk. It's been on my &quot;developer bucket list&quot; to write something to detect users who have multiple accounts. In order to accomplish this, and have a reliable list, I'd have to determine some logic in order to base this.<br /><br />I have a few things in mind:<br /><br />Index/scrape posts &amp;:<br /><br />For <b>multiple account detection</b>:<br /><br />- Look for same address usage between posts (BTC, ETH, etc)<br />- Look for same account usage between posts (telegram, skype, etc)<br />- [other ideas here]<br /><br />For <b>copy-pasta detection</b>:<br />- write a script to determine copy-pasta from accounts by matching the text of posts to similar text of other sites in order to return a probability percentage of the user copy/pasting (including src for manual analysis)<br />- [other ideas here]<br /><br />Results would be posted here for mods to look at (if need be), or just to keep a record of such a connection. I'd also probably link to results in <a class="ul" href="https://bitcointalk.org/index.php?topic=1926895.0">this topic</a><br /><br />I wanted to post this thread in advance to see if anyone else had any other logic / ideas in mind for these scripts/bots? This will solely be when I have the time to create this (which won't be for a couple of weeks), so I thought I'd post this well in advance.<br /><br />Thanks!
The first line is your Username, then userID, post number, some raw headers, and the last line is the post itself.
In compressed format, it takes about 10 MB per day. Instead of scraping the same data again, I could easily send it to you, and a few day's worth of data should be enough for you to start testing. If interested, let me know.

Not a bad idea. I'll take that into account. I probably won't be starting for a little while, but I'll send you a message in a little while if I need it.

I like the idea of scraping recent & just grabbing raw HTML to compare.

In order to minimize requests but allow multiple filtering scripts to parse the data separately, I'll probably end up scraping recent with 1 bot, caching that for a set time period (sort of like a mirror), and then using multiple other scripts to parse the data on the caching server/mirror.

What I might end up doing is creating the server that stores the cache & keeping it closed source. But I'll release the scripts that parse the data / determining abusers as open source. These scripts would connect back to the mirror server/site instead of BitcoinTalk. That way if others wish to volunteer by using some computational power to run those scripts, they can do so and it allows for others to contribute code without slamming BitcoinTalk with a massive amount of requests by testing.

I will closely follow this project. We've been waiting for such thing for a very long time. Most of the bots are using now word spinner to hide the copy-pasting, it's not easy to detect them but it's not impossible either.

Thanks! I'll try to keep this thread updated as much as I can.

----------------------------------
Web Developer. PM for details.
----------------------------------
mr_smith99
Copper Member
Newbie
*
Offline Offline

Activity: 168
Merit: 0


View Profile
September 18, 2018, 07:52:04 PM
 #7

That's a nice idea. And you should run the script for copy-paste eth accounts on the registry forum on the bounty thread. They have a Google sheet with all the ETH addresses
my1st
Copper Member
Jr. Member
*
Offline Offline

Activity: 350
Merit: 1


View Profile
September 18, 2018, 09:19:15 PM
 #8

If your script will be catching multi-accounts that do not hesitate to write proof of authentication in the bounty threads one after one with the same error like at screenshot, it will be an excellent trap for them!




Initscri (OP)
Hero Member
*****
Offline Offline

Activity: 1540
Merit: 759


View Profile WWW
September 18, 2018, 09:48:44 PM
 #9

If your script will be catching multi-accounts that do not hesitate to write proof of authentication in the bounty threads one after one with the same error like at screenshot, it will be an excellent trap for them!



I probably wouldn't worry too much about misspellings of "address", considering for Ethereum addresses I would just look for strings starting with 0x (unless I'm wrong on this, I'm more familiar with Bitcoin) and then just gather the entire address until the next space.

Not to mention, not all people start off with "Ethereum Address:", some threads may require other formats, so it's better to go off the string itself.

Also, and this unrelated to the above quote. I did post the following response on Theymos' announcement the other day: https://bitcointalk.org/index.php?topic=5030366.msg45889515#msg45889515

If merit requirements are posted to above 1 merit, I'll probably introduce a feature into my script looking for random merit sending of whatever the amount may be. Unfortunately, with the merit requirement only being 1, it would be much more difficult to detect abuse of this from a programming perspective.

----------------------------------
Web Developer. PM for details.
----------------------------------
suchmoon
Legendary
*
Offline Offline

Activity: 3654
Merit: 8909


https://bpip.org


View Profile WWW
September 19, 2018, 04:07:13 AM
Merited by dbshck (1)
 #10

Once you get it running to some meaningful extent I would suggest to post the scope you're working on (set of users, threads) in iasenko's thread here:

https://bitcointalk.org/index.php?topic=4720640.0

So that we don't duplicate the effort.

I'm experimenting with some NLP techniques for plagiarism detection and the results are promising although scalability is a bit of an issue. Currently working just on comparing Bitcointalk posts (not to outside sources).

Perhaps it's better not to publicize too many specific details on how the scripts work - might inadvertently help bot-farmers. I wish there was a section of the forum designated for spam-busting efforts, I believe hilarious has suggested this.
Initscri (OP)
Hero Member
*****
Offline Offline

Activity: 1540
Merit: 759


View Profile WWW
September 19, 2018, 05:16:25 AM
 #11

Once you get it running to some meaningful extent I would suggest to post the scope you're working on (set of users, threads) in iasenko's thread here:

https://bitcointalk.org/index.php?topic=4720640.0

So that we don't duplicate the effort.

I'm experimenting with some NLP techniques for plagiarism detection and the results are promising although scalability is a bit of an issue. Currently working just on comparing Bitcointalk posts (not to outside sources).

Perhaps it's better not to publicize too many specific details on how the scripts work - might inadvertently help bot-farmers. I wish there was a section of the forum designated for spam-busting efforts, I believe hilarious has suggested this.

Good point, I'll post within that thread once completed. I'm also hoping to hook it into BitcoinTalk & have it automatically update threads, but we'll see. Very much in the planning stage TBH

That's the thing, I've been debating closed source vs open source and the perks of both. What I might end up doing is creating a repo for these scripts, but keeping it private (I have an account on Github I can do this with), and then just inviting users who wish to contribute. Might just leave this to "if there's interest", but thanks for the flag on the potentials of abuse if open sourcing it. I didn't clue into that until now.

+1 for the forum section for spam busting, it'd be easier to keep lists of reported within.

If you're working on plagiarism detection already, I'll probably work on multiple account detection first. Granted, multiple bots running from different developers with different sets of algorithms probably isn't a bad idea (will make it harder for bots to avoid)

----------------------------------
Web Developer. PM for details.
----------------------------------
Jet Cash
Legendary
*
Offline Offline

Activity: 2688
Merit: 2449


https://JetCash.com


View Profile WWW
September 19, 2018, 08:52:33 AM
 #12

If it helps you guys to know about declared alts, here are mine.

Talk Merit
JetAid


Offgrid campers allow you to enjoy life and preserve your health and wealth.
Save old Cars - my project to save old cars from scrapage schemes, and to reduce the sale of new cars.
My new Bitcoin transfer address is - bc1q9gtz8e40en6glgxwk4eujuau2fk5wxrprs6fys
suchmoon
Legendary
*
Offline Offline

Activity: 3654
Merit: 8909


https://bpip.org


View Profile WWW
September 19, 2018, 01:29:12 PM
Merited by Initscri (1)
 #13

If you're working on plagiarism detection already, I'll probably work on multiple account detection first. Granted, multiple bots running from different developers with different sets of algorithms probably isn't a bad idea (will make it harder for bots to avoid)

I think we can certainly run multiple attacks on plagiarism as long as we coordinate to reduce overlap in which users we've reported etc, e.g. using the thread I mentioned and also https://bpip.org to check for bans.

With the little time I have available I'm still probably weeks away from a reasonably usable product and even then it would cover only a relatively small set of potential plagiarism. LoyceV mentioned that forum gets ~50k posts a day - many of which can be ignored or whitelisted but still that's a lot of garbage to sift through.
khaled0111
Legendary
*
Offline Offline

Activity: 2506
Merit: 2832


Top Crypto Casino


View Profile WWW
September 19, 2018, 03:53:58 PM
 #14

I don't know how it works but I think there is a bot on Steemit "@cheetah" that detect plagiarism, thus developing a similar bot wont be a problem (there are many senior developers in this forum).

It will be great if you succeed to write a script that detects members sendig Merits to each others.

I don't think it is going to be hard to code such script but you will need an access to the Merit database.

█████████████████████████
████▐██▄█████████████████
████▐██████▄▄▄███████████
████▐████▄█████▄▄████████
████▐█████▀▀▀▀▀███▄██████
████▐███▀████████████████
████▐█████████▄█████▌████
████▐██▌█████▀██████▌████
████▐██████████▀████▌████
█████▀███▄█████▄███▀█████
███████▀█████████▀███████
██████████▀███▀██████████
█████████████████████████
.
BC.GAME
▄▄░░░▄▀▀▄████████
▄▄▄
██████████████
█████░░▄▄▄▄████████
▄▄▄▄▄▄▄▄▄██▄██████▄▄▄▄████
▄███▄█▄▄██████████▄████▄████
███████████████████████████▀███
▀████▄██▄██▄░░░░▄████████████
▀▀▀█████▄▄▄███████████▀██
███████████████████▀██
███████████████████▄██
▄███████████████████▄██
█████████████████████▀██
██████████████████████▄
.
..CASINO....SPORTS....RACING..
█░░░░░░█░░░░░░█
▀███▀░░▀███▀░░▀███▀
▀░▀░░░░▀░▀░░░░▀░▀
░░░░░░░░░░░░
▀██████████
░░░░░███░░░░
░░█░░░███▄█░░░
░░██▌░░███░▀░░██▌
░█░██░░███░░░█░██
░█▀▀▀█▌░███░░█▀▀▀█▌
▄█▄░░░██▄███▄█▄░░▄██▄
▄███▄
░░░░▀██▄▀


▄▄████▄▄
▄███▀▀███▄
██████████
▀███▄░▄██▀
▄▄████▄▄░▀█▀▄██▀▄▄████▄▄
▄███▀▀▀████▄▄██▀▄███▀▀███▄
███████▄▄▀▀████▄▄▀▀███████
▀███▄▄███▀░░░▀▀████▄▄▄███▀
▀▀████▀▀████████▀▀████▀▀
coinlocket$
Legendary
*
Offline Offline

Activity: 2352
Merit: 1512


#1 VIP Crypto Casino


View Profile
September 19, 2018, 03:55:28 PM
 #15

It will be great if you succeed to write a script that detects members sendig Merits to each others.

I don't think it is going to be hard to code such script but you will need an access to the Merit database.

We already have several tools for this purpose, you can see one here done by @DdmrDdmr

Code:
https://public.tableau.com/profile/ddmrddmr#!/vizhome/BitcointalkMeritDashboard/GlobalSummary

.
.BITCASINO.. 
.
#1 VIP CRYPTO CASINO

▄██████████████▄
█▄████████████▄▀▄▄▄
█████████████████▄▄▄
█████▄▄▄▄▄▄██████████████▄
███████████████████████████████
████▀█████████████▄▄██████████
██████▀██████████████████████
████████████████▀██████▌████
███████████████▀▀▄█▄▀▀█████▀
███████████████████▀▀█████▀
 ▀▀▀▀▀▀▀██████████████
          ▀▀▀████████
                ▀▀▀███

.
......PLAY......
manfredmann
Member
**
Offline Offline

Activity: 518
Merit: 21


View Profile WWW
September 19, 2018, 04:10:15 PM
 #16

We already have several tools for this purpose, you can see one here done by @DdmrDdmr

Code:
https://public.tableau.com/profile/ddmrddmr#!/vizhome/BitcointalkMeritDashboard/GlobalSummary
This forum has full of enthusiast people working together shaping up for the betterment of this forum. I do believe that it could be achieve with the help from other members collaborating with each other. Thus, collaboration will help and get the job done easier. If i only have this kind of expertise then definitely I am more than willing to help you guys. Sad to say I am just only following and taking down important details for the future implmentation and update with this forum. GO! GO! GO!
qwk
Donator
Legendary
*
Offline Offline

Activity: 3542
Merit: 3411


Shitcoin Minimalist


View Profile
September 19, 2018, 04:27:46 PM
Merited by Initscri (1)
 #17

Detecting the text spinners will be a whole different level!
I guess a quick and dirty approach could be something like this:
1. take samples of all occurrences of 4 consecutive words
2. create their md5 (or whatever you prefer) hashes
3. store those hashes in a database
4. count number of hash collisions with other posts

So, a simple text like:
The quick brown fox jumps over the lazy dog

would result in 6 individual hashes:
The quick brown fox
quick brown fox jumps
brown fox jumps over
fox jumps over the
jumps over the lazy
over the lazy dog

Tinker a little with the number of words and the threshold for detection of duplicates, and you're probably almost there for a large share of the copy-pasta spam.

Yeah, well, I'm gonna go build my own blockchain. With blackjack and hookers! In fact forget the blockchain.
Initscri (OP)
Hero Member
*****
Offline Offline

Activity: 1540
Merit: 759


View Profile WWW
September 19, 2018, 04:32:11 PM
Last edit: September 19, 2018, 04:44:35 PM by Initscri
 #18

I don't know how it works but I think there is a bot on Steemit "@cheetah" that detect plagiarism, thus developing a similar bot wont be a problem (there are many senior developers in this forum).

It will be great if you succeed to write a script that detects members sendig Merits to each others.

I don't think it is going to be hard to code such script but you will need an access to the Merit database.

There's plenty of paid APIs to support plagiarism detection externally, so if I was lazy and rich I'd use those lol. Although, I'm uncertain of their reliability.

But realistically, external plagiarism detection isn't super difficult; although it may be more difficult than internal detection. I won't go too far into details (hashing, storage methods, etc), but essentially you're taking the copy of the text (or portions of it) & matching it against search engine results / meta descriptions.
I'm sure there's plenty of other methods as well.

The difficulty will be to find sources to match against (unsure if scraping Google will be permitted, we'll see).

Point is though: if 3 different developers develop it 3 different ways (using different sources) it will be far more difficult for bots/spammers to reverse engineer/abuse.

If you're working on plagiarism detection already, I'll probably work on multiple account detection first. Granted, multiple bots running from different developers with different sets of algorithms probably isn't a bad idea (will make it harder for bots to avoid)

I think we can certainly run multiple attacks on plagiarism as long as we coordinate to reduce overlap in which users we've reported etc, e.g. using the thread I mentioned and also https://bpip.org to check for bans.

With the little time I have available I'm still probably weeks away from a reasonably usable product and even then it would cover only a relatively small set of potential plagiarism. LoyceV mentioned that forum gets ~50k posts a day - many of which can be ignored or whitelisted but still that's a lot of garbage to sift through.


Maybe we can create some sort of central location for defining which users have been reported by bots.
If I have time, maybe I'll create something web-based, and just give out API keys to users who can prove they have an operating script.

Would just sort of be a web-based platform to set which users are reported by scripts/bots, and then it would track if those users actually have a ban through the use of BPIP (If Vod permits)

Dumping the info into a thread probably isn't ideal, but worst comes to worst we can rely on that until a more advanced system is produced.

If it helps you guys to know about declared alts, here are mine.

Talk Merit
JetAid



Thanks Jet Cash, if I do implement an alt detection system, I'd make the reporting of users more manual than automated.
I'm sure there's many users (such as yourself) who have alts for various reasons and aren't being nefarious and don't deserve a report.

If anyone has any further ideas for methods, keep em comin' Smiley

----------------------------------
Web Developer. PM for details.
----------------------------------
suchmoon
Legendary
*
Offline Offline

Activity: 3654
Merit: 8909


https://bpip.org


View Profile WWW
September 19, 2018, 05:01:24 PM
 #19

Tinker a little with the number of words and the threshold for detection of duplicates, and you're probably almost there for a large share of the copy-pasta spam.

I experimented with n-grams a little bit and couldn't find a good value. Low n yields too many false positives, high n doesn't detect spinners, etc. So I'm using a mixture of algorithms and base the decision on the pattern of the results of those algorithms - e.g. if the similarity of two texts using algorithm A is 70%, then union/intersect/otherwise manipulate the texts, run algorithm B, if it scores 90% then run algorithm C to eliminate false positives - made up numbers but you get the idea. Works ok-ish, but as I mentioned it doesn't scale well and I need to do more testing on larger samples.

The difficulty will be to find sources to match against (unsure if scraping Google will be permitted, we'll see).

Google has a search API. Not sure if there is a free tier though.
LoyceV
Legendary
*
Offline Offline

Activity: 3290
Merit: 16545


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
September 19, 2018, 05:12:52 PM
 #20

Tinker a little with the number of words and the threshold for detection of duplicates, and you're probably almost there for a large share of the copy-pasta spam.
I'm more worried about the very high number of positive results. Let me play around a bit with yesterday's data, from post 45850092 up to post 45893434. My scraper caught 43184 out of 43343 posts (it misses some burst posts). This is after the new Merit requirements, so there's less spam already.

I'll show the 50 most used posts (raw HTML excluding quotes; the number at the start of each line shows how often they appear). Those posts are exactly the same each time they were posted:
Code:
    288 (post was empty or only a quote)
    162 Do you have a telegram channel?
     91 Proof of Authentication:<br />Joined Telegram Campaign
     45 Bump
     25 bump
     24 microguy talks to himself just like he trades himself just like he lol himself <img src="https://bitcointalk.org/Smileys/default/grin.gif" alt="Grin" border="0" /> <img src="https://bitcointalk.org/Smileys/default/grin.gif" alt="Grin" border="0" /> <img src="https://bitcointalk.org/Smileys/default/grin.gif" alt="Grin" border="0" /><br /><br />sounds like the shytcoin is showing its age like microguy is <span style="font-size: 99pt !important; line-height: 1.3em;">&#129300;</span><br /><br />sounds like a igotspots shytcoin scam checkpoint dysfunction&nbsp; still better than btc right<br /><br /><span style="color: blue;"><span style="font-size: 90pt !important; line-height: 1.3em;">Whats in your wallet</span></span><br /><br /><span style="font-size: 90pt !important; line-height: 1.3em;"><span style="color: brown;"><a class="ul" href="https://imgur.com/rPLBZVM">https://imgur.com/rPLBZVM</a></span></span>
     23 <div align="center"><b><br />For a more general context on our seed round, and the reasons for this funding round please read our <a class="ul" href="https://[Suspicious link removed">/i3ufCd]medium article</a></b>
     20 <div align="center"><span style="font-size: 20pt !important; line-height: 1.3em;"><b>Hello Everyone, GOeureka are live now with Bounty Campaign.<br />&nbsp;Please follow given link to participate</b></span>
     19 hi<br />i noticed you deleted you telegram account recently<br />why?<br />i am still waiting the letter and when it arrives how can i contact you?<br />please contact me at @AmbrogioOrfeu on telegram
     18 IMPORTANT ANNOUNCEMENTS ABOUT INBOT FUTURE :<br /><br />1. Our revenue for first 6 months was more than whole 2017!<br />2. We are hiring Partner Managers and Business Operations Managers.<br />3. We are moving InToken from Ethereum to Stellar blockchain.<br />4. We will list InToken without an ICO.
     17 hello everyone <br />here im talking about a new cryptocurrency which is THUNDERSTAKE (TSC) .TSC PoS staking rewards: 900% APR fixed, every block number dividable by 10 is a superblock with double APR (1800 %) .<br />we have made products with TSC logo which you can buy from our website with TSC coin as payment.TSC is live on CMC and 5 exchanges, Cyptobridge,mercatox,Stokes.exchange, bitrex and escodex .<br />here is our website link <a class="ul" href="https://thunderstake.com">https://thunderstake.com</a> and discord link :&nbsp; &nbsp;<a class="ul" href="https://discord.gg/wmu9Zcx">https://discord.gg/wmu9Zcx</a> you can get everything from here have a look
     16 Up
     16 Proof of Authentication:<br />Joined Telegram Campaign<br />
     15 up
     14 week #1<br />Reddit Campaign<br />Reddit name: <br />Reddit user Url: <br />Like any post on Subreddit (list with links to post):<br />1.<br />
     14 #proof:<br />Twitter username:@cryptonerdd<br />Telegram username:@cryptonerdd<br />ERC20 address:0x51494b94939D2C8353d069206887687C40eD92B9<br />
     13 microguy talks to himself just like he trades himself just like he lol himself <img src="https://bitcointalk.org/Smileys/default/grin.gif" alt="Grin" border="0" /> <img src="https://bitcointalk.org/Smileys/default/grin.gif" alt="Grin" border="0" /> <img src="https://bitcointalk.org/Smileys/default/grin.gif" alt="Grin" border="0" /><br /><br />sounds like the shytcoin is showing its age like microguy is <span style="font-size: 99pt !important; line-height: 1.3em;">&#129300;</span><br /><br />sounds like a igotspots shytcoin scam checkpoint dysfunction&nbsp; still better than btc right<br /><br /><span style="color: blue;"><span style="font-size: 70pt !important; line-height: 1.3em;">Whats in your wallet</span></span><br /><br /><span style="font-size: 90pt !important; line-height: 1.3em;"><span style="color: brown;"><a class="ul" href="https://imgur.com/rPLBZVM">https://imgur.com/rPLBZVM</a></span></span>
     12 Bitcointalk username: aloha0001<br />Forum rank: member<br />Posts count:&nbsp; 255<br />ETH address: 0x04ddhA7Bb8b08af5E6866C1efc3rehe54a2859E6<br />
     12 <div align="center"><b><span style="font-size: 15pt !important; line-height: 1.3em;"><span style="color: orange;">Update</span></span></b>
     11 reserved
     11 Twitter<br /><br />Retweets<br />1.<a class="ul" href="https://mobile.twitter.com/MaestroProject1/status/1003536243370545152">https://mobile.twitter.com/MaestroProject1/status/1003536243370545152</a><br />2.<a class="ul" href="https://mobile.twitter.com/MaestroProject1/status/1003824945430843393">https://mobile.twitter.com/MaestroProject1/status/1003824945430843393</a><br />3.<a class="ul" href="https://mobile.twitter.com/MaestroProject1/status/1004547290063765508">https://mobile.twitter.com/MaestroProject1/status/1004547290063765508</a><br />4.<br />5.<br /><br />Tweets<br />1.<a class="ul" href="https://mobile.twitter.com/amanda_septiasa/status/1003681341915869184">https://mobile.twitter.com/amanda_septiasa/status/1003681341915869184</a><br />2.<br />
     10 Week #1<br />Facebook<br /><br />Shares + Likes<br /><br />1. <a class="ul" href="https://www.facebook.com/amro.trikid/posts/10212205125466225">https://www.facebook.com/amro.trikid/posts/10212205125466225</a><br />2. <a class="ul" href="https://www.facebook.com/amro.trikid/posts/10212210015988485">https://www.facebook.com/amro.trikid/posts/10212210015988485</a><br />3. <a class="ul" href="https://www.facebook.com/amro.trikid/posts/10212219066614745">https://www.facebook.com/amro.trikid/posts/10212219066614745</a><br />4. <a class="ul" href="https://www.facebook.com/amro.trikid/posts/10212228785097701">https://www.facebook.com/amro.trikid/posts/10212228785097701</a><br />5. <a class="ul" href="https://www.facebook.com/amro.trikid/posts/10212228787657765">https://www.facebook.com/amro.trikid/posts/10212228787657765</a><br />
     10 WEEK#1<br />Facebook Campaign<br />Facebook Link: <a class="ul" href="https://facebook.com/deerey.area">https://facebook.com/deerey.area</a><br />Friends: 1100<br /><br />Post:<br /><br />Shared:<br />
     10 Twitter Campaign&nbsp; &nbsp; &nbsp; <br />Twitter user Url:&nbsp; &nbsp;<a class="ul" href="https://twitter.com/4LUtr1qGRLB">https://twitter.com/4LUtr1qGRLB</a>&nbsp; &nbsp;<br />Repost and Like any post on Twitter (list with links):&nbsp; &nbsp; &nbsp; <br /><a class="ul" href="https://twitter.com/bitflipcc/status/10101578403">https://twitter.com/bitflipcc/status/10101578403</a><br />
     10 Bitcointalk account URL : <br />TELEGRAM username: @zlo2323<br />language: Korean<br />Rank: Jr.Member<br />Eth address: 0xaE0304fd2b399c790170aA6Ea6A1d6E78713f96<br />
     10 <br />test
     10 <br />
     10 #PROOF OF AUTHENTICATION POST<br />Joined Twitter Campaign<br />Bitcointalk Username: Dollar1980<br />Telegram Username: @TahsibGhurair<br />Twitter Username: @Tahsib_Ghurair<br />Twitter Account Url: <a class="ul" href="https://twitter.com/Tahsib_Ghurair">https://twitter.com/Tahsib_Ghurair</a><br />
      9 Native language: Russian&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />Bitcointalk username: Sabergas1w7&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />Profile link: <a class="ul" href="https://bitcointalk.org/index.php?action=profile;u=161465763">https://bitcointalk.org/index.php?action=profile;u=161465763</a>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />Part of the bounty you apply for: ANN&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />Experience: NO&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />Telegram: <a class="ul" href="https://t.me/Sadbis1g7">https://t.me/Sadbis1g7</a>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />Email: <a href="mailto:gaerhe5ra@mail.ru">gaerhe5ra@mail.ru</a>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />Ethereum address: 0x91D8f2e4hjdEC122568f4c2cd5D14a362glk561F&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />Please PM me if you accept. <br />
      9 #Proof of Authentication<br /><br />Campaign : Telegram &amp; Twitter <br />Bitcointalk Username: notnotok<br />Telegram Username : @khalidalbudoor<br />Twitter Account Link: <a class="ul" href="https://twitter.com/khalidalbudoor7">https://twitter.com/khalidalbudoor7</a><br />Twitter Username: @khalidalbudoor7<br />
      9 #PROOF OF AUTHENTICATION POST<br />Joined Twitter Campaign<br />Bitcointalk Username: ExcellentOffer86<br />Twitter Account Url: <a class="ul" href="https://twitter.com/Saeed_Imtiaz1">https://twitter.com/Saeed_Imtiaz1</a><br />Telegram Username: @Saeed_Imtiaz1<br />
      8 Week #1<br />Twitter<br /><br />Retweets<br />1. <a class="ul" href="https://twitter.com/MaestroProject1/status/998832211800412160">https://twitter.com/MaestroProject1/status/998832211800412160</a><br />2. <a class="ul" href="https://twitter.com/MaestroProject1/status/998839809895350272">https://twitter.com/MaestroProject1/status/998839809895350272</a><br />3. <a class="ul" href="https://twitter.com/MaestroProject1/status/999005931881906176">https://twitter.com/MaestroProject1/status/999005931881906176</a><br />4. <a class="ul" href="https://twitter.com/MaestroProject1/status/999036079238868992">https://twitter.com/MaestroProject1/status/999036079238868992</a><br />5. <a class="ul" href="https://twitter.com/MaestroProject1/status/999043596345950208">https://twitter.com/MaestroProject1/status/999043596345950208</a><br /><br />Tweets<br />1. <a class="ul" href="https://twitter.com/hellofancydei/status/1004721044937105413">https://twitter.com/hellofancydei/status/1004721044937105413</a><br />2. <a class="ul" href="https://twitter.com/hellofancydei/status/1004721411192049667">https://twitter.com/hellofancydei/status/1004721411192049667</a>&nbsp;
      8 Facebook<br />Week #1<br /><br />Twitter Profile Link: <a class="ul" href="https://twitter.com/CREoday_ru">https://twitter.com/CREoday_ru</a><br />Like and Retweet:<br />1. <a class="ul" href="https://twitter.com/medXe1/status/961630808724459520">https://twitter.com/medXe1/status/961630808724459520</a><br />2. <a class="ul" href="https://twitter.com/medXe1/status/962393102601412608">https://twitter.com/medXe1/status/962393102601412608</a><br />3. <a class="ul" href="https://twitter.com/medXe1/status/962767627113455616">https://twitter.com/medXe1/status/962767627113455616</a><br />4. <a class="ul" href="https://twitter.com/medXe1/status/962768328770146309">https://twitter.com/medXe1/status/962768328770146309</a><br />5. <a class="ul" href="https://twitter.com/medXe1/status/975583417281712128">https://twitter.com/medXe1/status/975583417281712128</a><br /><br />Facebook Profile Link: <a class="ul" href="https://www.facebook.com/ar.amur.ru">https://www.facebook.com/ar.amur.ru</a><br />Like and Share:<br />1. <a class="ul" href="https://www.facebook.com/ar.amur.ru/posts/597475630588609">https://www.facebook.com/ar.amur.ru/posts/597475630588609</a><br />2. <a class="ul" href="https://www.facebook.com/ar.amur.ru/posts/597613640574808">https://www.facebook.com/ar.amur.ru/posts/597613640574808</a><br />3. <a class="ul" href="https://www.facebook.com/ar.amur.ru/posts/597994343870071">https://www.facebook.com/ar.amur.ru/posts/597994343870071</a><br />4. <a class="ul" href="https://www.facebook.com/ar.amur.ru/posts/598519390484233">https://www.facebook.com/ar.amur.ru/posts/598519390484233</a><br />5. <a class="ul" href="https://www.facebook.com/ar.amur.ru/posts/599002237102615">https://www.facebook.com/ar.amur.ru/posts/599002237102615</a><br />
      8 <a href="https://i.imgur.com/QBgno2y.png">https://i.imgur.com/QBgno2y.png</a><br /><br />We invite you to bring your project to <b><a class="ul" href="http://Altmarkets.cc">Altmarkets.cc</a></b>,<br /><br /><br />Add your coin to our exchange by requesting <b><a class="ul" href="https://docs.google.com/forms/d/e/1FAIpQLSejTGyelV8OleqYbGqscdvWrMKsXOp8bCvO4VCtkFqAAJctcg/viewform?usp=send_form">Here</a></b><br /><br /><br />(OPTIONAL) Join us on Discord to speak directly to us about your listing request : <b><a class="ul" href="https://discord.gg/ZhQzy5f">https://discord.gg/ZhQzy5f</a></b><br /><br />Our Fees - <a class="ul" href="https://altmarkets.cc/fees">https://altmarkets.cc/fees</a><br />Listing Policy: <a class="ul" href="https://altmarkets.cc/add_coin">https://altmarkets.cc/add_coin</a>
      7 week 1<br /><br />Tweet link : <br />1. <br />2. <br />3. <br /><br />Retweet link : <br />1. <a class="ul" href="https://twitter.com/MaestroProject1/status/10016030348670208">https://twitter.com/MaestroProject1/status/10016030348670208</a><br />2. <br />3. <br />4. <br />5. <br /><br />LIke &amp; share link : <br />1. <a class="ul" href="https://web.facebook.com/coinhunt1/posts/28284343478955285">https://web.facebook.com/coinhunt1/posts/28284343478955285</a><br />2. <br />3. <br />4. <br />5. <br />
      7 Proof of joined post<br />Campaign in which you participate: Linkedin campaign<br />ETH address: 0x02Aft679fd80E9dD51cac1dc5se45f42578fhj64<br />
      7 I want to reserve a signature campaign.<br />BitcoinTalk name: jordarheje89<br />BitcoinTalk profile link: <a class="ul" href="https://bitcointalk.org/index.php?action=profile;u=1866560678;sa=summary">https://bitcointalk.org/index.php?action=profile;u=1866560678;sa=summary</a><br />Eth Address: 0xCd332c24rhehBfa3A9d658D2F33Aheh2eF5689<br />
      7 Bump.
      7 <div align="center"><b><span style="font-size: 15pt !important; line-height: 1.3em;"><span style="color: #7e0dbd;">RainCheck | Update</span></span></b>
      7 +12000 subcribers on Telegram<br />Come and chat with the Team<br /><a class="ul" href="https://t.me/brodweyrealteam">https://t.me/brodweyrealteam</a><br />
      7 #proof:<br />Twitter username:@cryptonerdd<br />Telegram username:@cryptonerdd<br />ERC20 address:0x51494b94939D2C8353d069206887687C40eD92B9
      7 #Proof of Authentication Post Link<br /><br />Twitter Campaign<br />Twitter Account : <a class="ul" href="https://twitter.com/DarinaBovsiktak">https://twitter.com/DarinaBovsiktak</a><br />Facebook Campaign<br />Facebook: <a class="ul" href="https://www.facebook.com/DorianTopz">https://www.facebook.com/DorianTopz</a>
      7 #PROOF OF AUTHENTICATION POST<br />Joined Twitter Campaign<br />Bitcointalk Username: ExcellentOffer86<br />Twitter Username: @Saeed_Imtiaz1<br />Twitter Account Url: <a class="ul" href="https://twitter.com/Saeed_Imtiaz1">https://twitter.com/Saeed_Imtiaz1</a><br />Telegram Username: @Saeed_Imtiaz1<br />
      7 ##PROOF OF AUTHENTICATION##<br />Bitcointalk Username: trishaanywhite<br /><br /><br />Joined Campaigns: Twitter<br />Twitter User Name: trishaanywhite<br />Twitter Account Url&nbsp; : <a class="ul" href="https://twitter.com/trishaanywhite">https://twitter.com/trishaanywhite</a><br /><br /><br />Joined Campaigns: Telegram<br />Telegram user Name: @trishaany<br />Telegram Url: <a class="ul" href="https://t.me/trishaany">https://t.me/trishaany</a><br /><br />
      6 TRANSLATION IN INDONESIAN<br />Bitcointalk username: adelaisav <br />Native language: indonesia<br />Email: <a href="mailto:dancukbanget@gmail.com">dancukbanget@gmail.com</a> <br />Telegram: @filarisdianto <br />Part of bounty you apply for : ALL<br />Translation/moderation experience: <a class="ul" href="https://docs.google.com/spreadsheets/d/1Ltym_vuCnAvpGD7F7KnldJtm7wYP8S3sdZ7pdRaK8Jg/htmlview">https://docs.google.com/spreadsheets/d/1Ltym_vuCnAvpGD7F7KnldJtm7wYP8S3sdZ7pdRaK8Jg/htmlview</a><br />ETH address: 0xb02518F08daeb2Ef11a50edB152C59507D0EB2F5<br />Pm me if you need sir
      6 Reserve
      6 Project looks great but there are tons of projects like this and my question is, how can you be a bit defirrent than other payment system?
      6 IMPORTANT ANNOUNCEMENTS ABOUT INBOT FUTURE :<br /><br />1. Our revenue for first 6 months was more than whole 2017!<br />2. We are hiring Partner Managers and Business Operations Managers.<br />3. We are moving InToken from Ethereum to Stellar blockchain.<br />4. We will list InToken without an ICO.<br />
      6 Hi dev,<br />I&#039;m writing to you with an offer of listing at one of the major masternodes monitoring website - <a class="ul" href="http://masternodes.plus">http://masternodes.plus</a> (MasterNodesPlus).<br />You have been selected and approved for listing as recommended masternode coin.<br />To be listed at the website, you can use one of the three offers:<br /><br />Normal listing-up to 24 hours: 0.1BTC<br />Listing an ICO (coin not available on any exchange) up to 6 hours: 0,3BTC<br /><br />You can make your request for the lisitng here:<br /><a class="ul" href="https://masternodes.plus/contact.html">https://masternodes.plus/contact.html</a><br /><br /><br />Regards,<br />Timothy James-Quill<br />\93MNP\94<br />
      6 A request to prospective clients, please post a message on the forum thread first to keep the thread alive and then make a contact using above mentioned contacts for prompt response.<br /><br />--------------------------------------------------------------<br />For users in China/Hong, they can also contact via QQ.<br /><br />QQ: 256447418
The first line is my own description. It's mainly caused by bounty spammers: they quote their own old post, then edit it to add their latest bounty report spam. My scraper catches the posts before they're edited.

This doesn't really catch plagiarism, but it catches spam. When you're looking for word phrases to detect plagiarism, you're likely to get even more hits than this.

The second entry came from Cidonar, who bumped this thread 162 times. That board shouldn't allow deleting posts within 24 hours, but it does.
The user isn't banned, as he deleted the evidence.

The third entry ("Proof of Authentication") came from many different users in this thread. I've just reported a few asking to check the thread.

The sixth entry ("microguy talks to himself") came from BitCoin ranger, who had 24 posts deleted by moderators.

Manually going through this list is a lot of work, while there aren't many posts to report. It's not very effective to do.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!