Bitcoin Forum
September 25, 2018, 09:58:48 PM *
News: ♦♦ New info! Bitcoin Core users absolutely must upgrade to previously-announced 0.16.3 [Torrent]. All Bitcoin users should temporarily trust confirmations slightly less. More info.
 
   Home   Help Search Donate Login Register  
Poll
Question: Using hompgraphs to be forbidden on the forum. What you think?
Yes, I support the idea
No, it's fine like this
What are these "homographs". /read below to find out/
It's up to theymos, he has to decide
I don't really care.

Pages: « 1 [2]  All
  Print  
Author Topic: VOTE PLEASE > [Request]Use of Homographs to be forbidden.  (Read 659 times)
actmyname
Copper Member
Legendary
*
Online Online

Activity: 1176
Merit: 1270


View Profile WWW
June 13, 2018, 08:23:49 PM
 #21

Useful link to look at when checking for these: http://unicode.org/cldr/utility/confusables.jsp

1537912728
Hero Member
*
Offline Offline

Posts: 1537912728

View Profile Personal Message (Offline)

Ignore
1537912728
Reply with quote  #2

1537912728
Report to moderator
Make a difference with your Ether.
Donate Ether for the greater good.
SPRING.WETRUST.IO
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1537912728
Hero Member
*
Offline Offline

Posts: 1537912728

View Profile Personal Message (Offline)

Ignore
1537912728
Reply with quote  #2

1537912728
Report to moderator
iasenko
Sr. Member
****
Offline Offline

Activity: 322
Merit: 503


Vod's BTT Public Information Project - bpip.org


View Profile WWW
June 20, 2018, 08:14:52 AM
 #22

I'll keep bumping this thread until there is some reaction on the case.

What can be done >
  • Theymos adds feature that automatically converts the homographs to Latin outside the Local section
  • We all together stop this madness as we list using homographs on the rules

The spammers already made an improvement. Instead of changing all the letters with homographs, now they change only one letter, which is more difficult to detect / at least they think so/.

Here is an example form the last few days. The marked with yellow letter is Cyrillic "a".
... image loading




Useful link to look at when checking for these: http://unicode.org/cldr/utility/confusables.jsp

Thanks man for helping, this is a good tool to see what type of characters to look for.

o_e_l_e_o
Sr. Member
****
Offline Offline

Activity: 364
Merit: 589



View Profile
June 20, 2018, 01:50:14 PM
 #23

Here's an even better tool for checking for homographs: https://www.textmagic.com/free-tools/unicode-detector

Just copy and paste the text in, and any Unicode character will be highlighted in red. If you suspect someone of using homograph plagiarism, you can go their profile and copy in an entire page of recent posts to check them all in about 10 seconds.


       █
      ██
     ██
   ██ ██
 █ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
   
       █
      ██
     ██
   ██ ██
 █ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
  B

          ▄▄▄▄▄▄
     ▄▄████████████▄▄
   ▄█████▀▀    ▀▀█████▄
  ████▀            ▀████
 ████                ████
▐███                  ███▌
███▌                  ▐███
▐███           ▄▄     ███▌
 ████         ▀███▄  ▐███
  ████▄         ▀███▄███
   ▀█████▄▄     ▄█████▀
     ▀▀████████████▀▀
          ▀▀▀▀▀▀
T 
Better. Quick.

Transparent.






             ▄████▄▄   ▄
█▄          ██████████▀▄
███        ███████████▀
▐████▄     ██████████▌
▄▄██████▄▄▄▄█████████▌
▀████████████████████
  ▀█████████████████
  ▄▄███████████████
   ▀█████████████▀
    ▄▄█████████▀
▀▀██████████▀
    ▀▀▀▀▀






▄█████████████████████████▄
███████████████████████████
███████████████▀       ████
██████████████      ▄▄▄████
██████████████    ▐████████
██████████████    ▐████████
██████████            ▐████
██████████            █████
██████████████    ▐████████
██████████████    ▐████████
██████████████    ▐████████
▀█████████████    ▐███████▀






                   ▄▄████
              ▄▄████████▌
         ▄▄█████████▀███
    ▄▄██████████▀▀ ▄███▌
▄████████████▀▀  ▄█████
▀▀▀███████▀   ▄███████▌
      ██    ▄█████████
       █  ▄██████████▌
       █  ███████████
       █ ██▀ ▀██████▌
       ██▀     ▀████
                 ▀█▌
iasenko
Sr. Member
****
Offline Offline

Activity: 322
Merit: 503


Vod's BTT Public Information Project - bpip.org


View Profile WWW
June 25, 2018, 09:58:06 AM
 #24

Here's an even better tool for checking for homographs: https://www.textmagic.com/free-tools/unicode-detector

Just copy and paste the text in, and any Unicode character will be highlighted in red. If you suspect someone of using homograph plagiarism, you can go their profile and copy in an entire page of recent posts to check them all in about 10 seconds.

This is a great tool, thanks.

There is no problem in detecting them but reporting them. Let me explain.
Using only homographs is no rule-breaking, the problem is that those who use this technique are trying to hide copy-pasting.
But to accuse someone in copy-pasting first you have to correct the post back to normal Latin characters, and then search for the original posts. Which takes time, even if you are using Word with "replace all" option.

I just want to add the homographs to the rules, because using them is no beneficial for the forum at all.
Doing so, you can directly report the homographs and skip the plagiarism part.



Bump, last 3 days history with more than 70 cases of using homographs:
https://i.imgur.com/F9np4wB.jpg

o_e_l_e_o
Sr. Member
****
Offline Offline

Activity: 364
Merit: 589



View Profile
June 25, 2018, 10:34:15 AM
 #25

-snip-

Are there any legitimate reasons for needing to use monographs in a forum post? I can think of none. Can anyone correct me?

If there are none, then the only reason to use them is to hide plagiarism, in which case they should be banned.


       █
      ██
     ██
   ██ ██
 █ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
   
       █
      ██
     ██
   ██ ██
 █ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
██ ██ ██
  B

          ▄▄▄▄▄▄
     ▄▄████████████▄▄
   ▄█████▀▀    ▀▀█████▄
  ████▀            ▀████
 ████                ████
▐███                  ███▌
███▌                  ▐███
▐███           ▄▄     ███▌
 ████         ▀███▄  ▐███
  ████▄         ▀███▄███
   ▀█████▄▄     ▄█████▀
     ▀▀████████████▀▀
          ▀▀▀▀▀▀
T 
Better. Quick.

Transparent.






             ▄████▄▄   ▄
█▄          ██████████▀▄
███        ███████████▀
▐████▄     ██████████▌
▄▄██████▄▄▄▄█████████▌
▀████████████████████
  ▀█████████████████
  ▄▄███████████████
   ▀█████████████▀
    ▄▄█████████▀
▀▀██████████▀
    ▀▀▀▀▀






▄█████████████████████████▄
███████████████████████████
███████████████▀       ████
██████████████      ▄▄▄████
██████████████    ▐████████
██████████████    ▐████████
██████████            ▐████
██████████            █████
██████████████    ▐████████
██████████████    ▐████████
██████████████    ▐████████
▀█████████████    ▐███████▀






                   ▄▄████
              ▄▄████████▌
         ▄▄█████████▀███
    ▄▄██████████▀▀ ▄███▌
▄████████████▀▀  ▄█████
▀▀▀███████▀   ▄███████▌
      ██    ▄█████████
       █  ▄██████████▌
       █  ███████████
       █ ██▀ ▀██████▌
       ██▀     ▀████
                 ▀█▌
iasenko
Sr. Member
****
Offline Offline

Activity: 322
Merit: 503


Vod's BTT Public Information Project - bpip.org


View Profile WWW
July 01, 2018, 09:16:04 PM
 #26

Are there any legitimate reasons for needing to use monographs in a forum post? I can think of none. Can anyone correct me?

If there are none, then the only reason to use them is to hide plagiarism, in which case they should be banned.

No reasons whatsoever..

OK then, a question:
What it takes to add something to the rules, when it comes to something abuseful like homogprahs?



Another bump. I'll do this until there is a reaction on the case.
Already 22 cases from today only /one is a copy/paste report actually/

And a new bump and anoter 24 cases only for today...


Quickseller
Copper Member
Legendary
*
Offline Offline

Activity: 1540
Merit: 1160


Hire BOUNTYPORTALS>Bounty management goo.gl/pSzJuA


View Profile WWW
July 01, 2018, 11:07:39 PM
 #27

I am not sure if there might ever be a legitimate need for the use of these symbols. If not, we may want to change SMF settings so that users cannot use these symbols, or that these symbols are automatically changed to the letter they are designed to look like.

3PjXm2XYDKLV5mN3oiKzNTyVvSkqP3ujeq <-- tipping address Advertise here
iasenko
Sr. Member
****
Offline Offline

Activity: 322
Merit: 503


Vod's BTT Public Information Project - bpip.org


View Profile WWW
July 02, 2018, 10:14:01 AM
 #28

I am not sure if there might ever be a legitimate need for the use of these symbols. If not, we may want to change SMF settings so that users cannot use these symbols, or that these symbols are automatically changed to the letter they are designed to look like.

We are waiting for reaction from theymos and I know it can take a few more months, in the mean time I just want to report all those using homographs but I have to find another reason to report them, coz using homographs is not against the rules... yet.

LoyceV
Legendary
*
Offline Offline

Activity: 1246
Merit: 1995


Let's make Bitcointalk great again!


View Profile WWW
July 02, 2018, 10:30:33 AM
 #29

There is no problem in detecting them but reporting them. Let me explain.
Using only homographs is no rule-breaking, the problem is that those who use this technique are trying to hide copy-pasting.
You could argue it's not English, which isn't allowed on the English boards, but it's a bit far fetched.

Quote
I just want to add the homographs to the rules, because using them is no beneficial for the forum at all.
Doing so, you can directly report the homographs and skip the plagiarism part.
Since there is no legitimate use for it, they should just be banned. It's clearly abuse.

I gave up looking for them though, because I can't see which accounts are banned already.

iasenko
Sr. Member
****
Offline Offline

Activity: 322
Merit: 503


Vod's BTT Public Information Project - bpip.org


View Profile WWW
July 05, 2018, 07:47:40 PM
 #30


There is no problem in detecting them but reporting them. Let me explain.
Using only homographs is no rule-breaking, the problem is that those who use this technique are trying to hide copy-pasting.
You could argue it's not English, which isn't allowed on the English boards, but it's a bit far fetched.

Quote
I just want to add the homographs to the rules, because using them is no beneficial for the forum at all.
Doing so, you can directly report the homographs and skip the plagiarism part.
Since there is no legitimate use for it, they should just be banned. It's clearly abuse.

I gave up looking for them though, because I can't see which accounts are banned already.

I have around 50 hompgraph reports hanging as unhandled for more than a month now and I stopped reporting them. Not enough time to waste on listing them on my rule-breakers list, so I'll push this thread a bit until we have a clear solution to the hompgraph problem.
Actually I don't see so many now, just 30-40 per day max, before they were like a few pages in the search results.

Bump, another 20-ish for the past day. I'll start reporting them as soon as they are forbidden.


Bump.I have reported already a few from today, lets see what the mods will do.

LoyceV
Legendary
*
Offline Offline

Activity: 1246
Merit: 1995


Let's make Bitcointalk great again!


View Profile WWW
July 14, 2018, 01:42:07 PM
 #31

This topic requires a higher priority. While it used to be a relatively small problem, I'm afraid I've pushed massive spammers to using homograph attacks by getting many of their spambots banned.
See here for many examples of spambots who started using homograph attacks today, which they didn't do yesterday.

iasenko
Sr. Member
****
Offline Offline

Activity: 322
Merit: 503


Vod's BTT Public Information Project - bpip.org


View Profile WWW
July 18, 2018, 06:21:10 AM
 #32

This topic requires a higher priority. While it used to be a relatively small problem, I'm afraid I've pushed massive spammers to using homograph attacks by getting many of their spambots banned.
See here for many examples of spambots who started using homograph attacks today, which they didn't do yesterday.

I do check for homographs almost ever day, or in the worst cases evety other day. I can tell you that almost all of the "single character" hompgraphs popping up in the search results ( I just check the past day/two days posts ) are made from newbies for the first time, refering to the "a very" case I have asked you to check a few days ago.

Let's be honest, all those are bots, it is known.
From time to time I spot a regular hompgraphs with many vocals replaced, but those are very rare now, and mostly posted by the "usual suspects".
I stopped reporting them as I got a bad report on one case - this one below, and I still have 49 hanging.

you say in your whitepaper and that the traditional digital advertising has a lot of issues right now which is not good I suppose... But do these issues really influence market in some bad way? I mean, I think it is alright in its current state and don't necessarily require radical changes. I still think your solution is great though, I just think I doesn’t worth it

So, I know this is getting bigger, it was big before too, but as I started reporting the thing started to look better.
Seems like the time spend reporting is a bit waisted, as the banned accounts are easily replaced by new ones, and everything done with a script.

bump

iasenko
Sr. Member
****
Offline Offline

Activity: 322
Merit: 503


Vod's BTT Public Information Project - bpip.org


View Profile WWW
July 24, 2018, 12:54:34 PM
 #33

Just a few from the latest ones,
Please, tell me again that this homographs are not used to hide plagiarism/copy-pasting?

Big drawback is that LoyceV's script cannot search for homographs. I have to hunt them manually.

C'mon let's add them to the rules people!!

I've written a inquiry in telegram, waiting for your return!
Riveting job, cognitive idea!  Good luck guys.

I have sent a inquiry in telegram, waiting for your answer. Really good vision, noticing approach, unblemished website!

I have sent a inquiry in telegram, waiting for your reply.
Very nice project, percipient design, excellence project!

I've sent a request in telegram, waiting for your return!
Spotless project, very nice logo, cognitive design.

I've sent a inquiry in telegram, waiting for your answer!
Riveting business, aesthetic project, impeccable plan!


LoyceV
Legendary
*
Offline Offline

Activity: 1246
Merit: 1995


Let's make Bitcointalk great again!


View Profile WWW
July 24, 2018, 07:52:26 PM
 #34

Big drawback is that LoyceV's script cannot search for homographs. I have to hunt them manually.
I can search for them, but only for Newbies and it's more work.
If you want me to search for them, can you make a list of common words and show the both the  homograph version and the HTML equivalent?
Example:

wallet
Code:
w&#1072;ll&#1077;t

If one word has different variations, post them all. I'll continue downloading new patrol pages next week, I'm not online much at the moment. But feel free to use this time to collect them for me.

iasenko
Sr. Member
****
Offline Offline

Activity: 322
Merit: 503


Vod's BTT Public Information Project - bpip.org


View Profile WWW
September 08, 2018, 08:27:17 PM
 #35

Finally the homographs problem is solved and I'm locking this thread.
I can reopen it if the situation get worse.
Thank everyone for the support. Smiley



 150 new hompgraphs are posted every day, im monitoring this for the past 6 days.
I think it's time to reopen this thread again.

I want to add them to the rules so we can report them directly without looking for plagiarism.

r1s2g3
Member
**
Offline Offline

Activity: 294
Merit: 80


View Profile
September 08, 2018, 09:35:11 PM
 #36

I guess if homograph character set(cyrillic characters) support is disabled in this forum where primary language of posting is English then it will be very easy to detect the homogarphs, They  will convert to gibberish and everybody will understood that cyrillic  character are used to hide copy-pasting. I guess it will reduce the work of everybody.

lord munchkin
Jr. Member
*
Offline Offline

Activity: 48
Merit: 7


View Profile
September 08, 2018, 10:50:58 PM
 #37

I believe that the use of homographs definitely suggests the intention to hide plagiarism and so a form of ASCII binding could be set up to deny the use of homographs. This would prevent plagiarized character strings appearing identical to the original if a web scraper or bot was to view the strings, but the scraper should still be able to tell that 99% of the text is copied.

I'm not sure if Bitcointalk has a bot that checks for plagiarism but I assume it does, (or else the use of homographs here wouldn't make a difference) and so another alternative would be to check any text that uses 'mimic characters' with more scrutiny. For eg, a string without the use of homographs could clear a plagiarism check if 70% of the text is original, but a text with homographs might only pass if 90% is original, or not pass at all because the only logical reason for the use f homographs is to evade recognition.

A third option would be to just report simular looking posts to moderators, but even here, the use of homographs to conceal plagiarism is negligible.

So the use of homographs logically means that it is the posters intention of hiding plagiarism from bots so the best course of action would probably be to implement some function into the web crawler that check Bitcointalk and give it permission to delete and ban all posts and accounts that use homographs. The Armenian characters set has characters identical to the Latin character set, eg. o, n, u, S and Լ. There isn't a Latin board here, so removing Latin characters from this forum could work.
TECSHARE
Legendary
*
Offline Offline

Activity: 2632
Merit: 1023


Welcome to Bitcoin Stalk


View Profile WWW
September 09, 2018, 10:07:03 AM
 #38

Instead of trying to make new rules to stop everything which will likely not be enforced any more heavily than it already is, use this discovery to build tools to find these people faster and report them.

BITCOINTALK STAFF SELECTIVELY ENFORCE THE RULES IN AN ATTEMPT TO CREATE A CHILL EFFECT AND PERMANENTLY REMOVE ME AND OTHERS FROM THIS FORUM AS RETALIATION FOR SPEAKING OUT ABOUT THEIR ABUSIVE BEHAVIOR, AND THAT OF THEIR PERSONAL CLIQUES.
iasenko
Sr. Member
****
Offline Offline

Activity: 322
Merit: 503


Vod's BTT Public Information Project - bpip.org


View Profile WWW
September 09, 2018, 12:17:20 PM
 #39

I guess if homograph character set(cyrillic characters) support is disabled in this forum where primary language of posting is English then it will be very easy to detect the homogarphs, They  will convert to gibberish and everybody will understood that cyrillic  character are used to hide copy-pasting. I guess it will reduce the work of everybody.

No, the characters are converted automatically to Latin, but searching for them get them listed, see here for example :
https://i.imgur.com/j0CPpxA.mp4

I want to add them to the rules so they can be easily reported.

Instead of trying to make new rules to stop everything which will likely not be enforced any more heavily than it already is, use this discovery to build tools to find these people faster and report them.

If I get them banned for using hompgraphs then I can easily reported them, even list them automatically.

Pages: « 1 [2]  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!