Bitcoin Forum
November 16, 2018, 11:15:13 PM *
News: Latest Bitcoin Core release: 0.17.0 [Torrent].
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: [FIXED]Homographs are fixed.Thank you theymos, again. See my report.  (Read 510 times)
iasenko
Sr. Member
****
Offline Offline

Activity: 378
Merit: 620


Bitcointalk Memes,post yours here> topic=4937275.0


View Profile WWW
August 27, 2018, 08:41:23 PM
 #1

The problem with the homographs is finally solved.

Done. I only did the ones that look really similar to Latin characters, and it only applies to English sections. It's done at display time, so it's retroactive.

I've tested it and it's better than ever. See my conclusion here > https://bitcointalk.org/index.php?topic=4967143.msg44859677#msg44859677
It won't affect the legit Cyrillic posts outside the local section, with the only exception that if you copy/quote the text from a Cyrillic post the changed/fixed letters will remain in Latin. See an example in the conclusion.


So the hompgraphs are back and are more active than before.
Just for the last 24 hours got 82 cases of hompgraph attacks.

Hompgraphs from the last 24 hours:

Had a look through the whitepaper and have to say it looks great. It explained a few things around masternodes as well.
~
... And many mone up to 82 cases

What we need to do is to make one simple list of all the characters and theymos will fix it.

~
BTW, the main blocker for me taking action was that I never got around to compiling the table of homographic characters and their ASCII counterparts. If this crops up again, it'd be helpful if someone would compile a nice plaintext "<char_from> -> <char_to>" table.

Anyone, who had the time to do it?
I'm on the mobile and it's awful.



[UPDATE] we have some lists finished bellow, now it's just to decide which one to use.

I kind of liked the homographs, it's pretty easy to spot the plagiarism, maybe it will be a good idea to just color them in red or put a dot after each homograph so we can see them easily.

1542410113
Hero Member
*
Offline Offline

Posts: 1542410113

View Profile Personal Message (Offline)

Ignore
1542410113
Reply with quote  #2

1542410113
Report to moderator
1542410113
Hero Member
*
Offline Offline

Posts: 1542410113

View Profile Personal Message (Offline)

Ignore
1542410113
Reply with quote  #2

1542410113
Report to moderator
1542410113
Hero Member
*
Offline Offline

Posts: 1542410113

View Profile Personal Message (Offline)

Ignore
1542410113
Reply with quote  #2

1542410113
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1542410113
Hero Member
*
Offline Offline

Posts: 1542410113

View Profile Personal Message (Offline)

Ignore
1542410113
Reply with quote  #2

1542410113
Report to moderator
mdayonliner
Sr. Member
****
Offline Offline

Activity: 336
Merit: 298

Loading... & http://bit.ly/reLoaded_


View Profile
August 27, 2018, 08:49:59 PM
 #2

Anyone, who had the time to do it?
I'm on the mobile and it's awful.
Give me some resources to start (URL, keywords or stuffs). I have time to make the list if it does not take too much time like a day or two.

iasenko
Sr. Member
****
Offline Offline

Activity: 378
Merit: 620


Bitcointalk Memes,post yours here> topic=4937275.0


View Profile WWW
August 27, 2018, 08:55:38 PM
 #3

See here :

http://sites.psu.edu/symbolcodes/languages/europe/cyrillic/cyrillicchart/

I guess there are other alfabets which can be used too but the Cyrillic is what is mainly used in the hompgraph attacks here.

There are some more resources and info in the quoted thread :

Does someone have a table of these characters? I can automatically convert non-standard characters to ASCII.

This is the one I use:
http://sites.psu.edu/symbolcodes/languages/europe/cyrillic/cyrillicchart/
~
There are more characters see the link I posted, not only the main Cyrillic, like:
CYRILLIC CAPITAL LETTER DZE   S   &‌#1029;   &‌#x0405;



mdayonliner
Sr. Member
****
Offline Offline

Activity: 336
Merit: 298

Loading... & http://bit.ly/reLoaded_


View Profile
August 27, 2018, 09:17:33 PM
 #4

Reference: http://jrgraphix.net/r/Unicode/0400-04FF

Version one:
Ѐ = 0400
Ё = 0401
Ђ = 0402
Ѓ = 0403
Є = 0404
S = 0405
I = 0406
Ї = 0407
J = 0408
Љ = 0409
Њ = 040a
Ћ = 040b
Ќ = 040c
Ѝ = 040d
Ў = 040e
Џ = 040f
A = 0410
Б = 0411
B = 0412
Г = 0413
Д = 0414
E = 0415
Ж = 0416
З = 0417
И = 0418
Й = 0419
К = 041a
Л = 041b
M = 041c
H = 041d
O = 041e
П = 041f
P = 0420
C = 0421
T = 0422
У = 0423
Ф = 0424
X = 0425
Ц = 0426
Ч = 0427
Ш = 0428
Щ = 0429
Ъ = 042a
Ы = 042b
Ь = 042c
Э = 042d
Ю = 042e
Я = 042f
a = 0430
б = 0431
в = 0432
г = 0433
д = 0434
e = 0435
ж = 0436
з = 0437
и = 0438
й = 0439
к = 043a
л = 043b
м = 043c
н = 043d
o = 043e
п = 043f
p = 0440
c = 0441
т = 0442
y = 0443
ф = 0444
x = 0445
ц = 0446
ч = 0447
ш = 0448
щ = 0449
ъ = 044a
ы = 044b
ь = 044c
э = 044d
ю = 044e
я = 044f
ѐ = 0450
ё = 0451
ђ = 0452
ѓ = 0453
є = 0454
s = 0455
i = 0456
ї = 0457
j = 0458
љ = 0459
њ = 045a
ћ = 045b
ќ = 045c
ѝ = 045d
ў = 045e
џ = 045f
Ѡ = 0460
ѡ = 0461
Ѣ = 0462
ѣ = 0463
Ѥ = 0464
ѥ = 0465
Ѧ = 0466
ѧ = 0467
Ѩ = 0468
ѩ = 0469
Ѫ = 046a
ѫ = 046b
Ѭ = 046c
ѭ = 046d
Ѯ = 046e
ѯ = 046f
Ѱ = 0470
ѱ = 0471
Ѳ = 0472
ѳ = 0473
Ѵ = 0474
ѵ = 0475
Ѷ = 0476
ѷ = 0477
Ѹ = 0478
ѹ = 0479
Ѻ = 047a
ѻ = 047b
Ѽ = 047c
ѽ = 047d
Ѿ = 047e
ѿ = 047f
Ҁ = 0480
ҁ = 0481
҂ = 0482
҃ = 0483
҄ = 0484
҅ = 0485
҆ = 0486
҇ = 0487
҈ = 0488
҉ = 0489
Ҋ = 048a
ҋ = 048b
Ҍ = 048c
ҍ = 048d
Ҏ = 048e
ҏ = 048f
Ґ = 0490
ґ = 0491
Ғ = 0492
ғ = 0493
Ҕ = 0494
ҕ = 0495
Җ = 0496
җ = 0497
Ҙ = 0498
ҙ = 0499
Қ = 049a
қ = 049b
Ҝ = 049c
ҝ = 049d
Ҟ = 049e
ҟ = 049f
Ҡ = 04a0
ҡ = 04a1
Ң = 04a2
ң = 04a3
Ҥ = 04a4
ҥ = 04a5
Ҧ = 04a6
ҧ = 04a7
Ҩ = 04a8
ҩ = 04a9
Ҫ = 04aa
ҫ = 04ab
Ҭ = 04ac
ҭ = 04ad
Y = 04ae
ү = 04af
Ұ = 04b0
ұ = 04b1
Ҳ = 04b2
ҳ = 04b3
Ҵ = 04b4
ҵ = 04b5
Ҷ = 04b6
ҷ = 04b7
Ҹ = 04b8
ҹ = 04b9
Һ = 04ba
h = 04bb
Ҽ = 04bc
ҽ = 04bd
Ҿ = 04be
ҿ = 04bf
Ӏ = 04c0
Ӂ = 04c1
ӂ = 04c2
Ӄ = 04c3
ӄ = 04c4
Ӆ = 04c5
ӆ = 04c6
Ӈ = 04c7
ӈ = 04c8
Ӊ = 04c9
ӊ = 04ca
Ӌ = 04cb
ӌ = 04cc
Ӎ = 04cd
ӎ = 04ce
ӏ = 04cf
Ӑ = 04d0
ӑ = 04d1
Ӓ = 04d2
ӓ = 04d3
Ӕ = 04d4
ӕ = 04d5
Ӗ = 04d6
ӗ = 04d7
Ә = 04d8
ә = 04d9
Ӛ = 04da
ӛ = 04db
Ӝ = 04dc
ӝ = 04dd
Ӟ = 04de
ӟ = 04df
Ӡ = 04e0
ӡ = 04e1
Ӣ = 04e2
ӣ = 04e3
Ӥ = 04e4
ӥ = 04e5
Ӧ = 04e6
ӧ = 04e7
Ө = 04e8
ө = 04e9
Ӫ = 04ea
ӫ = 04eb
Ӭ = 04ec
ӭ = 04ed
Ӯ = 04ee
ӯ = 04ef
Ӱ = 04f0
ӱ = 04f1
Ӳ = 04f2
ӳ = 04f3
Ӵ = 04f4
ӵ = 04f5
Ӷ = 04f6
ӷ = 04f7
Ӹ = 04f8
ӹ = 04f9
Ӻ = 04fa
ӻ = 04fb
Ӽ = 04fc
ӽ = 04fd
Ӿ = 04fe
ӿ = 04ff

without the equal sign and the new line symbol:
Code:
Ѐ 0400
Ё 0401
Ђ 0402
Ѓ 0403
Є 0404
S 0405
I 0406
Ї 0407
J 0408
Љ 0409
Њ 040a
Ћ 040b
Ќ 040c
Ѝ 040d
Ў 040e
Џ 040f
A 0410
Б 0411
B 0412
Г 0413
Д 0414
E 0415
Ж 0416
З 0417
И 0418
Й 0419
К 041a
Л 041b
M 041c
H 041d
O 041e
П 041f
P 0420
C 0421
T 0422
У 0423
Ф 0424
X 0425
Ц 0426
Ч 0427
Ш 0428
Щ 0429
Ъ 042a
Ы 042b
Ь 042c
Э 042d
Ю 042e
Я 042f
a 0430
б 0431
в 0432
г 0433
д 0434
e 0435
ж 0436
з 0437
и 0438
й 0439
к 043a
л 043b
м 043c
н 043d
o 043e
п 043f
p 0440
c 0441
т 0442
y 0443
ф 0444
x 0445
ц 0446
ч 0447
ш 0448
щ 0449
ъ 044a
ы 044b
ь 044c
э 044d
ю 044e
я 044f
ѐ 0450
ё 0451
ђ 0452
ѓ 0453
є 0454
s 0455
i 0456
ї 0457
j 0458
љ 0459
њ 045a
ћ 045b
ќ 045c
ѝ 045d
ў 045e
џ 045f
Ѡ 0460
ѡ 0461
Ѣ 0462
ѣ 0463
Ѥ 0464
ѥ 0465
Ѧ 0466
ѧ 0467
Ѩ 0468
ѩ 0469
Ѫ 046a
ѫ 046b
Ѭ 046c
ѭ 046d
Ѯ 046e
ѯ 046f
Ѱ 0470
ѱ 0471
Ѳ 0472
ѳ 0473
Ѵ 0474
ѵ 0475
Ѷ 0476
ѷ 0477
Ѹ 0478
ѹ 0479
Ѻ 047a
ѻ 047b
Ѽ 047c
ѽ 047d
Ѿ 047e
ѿ 047f
Ҁ 0480
ҁ 0481
҂ 0482
҃ 0483
҄ 0484
҅ 0485
҆ 0486
҇ 0487
҈ 0488
҉ 0489
Ҋ 048a
ҋ 048b
Ҍ 048c
ҍ 048d
Ҏ 048e
ҏ 048f
Ґ 0490
ґ 0491
Ғ 0492
ғ 0493
Ҕ 0494
ҕ 0495
Җ 0496
җ 0497
Ҙ 0498
ҙ 0499
Қ 049a
қ 049b
Ҝ 049c
ҝ 049d
Ҟ 049e
ҟ 049f
Ҡ 04a0
ҡ 04a1
Ң 04a2
ң 04a3
Ҥ 04a4
ҥ 04a5
Ҧ 04a6
ҧ 04a7
Ҩ 04a8
ҩ 04a9
Ҫ 04aa
ҫ 04ab
Ҭ 04ac
ҭ 04ad
Y 04ae
ү 04af
Ұ 04b0
ұ 04b1
Ҳ 04b2
ҳ 04b3
Ҵ 04b4
ҵ 04b5
Ҷ 04b6
ҷ 04b7
Ҹ 04b8
ҹ 04b9
Һ 04ba
h 04bb
Ҽ 04bc
ҽ 04bd
Ҿ 04be
ҿ 04bf
Ӏ 04c0
Ӂ 04c1
ӂ 04c2
Ӄ 04c3
ӄ 04c4
Ӆ 04c5
ӆ 04c6
Ӈ 04c7
ӈ 04c8
Ӊ 04c9
ӊ 04ca
Ӌ 04cb
ӌ 04cc
Ӎ 04cd
ӎ 04ce
ӏ 04cf
Ӑ 04d0
ӑ 04d1
Ӓ 04d2
ӓ 04d3
Ӕ 04d4
ӕ 04d5
Ӗ 04d6
ӗ 04d7
Ә 04d8
ә 04d9
Ӛ 04da
ӛ 04db
Ӝ 04dc
ӝ 04dd
Ӟ 04de
ӟ 04df
Ӡ 04e0
ӡ 04e1
Ӣ 04e2
ӣ 04e3
Ӥ 04e4
ӥ 04e5
Ӧ 04e6
ӧ 04e7
Ө 04e8
ө 04e9
Ӫ 04ea
ӫ 04eb
Ӭ 04ec
ӭ 04ed
Ӯ 04ee
ӯ 04ef
Ӱ 04f0
ӱ 04f1
Ӳ 04f2
ӳ 04f3
Ӵ 04f4
ӵ 04f5
Ӷ 04f6
ӷ 04f7
Ӹ 04f8
ӹ 04f9
Ӻ 04fa
ӻ 04fb
Ӽ 04fc
ӽ 04fd
Ӿ 04fe
ӿ 04ff


Version two:

&#1024; = 0400
&#1025; = 0401
&#1026; = 0402
&#1027; = 0403
&#1028; = 0404
&#1029; = 0405
&#1030; = 0406
&#1031; = 0407
&#1032; = 0408
&#1033; = 0409
&#1034; = 040a
&#1035; = 040b
&#1036; = 040c
&#1037; = 040d
&#1038; = 040e
&#1039; = 040f
&#1040; = 0410
&#1041; = 0411
&#1042; = 0412
&#1043; = 0413
&#1044; = 0414
&#1045; = 0415
&#1046; = 0416
&#1047; = 0417
&#1048; = 0418
&#1049; = 0419
&#1050; = 041a
&#1051; = 041b
&#1052; = 041c
&#1053; = 041d
&#1054; = 041e
&#1055; = 041f
&#1056; = 0420
&#1057; = 0421
&#1058; = 0422
&#1059; = 0423
&#1060; = 0424
&#1061; = 0425
&#1062; = 0426
&#1063; = 0427
&#1064; = 0428
&#1065; = 0429
&#1066; = 042a
&#1067; = 042b
&#1068; = 042c
&#1069; = 042d
&#1070; = 042e
&#1071; = 042f
&#1072; = 0430
&#1073; = 0431
&#1074; = 0432
&#1075; = 0433
&#1076; = 0434
&#1077; = 0435
&#1078; = 0436
&#1079; = 0437
&#1080; = 0438
&#1081; = 0439
&#1082; = 043a
&#1083; = 043b
&#1084; = 043c
&#1085; = 043d
&#1086; = 043e
&#1087; = 043f
&#1088; = 0440
&#1089; = 0441
&#1090; = 0442
&#1091; = 0443
&#1092; = 0444
&#1093; = 0445
&#1094; = 0446
&#1095; = 0447
&#1096; = 0448
&#1097; = 0449
&#1098; = 044a
&#1099; = 044b
&#1100; = 044c
&#1101; = 044d
&#1102; = 044e
&#1103; = 044f
&#1104; = 0450
&#1105; = 0451
&#1106; = 0452
&#1107; = 0453
&#1108; = 0454
&#1109; = 0455
&#1110; = 0456
&#1111; = 0457
&#1112; = 0458
&#1113; = 0459
&#1114; = 045a
&#1115; = 045b
&#1116; = 045c
&#1117; = 045d
&#1118; = 045e
&#1119; = 045f
&#1120; = 0460
&#1121; = 0461
&#1122; = 0462
&#1123; = 0463
&#1124; = 0464
&#1125; = 0465
&#1126; = 0466
&#1127; = 0467
&#1128; = 0468
&#1129; = 0469
&#1130; = 046a
&#1131; = 046b
&#1132; = 046c
&#1133; = 046d
&#1134; = 046e
&#1135; = 046f
&#1136; = 0470
&#1137; = 0471
&#1138; = 0472
&#1139; = 0473
&#1140; = 0474
&#1141; = 0475
&#1142; = 0476
&#1143; = 0477
&#1144; = 0478
&#1145; = 0479
&#1146; = 047a
&#1147; = 047b
&#1148; = 047c
&#1149; = 047d
&#1150; = 047e
&#1151; = 047f
&#1152; = 0480
&#1153; = 0481
&#1154; = 0482
&#1155; = 0483
&#1156; = 0484
&#1157; = 0485
&#1158; = 0486
&#1159; = 0487
&#1160; = 0488
&#1161; = 0489
&#1162; = 048a
&#1163; = 048b
&#1164; = 048c
&#1165; = 048d
&#1166; = 048e
&#1167; = 048f
&#1168; = 0490
&#1169; = 0491
&#1170; = 0492
&#1171; = 0493
&#1172; = 0494
&#1173; = 0495
&#1174; = 0496
&#1175; = 0497
&#1176; = 0498
&#1177; = 0499
&#1178; = 049a
&#1179; = 049b
&#1180; = 049c
&#1181; = 049d
&#1182; = 049e
&#1183; = 049f
&#1184; = 04a0
&#1185; = 04a1
&#1186; = 04a2
&#1187; = 04a3
&#1188; = 04a4
&#1189; = 04a5
&#1190; = 04a6
&#1191; = 04a7
&#1192; = 04a8
&#1193; = 04a9
&#1194; = 04aa
&#1195; = 04ab
&#1196; = 04ac
&#1197; = 04ad
&#1198; = 04ae
&#1199; = 04af
&#1200; = 04b0
&#1201; = 04b1
&#1202; = 04b2
&#1203; = 04b3
&#1204; = 04b4
&#1205; = 04b5
&#1206; = 04b6
&#1207; = 04b7
&#1208; = 04b8
&#1209; = 04b9
&#1210; = 04ba
&#1211; = 04bb
&#1212; = 04bc
&#1213; = 04bd
&#1214; = 04be
&#1215; = 04bf
&#1216; = 04c0
&#1217; = 04c1
&#1218; = 04c2
&#1219; = 04c3
&#1220; = 04c4
&#1221; = 04c5
&#1222; = 04c6
&#1223; = 04c7
&#1224; = 04c8
&#1225; = 04c9
&#1226; = 04ca
&#1227; = 04cb
&#1228; = 04cc
&#1229; = 04cd
&#1230; = 04ce
&#1231; = 04cf
&#1232; = 04d0
&#1233; = 04d1
&#1234; = 04d2
&#1235; = 04d3
&#1236; = 04d4
&#1237; = 04d5
&#1238; = 04d6
&#1239; = 04d7
&#1240; = 04d8
&#1241; = 04d9
&#1242; = 04da
&#1243; = 04db
&#1244; = 04dc
&#1245; = 04dd
&#1246; = 04de
&#1247; = 04df
&#1248; = 04e0
&#1249; = 04e1
&#1250; = 04e2
&#1251; = 04e3
&#1252; = 04e4
&#1253; = 04e5
&#1254; = 04e6
&#1255; = 04e7
&#1256; = 04e8
&#1257; = 04e9
&#1258; = 04ea
&#1259; = 04eb
&#1260; = 04ec
&#1261; = 04ed
&#1262; = 04ee
&#1263; = 04ef
&#1264; = 04f0
&#1265; = 04f1
&#1266; = 04f2
&#1267; = 04f3
&#1268; = 04f4
&#1269; = 04f5
&#1270; = 04f6
&#1271; = 04f7
&#1272; = 04f8
&#1273; = 04f9
&#1274; = 04fa
&#1275; = 04fb
&#1276; = 04fc
&#1277; = 04fd
&#1278; = 04fe
&#1279; = 04ff


without the equal sign and the new line symbol:
Code:
&#1024; 0400
&#1025; 0401
&#1026; 0402
&#1027; 0403
&#1028; 0404
&#1029; 0405
&#1030; 0406
&#1031; 0407
&#1032; 0408
&#1033; 0409
&#1034; 040a
&#1035; 040b
&#1036; 040c
&#1037; 040d
&#1038; 040e
&#1039; 040f
&#1040; 0410
&#1041; 0411
&#1042; 0412
&#1043; 0413
&#1044; 0414
&#1045; 0415
&#1046; 0416
&#1047; 0417
&#1048; 0418
&#1049; 0419
&#1050; 041a
&#1051; 041b
&#1052; 041c
&#1053; 041d
&#1054; 041e
&#1055; 041f
&#1056; 0420
&#1057; 0421
&#1058; 0422
&#1059; 0423
&#1060; 0424
&#1061; 0425
&#1062; 0426
&#1063; 0427
&#1064; 0428
&#1065; 0429
&#1066; 042a
&#1067; 042b
&#1068; 042c
&#1069; 042d
&#1070; 042e
&#1071; 042f
&#1072; 0430
&#1073; 0431
&#1074; 0432
&#1075; 0433
&#1076; 0434
&#1077; 0435
&#1078; 0436
&#1079; 0437
&#1080; 0438
&#1081; 0439
&#1082; 043a
&#1083; 043b
&#1084; 043c
&#1085; 043d
&#1086; 043e
&#1087; 043f
&#1088; 0440
&#1089; 0441
&#1090; 0442
&#1091; 0443
&#1092; 0444
&#1093; 0445
&#1094; 0446
&#1095; 0447
&#1096; 0448
&#1097; 0449
&#1098; 044a
&#1099; 044b
&#1100; 044c
&#1101; 044d
&#1102; 044e
&#1103; 044f
&#1104; 0450
&#1105; 0451
&#1106; 0452
&#1107; 0453
&#1108; 0454
&#1109; 0455
&#1110; 0456
&#1111; 0457
&#1112; 0458
&#1113; 0459
&#1114; 045a
&#1115; 045b
&#1116; 045c
&#1117; 045d
&#1118; 045e
&#1119; 045f
&#1120; 0460
&#1121; 0461
&#1122; 0462
&#1123; 0463
&#1124; 0464
&#1125; 0465
&#1126; 0466
&#1127; 0467
&#1128; 0468
&#1129; 0469
&#1130; 046a
&#1131; 046b
&#1132; 046c
&#1133; 046d
&#1134; 046e
&#1135; 046f
&#1136; 0470
&#1137; 0471
&#1138; 0472
&#1139; 0473
&#1140; 0474
&#1141; 0475
&#1142; 0476
&#1143; 0477
&#1144; 0478
&#1145; 0479
&#1146; 047a
&#1147; 047b
&#1148; 047c
&#1149; 047d
&#1150; 047e
&#1151; 047f
&#1152; 0480
&#1153; 0481
&#1154; 0482
&#1155; 0483
&#1156; 0484
&#1157; 0485
&#1158; 0486
&#1159; 0487
&#1160; 0488
&#1161; 0489
&#1162; 048a
&#1163; 048b
&#1164; 048c
&#1165; 048d
&#1166; 048e
&#1167; 048f
&#1168; 0490
&#1169; 0491
&#1170; 0492
&#1171; 0493
&#1172; 0494
&#1173; 0495
&#1174; 0496
&#1175; 0497
&#1176; 0498
&#1177; 0499
&#1178; 049a
&#1179; 049b
&#1180; 049c
&#1181; 049d
&#1182; 049e
&#1183; 049f
&#1184; 04a0
&#1185; 04a1
&#1186; 04a2
&#1187; 04a3
&#1188; 04a4
&#1189; 04a5
&#1190; 04a6
&#1191; 04a7
&#1192; 04a8
&#1193; 04a9
&#1194; 04aa
&#1195; 04ab
&#1196; 04ac
&#1197; 04ad
&#1198; 04ae
&#1199; 04af
&#1200; 04b0
&#1201; 04b1
&#1202; 04b2
&#1203; 04b3
&#1204; 04b4
&#1205; 04b5
&#1206; 04b6
&#1207; 04b7
&#1208; 04b8
&#1209; 04b9
&#1210; 04ba
&#1211; 04bb
&#1212; 04bc
&#1213; 04bd
&#1214; 04be
&#1215; 04bf
&#1216; 04c0
&#1217; 04c1
&#1218; 04c2
&#1219; 04c3
&#1220; 04c4
&#1221; 04c5
&#1222; 04c6
&#1223; 04c7
&#1224; 04c8
&#1225; 04c9
&#1226; 04ca
&#1227; 04cb
&#1228; 04cc
&#1229; 04cd
&#1230; 04ce
&#1231; 04cf
&#1232; 04d0
&#1233; 04d1
&#1234; 04d2
&#1235; 04d3
&#1236; 04d4
&#1237; 04d5
&#1238; 04d6
&#1239; 04d7
&#1240; 04d8
&#1241; 04d9
&#1242; 04da
&#1243; 04db
&#1244; 04dc
&#1245; 04dd
&#1246; 04de
&#1247; 04df
&#1248; 04e0
&#1249; 04e1
&#1250; 04e2
&#1251; 04e3
&#1252; 04e4
&#1253; 04e5
&#1254; 04e6
&#1255; 04e7
&#1256; 04e8
&#1257; 04e9
&#1258; 04ea
&#1259; 04eb
&#1260; 04ec
&#1261; 04ed
&#1262; 04ee
&#1263; 04ef
&#1264; 04f0
&#1265; 04f1
&#1266; 04f2
&#1267; 04f3
&#1268; 04f4
&#1269; 04f5
&#1270; 04f6
&#1271; 04f7
&#1272; 04f8
&#1273; 04f9
&#1274; 04fa
&#1275; 04fb
&#1276; 04fc
&#1277; 04fd
&#1278; 04fe
&#1279; 04ff


Let me know how much it helps.

nkampala
Member
**
Offline Offline

Activity: 68
Merit: 58


View Profile
August 27, 2018, 10:15:21 PM
Merited by theymos (10), Quickseller (1), LoyceV (1), iasenko (1)
 #5

I think I got most of them. They come from the Cyrillic, Greek, and Armenian alphabets. Info from this wikipedia page: https://en.wikipedia.org/wiki/IDN_homograph_attack

Homograph Character -> Regular Latin Character

Uppercase

Code:
A -> A
A -> A
B -> B
B -> B
C -> C
E -> E
E -> E
Ғ -> F
G -> G
H -> H
H -> H
I -> I
I -> I
J -> J
К -> K
K -> K
Լ -> L
M -> M
M -> M
N -> N
O -> O
O -> O
O -> O
P -> P
P -> P
S -> S
S -> S
T -> T
T -> T
U -> U
X -> X
X -> X
Y -> Y
Y -> Y
Z -> Z

Lowercase

Code:
a -> a
c -> c
d -> d
e -> e
ε -> e
g -> g
h -> h
h -> h
h -> h
i -> i
ι -> i
j -> j
κ -> k
Ӏ -> l
յ -> j
n -> n
η -> n
n -> n
o -> o
o -> o
o -> o
o -> o
p -> p
ρ -> p
q -> q
զ -> q
s -> s
τ -> t
υ -> u
u -> u
u -> u
ѵ -> v
ν -> v
w -> w
ω -> w
x -> x
χ -> x
y -> y
γ -> y

Accents & Other Marks

Code:
Ӓ -> Ä
Ё -> Ë
Ї -> Ï
Ӧ -> Ö
ӓ -> ä
ё -> ë
ї -> ï
ӧ -> ö

Numbers

Code:
Ձ -> 2
շ -> 2
З -> 3
Յ -> 3
Ч -> 4
б -> 6

CJK Compatability (not used as much b/c it doesn't look as similar, but might as well add it to the list anyway)
https://en.wikipedia.org/wiki/CJK_Compatibility

Code:
㍲ -> da
㍳ -> AU
㍴ -> bar
㍶ -> pc
㍷ -> dm
㍺ -> IU
㎅ -> KB
㎆ -> MB
㎇ -> GB
㎎ -> mg
㎏ -> kg
㎙ -> fm
㎚ -> nm
㎜ -> mm
㎝ -> cm
㎞ -> km
㎩ -> Pa
㎭ -> rad
㎰ -> ps
㎱ -> ns
㎳ -> ms
㎹ -> MV
㎿ -> MW
㏄ -> cc
㏅ -> cd
㏊ -> ha
㏌ -> in
㏐ -> lm
㏑ -> ln
㏒ -> log
㏓ -> lx
㏕ -> mil
㏖ -> mol
㏚ -> PR
㏛ -> sr
Xal0lex
Staff
Sr. Member
****
Offline Offline

Activity: 448
Merit: 308


View Profile
August 27, 2018, 11:05:39 PM
 #6

Characters that look the same in Latin and Cyrillic:

Code:
a, A, c, C, e, E, p, P, o, O, y, x, X, B, H, K, T, M


Latin

a  &#97;  -->
A  &#65;  -->
c  &#99;  -->
C  &#67;  -->
e  &#101; -->
E  &#69;  -->
K  &#75;  -->
p  &#112; -->
P  &#80;  -->
o  &#111; -->
O  &#79;  -->
y  &#121; -->
x  &#120; -->
X  &#88;  -->
B  &#66;  -->
H  &#72;  -->
T  &#84;  -->
M  &#77;  -->
....................
Cyrillic

a  &‌#1072;
A  &‌#1040;
c  &‌#1089;
C  &‌#1057;
e  &‌#1077;
E  &‌#1045;
К  &‌#1050;
p  &‌#1088;
P  &‌#1056;
o  &‌#1086;
O  &‌#1054;
y  &‌#1091;
x  &‌#1093;
X  &‌#1061;
B  &‌#1042;
H  &‌#1053;
T  &‌#1058;
M  &‌#1052;
....................






Quickseller
Copper Member
Legendary
*
Offline Offline

Activity: 1596
Merit: 1206

in 2 min-groin injury, dildo on field, & 6-9 score


View Profile WWW
August 28, 2018, 02:52:03 AM
 #7

Some of these are "legit" symbols in various languages, correct? For example Russian and I believe Hebrew use different symbols than English does.

Maybe someone can compile a list of symbols used in each language in the local section (along with English), and those symbols can be all that is allowed to be used.

Edit : 🔑

3PjXm2XYDKLV5mN3oiKzNTyVvSkqP3ujeq <-- tipping address Advertise here
Piggy
Hero Member
*****
Offline Offline

Activity: 546
Merit: 995



View Profile WWW
August 28, 2018, 04:32:51 AM
 #8

if i understood correctly they are mixing letter from different alphabets, this could be quite easy to spot by:

  • parsing the message
  • reporting the message
  • then check manually the message

I mean i don't see this going very far with this trick



     ▄██    ▐███████▄▄▄       ▄▄█████▄▄      ▄██▄      ▐██▄    ▒▓▓▄      ▄▓▓▒
     ███    ▐██▌▀▀▀▀▀███▄    ███▀▀▀▀▀███▄    ████▄     ▐██▌  ▐▓▄ ▀▓▓▄  ▄▓▓▀ ▄▓▌
     ███    ▐██▌      ███   ███▌      ███▌   ██████    ▐██▌   ▀▓▓▄ ▀▓▓▓▓▀ ▄▓▓▀
     ███    ▐██▌    ▄████  ▐███▌      ▐██▌   ███ ███▄  ▐██▌     ▀▓▓▄ ▀▀ ▄▓▓▀
     ███    ▐█████████▀▀   ▐███▌      ▐██▌   ███  ▀███ ▐██▌      ▓▓▓    ▓▓▓
     ███    ▐██▌   ▀███     ███▌      ███▌   ███    ██████▌   ▄▓▓▀ ▄▓▓▓▓▄ ▓▓▓▄
     ███    ▐██▌     ███    ▀███▄▄▄▄▄████    ███     ▀████▌  ▐▓▀ ▄▓▓▀  ▀▓▓▄ ▀▓▌
     ███    ▐██▌      ███     ▀▀██████▀▀     ███       ███▌    ▄▓▓▀      ▀▓▓▄
                  ▄▄▄█████▄▄▄▄
             ▄▄█▓▓▓▓▓█▀▀▀▀█▓▓▓▓▓█▄
           ▄▓▓▓█▀▀            ▀▀█▓▓█▄
         ▓▓▓█▀                    ▀▓▓█▄
       ▄▓▓▓▀                        ▀▓▓█
      ▄▓▓█                            █▓▓
      ▓▓▓                    ▄██▄     ▐▓▓█
     ▓▓▓                   ▄█▓▓▀       ▐▓▓▌
     ▓▓▓                 ▄█▓▓▀          ▓▓▓
     ▓▓▓       ▓▓▓▄    ▓▓▓▓▀            ▓▓▓
     ▓▓▓        ▀▓▓▓▄█▓▓▓▀             ▐▓▓▌
     ▀▓▓▓         ▀█▓▓█▀               █▓▓
      ▓▓▓▄                            ▓▓▓▌
       ▓▓▓█                         ▄█▓▓▀
        ▀▓▓█▄                     ▄▓▓▓█▀
          ▀▓▓▓█▄               ▄▄█▓▓█▀
            ▀▀█▓▓▓█▄▄▄▄▄▄▄▄▄▄█▓▓▓█▀
                ▀▀██▓▓▓▓▓▓▓███▀▀
xtraelv
Hero Member
*****
Offline Offline

Activity: 518
Merit: 898



View Profile
August 28, 2018, 06:18:55 AM
 #9

if i understood correctly they are mixing letter from different alphabets, this could be quite easy to spot by:

  • parsing the message
  • reporting the message
  • then check manually the message

I mean i don't see this going very far with this trick

I think Quickseller was hinting maybe on an automated program that can check in the different language sections for the valid and invalid characters.

Personally I favor posting unpleasant messages on ICOs that employ Bots and shills to promote their product. (Like I have done before)
I make them all different so I can't be reported for multi posts.

If others start doing that then eventually it will be pointless to use shills to promote ICOs.

I read the white paper. I’ve also stayed around after the countless delays and dates being changed

I think that the potential of the Кrios to take advantage of the computing power of the entire Internet destroys the fictitious belief that the cryptocurrency has no value, is a bubble or the latest fashion of technology.

Кrios provides new forms of financing to companies wishing to raise funds for their startup projects. The Кrios platform includes a centralized exchange of listings with decentralized interconnection.

This is a great project, bringing great benefits to the community. Not only that, it also brings new development for all of us. It is pride and happiness to be able to participate and I've known about ico for a long time, but this is probably the first time I've been so surprised to see the benefits and the advancement of your ideas.

Go for it guys this is a great project!! This is an amazing project, When I read your white sheet, I was totally delighted with how this would change our life. I think that it depends on each of us.

Just some of the fake comments by new shills posting on this thread.

A shill is a confidence trickster or swindler who poses as a genuine customer to entice or encourage others.




Is it wise to trust a ICO or coin that uses dishonesty to attract investors ?



We are surrounded by legends on this forum. Phenomenal successes and catastrophic failures. Then there are the scams. This forum is a digital museum.  
* The most iconic historic bitcointalk threads.* Satoshi * Cypherpunks*MtGox*Bitcointalk hacks*pHiShInG* Silk Road*Pirateat40*Knightmb*Miner shams*Forum scandals*BBCode*
Piggy
Hero Member
*****
Offline Offline

Activity: 546
Merit: 995



View Profile WWW
August 28, 2018, 06:29:43 AM
 #10

if i understood correctly they are mixing letter from different alphabets, this could be quite easy to spot by:

  • parsing the message
  • reporting the message
  • then check manually the message

I mean i don't see this going very far with this trick

I think Quickseller was hinting maybe on an automated program that can check in the different language sections for the valid and invalid characters.


What i meant was, if your message contains a small % of cyrillic caracthers, because some ordinary character was substituted to hide the plagiarism, that can quite easily spotted by checking the the text.
 
Regular expressions, if i remember correctly, can do that quite easily for other languages characters.




     ▄██    ▐███████▄▄▄       ▄▄█████▄▄      ▄██▄      ▐██▄    ▒▓▓▄      ▄▓▓▒
     ███    ▐██▌▀▀▀▀▀███▄    ███▀▀▀▀▀███▄    ████▄     ▐██▌  ▐▓▄ ▀▓▓▄  ▄▓▓▀ ▄▓▌
     ███    ▐██▌      ███   ███▌      ███▌   ██████    ▐██▌   ▀▓▓▄ ▀▓▓▓▓▀ ▄▓▓▀
     ███    ▐██▌    ▄████  ▐███▌      ▐██▌   ███ ███▄  ▐██▌     ▀▓▓▄ ▀▀ ▄▓▓▀
     ███    ▐█████████▀▀   ▐███▌      ▐██▌   ███  ▀███ ▐██▌      ▓▓▓    ▓▓▓
     ███    ▐██▌   ▀███     ███▌      ███▌   ███    ██████▌   ▄▓▓▀ ▄▓▓▓▓▄ ▓▓▓▄
     ███    ▐██▌     ███    ▀███▄▄▄▄▄████    ███     ▀████▌  ▐▓▀ ▄▓▓▀  ▀▓▓▄ ▀▓▌
     ███    ▐██▌      ███     ▀▀██████▀▀     ███       ███▌    ▄▓▓▀      ▀▓▓▄
                  ▄▄▄█████▄▄▄▄
             ▄▄█▓▓▓▓▓█▀▀▀▀█▓▓▓▓▓█▄
           ▄▓▓▓█▀▀            ▀▀█▓▓█▄
         ▓▓▓█▀                    ▀▓▓█▄
       ▄▓▓▓▀                        ▀▓▓█
      ▄▓▓█                            █▓▓
      ▓▓▓                    ▄██▄     ▐▓▓█
     ▓▓▓                   ▄█▓▓▀       ▐▓▓▌
     ▓▓▓                 ▄█▓▓▀          ▓▓▓
     ▓▓▓       ▓▓▓▄    ▓▓▓▓▀            ▓▓▓
     ▓▓▓        ▀▓▓▓▄█▓▓▓▀             ▐▓▓▌
     ▀▓▓▓         ▀█▓▓█▀               █▓▓
      ▓▓▓▄                            ▓▓▓▌
       ▓▓▓█                         ▄█▓▓▀
        ▀▓▓█▄                     ▄▓▓▓█▀
          ▀▓▓▓█▄               ▄▄█▓▓█▀
            ▀▀█▓▓▓█▄▄▄▄▄▄▄▄▄▄█▓▓▓█▀
                ▀▀██▓▓▓▓▓▓▓███▀▀
iasenko
Sr. Member
****
Offline Offline

Activity: 378
Merit: 620


Bitcointalk Memes,post yours here> topic=4937275.0


View Profile WWW
August 28, 2018, 01:18:26 PM
 #11

if i understood correctly they are mixing letter from different alphabets, this could be quite easy to spot by:

  • parsing the message
  • reporting the message
  • then check manually the message

I mean i don't see this going very far with this trick

I think Quickseller was hinting maybe on an automated program that can check in the different language sections for the valid and invalid characters.


What i meant was, if your message contains a small % of cyrillic caracthers, because some ordinary character was substituted to hide the plagiarism, that can quite easily spotted by checking the the text.
 
Regular expressions, if i remember correctly, can do that quite easily for other languages characters.



The easiest way to spot it is by searching for a single Cyrillic character, like for example "a", and excluding the local sections.
Then you get all the posts listed, often there are posts in Russian which I also report. I wish I had a report button from the search results but.. no.


Great, thanks everyone for the help, now we gonna sit and wait for reaction from the headquarters.

A bump to attract theymos' attention, I think I have to hire a bumping bot here Cheesy jk.

LoyceV
Legendary
*
Offline Offline

Activity: 1302
Merit: 2260


Self-made Legendary!


View Profile WWW
August 28, 2018, 01:59:36 PM
 #12

Some of these are "legit" symbols in various languages, correct? For example Russian and I believe Hebrew use different symbols than English does.
Correct. That's why theymos wants to auto-replace them only on the English boards.
I'm not sure if that's going to help though, plagiarism by homograph attacks is much easier to detect than plagiarism through text spinners.

vlad230
Sr. Member
****
Offline Offline

Activity: 336
Merit: 275



View Profile
August 28, 2018, 02:29:21 PM
 #13

It's good that you guys created a list with them but to be realistic, I don't think theymos will fix any of these.

I think he has more important items on his agenda.
iasenko
Sr. Member
****
Offline Offline

Activity: 378
Merit: 620


Bitcointalk Memes,post yours here> topic=4937275.0


View Profile WWW
August 28, 2018, 05:56:38 PM
 #14

It's good that you guys created a list with them but to be realistic, I don't think theymos will fix any of these.

I think he has more important items on his agenda.

He said that if the things with the hompgraphs became more serious, he gonna implement this "fix". I think 80 hompgraphs per day is a serious thing

bitart
Hero Member
*****
Offline Offline

Activity: 1036
Merit: 593


Vires in Numeris


View Profile
August 28, 2018, 10:02:40 PM
 #15

Some of these are "legit" symbols in various languages, correct? For example Russian and I believe Hebrew use different symbols than English does.
Correct. That's why theymos wants to auto-replace them only on the English boards.
I'm not sure if that's going to help though, plagiarism by homograph attacks is much easier to detect than plagiarism through text spinners.
Is it also possible to auto-replace some other kind of strings like 'good project' etc.. to something like this: 'please ban me I'm a bounty hunter' ? Cheesy
Also, you have to wait, report badges were here first to implement Smiley

To be more serious:
These spammers who use these special characters don't think that mods will easily find those posts and will delete them? Their activity will decrease and they won't get paid... It's a kind of thing that can be spotted easily so it doesn't worth the effort, but this is just my opinion.... Do you think they don't read the Meta section at all?


iasenko
Sr. Member
****
Offline Offline

Activity: 378
Merit: 620


Bitcointalk Memes,post yours here> topic=4937275.0


View Profile WWW
August 28, 2018, 10:16:44 PM
 #16


To be more serious:
These spammers who use these special characters don't think that mods will easily find those posts and will delete them? Their activity will decrease and they won't get paid... It's a kind of thing that can be spotted easily so it doesn't worth the effort, but this is just my opinion.... Do you think they don't read the Meta section at all?

I asked many times to add those to the rules but got no support from theymos. They can get away only with deleted posts instead of ban as they are hiding plagiarism but it's difficult to prove it.
Oh they read meta for sure, when I suggested to ban everyone who has more than 1 changed character in a post, they started posting with only one hompgraph - the popular "a very" spam.

theymos
Administrator
Legendary
*
Offline Offline

Activity: 3206
Merit: 3935


View Profile
August 29, 2018, 01:30:03 AM
Merited by malevolent (1)
 #17

Done. I only did the ones that look really similar to Latin characters, and it only applies to English sections. It's done at display time, so it's retroactive.

1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD
Foxpup
Legendary
*
Offline Offline

Activity: 2366
Merit: 1187



View Profile
August 29, 2018, 04:04:57 AM
 #18

Done. I only did the ones that look really similar to Latin characters, and it only applies to English sections. It's done at display time, so it's retroactive.
What does this mean for Russian text that is legitimately posted in English sections?

For reference, the correct translation of "ктo-нибyдь" is "someone" or "somebody", not "who - нибyдь". Come on, even Google Translate gets that one right. Roll Eyes
Nope, Google Translate can't make heads or tails of it now. Sad This could be a problem (though whether it's a bigger problem than plagiarism remains to be seen).

Will pretend to do unverifiable things (while actually eating an enchilada-style burrito) for bitcoins: 1K6d1EviQKX3SVKjPYmJGyWBb1avbmCFM4
Quickseller
Copper Member
Legendary
*
Offline Offline

Activity: 1596
Merit: 1206

in 2 min-groin injury, dildo on field, & 6-9 score


View Profile WWW
August 29, 2018, 04:30:21 AM
 #19

The English sections should only contain English. If a post is posted in Russian in one of the English sections it would be off topic and should be reported

3PjXm2XYDKLV5mN3oiKzNTyVvSkqP3ujeq <-- tipping address Advertise here
iasenko
Sr. Member
****
Offline Offline

Activity: 378
Merit: 620


Bitcointalk Memes,post yours here> topic=4937275.0


View Profile WWW
August 29, 2018, 04:43:53 AM
 #20

The English sections should only contain English. If a post is posted in Russian in one of the English sections it would be off topic and should be reported

Yes, I report every post I find written in other languages than English.

Done. I only did the ones that look really similar to Latin characters, and it only applies to English sections. It's done at display time, so it's retroactive.

Great, I'll be monitoring the next few days to see how it goes Smiley

Pages: [1] 2 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!