Bitcoin Forum
November 12, 2024, 02:13:04 AM *
News: Check out the artwork 1Dq created to commemorate this forum's 15th anniversary
 
   Home   Help Search Login Register More  
Pages: [1] 2 »  All
  Print  
Author Topic: Sockpuppet-Detection Algorithms  (Read 2026 times)
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 29, 2012, 11:49:38 PM
 #1

I'm going to be helping someone set up a forum, but we haven't decided on what software yet.

Anyway, it got me curious about how sockpuppets and alt accounts can be detected, with varying degrees of accuracy.

What ingeniously-complicated algorithms already exist to do this? And what could be coded better?

Some obvious metrics to consider:

  • IP address
  • Language
  • Grammar
  • Vocabulary
  • Profile preferences, eg. Timezone
  • Regular login and post days/times
  • Use of smileys or images

What else?

Is software to do this built in to the major forum scripts or is this kind of thing studied separately by mods?
MysteryMiner
Legendary
*
Offline Offline

Activity: 1512
Merit: 1049


Death to enemies!


View Profile
October 30, 2012, 12:08:16 AM
 #2

I guess many accounts will be stockpuppetdetected with total strangers. With more registered and active users there will be more false positives.

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 12:44:54 AM
 #3

I guess many accounts will be stockpuppetdetected with total strangers. With more registered and active users there will be more false positives.

Agreed, but if it were scored based, an automated system could at least rank users from 1 (very unlikely to be an alt) to 100 (highly likely to be an alt) and then a human could keep an eye on the higher-scoring accounts.

I'm sure that absolute certainty would be impossible, however I'm still interested in what metrics could be used to come up with such a score. And the pursuit of accuracy through refining the algorithms is rather intriguing to me.
P_Shep
Legendary
*
Offline Offline

Activity: 1795
Merit: 1208


This is not OK.


View Profile
October 30, 2012, 12:53:55 AM
 #4

Language is pretty easy.

Form a histograms of:
Syllables per word
Words per sentence
Sentences per paragraph.
Vocabulary.
paraipan
In memoriam
Legendary
*
Offline Offline

Activity: 924
Merit: 1004


Firstbits: 1pirata


View Profile WWW
October 30, 2012, 12:57:35 AM
 #5

Language is pretty easy.

Form a histograms of:
Syllables per word
Words per sentence
Sentences per paragraph.
Vocabulary.

And use of CR's in every post too

BTCitcoin: An Idea Worth Saving - Q&A with bitcoins on rugatu.com - Check my rep
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 01:01:04 AM
 #6

Language is pretty easy.

Form a histograms of:
Syllables per word
Words per sentence
Sentences per paragraph.
Vocabulary.

And ratio of exclamation marks to word count!!!

Thanks, keep 'em coming Smiley
MysteryMiner
Legendary
*
Offline Offline

Activity: 1512
Merit: 1049


Death to enemies!


View Profile
October 30, 2012, 01:57:10 AM
 #7

The average IQ level for all posts will also help at suckpuppet detection. The problem might be that most suckpuppet masters might fall on lower 20% on IQ scale.

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 02:05:35 AM
 #8

The average IQ level for all posts will also help at suckpuppet detection. The problem might be that most suckpuppet masters might fall on lower 20% on IQ scale.

That would probably depend on how easily they were detected!

The dumb or careless puppet masters would be more obvious while the high-IQ alts would probably be more conscious of their traits and hence be better at flying under the radar.
paraipan
In memoriam
Legendary
*
Offline Offline

Activity: 924
Merit: 1004


Firstbits: 1pirata


View Profile WWW
October 30, 2012, 02:06:34 AM
 #9

The average IQ level for all posts will also help at suckpuppet detection. The problem might be that most suckpuppet masters might fall on lower 20% on IQ scale.

Proof or STFU

They obviously have a higher than the average IQ and suffer some multiple personality disorder, so they can keep a good level of post quality if they wanted to, or the agenda requires it.

BTCitcoin: An Idea Worth Saving - Q&A with bitcoins on rugatu.com - Check my rep
MysteryMiner
Legendary
*
Offline Offline

Activity: 1512
Merit: 1049


Death to enemies!


View Profile
October 30, 2012, 02:08:48 AM
 #10

If the algorithm detects that masters have low IQ but the stuckpuppets have high IQ then we have a problem LOL
The average IQ level for all posts will also help at suckpuppet detection. The problem might be that most suckpuppet masters might fall on lower 20% on IQ scale.

Proof or STFU
Are You a stuckpuppet master?

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j
caffeinewriter
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
October 30, 2012, 02:21:35 AM
 #11

Similar signatures/same website on profile might also help.

MysteryMiner
Legendary
*
Offline Offline

Activity: 1512
Merit: 1049


Death to enemies!


View Profile
October 30, 2012, 02:51:58 AM
 #12

Similar signatures/same website on profile might also help.
Quote
most suckpuppet masters might fall on lower 20% on IQ scale.
Is this this case?

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j
caffeinewriter
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
October 30, 2012, 03:24:13 AM
 #13

Similar signatures/same website on profile might also help.
Quote
most suckpuppet masters might fall on lower 20% on IQ scale.
Is this this case?

If this, then that.

helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 03:34:20 AM
 #14

Here's a conundrum:

I'd assume that alt accounts are more likely to agree with points raised by their primary account (although some would also be set up to argue), but what would the percentages be? e.g. 99% of alts align their views with their primary account, and 1% argue opposing points, or would it be closer to say, 60/40?

You could probably write an entire thesis (and more) on the topic, and if I were still in college, perhaps that's what I'd do.
caffeinewriter
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
October 30, 2012, 03:41:39 AM
 #15

Here's a conundrum:

I'd assume that alt accounts are more likely to agree with points raised by their primary account (although some would also be set up to argue), but what would the percentages be? e.g. 99% of alts align their views with their primary account, and 1% argue opposing points, or would it be closer to say, 60/40?

You could probably write an entire thesis (and more) on the topic, and if I were still in college, perhaps that's what I'd do.


Hmm, I can see the title now.

"Detecting Recurring Patterns Between Accounts To Find Individual Users with Multiple Accounts"

Maybe compare the users' introduction posts in the "Introduce Yourself" thread.

helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 03:57:10 AM
 #16

Here's a conundrum:

I'd assume that alt accounts are more likely to agree with points raised by their primary account (although some would also be set up to argue), but what would the percentages be? e.g. 99% of alts align their views with their primary account, and 1% argue opposing points, or would it be closer to say, 60/40?

You could probably write an entire thesis (and more) on the topic, and if I were still in college, perhaps that's what I'd do.


Hmm, I can see the title now.

"Detecting Recurring Patterns Between Accounts To Find Individual Users with Multiple Accounts"

Maybe compare the users' introduction posts in the "Introduce Yourself" thread.

Except accounts on internet forums would be just a small subset of the research.

Detectives already do similar stuff IRL when someone forges a signature, or steals an identity. The signature may look okay to the naked eye, but up close it has very specific traits that can identify the real writer.

Although I am interested in the programming aspect, it's not really an I.T. topic at heart. It's probably more psychological / social.
myrkul
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500


FIAT LIBERTAS RVAT CAELVM


View Profile WWW
October 30, 2012, 04:38:13 AM
 #17

I'm sure this thread will help you test your algorithms. The inevitable pony-themed companion thread will help, too.

BTC1MYRkuLv4XPBa6bGnYAronz55grPAGcxja
Need Dispute resolution? Public Key ID: 0x11D341CF
No person has the right to initiate force, threat of force, or fraud against another person or their property. VIM VI REPELLERE LICET
niko
Hero Member
*****
Offline Offline

Activity: 756
Merit: 501


There is more to Bitcoin than bitcoins.


View Profile
October 30, 2012, 05:01:28 AM
 #18

Whatever algorithm you come up with, if you describe it in public it will quit working from that point on.

They're there, in their room.
Your mining rig is on fire, yet you're very calm.
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 05:12:16 AM
 #19

Whatever algorithm you come up with, if you describe it in public it will quit working from that point on.

I very much doubt that, although it's effectiveness might reduce slightly.

The public description of ponzi schemes does not stop people falling for them.

And police catch criminals using fingerprint matching. This is well-known, and still works despite everyone knowing that they do this.

Still though, you bring up another dimension to the issue: Would researching and designing a system be made more difficult by those who would rather such detection methods remain secret?
caffeinewriter
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
October 30, 2012, 05:27:57 AM
 #20

Whatever algorithm you come up with, if you describe it in public it will quit working from that point on.

I very much doubt that, although it's effectiveness might reduce slightly.

The public description of ponzi schemes does not stop people falling for them.

And police catch criminals using fingerprint matching. This is well-known, and still works despite everyone knowing that they do this.

Still though, you bring up another dimension to the issue: Would researching and designing a system be made more difficult by those who would rather such detection methods remain secret?


I think this is similar to the security concerns that surround Open Source software. Sure, proprietary companies keep their source a secret, but when it's open source, the community can push out a fix instead of waiting for the company to do it. Not to mention everyone can improve upon it.

Pages: [1] 2 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!