Bitcoin Forum
May 10, 2024, 02:21:22 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: 1 2 [All]
  Print  
Author Topic: Sockpuppet-Detection Algorithms  (Read 1995 times)
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 29, 2012, 11:49:38 PM
 #1

I'm going to be helping someone set up a forum, but we haven't decided on what software yet.

Anyway, it got me curious about how sockpuppets and alt accounts can be detected, with varying degrees of accuracy.

What ingeniously-complicated algorithms already exist to do this? And what could be coded better?

Some obvious metrics to consider:

  • IP address
  • Language
  • Grammar
  • Vocabulary
  • Profile preferences, eg. Timezone
  • Regular login and post days/times
  • Use of smileys or images

What else?

Is software to do this built in to the major forum scripts or is this kind of thing studied separately by mods?
1715350882
Hero Member
*
Offline Offline

Posts: 1715350882

View Profile Personal Message (Offline)

Ignore
1715350882
Reply with quote  #2

1715350882
Report to moderator
1715350882
Hero Member
*
Offline Offline

Posts: 1715350882

View Profile Personal Message (Offline)

Ignore
1715350882
Reply with quote  #2

1715350882
Report to moderator
1715350882
Hero Member
*
Offline Offline

Posts: 1715350882

View Profile Personal Message (Offline)

Ignore
1715350882
Reply with quote  #2

1715350882
Report to moderator
Each block is stacked on top of the previous one. Adding another block to the top makes all lower blocks more difficult to remove: there is more "weight" above each block. A transaction in a block 6 blocks deep (6 confirmations) will be very difficult to remove.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
MysteryMiner
Legendary
*
Offline Offline

Activity: 1470
Merit: 1029


Show middle finger to system and then destroy it!


View Profile
October 30, 2012, 12:08:16 AM
 #2

I guess many accounts will be stockpuppetdetected with total strangers. With more registered and active users there will be more false positives.

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 12:44:54 AM
 #3

I guess many accounts will be stockpuppetdetected with total strangers. With more registered and active users there will be more false positives.

Agreed, but if it were scored based, an automated system could at least rank users from 1 (very unlikely to be an alt) to 100 (highly likely to be an alt) and then a human could keep an eye on the higher-scoring accounts.

I'm sure that absolute certainty would be impossible, however I'm still interested in what metrics could be used to come up with such a score. And the pursuit of accuracy through refining the algorithms is rather intriguing to me.
P_Shep
Legendary
*
Offline Offline

Activity: 1795
Merit: 1198


This is not OK.


View Profile
October 30, 2012, 12:53:55 AM
 #4

Language is pretty easy.

Form a histograms of:
Syllables per word
Words per sentence
Sentences per paragraph.
Vocabulary.
paraipan
In memoriam
Legendary
*
Offline Offline

Activity: 924
Merit: 1004


Firstbits: 1pirata


View Profile WWW
October 30, 2012, 12:57:35 AM
 #5

Language is pretty easy.

Form a histograms of:
Syllables per word
Words per sentence
Sentences per paragraph.
Vocabulary.

And use of CR's in every post too

BTCitcoin: An Idea Worth Saving - Q&A with bitcoins on rugatu.com - Check my rep
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 01:01:04 AM
 #6

Language is pretty easy.

Form a histograms of:
Syllables per word
Words per sentence
Sentences per paragraph.
Vocabulary.

And ratio of exclamation marks to word count!!!

Thanks, keep 'em coming Smiley
MysteryMiner
Legendary
*
Offline Offline

Activity: 1470
Merit: 1029


Show middle finger to system and then destroy it!


View Profile
October 30, 2012, 01:57:10 AM
 #7

The average IQ level for all posts will also help at suckpuppet detection. The problem might be that most suckpuppet masters might fall on lower 20% on IQ scale.

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 02:05:35 AM
 #8

The average IQ level for all posts will also help at suckpuppet detection. The problem might be that most suckpuppet masters might fall on lower 20% on IQ scale.

That would probably depend on how easily they were detected!

The dumb or careless puppet masters would be more obvious while the high-IQ alts would probably be more conscious of their traits and hence be better at flying under the radar.
paraipan
In memoriam
Legendary
*
Offline Offline

Activity: 924
Merit: 1004


Firstbits: 1pirata


View Profile WWW
October 30, 2012, 02:06:34 AM
 #9

The average IQ level for all posts will also help at suckpuppet detection. The problem might be that most suckpuppet masters might fall on lower 20% on IQ scale.

Proof or STFU

They obviously have a higher than the average IQ and suffer some multiple personality disorder, so they can keep a good level of post quality if they wanted to, or the agenda requires it.

BTCitcoin: An Idea Worth Saving - Q&A with bitcoins on rugatu.com - Check my rep
MysteryMiner
Legendary
*
Offline Offline

Activity: 1470
Merit: 1029


Show middle finger to system and then destroy it!


View Profile
October 30, 2012, 02:08:48 AM
 #10

If the algorithm detects that masters have low IQ but the stuckpuppets have high IQ then we have a problem LOL
The average IQ level for all posts will also help at suckpuppet detection. The problem might be that most suckpuppet masters might fall on lower 20% on IQ scale.

Proof or STFU
Are You a stuckpuppet master?

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j
caffeinewriter
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
October 30, 2012, 02:21:35 AM
 #11

Similar signatures/same website on profile might also help.

MysteryMiner
Legendary
*
Offline Offline

Activity: 1470
Merit: 1029


Show middle finger to system and then destroy it!


View Profile
October 30, 2012, 02:51:58 AM
 #12

Similar signatures/same website on profile might also help.
Quote
most suckpuppet masters might fall on lower 20% on IQ scale.
Is this this case?

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j
caffeinewriter
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
October 30, 2012, 03:24:13 AM
 #13

Similar signatures/same website on profile might also help.
Quote
most suckpuppet masters might fall on lower 20% on IQ scale.
Is this this case?

If this, then that.

helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 03:34:20 AM
 #14

Here's a conundrum:

I'd assume that alt accounts are more likely to agree with points raised by their primary account (although some would also be set up to argue), but what would the percentages be? e.g. 99% of alts align their views with their primary account, and 1% argue opposing points, or would it be closer to say, 60/40?

You could probably write an entire thesis (and more) on the topic, and if I were still in college, perhaps that's what I'd do.
caffeinewriter
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
October 30, 2012, 03:41:39 AM
 #15

Here's a conundrum:

I'd assume that alt accounts are more likely to agree with points raised by their primary account (although some would also be set up to argue), but what would the percentages be? e.g. 99% of alts align their views with their primary account, and 1% argue opposing points, or would it be closer to say, 60/40?

You could probably write an entire thesis (and more) on the topic, and if I were still in college, perhaps that's what I'd do.


Hmm, I can see the title now.

"Detecting Recurring Patterns Between Accounts To Find Individual Users with Multiple Accounts"

Maybe compare the users' introduction posts in the "Introduce Yourself" thread.

helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 03:57:10 AM
 #16

Here's a conundrum:

I'd assume that alt accounts are more likely to agree with points raised by their primary account (although some would also be set up to argue), but what would the percentages be? e.g. 99% of alts align their views with their primary account, and 1% argue opposing points, or would it be closer to say, 60/40?

You could probably write an entire thesis (and more) on the topic, and if I were still in college, perhaps that's what I'd do.


Hmm, I can see the title now.

"Detecting Recurring Patterns Between Accounts To Find Individual Users with Multiple Accounts"

Maybe compare the users' introduction posts in the "Introduce Yourself" thread.

Except accounts on internet forums would be just a small subset of the research.

Detectives already do similar stuff IRL when someone forges a signature, or steals an identity. The signature may look okay to the naked eye, but up close it has very specific traits that can identify the real writer.

Although I am interested in the programming aspect, it's not really an I.T. topic at heart. It's probably more psychological / social.
myrkul
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500


FIAT LIBERTAS RVAT CAELVM


View Profile WWW
October 30, 2012, 04:38:13 AM
 #17

I'm sure this thread will help you test your algorithms. The inevitable pony-themed companion thread will help, too.

BTC1MYRkuLv4XPBa6bGnYAronz55grPAGcxja
Need Dispute resolution? Public Key ID: 0x11D341CF
No person has the right to initiate force, threat of force, or fraud against another person or their property. VIM VI REPELLERE LICET
niko
Hero Member
*****
Offline Offline

Activity: 756
Merit: 501


There is more to Bitcoin than bitcoins.


View Profile
October 30, 2012, 05:01:28 AM
 #18

Whatever algorithm you come up with, if you describe it in public it will quit working from that point on.

They're there, in their room.
Your mining rig is on fire, yet you're very calm.
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 05:12:16 AM
 #19

Whatever algorithm you come up with, if you describe it in public it will quit working from that point on.

I very much doubt that, although it's effectiveness might reduce slightly.

The public description of ponzi schemes does not stop people falling for them.

And police catch criminals using fingerprint matching. This is well-known, and still works despite everyone knowing that they do this.

Still though, you bring up another dimension to the issue: Would researching and designing a system be made more difficult by those who would rather such detection methods remain secret?
caffeinewriter
Hero Member
*****
Offline Offline

Activity: 532
Merit: 500



View Profile
October 30, 2012, 05:27:57 AM
 #20

Whatever algorithm you come up with, if you describe it in public it will quit working from that point on.

I very much doubt that, although it's effectiveness might reduce slightly.

The public description of ponzi schemes does not stop people falling for them.

And police catch criminals using fingerprint matching. This is well-known, and still works despite everyone knowing that they do this.

Still though, you bring up another dimension to the issue: Would researching and designing a system be made more difficult by those who would rather such detection methods remain secret?


I think this is similar to the security concerns that surround Open Source software. Sure, proprietary companies keep their source a secret, but when it's open source, the community can push out a fix instead of waiting for the company to do it. Not to mention everyone can improve upon it.

niko
Hero Member
*****
Offline Offline

Activity: 756
Merit: 501


There is more to Bitcoin than bitcoins.


View Profile
October 30, 2012, 05:48:53 AM
 #21

I think this is similar to the security concerns that surround Open Source software. Sure, proprietary companies keep their source a secret, but when it's open source, the community can push out a fix instead of waiting for the company to do it. Not to mention everyone can improve upon it.
Yep, everyone can improve upon it Cheesy  including the perpetrator!

They're there, in their room.
Your mining rig is on fire, yet you're very calm.
MysteryMiner
Legendary
*
Offline Offline

Activity: 1470
Merit: 1029


Show middle finger to system and then destroy it!


View Profile
October 30, 2012, 05:50:24 AM
 #22

Can someone find my alternate profiles?

And no, Atlas is not my alter-ego!

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 06:33:29 AM
 #23

Can someone find my alternate profiles?

Keep in mind most of us only have access to half the data (what's public).

The other data set would be stuff like IP addresses, access logs (which pages were viewed, which links were clicked, and at what time, etc).
helloworld (OP)
Sr. Member
****
Offline Offline

Activity: 266
Merit: 250



View Profile
October 30, 2012, 06:42:41 AM
 #24

I think this is similar to the security concerns that surround Open Source software. Sure, proprietary companies keep their source a secret, but when it's open source, the community can push out a fix instead of waiting for the company to do it. Not to mention everyone can improve upon it.
Yep, everyone can improve upon it Cheesy  including the perpetrator!

But that's the same for forensic science. A perpetrator could spend years studying police investigation methods in order to fool them or evade detection and they may well be successful, but the other 99.99% of perpetrators won't spend years studying this, and will get caught out.

Likewise, the publishing of every known sockpuppetry detection method would only affect the minority of puppet masters that choose to study first and become experts at fooling the algorithms. I'm guessing that would be a very small minority as most wouldn't take the time to do this.
Pages: 1 2 [All]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!