out of interest i threw a couple of threads into wordle to see what keyword frequency could say:
visacoin: (lol 'scam' 'op' 'ipo')
amerocoin: ('addnode' having wallet troubles on the launch?)
best $25,000 gpu for the job: (270x, 280x, 290x, khash)
catcoin: (meow
based purely on word frequency, there are words that do ring alarm bells (e.g. visacoin thread - scam, ipo, whitepaper). As the thread gets longer, the keywords do still show up strongly (e.g. scam, coingen, expensive, america). the frequency of specific, technical words (e.g. best video card thread - 280x, kwh, hash) indicate more focused topics. weasel words/marketing words also stick out like a sore thumb (e.g. value, relaunch, everyone) I was worried that serial quotes or quote pyramids would change the distribution of the usernames, but they are proportional to the activity of the user in the thread. of interest are the threads where wallet addresses show up in the wordle. there's a lot of interesting info researching distribution and frequency alone.
someone could throw the words into a T-test to determine statistical significant correlation between pairs of words. It would also be interesting to track correlation over time (e.g. 'scam' appears on these dates onwards, or this user appears for these #days only) to see if there are trends in posting, posters, opinion, or information.