Bitcoin Forum

Bitcoin => Project Development => Topic started by: TiagoTiago on September 27, 2011, 08:27:55 PM



Title: [IDEA] PseudoWebBot - timeweighted wordcloud of contents of the forum(s)
Post by: TiagoTiago on September 27, 2011, 08:27:55 PM
Before anything, lemme clear somthing out; i don't have the knowledge/skills required to make this myself, i'm just posting the idea in hopes someone will pick up and make somthing.





Alright, here's the idea; using an automated crawler the system collects statistics of what is posted in the forum (or in more than one forum, and perhaps Bitcoin related news sites), more specificly, words and common group of words (discarding useless things like the word "do" alone, expressions like "what does" etc), and displays the results as a wordcloud; the final "weight" of each word or expression would be defined by the frequency it is seen, and the calculation of the frequency would be weighted with how recent the post (or last edit to the post) was, so old things won't be weighted as much as newer things.


Even better if it's somthing interactive, letting the user specify time periods, moments other than "now" as when things weight more, restrict to only certain subsections of the forum etc. Also somthing like Google's Ngram viewer (http://ngrams.googlelabs.com/) for seeing the evolution of different words and expressions over time would be great as well; and the ability to "play" the wordcloud gradually changing the top weight time at user specified rates would be interesting too (would make it easy to see when topics explode in popularity and then fade).