Bitcoin Forum

Other => Meta => Topic started by: Aero Blue on August 09, 2019, 06:00:07 PM



Title: Stats you would like to see?
Post by: Aero Blue on August 09, 2019, 06:00:07 PM
Currently I'm messing around with Python in order to scrape data for analysis later on. I know there are already a lot of people who have scrapers that will do this, but I'm looking for a more unique approach. So far the main thing I've been focusing on is "user stats" AKA gathering information from all their posts and plotting it.

Here are some examples of some stats I am currently able to obtain (these are from my profile):



At this point I'm kind of stuck on what exactly to do next, and I'm sure a lot of you have some great ideas! Also, tell me your thoughts on the current stats, I'm always looking for ways to improve / different ways of displaying info.



Title: Re: Stats you would like to see?
Post by: actmyname on August 09, 2019, 06:48:33 PM
Since user stats have been disabled, it's good to see selective scraping.

An average of characters/post would be good. Same thing with posts/day. If we target users with typical spammer statistics, we can gather a list of them.
I've been reporting tons of spammers however a high number of the most egregious one-liner burst-posting spam megathread posters have slipped through my fingers.
How about "average time between posts"?


Title: Re: Stats you would like to see?
Post by: dkbit98 on August 09, 2019, 07:57:13 PM
I would be interested to see What Topics are most active for every Bitcointalk rank in separate.


Title: Re: Stats you would like to see?
Post by: angel55 on August 09, 2019, 07:58:11 PM
Can you find what user has the most posts without receiving a single merit?  Don't count airdropped merits.


Title: Re: Stats you would like to see?
Post by: Aero Blue on August 09, 2019, 10:15:54 PM
Since user stats have been disabled, it's good to see selective scraping.

An average of characters/post would be good. Same thing with posts/day. If we target users with typical spammer statistics, we can gather a list of them.
I've been reporting tons of spammers however a high number of the most egregious one-liner burst-posting spam megathread posters have slipped through my fingers.
How about "average time between posts"?

Yes, I already have the ability to do average char count. When you say posts/day, what timeframe would you like for that? I could do from when the account first posted but I'm assuming you would like a more recent value, maybe for the past month or the past week? I'm working on average time between posts right now, probably going to put the timeframe at about 1 week or so and then plot posts that are less than 30 minutes apart, etc.

I would be interested to see What Topics are most active for every Bitcointalk rank in separate.


Can you find what user has the most posts without receiving a single merit?  Don't count airdropped merits.

I will not be doing merit related things as Loyce already has that covered. I am interested in only focusing on user related stats (excluding merit), so things I am able to derive from post history / profile page.


Title: Re: Stats you would like to see?
Post by: philipma1957 on August 09, 2019, 10:38:25 PM
check my posts out   should be

mining sha 256
marketplace
alt coins
meta


Title: Re: Stats you would like to see?
Post by: dkbit98 on August 09, 2019, 10:55:04 PM
Quote

I will not be doing merit related things as Loyce already has that covered. I am interested in only focusing on user related stats (excluding merit), so things I am able to derive from post history / profile page.

I never asked for anything merit related:

''I would be interested to see What Topics are most active for every Bitcointalk rank in separate.''

Thanks


Title: Re: Stats you would like to see?
Post by: Aero Blue on August 09, 2019, 11:56:01 PM
Quote

I will not be doing merit related things as Loyce already has that covered. I am interested in only focusing on user related stats (excluding merit), so things I am able to derive from post history / profile page.

I never asked for anything merit related:

''I would be interested to see What Topics are most active for every Bitcointalk rank in separate.''

Thanks

Yes, however you did ask for something related to "board statistics" meaning that I would have to scrape thousands of profiles, which is not "user specific". Maybe in the future I will be able to collect stats like that, but right now it's not feasible. Unless there is a way to get around what I've just described I can't do it for right now.

check my posts out   should be

mining sha 256
marketplace
alt coins
meta

well after crashing a few times due to your ridiculous 30k posts:

https://i.postimg.cc/MGR4HyjV/pie.png


Title: Re: Stats you would like to see?
Post by: tranthidung on August 10, 2019, 03:52:10 AM
I would like to have (if you can)
- Median and interquartile range of merits per post in each board.
- Mean and standard deviations of merits per post in each board.
Without statistics, I can guess figures for serious boards are some-fold higher than in spam-boards (like Bitcoin discussion, altcoin discussion, etc.), but it is interesting if you can retrieve those stats with your skills.
There are some reference for you (you might have some ideas from them):
Time Series Analysis on Distributed Merits in the forum (daily, weekly, monthly) (https://bitcointalk.org/index.php?topic=5069140.240)
Time Series on monthly statistics of forum (new users, new topics, new posts) (https://bitcointalk.org/index.php?topic=5071903.0)
Assumed monthly statistics on registered accounts of bitcointalk.org (2009-2019) (https://bitcointalk.org/index.php?topic=5168990.0)
Observation on interquartile range of intra-day merits with time series plot (https://bitcointalk.org/index.php?topic=5129273.0)
Some stats of forum in the WO thread (Oct. 2017 - Jul. 2019) Monthly update (https://bitcointalk.org/index.php?topic=5171307.0)
Bitcointalk Merit Dashboard (https://bitcointalk.org/index.php?topic=4428616.0) https://public.tableau.com/profile/ddmrddmr#!/


Title: Re: Stats you would like to see?
Post by: o_e_l_e_o on August 10, 2019, 07:29:06 AM
Can you find what user has the most posts without receiving a single merit?  Don't count airdropped merits.
You can find this pretty easily by using Vod's BPIP's "Most Posts" lists here: https://bpip.org/report.aspx?r=mostposts

The account with the most posts but no merit is ChartBuddy (https://bitcointalk.org/index.php?action=profile;u=110685) at 21804 posts. They are however a bot account, which hasn't made a single post since the introduction of the merit system.

The highest posting non bot account with no merit is notlist3d (https://bitcointalk.org/index.php?action=profile;u=105355) at 15110 posts, but again, they've only made 3 posts since the introduction of merit.


Title: Re: Stats you would like to see?
Post by: LoyceV on August 10, 2019, 08:47:12 AM
How about "average time between posts"?
The average will just be one number (based on the total number of posts since registration). A "burst post" graph could be interesting to show the distribution of time between posts. Say:
-number of posts within less than 2 minutes: x
-number of posts within 2-5 minutes: y
-5-10 minutes: z
-10-30 minutes: a
-30-120 minutes: b
-120-720 minutes: c

You catch my drift :)


Title: Re: Stats you would like to see?
Post by: Aero Blue on August 10, 2019, 08:24:59 PM
How about "average time between posts"?
The average will just be one number (based on the total number of posts since registration). A "burst post" graph could be interesting to show the distribution of time between posts. Say:
-number of posts within less than 2 minutes: x
-number of posts within 2-5 minutes: y
-5-10 minutes: z
-10-30 minutes: a
-30-120 minutes: b
-120-720 minutes: c

You catch my drift :)

Here is an example of what I could come up with:

https://i.postimg.cc/0yPXKWh9/post-dist.png

It gets super complex really quick when talking about distribution. This is the best I can do as my knowledge in stats is limited. Hopefully I can figure something out that looks better but this is at least something to look at.

Edit: Added legend so it's a bit more readable.


Title: Re: Stats you would like to see?
Post by: PrimeNumber7 on August 10, 2019, 09:19:31 PM
You can try to measure interest in threads posted in. You can measure this by the percentage of threads that person has posted in, which they posted exactly one post, two or more posts, and 5 or more posts.

Depending on your skill level, you can also scrape each thread a person has posted in, and count the number of times they were quoted, and the number of times they subsequently posted after being quoted. Or you could measure the number of times a person posted in a thread, at least one post by someone else was made, and the person posted in the thread a subsequent time. Both of these should measure engagement.

You could also measure how many times a person is quoted in a thread after they post in a thread. This should measure how interesting their posts are.


Title: Re: Stats you would like to see?
Post by: tranthidung on August 11, 2019, 02:30:08 AM
Looks nice! Can you get some of my statistics, please.
- Average posts per day
- Average merits per merited posts
It is nice if you can make plots, with given dataset (then I can play around with my dataset).
Thanks in advance, fella.