Title: Daily merits over local boards Post by: tranthidung on September 04, 2019, 01:16:46 PM Initially, I would like to make a thread for two statistics, activity and amount of merits distributed over local boards on daily basis.
I have this idea for a long time ago, but LoyceV felt difficult to give me such data (or he did not catch my ideas, not sure). Then, days ago, I saw that topic: [Stats] Local Boards Activity (https://bitcointalk.org/index.php?topic=5181338.0), which brought the idea to make my thread about that issue back. Thanks for @DTalk for that great topic, and @DdmrDdmr for data dump (https://docs.google.com/spreadsheets/d/1RRHuH6k4hRn6mYuXNahvRta4GW438y49ruimYXNHrFE/edit?usp=sharing) as well as great explanation (https://bitcointalk.org/index.php?topic=5181705.msg52358308#msg52358308) and link to Wayback Machine (https://web.archive.org/web/*/https://bitcointalk.org/index.php). I will find out more about Wayback Machine for further update of the thread. Notes: - Update will be made in last posts, not in the OP. - Lack of data for daily posts (lack of acceptable precise data) OBJECTIVES - Statistics on merit distribution (daily) over boards in median, interquartile range, mean, standard deviations - Box plots of daily merit distribution over boards ABSTRACT Observed period: 01/07/2019 - 29/08/2019 (1) Median of daily merits over all sub-sections is 2, and interquartile range (IQR) is 0 to 9. (2) Highest subsection in terms of median of daily merits is Russian , at 52; the second and third highest is Turkish (at 11), and German (at 10). Some other subsections that have significant median of daily merits are Arabic, and Pilipinas (at 9), French (at 8 ), Indonesian (at 7), and Italian (at 6). (3) Lowest subsection in terms of median of daily merits are Chinese, Dutch, Greek, Japanese, Romanian, Others, at 0. (4) Minimum and maximum of daily merits (over all subsections) are 0 and 231, that was found in Russian board, while the maximum of daily merits in German board is a little lower at 201. (5) Over 60 days, the top 5 subsections are Russian (3440, 35.1%), German (1070, 10.9%), Turkish (852, 8.7%), Pilipinas (821, 8.4%), and French (708 , 7.2%). Figures displayed in sum and percent, respectively. Converted dataset: * Part 1 (https://bitcointalk.org/index.php?topic=5181705.msg52363000#msg52363000) * Part 2 (https://bitcointalk.org/index.php?topic=5181705.msg52363020#msg52363020) * Part 3 (https://bitcointalk.org/index.php?topic=5181705.msg52364976#msg52364976) Statistics: Variables:
Code: . tabstat nmerits, s(n mean sd p50 p25 p75 min max) format(%9.1f) by(subsection) Box plots Outliers non-displayed Outliers displayed (in red circles) Total merits of each subsections during 60 days: Code: . list subsection nmerits pmerit Pie chart For other statistical threads, please visit: tranthidung's statistical threads on bitcointalk (https://bitcointalk.org/index.php?topic=5181068.0) Title: Re: Local boards' activity & merit (daily) Post by: DdmrDdmr on September 04, 2019, 02:53:14 PM <…> If someone can help, please help me with data for around 60 days or 90 days. I guess you want to start-off with 60 day of retrospective data points. I doubt anyone is manually keeping track of the number of posts created on a daily basis (except for inner forum statistics). There are two alternatives as far as I can see: a) Do what @Dtalk already does, and manually tabulate the data from here on, on a daily basis. Since the granularity is daily, the data should be retrieved rougly at the same time each day (ideally at the end of day). b) Use the WayBackMachine (https://web.archive.org/web/*/https://bitcointalk.org/index.php) or alternatives, and manually tabulate the data for each available date (this is what I do, but on a monthly basis). Drawbacks: - There is data for only 13 distinct days for August and 17 for July 2019 on the WayBackMachine. - The time at which each snapshot is taken is different, meaning that the reading could be early for one date and late at night for another. For a granularity such a “day”, that is significant (and not too great). I take monthly readings, so this effect is pretty small and tolerable with a wider granularity in the data. In addition, I try to go down one level of childboards (haven’t always) in order to be able to separate Altcoin childboard threads from the rest (due to their weight in the number of post toll). That is a bit of a PITA, since it takes the granularity for local boards one level further down, but makes it more interpretable. Going back to the daily granularity per board, the number of merits is easier to obtain: https://docs.google.com/spreadsheets/d/1RRHuH6k4hRn6mYuXNahvRta4GW438y49ruimYXNHrFE/edit?usp=sharing. This data could shift a bit retrospectively if posts are deleted or moved (this latter effect should be pretty small on local boards). Title: Re: Local boards' activity & merit (daily) Post by: RocketSingh on September 04, 2019, 05:11:10 PM Along with local boards, it would be great to have a track of pinned local threads under Other languages/locations as well.
Title: Re: Local boards' activity & merit (daily) Post by: Upgrade00 on September 04, 2019, 05:14:31 PM .. I would love to see the statistics on this. This would give us an idea of how much activity goes on on those threads and which local language should get their own board for better discussions. Title: Re: Daily merits over local boards Post by: tranthidung on September 05, 2019, 01:36:59 AM It seems to be difficult to deal with deleted posts, and honestly by now I don't have skills to do this.
Let's play with data from @DdmrDdmr (the site he gave is great, I'm going to dig deeply later) Merit distribution in subsections from 01/7/2019 to 29/8/2019 (dd/mm/yyyy). Converted dataset: * Part 1 Code: . list id date subsection nmerits in 1/400 Box plots Outliers non-displayed Outliers displayed (in red circles) Title: Re: Daily merits over local boards Post by: tranthidung on September 05, 2019, 01:41:11 AM * Part 2 of converted dataset
Code: . list id date subsection nmerits in 401/800 Title: Re: Daily merits over local boards Post by: tranthidung on September 05, 2019, 07:59:08 AM * Part 3 of converted dataset
Code: . list id date subsection nmerits in 801/1140 |