Readable merit dataset for your own evaluations

ptrk (OP)

Member

Offline

Activity: 140
Merit: 90

Readable merit dataset for your own evaluations

September 28, 2019, 04:39:39 PM
Last edit: February 02, 2020, 11:22:56 AM by ptrk

Merited by Welsh (4), dbshck (4), bones261 (4), tyz (3), Halab (2), DdmrDdmr (2), Daniel91 (1), LoyceV (1), nutildah (1), 1miau (1), YOSHIE (1), hd49728 (1), aundroid (1)

I have read that some would like to perform merit evaluations themselves, but the data provided by theymos is not very readable. That's why I wrote a script that provides the same data with readable date, time, category path, thread name and usernames (from & to).

I make the data freely available for everyone on Github. Have fun with the data analysis.

The data was created automatically, so there is no guarantee the data is consistent. It is based on raw data provided by theymos and LoyceV.

Full History (24th January 2018 and 31th January 2020)
299935 merit records
Github

Subset History (23th May 2019 and 20th September 2019)
41948 merit records
Github

October 2019 History
11912 merit records
Github

November 2019 History
18228 merit records
Github

December 2019 History
14734 merit records
Github

January 2020 History
16086 merit records
Github

nutildah

Legendary

Offline

Activity: 3052
Merit: 8202

Re: Readable merit dataset for your own evaluations

September 28, 2019, 05:00:00 PM

I sent you my last merit. Could you potentially do this for the entire history of the merit system? I'd like to compile my own database in a single Excel file. I know DdmrDdmr and LoyceV have done similar things but I do appreciate being able to import the data directly into Excel.

.
MΞTAWIN ^│ ^{THE FIRST WEB3 CASINO}

. ▄▄███████▄▄ ▄███████████████▄ ███▀██████▀░░░▀▄███ ████▄░░▀▀▀░░░░░░▄████ ████▄░░░░░░░░░░▄█████ █████▄▄░░░░░░░▄██████ ███▄▄░░░░▄▄▄███████ ▀███████████████▀ ▀▀███████▀▀ TWITTER

. ▄▄███████▄▄ ▄███████████████▄ ███████████▀▀▀░░███ ██████▀▀▀░░▄▄▀░░▐████ ███▄▄░░░▄█▀░░░░░█████ ██████▌▐▀░░░░░░▐█████ ██████▌░▄█▄▄░░█████ ▀███████████████▀ ▀▀███████▀▀ TELEGRAM

. ▄▄███████▄▄ ▄███████████████▄ ████▀▀░░▀▀▀░░▀▀████ ████▀░░░░░░░░░░░▀████ ███▌░░░██░░░██░░░▐███ ███▌░░░░░░░░░░░░░▐███ ███▄░▀█▄▄▄▄▄█▀░▄███ ▀███████████████▀ ▀▀███████▀▀ DISCORD

.
..^PLAY NOW..

ptrk (OP)

Member

Offline

Activity: 140
Merit: 90

Re: Readable merit dataset for your own evaluations

September 28, 2019, 05:02:57 PM

Quote from: nutildah on September 28, 2019, 05:00:00 PM

I sent you my last merit. Could you potentially do this for the entire history of the merit system?

Thank you!
Sure, where do I get the entire history? I only found the data set of the last four months.

LoyceV

Legendary

Offline

Activity: 3374
Merit: 17017

Thick-Skinned Gang Leader and Golden Feather 2021

Re: Readable merit dataset for your own evaluations

September 28, 2019, 05:15:47 PM

Quote from: ptrk on September 28, 2019, 05:02:57 PM

Sure, where do I get the entire history?

http://loyce.club/Merit/merit.all.txt (updated weekly, usually at the end of Friday).

LoyceV's Signature for rent

DdmrDdmr

Legendary

Offline

Activity: 2380
Merit: 10876

There are lies, damned lies and statistics. MTwain

Re: Readable merit dataset for your own evaluations

September 28, 2019, 06:23:29 PM

Quote from: ptrk on September 28, 2019, 04:39:39 PM

<...>

Currently, I publish all the sMerit TXs here: https://fusiontables.google.com/DataSource?docid=1wM2Op6_ol8_0iP0sDEemIGr9weKvIeLPvKsKMpFy#rows:id=1. The data is downloadable (File-> Dowload) as a csv, and is updated every Friday. The only issue is that I may not continue feeding that tool from December onwards, since it is going to be discontinued.

Prior to publishing the TXs there, I do upload them to internal Google Sheets such as there:
https://docs.google.com/spreadsheets/d/1GTngeRJlWSEg1bFY-z0S6nqZxwSGUsUTeaYREnOQieI/edit?usp=sharing (Part I)
https://docs.google.com/spreadsheets/d/1V7kW7q-dHIK-dJj7byUbBE1PLyVjmG5lQb_wJHxgUIU/edit?usp=sharing (Part II)

The above are Google Sheets, and cam be exported to Excel and csv amongst others. The reason for the file to be split is to make it easier to load on the Fusion Tables structure, but otherwise I would just feed a single Google spreadsheet.

Data is derived from the cumulative of merit.txt files that the forum published every Friday (it’s not a simple merge, since there are TX in common between files – 113 aprox in common out of the 120 days in each merit.txt. All the data is kept in a single database, applying each week’s cumulative file beforehand.

That is not all the info nevertheless, since forum structure tied to each message is obtained separately.

ptrk (OP)

Member

Offline

Activity: 140
Merit: 90

Re: Readable merit dataset for your own evaluations

October 01, 2019, 08:35:47 AM

Quote from: nutildah on September 28, 2019, 05:00:00 PM

I sent you my last merit. Could you potentially do this for the entire history of the merit system?

I uploaded the full history (from 25th January 2018 to 27th September 2019).

ptrk (OP)

Member

Offline

Activity: 140
Merit: 90

Re: Readable merit dataset for your own evaluations

November 02, 2019, 05:08:55 PM

Data sets were updated.

Full history file now contains merit records from 24th January 2018 to 31th October 2019.

Also, there is a file which contains merit records from October 2019 only.

PrimeNumber7

Copper Member
Legendary

Offline

Activity: 1638
Merit: 1899

Amazon Prime Member #7

Re: Readable merit dataset for your own evaluations

November 02, 2019, 09:25:00 PM
Last edit: November 03, 2019, 08:35:53 AM by PrimeNumber7

Merited by ptrk (2)

Quote from: ptrk on September 28, 2019, 04:39:39 PM

Full History (24th January 2018 and 31th October 2019)
250887 merit records
Github

I noticed that some data has a delimiter of a comma, while other data has a semi-colon as a delimiter. This makes it more difficult to analyze. It is also a best practice to use ID numbers instead of user generated names in CSV files because thread names, or usernames may contain the delimiter.

You can map names into your dataset after you analyze it for display purposes. This also helps remove any biases you may have with regards to what you are trying to prove.

I put the entire merit dataset into a comma delimited CSV file with a header row and uploaded it here.

edit: as an example, I believe the following line in your CSV file contains an incorrect name:

Quote

2018-09-17;07:02:12;1;;nkampala;BITSSA

I haven't looked, but I suspect there is also issues with the transactions involving the following UIDs:
['1053767', '1187433', '2307758', '2471646', '2471831']

ptrk (OP)

Member

Offline

Activity: 140
Merit: 90

Re: Readable merit dataset for your own evaluations

November 08, 2019, 09:12:12 PM

Quote from: PrimeNumber7 on November 02, 2019, 09:25:00 PM

edit: as an example, I believe the following line in your CSV file contains an incorrect name:

Quote

2018-09-17;07:02:12;1;;nkampala;BITSSA

Thanks for letting me know about the issue.

I took a closer look. This is the corresponding raw data set.

Quote

1537167732 1 5025631.msg45813886 2093373 1053767

The issue with the double ; appears when the thread cannot be found. It seems the thread or post was deleted. In this case it is about this thread https://bitcointalk.org/index.php?topic=5025631.msg45813886.0 which is missed.

The second issue in the data record is the receiver's username. It is documented as BITSSA but instead it is BITSSA : BITCOIN EXCHANGE. I will adjust my script accordingly so that if there is a colon in the username (which is a very rare case), the complete username is documented.

ptrk (OP)

Member

Offline

Activity: 140
Merit: 90

⇾ Re: Readable merit dataset for your own evaluations

December 07, 2019, 07:28:33 PM

#10

Data sets were updated.

Full history file now contains merit records from 24th January 2018 to 30th November 2019.

Also, there is a file which contains merit records from November 2019 only.

Find all file in https://github.com/ptrk01/bitcointalkorg_meritdata

ptrk (OP)

Member

Offline

Activity: 140
Merit: 90

Re: Readable merit dataset for your own evaluations

January 04, 2020, 03:12:10 PM

#11

Data sets were updated. New file added with December 2019 merit data.

Full history file now contains merit records from 24th January 2018 to 31th December 2019.

Find all files https://github.com/ptrk01/bitcointalkorg_meritdata

ptrk (OP)

Member

Offline

Activity: 140
Merit: 90

Re: Readable merit dataset for your own evaluations

February 02, 2020, 11:20:15 AM

#12

Data sets were updated. New file added with January 2020 merit data.

Full history file now contains merit records from 24th January 2018 to 31th January 2020.

Find all files https://github.com/ptrk01/bitcointalkorg_meritdata

Pages: [1]

Bitcoin Forum > Other > Meta > Readable merit dataset for your own evaluations

« previous topic next topic »