Bitcoin Forum
May 08, 2024, 07:23:31 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Readable merit dataset for your own evaluations  (Read 449 times)
ptrk (OP)
Member
**
Offline Offline

Activity: 137
Merit: 90


View Profile
September 28, 2019, 04:39:39 PM
Last edit: February 02, 2020, 11:22:56 AM by ptrk
Merited by Welsh (4), dbshck (4), bones261 (4), tyz (3), Halab (2), DdmrDdmr (2), Daniel91 (1), LoyceV (1), nutildah (1), 1miau (1), YOSHIE (1), hd49728 (1), aundroid (1)
 #1

I have read that some would like to perform merit evaluations themselves, but the data provided by theymos is not very readable. That's why I wrote a script that provides the same data with readable date, time, category path, thread name and usernames (from & to).

I make the data freely available for everyone on Github. Have fun with the data analysis.

The data was created automatically, so there is no guarantee the data is consistent. It is based on raw data provided by theymos and LoyceV.



Full History (24th January 2018 and 31th January 2020)
299935 merit records
Github

Subset History (23th May 2019 and 20th September 2019)
41948 merit records
Github

October 2019 History
11912 merit records
Github

November 2019 History
18228 merit records
Github

December 2019 History
14734 merit records
Github

January 2020 History
16086 merit records
Github
1715153011
Hero Member
*
Offline Offline

Posts: 1715153011

View Profile Personal Message (Offline)

Ignore
1715153011
Reply with quote  #2

1715153011
Report to moderator
Be very wary of relying on JavaScript for security on crypto sites. The site can change the JavaScript at any time unless you take unusual precautions, and browsers are not generally known for their airtight security.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1715153011
Hero Member
*
Offline Offline

Posts: 1715153011

View Profile Personal Message (Offline)

Ignore
1715153011
Reply with quote  #2

1715153011
Report to moderator
nutildah
Legendary
*
Offline Offline

Activity: 2982
Merit: 7978



View Profile WWW
September 28, 2019, 05:00:00 PM
 #2

I sent you my last merit. Could you potentially do this for the entire history of the merit system? I'd like to compile my own database in a single Excel file. I know DdmrDdmr and LoyceV have done similar things but I do appreciate being able to import the data directly into Excel.

▄▄███████▄▄
▄██████████████▄
▄██████████████████▄
▄████▀▀▀▀███▀▀▀▀█████▄
▄█████████████▄█▀████▄
███████████▄███████████
██████████▄█▀███████████
██████████▀████████████
▀█████▄█▀█████████████▀
▀████▄▄▄▄███▄▄▄▄████▀
▀██████████████████▀
▀███████████████▀
▀▀███████▀▀
.
 MΞTAWIN  THE FIRST WEB3 CASINO   
.
.. PLAY NOW ..
ptrk (OP)
Member
**
Offline Offline

Activity: 137
Merit: 90


View Profile
September 28, 2019, 05:02:57 PM
 #3

I sent you my last merit. Could you potentially do this for the entire history of the merit system?

Thank you!
Sure, where do I get the entire history? I only found the data set of the last four months.
LoyceV
Legendary
*
Offline Offline

Activity: 3304
Merit: 16618


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
September 28, 2019, 05:15:47 PM
 #4

Sure, where do I get the entire history?
http://loyce.club/Merit/merit.all.txt (updated weekly, usually at the end of Friday).

DdmrDdmr
Legendary
*
Offline Offline

Activity: 2310
Merit: 10759


There are lies, damned lies and statistics. MTwain


View Profile WWW
September 28, 2019, 06:23:29 PM
 #5

<...>
Currently, I publish all the sMerit TXs here: https://fusiontables.google.com/DataSource?docid=1wM2Op6_ol8_0iP0sDEemIGr9weKvIeLPvKsKMpFy#rows:id=1. The data is downloadable (File-> Dowload) as a csv, and is updated every Friday. The only issue is that I may not continue feeding that tool from December onwards, since it is going to be discontinued.

Prior to publishing the TXs there, I do upload them to internal Google Sheets such as there:
https://docs.google.com/spreadsheets/d/1GTngeRJlWSEg1bFY-z0S6nqZxwSGUsUTeaYREnOQieI/edit?usp=sharing (Part I)
https://docs.google.com/spreadsheets/d/1V7kW7q-dHIK-dJj7byUbBE1PLyVjmG5lQb_wJHxgUIU/edit?usp=sharing (Part II)

The above are Google Sheets, and cam be exported to Excel and csv amongst others. The reason for the file to be split is to make it easier to load on the Fusion Tables structure, but otherwise I would just feed a single Google spreadsheet.

Data is derived from the cumulative of merit.txt files that the forum published every Friday (it’s not a simple merge, since there are TX in common between files – 113 aprox in common out of the 120 days in each merit.txt. All the data is kept in a single database, applying each week’s cumulative file beforehand.

That is not all the info nevertheless, since forum structure tied to each message is obtained separately.
ptrk (OP)
Member
**
Offline Offline

Activity: 137
Merit: 90


View Profile
October 01, 2019, 08:35:47 AM
 #6

I sent you my last merit. Could you potentially do this for the entire history of the merit system?

I uploaded the full history (from 25th January 2018 to 27th September 2019).
ptrk (OP)
Member
**
Offline Offline

Activity: 137
Merit: 90


View Profile
November 02, 2019, 05:08:55 PM
 #7

Data sets were updated.

Full history file now contains merit records from 24th January 2018 to 31th October 2019.

Also, there is a file which contains merit records from October 2019 only.
PrimeNumber7
Copper Member
Legendary
*
Offline Offline

Activity: 1624
Merit: 1899

Amazon Prime Member #7


View Profile
November 02, 2019, 09:25:00 PM
Last edit: November 03, 2019, 08:35:53 AM by PrimeNumber7
Merited by ptrk (2)
 #8

Full History (24th January 2018 and 31th October 2019)
250887 merit records
Github
I noticed that some data has a delimiter of a comma, while other data has a semi-colon as a delimiter. This makes it more difficult to analyze. It is also a best practice to use ID numbers instead of user generated names in CSV files because thread names, or usernames may contain the delimiter.

You can map names into your dataset after you analyze it for display purposes. This also helps remove any biases you may have with regards to what you are trying to prove.  

I put the entire merit dataset into a comma delimited CSV file with a header row and uploaded it here.

edit: as an example, I believe the following line in your CSV file contains an incorrect name:
Quote
2018-09-17;07:02:12;1;;nkampala;BITSSA

I haven't looked, but I suspect there is also issues with the transactions involving the following UIDs:
['1053767', '1187433', '2307758', '2471646', '2471831']
ptrk (OP)
Member
**
Offline Offline

Activity: 137
Merit: 90


View Profile
November 08, 2019, 09:12:12 PM
 #9

edit: as an example, I believe the following line in your CSV file contains an incorrect name:
Quote
2018-09-17;07:02:12;1;;nkampala;BITSSA

Thanks for letting me know about the issue.

I took a closer look. This is the corresponding raw data set.
Quote
1537167732   1   5025631.msg45813886   2093373   1053767

The issue with the double ; appears when the thread cannot be found. It seems the thread or post was deleted. In this case it is about this thread https://bitcointalk.org/index.php?topic=5025631.msg45813886.0 which is missed.

The second issue in the data record is the receiver's username. It is documented as BITSSA but instead it is BITSSA : BITCOIN EXCHANGE. I will adjust my script accordingly so that if there is a colon in the username (which is a very rare case), the complete username is documented.
ptrk (OP)
Member
**
Offline Offline

Activity: 137
Merit: 90


View Profile
December 07, 2019, 07:28:33 PM
 #10

Data sets were updated.

Full history file now contains merit records from 24th January 2018 to 30th November 2019.

Also, there is a file which contains merit records from November 2019 only.

Find all file in https://github.com/ptrk01/bitcointalkorg_meritdata
ptrk (OP)
Member
**
Offline Offline

Activity: 137
Merit: 90


View Profile
January 04, 2020, 03:12:10 PM
 #11

Data sets were updated. New file added with December 2019 merit data.

Full history file now contains merit records from 24th January 2018 to 31th December 2019.

Find all files https://github.com/ptrk01/bitcointalkorg_meritdata
ptrk (OP)
Member
**
Offline Offline

Activity: 137
Merit: 90


View Profile
February 02, 2020, 11:20:15 AM
 #12

Data sets were updated. New file added with January 2020 merit data.

Full history file now contains merit records from 24th January 2018 to 31th January 2020.

Find all files https://github.com/ptrk01/bitcointalkorg_meritdata
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!