Full History (24th January 2018 and 31th October 2019)
250887 merit records
Github I noticed that some data has a delimiter of a comma, while other data has a semi-colon as a delimiter. This makes it more difficult to analyze. It is also a best practice to use ID numbers instead of user generated names in CSV files because thread names, or usernames may contain the delimiter.
You can map names into your dataset after you analyze it for display purposes. This also helps remove any biases you may have with regards to what you are trying to prove.
I put the entire merit dataset into a comma delimited CSV file with a header row and uploaded it
here.
edit: as an example, I believe the following line in your CSV file contains an incorrect name:
2018-09-17;07:02:12;1;;nkampala;BITSSA
I haven't looked, but I suspect there is also issues with the transactions involving the following UIDs:
['1053767', '1187433', '2307758', '2471646', '2471831']