Bitcoin Forum
May 14, 2024, 04:17:43 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: [ANN] All bitcointalk users table project  (Read 339 times)
~DefaultTrust (OP)
Copper Member
Sr. Member
****
Offline Offline

Activity: 1554
Merit: 489

Stop the war!


View Profile
January 16, 2020, 11:19:24 AM
Last edit: February 01, 2020, 03:50:37 PM by ~DefaultTrust
Merited by AB de Royse777 (5), OgNasty (2), TheBeardedBaby (1)
 #1

I am working on script that will collect all bitcointalk users profiles. This is a simple parser with simple frontend. The Project is temporaly hosted on this domain http://forbtt.tk

For now there is fully parsed and ready to use. You can test it with sorting users by id, name, posts etc. For addition you can use some filters lile Minimum posts and Minimum merits. Now reguest is limited by 3000 users on page.

I would be grateful for the feedback and suggestions.





Do not trust bitcointalk fascists: leonello; Snork1979; ivan1975
AB de Royse777
Legendary
*
Offline Offline

Activity: 2478
Merit: 3895


Hire Bitcointalk Camp. Manager @ r7promotions.com


View Profile WWW
January 16, 2020, 11:23:01 AM
 #2

Whatever the link was in your post, it has been removed. And talking about the project, we have LoyceV DdmrDdmr and some other users who scraps forum data and they also have the data you have scrapped so far. I hope you are aware of http://loyce.club site.

Anyway, let's see what you bring up with the data you are collecting. Good luck.

Edit: Just found the IP. First impression is good. I like the thing that I can filter by number of posts and merits. Also the sorting option of the table data.

I would be grateful for the feedback and suggestions.
Give this feature to the users to see x number of data in one page. Right now it's everything in one page and once you will have huge data then I am sure the page will take ages to load.

..Stake.com..   ▄████████████████████████████████████▄
   ██ ▄▄▄▄▄▄▄▄▄▄            ▄▄▄▄▄▄▄▄▄▄ ██  ▄████▄
   ██ ▀▀▀▀▀▀▀▀▀▀ ██████████ ▀▀▀▀▀▀▀▀▀▀ ██  ██████
   ██ ██████████ ██      ██ ██████████ ██   ▀██▀
   ██ ██      ██ ██████  ██ ██      ██ ██    ██
   ██ ██████  ██ █████  ███ ██████  ██ ████▄ ██
   ██ █████  ███ ████  ████ █████  ███ ████████
   ██ ████  ████ ██████████ ████  ████ ████▀
   ██ ██████████ ▄▄▄▄▄▄▄▄▄▄ ██████████ ██
   ██            ▀▀▀▀▀▀▀▀▀▀            ██ 
   ▀█████████▀ ▄████████████▄ ▀█████████▀
  ▄▄▄▄▄▄▄▄▄▄▄▄███  ██  ██  ███▄▄▄▄▄▄▄▄▄▄▄▄
 ██████████████████████████████████████████
▄▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▄
█  ▄▀▄             █▀▀█▀▄▄
█  █▀█             █  ▐  ▐▌
█       ▄██▄       █  ▌  █
█     ▄██████▄     █  ▌ ▐▌
█    ██████████    █ ▐  █
█   ▐██████████▌   █ ▐ ▐▌
█    ▀▀██████▀▀    █ ▌ █
█     ▄▄▄██▄▄▄     █ ▌▐▌
█                  █▐ █
█                  █▐▐▌
█                  █▐█
▀▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▀█
▄▄█████████▄▄
▄██▀▀▀▀█████▀▀▀▀██▄
▄█▀       ▐█▌       ▀█▄
██         ▐█▌         ██
████▄     ▄█████▄     ▄████
████████▄███████████▄████████
███▀    █████████████    ▀███
██       ███████████       ██
▀█▄       █████████       ▄█▀
▀█▄    ▄██▀▀▀▀▀▀▀██▄  ▄▄▄█▀
▀███████         ███████▀
▀█████▄       ▄█████▀
▀▀▀███▄▄▄███▀▀▀
..PLAY NOW..
~DefaultTrust (OP)
Copper Member
Sr. Member
****
Offline Offline

Activity: 1554
Merit: 489

Stop the war!


View Profile
January 16, 2020, 11:26:55 AM
 #3

Whatever the link was in your post, it has been removed. And talking about the project, we have LoyceV DdmrDdmr and some other users who scraps forum data and they also have the data you have scrapped so far. I hope you are aware of http://loyce.club site.

Anyway, let's see what you bring up with the data you are collecting. Good luck.

Thank you. It seems that forum do not like links to free .tk domains. forbtt[dot]tk

Do not trust bitcointalk fascists: leonello; Snork1979; ivan1975
AB de Royse777
Legendary
*
Offline Offline

Activity: 2478
Merit: 3895


Hire Bitcointalk Camp. Manager @ r7promotions.com


View Profile WWW
January 16, 2020, 11:36:28 AM
 #4

Thank you. It seems that forum do not like links to free .tk domains. forbtt[dot]tk
Yeah got that. Great job on the site and I hope you will start adding features too with more staffs like trust, flag etc. Post, Activity, Last Active, Merit, Local time, Website, BTC address are some dynamic column. How are you going to keep them updated. What frequency you are checking each users?

By the way, nice username you have there :-P

Edit:
You really need to PM theymos and request to change the username to something else. From your trust page I already see it has already got attention and obviously I think all those users are correct in their inputs. Your creation seems exciting (on the site) and I really hope you will respect the forum value too.
PS: I left you 5 merits to show some encouragements on your work.

..Stake.com..   ▄████████████████████████████████████▄
   ██ ▄▄▄▄▄▄▄▄▄▄            ▄▄▄▄▄▄▄▄▄▄ ██  ▄████▄
   ██ ▀▀▀▀▀▀▀▀▀▀ ██████████ ▀▀▀▀▀▀▀▀▀▀ ██  ██████
   ██ ██████████ ██      ██ ██████████ ██   ▀██▀
   ██ ██      ██ ██████  ██ ██      ██ ██    ██
   ██ ██████  ██ █████  ███ ██████  ██ ████▄ ██
   ██ █████  ███ ████  ████ █████  ███ ████████
   ██ ████  ████ ██████████ ████  ████ ████▀
   ██ ██████████ ▄▄▄▄▄▄▄▄▄▄ ██████████ ██
   ██            ▀▀▀▀▀▀▀▀▀▀            ██ 
   ▀█████████▀ ▄████████████▄ ▀█████████▀
  ▄▄▄▄▄▄▄▄▄▄▄▄███  ██  ██  ███▄▄▄▄▄▄▄▄▄▄▄▄
 ██████████████████████████████████████████
▄▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▄
█  ▄▀▄             █▀▀█▀▄▄
█  █▀█             █  ▐  ▐▌
█       ▄██▄       █  ▌  █
█     ▄██████▄     █  ▌ ▐▌
█    ██████████    █ ▐  █
█   ▐██████████▌   █ ▐ ▐▌
█    ▀▀██████▀▀    █ ▌ █
█     ▄▄▄██▄▄▄     █ ▌▐▌
█                  █▐ █
█                  █▐▐▌
█                  █▐█
▀▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▀█
▄▄█████████▄▄
▄██▀▀▀▀█████▀▀▀▀██▄
▄█▀       ▐█▌       ▀█▄
██         ▐█▌         ██
████▄     ▄█████▄     ▄████
████████▄███████████▄████████
███▀    █████████████    ▀███
██       ███████████       ██
▀█▄       █████████       ▄█▀
▀█▄    ▄██▀▀▀▀▀▀▀██▄  ▄▄▄█▀
▀███████         ███████▀
▀█████▄       ▄█████▀
▀▀▀███▄▄▄███▀▀▀
..PLAY NOW..
LoyceV
Legendary
*
Offline Offline

Activity: 3304
Merit: 16658


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
January 16, 2020, 11:42:54 AM
 #5

It seems that forum do not like links to free .tk domains. forbtt[dot]tk
That's spam protection for Newbies. Dot.tk links are fine:

You really need to PM theymos and request to change the username to something else. From your trust page I already see it has already got attention and obviously I think all those users are correct in their inputs.
This is a troll account (some say it's owned by banned user korner), created right after theymos DefaultTrust.

~DefaultTrust (OP)
Copper Member
Sr. Member
****
Offline Offline

Activity: 1554
Merit: 489

Stop the war!


View Profile
January 16, 2020, 11:43:15 AM
 #6

Thank you. It seems that forum do not like links to free .tk domains. forbtt[dot]tk
Yeah got that. Great job on the site and I hope you will start adding features too with more staffs like trust, flag etc. Post, Activity, Last Active, Merit, Local time, Website, BTC address are some dynamic column. How are you going to keep them updated. What frequency you are checking each users?


I have start it only 16 hours ago and now it parsed 55000 users. So I think that all 3000000 users will parsed about 870 hours (one month)
After that I will parse again with begin but except deleted users. It should be faster.

Do not trust bitcointalk fascists: leonello; Snork1979; ivan1975
DdmrDdmr
Legendary
*
Offline Offline

Activity: 2310
Merit: 10759


There are lies, damned lies and statistics. MTwain


View Profile WWW
January 17, 2020, 11:05:33 AM
 #7

If I recall correctly, the last full profile DB published by a forum member was @piggy’s Open scraped data of all the users - SQL Lite DB - 2.481.270 users.

That was published over a year ago now, and at the time it took him just over 5 days to obtain a full DB dump (different IPs running parallel processes). It seemed to take-up quite some personal time, and was thus done only a couple of times. The good thing about getting the data in those five days is that the dataset will have more inner time/value related consistency than performing it over a longer period of time (such as a month - which is what it would probably take me too).
~DefaultTrust (OP)
Copper Member
Sr. Member
****
Offline Offline

Activity: 1554
Merit: 489

Stop the war!


View Profile
January 17, 2020, 11:22:40 AM
 #8

If I recall correctly, the last full profile DB published by a forum member was @piggy’s Open scraped data of all the users - SQL Lite DB - 2.481.270 users.

That was published over a year ago now, and at the time it took him just over 5 days to obtain a full DB dump (different IPs running parallel processes). It seemed to take-up quite some personal time, and was thus done only a couple of times. The good thing about getting the data in those five days is that the dataset will have more inner time/value related consistency than performing it over a longer period of time (such as a month - which is what it would probably take me too).

I don’t understand how he managed to circumvent the CloudFlare defense.

I am parsing with one process and one IP address and with no parallel reqiests. But even with such a low speed, my bot was banned twice yesterday by CloudFlare. It took a long time to solve the problem. Not sure I solved it completely

Do not trust bitcointalk fascists: leonello; Snork1979; ivan1975
TryNinja
Legendary
*
Offline Offline

Activity: 2828
Merit: 6989


Crypto Swap Exchange


View Profile WWW
January 17, 2020, 11:28:44 AM
 #9

When I make a search with the reg. date set to the range 1 January 2014 - 17 January 2020, I get just an “error”. I was trying to find myself. Is this because you haven’t parsed users from that date and forward?

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
~DefaultTrust (OP)
Copper Member
Sr. Member
****
Offline Offline

Activity: 1554
Merit: 489

Stop the war!


View Profile
January 17, 2020, 11:32:36 AM
 #10

When I make a search with the reg. date set to the range 1 January 2014 - 17 January 2020, I get just an “error”. I was trying to find myself. Is this because you haven’t parsed users from that date and forward?

Yes. Parser is still working right now an it is parsed only till April 2013

Do not trust bitcointalk fascists: leonello; Snork1979; ivan1975
DdmrDdmr
Legendary
*
Offline Offline

Activity: 2310
Merit: 10759


There are lies, damned lies and statistics. MTwain


View Profile WWW
January 17, 2020, 11:59:44 AM
 #11

<…>
Not sure how he did it either (I think you can achieve it with multiple VMs, but I have not done it myself).
I assume you’ve set your scraper script to intervals no shorter than 1 second between queries.
~DefaultTrust (OP)
Copper Member
Sr. Member
****
Offline Offline

Activity: 1554
Merit: 489

Stop the war!


View Profile
January 17, 2020, 12:07:40 PM
 #12

I probably really will try to parse from multiple addresses

Do not trust bitcointalk fascists: leonello; Snork1979; ivan1975
~DefaultTrust (OP)
Copper Member
Sr. Member
****
Offline Offline

Activity: 1554
Merit: 489

Stop the war!


View Profile
February 01, 2020, 03:47:31 PM
 #13

Ready to use! All profiles are parsed. Added filter by banned users and some stats




Do not trust bitcointalk fascists: leonello; Snork1979; ivan1975
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!