PytagoraZ (OP)
Sr. Member
Offline
Activity: 350
Merit: 343
Jolly? I think I've heard that name before. hmm
|
|
July 15, 2023, 11:02:30 AM |
|
I'm curious mate, Where does the data owned by DdmrDdmr, LoyceV, and ninjastic.space come from? Is there an API from bitcointalk?
I am not an expert in the field of websites. I'm just a blogger
|
|
|
|
Nwada001
|
|
July 15, 2023, 11:11:40 AM |
|
Those above-mentioned users are all programmers or developers; they develop their own tools used for specific things. Those tools are being used to scrape data from the forum, but it requires some kind of approval from the forum administrator (some sort of IP whitelisting) in order to grant their tool access to the data they seek. Those are how I understood it. I am not a programmer either.
|
| █▄ | R |
▀▀▀▀▀▀▀██████▄▄ ████████████████ ▀▀▀▀█████▀▀▀█████ ████████▌███▐████ ▄▄▄▄█████▄▄▄█████ ████████████████ ▄▄▄▄▄▄▄██████▀▀ | LLBIT | ▀█ | THE #1 SOLANA CASINO | ████████████▄ ▀▀██████▀▀███ ██▄▄▀▀▄▄█████ █████████████ █████████████ ███▀█████████ ▀▄▄██████████ █████████████ █████████████ █████████████ █████████████ █████████████ ████████████▀ | ████████████▄ ▀▀▀▀▀▀▀██████ █████████████ ▄████████████ ██▄██████████ ████▄████████ █████████████ █░▀▀█████████ ▀▀███████████ █████▄███████ ████▀▄▀██████ ▄▄▄▄▄▄▄██████ ████████████▀ | ........5,000+........ GAMES ......INSTANT...... WITHDRAWALS | ..........HUGE.......... REWARDS ............VIP............ PROGRAM | . PLAY NOW |
[/quote] [center][table][tr][td][/td][td][size=20pt][nbsp] [size=6pt][color=#65e]█▄[/td] [td][font=arial black][size=24pt]R[/size][/font][/td] [td][size=2pt]
[color=#fec]▀[color=#fda]▀[color=#fc9]▀[color=#eb7]▀[color=#eb5]▀[col
|
|
|
PX-Z
|
|
July 15, 2023, 11:20:30 AM |
|
In simple words those 3 scrape data from bitcointalk (post/reply, users, merits, trust), save it on their servers and offers public api to access those data, well at least ninjastic.space have it [1], i dont know the other two. [1] https://bitcointalk.org/index.php?topic=5273824.0
|
|
|
|
hugeblack
Legendary
Offline
Activity: 2674
Merit: 3921
|
|
July 15, 2023, 11:24:55 AM |
|
Where does the data owned by DdmrDdmr, LoyceV, and ninjastic.space come from? Is there an API from bitcointalk?
I don't know how LoyceV, and ninjaastic.space collect data but you can fetch topics/posts from pages like unread posts since last visit ---> https://bitcointalk.org/index.php?action=unreadThen all you will do is analyze, collect and organize the data, whether using public.tableau.com or any personal tool. You will only need to have the IP whitelisting done by admin due to high requests which may be blocked by Cloudflare This code [1] is old, but it can explain the idea to you. Some data like trust are weekly updated but it's a good opportunity to tell us how they do that or help who wants to learn how to make data analysis like that. [1] https://github.com/mprep-btc/Unofficial-Bitcointalk-API
|
|
|
|
LoyceV
Legendary
Offline
Activity: 3472
Merit: 17515
Thick-Skinned Gang Leader and Golden Feather 2021
|
Where does the data owned by DdmrDdmr, LoyceV, and ninjastic.space come from? Depending on what data I need, I use Patrol, Recent, data dumps or some of the "normal" pages on the forum (such as the Merit page, user profile or just pages in a topic). Is there an API from bitcointalk? No. it requires some kind of approval from the forum administrator (some sort of IP whitelisting) in order to grant their tool access to the data they seek. That's not true. Anyone can scrape the forum, as long as they keep it under 1 request per second. The IP whitelisting is only needed when Cloudflare becomes very active against DDOS.
|
|
|
|
Synchronice
|
|
July 15, 2023, 12:03:34 PM |
|
I once asked similar question in Bpip.org ANN thread and LoyceV's answer was that he uses wget to get data from website.
|
| CHIPS.GG | | | ▄▄███████▄▄ ▄████▀▀▀▀▀▀▀████▄ ▄███▀░▄░▀▀▀▀▀░▄░▀███▄ ▄███░▄▀░░░░░░░░░▀▄░███▄ ▄███░▄░░░▄█████▄░░░▄░███▄ ███░▄▀░░░███████░░░▀▄░███ ███░█░░░▀▀▀▀▀░░░▀░░░█░███ ███░▀▄░▄▀░▄██▄▄░▀▄░▄▀░███ ▀███░▀░▀▄██▀░▀██▄▀░▀░███▀ ▀███░▀▄░░░░░░░░░▄▀░███▀ ▀███▄░▀░▄▄▄▄▄░▀░▄███▀ ▀████▄▄▄▄▄▄▄████▀ █████████████████████████ | | ▄▄███████▄▄ ▄███████████████▄ ▄█▀▀▀▄█████████▄▀▀▀█▄ ▄██████▀▄█▄▄▄█▄▀██████▄ ▄████████▄█████▄████████▄ ████████▄███████▄████████ ███████▄█████████▄███████ ███▄▄▀▀█▀▀█████▀▀█▀▀▄▄███ ▀█████████▀▀██▀█████████▀ ▀█████████████████████▀ ▀███████████████████▀ ▀████▄▄███▄▄████▀ ████████████████████████ | | 3000+ UNIQUE GAMES | | | 12+ CURRENCIES ACCEPTED | | | VIP REWARD PROGRAM | | ◥ | Play Now |
|
|
|
LoyceV
Legendary
Offline
Activity: 3472
Merit: 17515
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
July 15, 2023, 12:31:29 PM |
|
LoyceV's answer was that he uses wget to get data from website. That's the answer to "how", not to "where". There are many more command line tools for downloading, but wget is the easiest.
|
|
|
|
PytagoraZ (OP)
Sr. Member
Offline
Activity: 350
Merit: 343
Jolly? I think I've heard that name before. hmm
|
|
July 15, 2023, 12:36:55 PM |
|
Depending on what data I need, I use Patrol, Recent, data dumps or some of the "normal" pages on the forum (such as the Merit page, user profile or just pages in a topic). So what are you all doing from outside the forum? I mean, you are like other members and don't have special access to the forum? I'm not a programmer, but I've tried to learn programming languages by myself, but failed because of my busy life. Can you tell me what mechanism you use? I'm honestly curious how it could work but wget is the easiest.
Can you reference the website to study this?
|
|
|
|
LoyceV
Legendary
Offline
Activity: 3472
Merit: 17515
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
July 15, 2023, 01:30:23 PM |
|
you are like other members and don't have special access to the forum? Correct. Can you tell me what mechanism you use? I'm honestly curious how it could work I just use some scripting. Can you reference the website to study this? There's Wget - GNU Project, but that won't help you much. As a Linux user, you can do many things once you learn how to use the command line. But anything else that works for you will do, downloading from the internet is no rocket science.
I have to ask: what are you trying to accomplish?
|
|
|
|
RickDeckard
Legendary
Offline
Activity: 1148
Merit: 3117
|
|
July 15, 2023, 10:44:33 PM |
|
The only API that I remember seeing is the one developed by TryNinja[1] for his Ninjastic.space[2] project. Do note that this isn't an official API for the forum but since TryNinja already scraps the forum, he setup this API for users that could have any need to directly interact with some of data that is collected. Depending on your needs you can talk with TryNinja to see if he's able to help you out: API: If you have a cool project or project idea that requires any posts/addresses data, I can help you with my REST API. Here is the documentation: https://docs.ninjastic.space
[1] https://docs.ninjastic.space[2] https://bitcointalk.org/index.php?topic=5273824.0
|
|
|
|
TryNinja
Legendary
Offline
Activity: 2996
Merit: 7411
Top Crypto Casino
|
|
July 16, 2023, 06:28:56 AM Last edit: July 16, 2023, 10:26:27 AM by TryNinja Merited by LoyceV (2), Pmalek (2), ABCbits (1) |
|
I also unofficially scrape the forum. I mostly use Javascript’s fetch to make requests and cheerio to parse most of the data. My code is open source so no secrets there: https://github.com/ninjastic/bitcointalk-supernotifier-v2
|
|
|
|
Pmalek
Legendary
Offline
Activity: 2926
Merit: 7517
Playgram - The Telegram Casino
|
|
July 16, 2023, 06:49:28 AM |
|
That's not true. Anyone can scrape the forum, as long as they keep it under 1 request per second. The IP whitelisting is only needed when Cloudflare becomes very active against DDOS. How did the scrapping work last week when Cloudflare was acting up and Bitcointalk was running slowly or not at all for certain actions? Some users reported they couldn't post, others submitted multiple posts in row, I had problems editing and previewing posts, etc. It wasn't DDOS-ing but still a Cloudflare issue. Did it affect any of your regular scrapping work?
|
|
|
|
▄▄███████▄▄███████ ▄███████████████▄▄▄▄▄ ▄████████████████████▀░ ▄█████████████████████▄░ ▄█████████▀▀████████████▄ ██████████████▀▀█████████ █████████████████████████ ██████████████▄▄█████████ ▀█████████▄▄████████████▀ ▀█████████████████████▀░ ▀████████████████████▄░ ▀███████████████▀▀▀▀▀ ▀▀███████▀▀███████ | ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ Playgram.io ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ | ▄▄▄░░ ▀▄ █ █ █ █ █ █ █ ▄▀ ▀▀▀░░
| │ | ▄▄▄███████▄▄▄ ▄▄███████████████▄▄ ▄███████████████████▄ ▄██████████████▀▀█████▄ ▄██████████▀▀███▄██▐████▄ ██████▀▀████▄▄▀▀█████████ ████▄▄███▄██▀█████▐██████ ██████████▀██████████████ ▀███████▌▐██▄████▐██████▀ ▀███████▄▄███▄████████▀ ▀███████████████████▀ ▀▀███████████████▀▀ ▀▀▀███████▀▀▀ | | │ | ██████▄▄███████▄▄████████ ███▄███████████████▄░░▀█▀ ███████████░█████████░░█ ░█████▀██▄▄░▄▄██▀█████░█ █████▄░▄███▄███▄░▄██████ ████████████████████████ ████████████████████████ ██░▄▄▄░██░▄▄▄░██░▄▄▄░███ ██░░░█░██░░░█░██░░░█░████ ██░░█░░██░░█░░██░░█░░████ ██▄▄▄▄▄██▄▄▄▄▄██▄▄▄▄▄████ ███████████████████████ ███████████████████████ | | │ | ► | |
|
|
|
TryNinja
Legendary
Offline
Activity: 2996
Merit: 7411
Top Crypto Casino
|
|
July 16, 2023, 06:58:11 AM |
|
How did the scrapping work last week when Cloudflare was acting up and Bitcointalk was running slowly or not at all for certain actions? Some users reported they couldn't post, others submitted multiple posts in row, I had problems editing and previewing posts, etc. It wasn't DDOS-ing but still a Cloudflare issue. Did it affect any of your regular scrapping work?
Scraping is pretty much impossible when Cloudflare is cranked up. Requests are blocked (403 error) and there are captchas everywhere. My bot was down for almost 2 full days (until theymos apparently whitelisted our IPs so we could bypass it).
|
|
|
|
LoyceV
Legendary
Offline
Activity: 3472
Merit: 17515
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
July 16, 2023, 07:21:59 AM |
|
How did the scrapping work last week when Cloudflare was acting up Scraping didn't work. It wasn't DDOS-ing but still a Cloudflare issue. Cloudflare does that because of a DDOS. My bot was down for almost 2 full days (until theymos apparently whitelisted our IPs so we could bypass it). I figured I'd ask theymos. His previous whitelist attempt (last December) didn't work, and now he fixed it
|
|
|
|
joker_josue
Legendary
Offline
Activity: 1820
Merit: 4908
**In BTC since 2013**
|
|
July 16, 2023, 08:54:18 AM |
|
I do not scrape the site. But, some time ago, I wanted to collect some information, and I used the Octoparse software, which worked perfectly for what I wanted.
Therefore, anyone can scrape the forum without major problems. You just have to know what you want and use the right tools for it.
|
|
|
|
PytagoraZ (OP)
Sr. Member
Offline
Activity: 350
Merit: 343
Jolly? I think I've heard that name before. hmm
|
|
July 16, 2023, 03:21:17 PM |
|
I have to ask: what are you trying to accomplish?
No, no. I don't have a specific goal. I just want to know how this process works. I know I won't become an expert by just reading on the internet, especially since this mechanism is too difficult for someone who really doesn't understand coding.
Since another member advised me not to get involved in reputation boards, I was confused about what to do in the forum, so I studied your tool a bit, Loycev.club. also tried using ninjasctic, tableu DdmrDdmr. From there my curiosity emerged. Is this method also effective for business? like spying on web competitors?
|
|
|
|
KingsDen
Legendary
Offline
Activity: 1260
Merit: 1077
Goodnight, o_e_l_e_o 🌹
|
|
July 17, 2023, 08:04:27 PM |
|
Those above-mentioned users are all programmers or developers; they develop their own tools used for specific things. Those tools are being used to scrape data from the forum, but it requires some kind of approval from the forum administrator (some sort of IP whitelisting) in order to grant their tool access to the data they seek. Those are how I understood it. I am not a programmer either.
I was going to say that they don't need special permission from theymos before they could scrap the forum until LoyceV said so. It is dedication, if it is something you want to do, you can do it. But there are many people in the data scraping of a thing and they are doing great. Since another member advised me not to get involved in reputation boards, I was confused about what to do in the forum, so I studied your tool a bit, Loycev.club. also tried using ninjasctic, tableu DdmrDdmr. From there my curiosity emerged.
No one should take away your freedom. There is no restriction in the forum, you can contribute anywhere you wish.
|
|
|
|
R |
▀▀▀▀▀▀▀██████▄▄ ████████████████ ▀▀▀▀█████▀▀▀█████ ████████▌███▐████ ▄▄▄▄█████▄▄▄█████ ████████████████ ▄▄▄▄▄▄▄██████▀▀ | LLBIT | | | 4,000+ GAMES███████████████████ ██████████▀▄▀▀▀████ ████████▀▄▀██░░░███ ██████▀▄███▄▀█▄▄▄██ ███▀▀▀▀▀▀█▀▀▀▀▀▀███ ██░░░░░░░░█░░░░░░██ ██▄░░░░░░░█░░░░░▄██ ███▄░░░░▄█▄▄▄▄▄████ ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ | █████████ ▀████████ ░░▀██████ ░░░░▀████ ░░░░░░███ ▄░░░░░███ ▀█▄▄▄████ ░░▀▀█████ ▀▀▀▀▀▀▀▀▀ | █████████ ░░░▀▀████ ██▄▄▀░███ █░░█▄░░██ ░████▀▀██ █░░█▀░░██ ██▀▀▄░███ ░░░▄▄████ ▀▀▀▀▀▀▀▀▀ |
| | | | | | .
| | | ▄▄████▄▄ ▀█▀▄▀▀▄▀█▀ ▄▄░░▄█░██░█▄░░▄▄ ▄▄█░▄▀█░▀█▄▄█▀░█▀▄░█▄▄ ▀▄█░███▄█▄▄█▄███░█▄▀ ▀▀█░░░▄▄▄▄░░░█▀▀ █░░██████░░█ █░░░░▀▀░░░░█ █▀▄▀▄▀▄▀▄▀▄█ ▄░█████▀▀█████░▄ ▄███████░██░███████▄ ▀▀██████▄▄██████▀▀ ▀▀████████▀▀ | . ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ░▀▄░▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄░▄▀ ███▀▄▀█████████████████▀▄▀ █████▀▄░▄▄▄▄▄███░▄▄▄▄▄▄▀ ███████▀▄▀██████░█▄▄▄▄▄▄▄▄ █████████▀▄▄░███▄▄▄▄▄▄░▄▀ ████████████░███████▀▄▀ ████████████░██▀▄▄▄▄▀ ████████████░▀▄▀ ████████████▄▀ ███████████▀ | ▄▄███████▄▄ ▄████▀▀▀▀▀▀▀████▄ ▄███▀▄▄███████▄▄▀███▄ ▄██▀▄█▀▀▀█████▀▀▀█▄▀██▄ ▄██▀▄███░░░▀████░███▄▀██▄ ███░████░░░░░▀██░████░███ ███░████░█▄░░░░▀░████░███ ███░████░███▄░░░░████░███ ▀██▄▀███░█████▄░░███▀▄██▀ ▀██▄▀█▄▄▄██████▄██▀▄██▀ ▀███▄▀▀███████▀▀▄███▀ ▀████▄▄▄▄▄▄▄████▀ ▀▀███████▀▀ | | OFFICIAL PARTNERSHIP SOUTHAMPTON FC FAZE CLAN SSC NAPOLI |
[/quote] [center][table][tr][td][url=h
|
|
|
NotATether
Legendary
Offline
Activity: 1764
Merit: 7330
Top Crypto Casino
|
|
July 18, 2023, 12:21:14 PM |
|
There's no Bitcointalk API. All of the data you see floating around comes from the official Simple Machines Forum endpoints (which also powers the forum frontend) and all information can be gained from the path index.php?action=blablabla;more=parameters;follow=here. It's just that you will get a ton of HTML along with this request, which needs to be filtered and parsed. But as you probably figured, there is unofficial rate-limiting on the whole website; you can't make more than one request per second, or theymos blocks your IP address.
|
|
|
|
joker_josue
Legendary
Offline
Activity: 1820
Merit: 4908
**In BTC since 2013**
|
|
July 18, 2023, 12:51:58 PM |
|
No, no. I don't have a specific goal. I just want to know how this process works. I know I won't become an expert by just reading on the internet, especially since this mechanism is too difficult for someone who really doesn't understand coding.
Sometimes you don't even need to know much about coding. Using the program I mentioned, you can get almost any information you want that is public on the forum. Now, of course, it makes no sense to collect data if you don't have any specific objective, if you're not just going to waste time.
|
|
|
|
LoyceV
Legendary
Offline
Activity: 3472
Merit: 17515
Thick-Skinned Gang Leader and Golden Feather 2021
|
|
July 18, 2023, 02:00:05 PM |
|
Since another member advised me not to get involved in reputation boards No one should take away your freedom. There is no restriction in the forum, you can contribute anywhere you wish. It's (obviously) not forbidden, but as a Newbie, I stayed away from any Reputation drama. Bitcointalk looked like a scary place where users got tagged for the smallest things they did. Staying out of that is what earned me the nickname "Switzerland".
|
|
|
|
|