Talksearch.io - Advanced Bitcointalk Search Engine

Bitcoin Forum

November 03, 2025, 04:54:29 AM

Welcome, Guest. Please login or register.

News: Pumpkin carving contest

Home

Help

Bitcoin Forum > Other > Meta > Talksearch.io - Advanced Bitcointalk Search Engine

Poll

Question:

Should I create translations of the website?

Yes	24 (72.7%)
No	9 (27.3%)

Total Voters: 33

Pages: « 1 2 3 4 [5] 6 7 8 9 10 » All

« previous topic next topic »

Author

Topic: Talksearch.io - Advanced Bitcointalk Search Engine (Read 4097 times)

Hossain Risfa

Jr. Member

Offline

Activity: 51
Merit: 22

WO Buddy!!!!!

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 01, 2025, 05:38:54 PM

#81

I've done my translation in our Bangla local board thank you very much for giving me permission to translate the post. I was quite nervous and I don't know that am I able to translate accurately. But after post my translation in our local board some senior brothers told me that I've done great translation and my translation skill is good and they also give me some advice. Thank you @NotaTether for give me permission and thankyou for giving me a chance to translate it give me experience and all over as e newbie you think thak I may be able to do and sir I try to do my best. My post translational link .

Talksearch.io - Advanced Bitcointalk Search Engine Translated in Bangla local board by Hossain Risfa

NotATether (OP)

Legendary

Offline

Activity: 2142
Merit: 9078

Trêvoid █ No KYC-AML Crypto Swaps

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 02, 2025, 02:30:52 PM

#82

Well, it looks like I've hit another snag during uploading. Thankfully, this has nothing to do with Elasticsearch, but with my scraping server.

As you might be aware, I scrape the posts on my server before processing them. The processing involves splitting up posts by quotes, which create a series of chunks for each posts. Usually 1-3. This is saved to the disk, and then another part of the program reads them into memory, and after that these are uploaded to Elasticsearch.

It seems that the splitting process has created so many chunks that I simply cannot create any more in that folder. Any attempts to do so lead to an error.

It might have something to do with the fact that there are tens of millions of these files (inodes) in the filesystem, but I don't know if ext4 has such a limitation. And I'm definitely not out of disk space (though the Elasticsearch server could be a different story when this is all uploaded), as not even 50% of the disk space is used so far. (Strangely, I'm not out of inodes either.)

One solution to this problem could be to avoid saving these chunks to the disk all together and run the processing and upload as one step. This is what I was doing for several days, but then I had to diagnose performance issues on the cluster so it got interrupted. Performance was bad after that though, because I was reading already-uploaded chunks form the disk.

Another solution would be to simply avoid processing low-quality posts, e.g. gambling discussion. This will make for a smaller set, but it will take vastly less space. I estimate that around 15% of all Bitcointalk posts are made on Gambling Discussion. This is mostly sig spam that nobody wants to read, so there's no use returning that in search results. As a side effect of this, it will bring features resembling Google de-indexing to Talksearch, but I will never knowingly de-index posts I don't agree with. There will still be an index containing all existing forum posts, but that will be reserved for detailed search and API only.

▄▄███▄▄
▄▄███████████▄▄
▄██████████████████▄
▄█████▀▀▀█████▀▀▀█████▄
████▌░░░░░░░░░░░░░▐████
████▌░░░░░░░░░░░░░▐████
████▀░░░▄▄░░░▄▄░░░▀████
████░░░██▀░░░▀██░░░████
████▄░░░░░▀█▀░░░░░▄████
▀█████▄▄▄▄▄▄▄▄▄▄▄█████▀
▀██████████████████▀
▀▀███████████▀▀
▀▀███▀▀

.
betpanda.io

│

ANONYMOUS & INSTANT
.......ONLINE CASINO.......

│

▄███████████████████████▄
█████████████████████████
█████████████████████████
████████▀▀▀▀▀▀███████████
████▀▀▀█░▀▀░░░░░░▄███████
████░▄▄█▄▄▀█▄░░░█▄░▄█████
████▀██▀░▄█▀░░░█▀░░██████
██████░░▄▀░░░░▐░░░▐█▄████
██████▄▄█░▀▀░░░█▄▄▄██████
█████████████████████████
█████████████████████████
█████████████████████████
▀███████████████████████▀

▄███████████████████████▄
█████████████████████████
██████████▀░░░▀██████████
█████████░░░░░░░█████████
████████░░░░░░░░░████████
████████░░░░░░░░░████████
█████████▄░░░░░▄█████████
███████▀▀▀█▄▄▄█▀▀▀███████
██████░░░░▄░▄░▄░░░░██████
██████░░░░█▀█▀█░░░░██████
██████░░░░░░░░░░░░░██████
█████████████████████████
▀███████████████████████▀

▄███████████████████████▄
█████████████████████████
██████████▀▀▀▀▀▀█████████
███████▀▀░░░░░░░░░███████
██████▀░░░░░░░░░░░░▀█████
██████░░░░░░░░░░░░░░▀████
██████▄░░░░░░▄▄░░░░░░████
████▀▀▀▀▀░░░█░░█░░░░░████
████░▀░▀░░░░░▀▀░░░░░█████
████░▀░▀▄░░░░░░▄▄▄▄██████
█████░▀░█████████████████
█████████████████████████
▀███████████████████████▀

SLOT GAMES
....SPORTS....
LIVE CASINO

│

▄░░▄█▄░░▄
▀█▀░▄▀▄░▀█▀
▄▄▄▄▄▄▄▄▄▄▄
█████████████
█░░░░░░░░░░░█
█████████████
▄▀▄██▀▄▄▄▄▄███▄▀▄
▄▀▄██▄███▄█▄██▄▀▄
▄▀▄█▐▐▌███▐▐▌█▄▀▄
▄▀▄██▀█████▀██▄▀▄
▄▀▄█████▀▄████▄▀▄
▀▄▀▄▀█████▀▄▀▄▀
▀▀▀▄█▀█▄▀▄▀▀

Regional Sponsor of the
Argentina National Team

LoyceV

Legendary

Offline

Activity: 3850
Merit: 20282

Thick-Skinned Gang Leader and Golden Feather 2021

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 02, 2025, 03:29:34 PM
Last edit: May 02, 2025, 05:42:50 PM by LoyceV

#83

Quote from: NotATether on May 02, 2025, 02:30:52 PM

It might have something to do with the fact that there are tens of millions of these files (inodes) in the filesystem, but I don't know if ext4 has such a limitation. And I'm definitely not out of disk space (though the Elasticsearch server could be a different story when this is all uploaded), as not even 50% of the disk space is used so far. (Strangely, I'm not out of inodes either.)

I have some experience dealing with tens of millions of files, and apart from making a directory view terribly slow, it works fine as long as you have enough inodes.
On ext4, with default settings, it looks like a ten times larger disk does not get ten times more inodes. I checked a few disks, and typical limits are tens to hundreds of millions of inodes per disk.
Just enter df -hi and it tells you want you need to know.

¡uʍop ǝpᴉsdn pɐǝɥ ɹnoʎ ɥʇᴉʍ ʎuunɟ ʞool no⅄

signature for rent

Z_MBFM

Sr. Member

Offline

Activity: 924
Merit: 429

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 02, 2025, 06:52:33 PM

#84

Although I used to search Google to see if there was a related topic on this forum before I thought about something, I could have found information there too, but since Google is a search engine, there would have been many more search results besides the forum related.

However, I found using talksearch that it could make our forum related search much smoother. However, it is quite effective. nice job op

betpanda.io

│

ANONYMOUS & INSTANT
ONLINE CASINO

│

SLOT GAMES
SPORTS
LIVE CASINO

│

Regional Sponsor of the
Argentina National Team

NotATether (OP)

Legendary

Offline

Activity: 2142
Merit: 9078

Trêvoid █ No KYC-AML Crypto Swaps

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 03, 2025, 09:10:01 AM

#85

Quote from: LoyceV on May 02, 2025, 03:29:34 PM

About 18% of my inodes are used.

ls ran for a horribly long time but I finally got output:

Code:

zenulabidin@zerstrorer ~ % ls -l /opt/talksearch/processed_chunks | wc -l
30240178
command ls --color=auto -v -l /opt/talksearch/processed_chunks  435.80s user 1776.56s system 2% cpu 21:08:55.04 total
wc -l  1.46s user 2.14s system 0% cpu 21:08:54.10 total

So about 30 million files. Thank goodness for zsh, otherwise I wouldn't have known the run time of this. I'll see if this long directory listing time is the cause of "No space left on device" bailing-out in the filesystem code and/or the kernel.

.
betpanda.io

│

ANONYMOUS & INSTANT
.......ONLINE CASINO.......

│

SLOT GAMES
....SPORTS....
LIVE CASINO

│

Regional Sponsor of the
Argentina National Team

LoyceV

Legendary

Offline

Activity: 3850
Merit: 20282

Thick-Skinned Gang Leader and Golden Feather 2021

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 03, 2025, 09:24:43 AM
Last edit: May 03, 2025, 10:44:34 AM by LoyceV

Merited by NotATether (2)

#86

Quote from: NotATether on May 03, 2025, 09:10:01 AM

About 18% of my inodes are used.
~
So about 30 million files. ~ I'll see if this long directory listing time is the cause of "No space left on device" bailing-out in the filesystem code and/or the kernel.

As far as I know, there are no limits to the number of files per directory on ext4, so this is weird. I'm pretty sure I've had more files in one directory before I added subdirectories for faster listings.

I'm going to test it

I don't want this many files on my own system, so I use a temporary server:

Code:

16GB PKVM
$100/mo ($0.15/hr)
4 CPU, 16GB RAM, 400GB NVMe

Running this as a user:

Code:

i=1; while test $i -le 40000000; do echo "Hello world!" > $i; i=$((i+1)); done

This takes a while Tongue

I'll be damned: No space left on device!
I got to 29,272,362 files with 22M inodes free.

Filesystem:

Code:

/dev/vda1 on / type ext4 (rw,relatime,discard,errors=remount-ro,commit=30

It gets weirder: I can still create new files, just not all of them:

Code:

i=100000000; time while test $i -le 110000000; do echo "Hello world!" > $i; i=$((i+1)); done
-bash: 100000040: No space left on device
-bash: 100000145: No space left on device
-bash: 100002253: No space left on device
-bash: 100002567: No space left on device
-bash: 100002715: No space left on device
-bash: 100002827: No space left on device
-bash: 100003033: No space left on device
-bash: 100003445: No space left on device
-bash: 100003749: No space left on device
-bash: 100003997: No space left on device
-bash: 100004406: No space left on device
-bash: 100007839: No space left on device

That's 12 out of 7840 files that couldn't be created, the rest is fine:

Code:

ls 10000282*
10000282   100002821  100002823  100002825  100002828
100002820  100002822  100002824  100002826  100002829

Root command dmesg shows this:

Code:

[ 2024.349441] EXT4-fs warning: 598 callbacks suppressed
[ 2024.349450] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2024.349477] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2024.349503] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2024.349505] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2024.349524] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2024.349526] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2024.349545] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2024.349547] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2024.363790] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2024.363797] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2050.577162] EXT4-fs warning: 118 callbacks suppressed
[ 2050.577169] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2050.577175] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2050.582961] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2050.582965] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2050.582990] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2050.582992] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2050.583012] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2050.583014] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2050.598773] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2050.598778] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2078.294090] EXT4-fs warning: 302 callbacks suppressed
[ 2078.294097] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2078.294103] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2078.296589] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2078.296594] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2078.296638] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2078.296641] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2078.296659] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2078.296661] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem
[ 2078.302125] EXT4-fs warning (device vda1): ext4_dx_add_entry:2592: Directory (ino: 295416) index full, reach max htree level :2
[ 2078.302130] EXT4-fs warning (device vda1): ext4_dx_add_entry:2596: Large directory feature is not enabled on this filesystem

Solution
Enabling ext4 large_dir seems to fix it:

Code:

tune2fs -O large_dir /dev/nvme2n1

Quote from: https://www.phoronix.com/news/EXT4-Largedir-Linux-4.13

The EXT4 "largedir" feature overcomes the current limit of around ten million entires allowed within a directory on EXT4. Now, EXT4 directories can support around two billion directory entries. However, you are likely to hit performance bottlenecks before hitting this new EXT4 limitation.

It looks like the safe limit is about 10 million files per directory, although it may work up to around 30 million files, but you shouldn't get anywhere near that number without enabling large_dir because things start failing.

I completed my test at over 51 million files in a single directory. No more errors until I actually ran out of inodes.

¡uʍop ǝpᴉsdn pɐǝɥ ɹnoʎ ɥʇᴉʍ ʎuunɟ ʞool no⅄

signature for rent

NotATether (OP)

Legendary

Offline

Activity: 2142
Merit: 9078

Trêvoid █ No KYC-AML Crypto Swaps

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 03, 2025, 12:35:16 PM

#87

Quote from: LoyceV on May 03, 2025, 09:24:43 AM

~
Solution
Enabling ext4 large_dir seems to fix it:

Code:

tune2fs -O large_dir /dev/nvme2n1

Quote from: https://www.phoronix.com/news/EXT4-Largedir-Linux-4.13

Amazing work!

The forum should hire you as a consultant

I can restart the chunks processing now, but it's going to be starting from the first topic because I lost track of which topics failed to write. Fortunately it is much faster than upload at the moment - I was actually processing topics from 2023 when I noticed this issue.

.
betpanda.io

│

ANONYMOUS & INSTANT
.......ONLINE CASINO.......

│

SLOT GAMES
....SPORTS....
LIVE CASINO

│

Regional Sponsor of the
Argentina National Team

joker_josue

Legendary

Offline

Activity: 2198
Merit: 6360

Trêvoid █ No KYC-AML Crypto Swaps

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 04, 2025, 06:34:05 AM

#88

Quote from: NotATether on May 03, 2025, 12:35:16 PM

Don't run the system all at once!
Make it run in cycles, for example 1 year at a time. This way, if there is a failure in any cycle, you know to what extent everything is fine and you won't have to start from scratch.

You do this manually by running the script in each cycle. Or you can set up the script so that it runs in cycles and keeps a log of the events. Whenever a cycle ends, it informs you of the result, if everything is ok. This way you can follow the process.

.^Winna.com..

│

░░░░░░░▄▀▀▀
░░█
█
█▒█
▐▌▒▐▌
▄▄▄█▒▒▒█▄▄▄
█████████████
█████████████
▀███▀▒▀███▀

▄▄▄▄▄▄▄▄

██████████████
█████████████▄
█████▄████████
███▄███▄█████▌
███▀▀█▀▀██████
████▀▀▀█████▌█
██████████████
███████████▌██
█████▀▀▀██████

▄▄▄▄▄▄▄▄

THE ULTIMATE CRYPTO
...CASINO & SPORTSBOOK...
───── ♠ ♥ ♣ ♦ ─────

▄▄██▄▄
▄▄████████▄▄
▄██████████████▄
████████████████
████████████████
████████████████
▀██████████████▀
▀██████████▀
▀████▀

▄▄▄▄▄▄▄▄

▄▄▀███▀▄▄
▄███████████▄
███████████████
███▄▄█▄███▄█▄▄███
█████▀█████▀█████
█████████████████
███████████████
▀███████████▀
▀▀█████▀▀

▄▄▄▄▄▄▄▄

│

►

►

.....INSTANT.....
WITHDRAWALS

...UP TO 30%...
LOSSBACK

│

TRANSFER
.VIP STATUS.
NOW ►

│

^{^{^PLAY NOW}}

GazetaBitcoin

Legendary

Offline

Activity: 2226
Merit: 8933

Fully-fledged Merit Cycler|Spambuster'23|Pie Baker

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 04, 2025, 12:18:06 PM

#89

Hey NotATether, please be aware that 1 more translation was made for your topic by AOBT:

Ukrainian translation, made by DrBeer

Cheers!

^bustabit

██████▄█▀
███▄███▀
██▀██▄▄▄
▄███▀▀▀
▀█▄█▄▄▄
█▀██▀▀
█▄▐▌▄█
▀████▀
██▀▀███▄
██▄▄█████▄
█▀██████▄██
██████████▀
████▀███▀▀▄

THE ORIGINAL
CRASH GAME
.....S I N C E 2 0 1 4.....

▀█▄██████
▀███▄███
▄▄▄██▀██
▀▀▀███▄
▄▄▄█▄█▀
▀▀██▀█
█▄▐▌▄█
▀████▀
▄██▄█▀██
▄███▄██▄██
██▄██████▀█
▀██████████
▄▀▀███▀████

█
█
█
█
█
█
█
█
▀

HOUSE EDGE
1%

█
█
█
█
█
█
█
█
▀

WAGERED
^{^BTC200M+}

█
█
█
█
█
█
█
█
▀

MAX PROFIT
BTC5+

█
█
█
█
█
█
█
█
▀

^PLAY NOW

Mahiyammahi

Sr. Member

Offline

Activity: 448
Merit: 281

⇾ Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 09, 2025, 10:12:09 AM

#90

Hey NotATether, how about creating an AI Model only specific to the Bitcointalk forum? Since you have developed a search engine, it can scrape posts. Why not train an AI model using it? I don't know if it will be helpful for forum users. But if an AI model found that scraps all the answers from the Bitcointalk forum Topic, replies, I think it won't be bad. A user can get their answer within a few seconds rather than scraping all the data it had been on Bitcointalk. Other AI models like Chatgpt look everywhere for an answer. So, if an AI model specifically only looks at forum data, this would be great.

betpanda.io

│

ANONYMOUS & INSTANT
ONLINE CASINO

│

SLOT GAMES
SPORTS
LIVE CASINO

│

Regional Sponsor of the
Argentina National Team

$crypto$

Legendary

Offline

Activity: 2912
Merit: 1183

Smart is not enough, there must be skills

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 09, 2025, 12:00:18 PM

#91

Quote from: Mahiyammahi on May 09, 2025, 10:12:09 AM

There is an AI search engine (Bitcointalk) you can do some browsing there.

[AI Search Engine] Bitcointalk

Have tried asking questions on this AI search engine --- there are some answers that the AI gives are not accurate, and it takes a few seconds to give an answer.

▀▀▀▀▀▀▀██████▄▄
████████████████
▀▀▀▀█████▀▀▀█████
████████▌███▐████
▄▄▄▄█████▄▄▄█████
████████████████
▄▄▄▄▄▄▄██████▀▀

LLBIT

4,000+ GAMES

███████████████████
██████████▀▄▀▀▀████
████████▀▄▀██░░░███
██████▀▄███▄▀█▄▄▄██
███▀▀▀▀▀▀█▀▀▀▀▀▀███
██░░░░░░░░█░░░░░░██
██▄░░░░░░░█░░░░░▄██
███▄░░░░▄█▄▄▄▄▄████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

█████████
▀████████
░░▀██████
░░░░▀████
░░░░░░███
▄░░░░░███
▀█▄▄▄████
░░▀▀█████
▀▀▀▀▀▀▀▀▀

█████████
░░░▀▀████
██▄▄▀░███
█░░█▄░░██
░████▀▀██
█░░█▀░░██
██▀▀▄░███
░░░▄▄████
▀▀▀▀▀▀▀▀▀

.
BONUS
BATTLES

$25,000
DAILY RACE

▄▄████▄▄
▀█▀▄▀▀▄▀█▀
▄▄░░▄█░██░█▄░░▄▄
▄▄█░▄▀█░▀█▄▄█▀░█▀▄░█▄▄
▀▄█░███▄█▄▄█▄███░█▄▀
▀▀█░░░▄▄▄▄░░░█▀▀
█░░██████░░█
█░░░░▀▀░░░░█
█▀▄▀▄▀▄▀▄▀▄█
▄░█████▀▀█████░▄
▄███████░██░███████▄
▀▀██████▄▄██████▀▀
▀▀████████▀▀

.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
░▀▄░▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄░▄▀
███▀▄▀█████████████████▀▄▀
█████▀▄░▄▄▄▄▄███░▄▄▄▄▄▄▀
███████▀▄▀██████░█▄▄▄▄▄▄▄▄
█████████▀▄▄░███▄▄▄▄▄▄░▄▀
████████████░███████▀▄▀
████████████░██▀▄▄▄▄▀
████████████░▀▄▀
████████████▄▀
███████████▀

▄▄███████▄▄
▄████▀▀▀▀▀▀▀████▄
▄███▀▄▄███████▄▄▀███▄
▄██▀▄█▀▀▀█████▀▀▀█▄▀██▄
▄██▀▄███░░░▀████░███▄▀██▄
███░████░░░░░▀██░████░███
███░████░█▄░░░░▀░████░███
███░████░███▄░░░░████░███
▀██▄▀███░█████▄░░███▀▄██▀
▀██▄▀█▄▄▄██████▄██▀▄██▀
▀███▄▀▀███████▀▀▄███▀
▀████▄▄▄▄▄▄▄████▀
▀▀███████▀▀

OFFICIAL PARTNERSHIP
SOUTHAMPTON FC
FAZE CLAN
SSC NAPOLI

NotATether (OP)

Legendary

Offline

Activity: 2142
Merit: 9078

Trêvoid █ No KYC-AML Crypto Swaps

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 09, 2025, 01:32:16 PM

#92

Quote from: Mahiyammahi on May 09, 2025, 10:12:09 AM

I don't have a dev team, so this will take a very long time to implement. It is not a priority at the moment.

In fact, only about 5 million chunks out of almost a hundred million have been uploaded so far.

.
betpanda.io

│

ANONYMOUS & INSTANT
.......ONLINE CASINO.......

│

SLOT GAMES
....SPORTS....
LIVE CASINO

│

Regional Sponsor of the
Argentina National Team

hopenotlate

Legendary

Offline

Activity: 3836
Merit: 1272

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 09, 2025, 03:10:44 PM

#93

I had some free time, and as a sign of gratitude for the efforts you make to improve users experience of this forum I took the liberty of translating opening post into Italian as I noticed it hadn't been done yet, without even asking your permission to do it.
I hope you don't mind and please let me know if it's okay or if I should remove it.

Translation link : Talksearch.io - Motore di ricerca avanzato per Bitcointalk

^EARNBET

██
██
██
██
██
██
██
██
██
██
██
██
██

███████▄▄███████████
████▄██████████████████
██▄▀▀███████████████▀▀███
█▄████████████████████████
▄▄████████▀▀▀▀▀████████▄▄██
███████████████████████████
█████████▌████▀████████████
███████████████████████████
▀▀███████▄▄▄▄▄█████████▀▀██
█▀█████████████████████▀██
██▀▄▄███████████████▄▄███
████▀██████████████████
███████▀▀███████████

HIGHEST VIP REWARDS
✔ G U A R A N T E E D

██
██
██
██
██
██
██
██
██
██
██
██
██

▄▄▄
▄▄▄███████▐███▌███████▄▄▄
█████████████████████████
▀████▄▄▄███████▄▄▄████▀
█████████████████████
▐███████████████████▌
███████████████████
███████████████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

King of The Castle
$200,000 in prizes

██
██
██
██
██
██
██
██
██
██
██
██
██

^62.5%

RAKEBACK
BONUS

NotATether (OP)

Legendary

Offline

Activity: 2142
Merit: 9078

Trêvoid █ No KYC-AML Crypto Swaps

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 12, 2025, 06:01:34 AM

Merited by vapourminer (2)

#94

Quote from: hopenotlate on May 09, 2025, 03:10:44 PM

Anybody can make a translation of this topic without asking me. But to avoid duplicate efforts, people should make sure that a local translation doesn't already exist.

On an unrelated note - Google Cloud is so useful! It's like having a free VScode in the cloud that doesn't cost anything extra, along with a database and git integration. HTTP server URLs are practically free as well. I am even using it for other projects too.

It's too bad that Elasticsearch is not keeping up with the load Tongue

, I guess I will have to wait a while for the upload to complete.

.
betpanda.io

│

ANONYMOUS & INSTANT
.......ONLINE CASINO.......

│

SLOT GAMES
....SPORTS....
LIVE CASINO

│

Regional Sponsor of the
Argentina National Team

hopenotlate

Legendary

Offline

Activity: 3836
Merit: 1272

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 12, 2025, 09:26:18 AM

#95

Quote from: NotATether on May 12, 2025, 06:01:34 AM

Quote from: hopenotlate on May 09, 2025, 03:10:44 PM

Anybody can make a translation of this topic without asking me. But to avoid duplicate efforts, people should make sure that a local translation doesn't already exist.

-snip-

Glad to hear everything it's ok with it; maybe to avoid a duplicate you might want to add my translation link in opening post just for everyone to make sure at a first look it has already been done.

^EARNBET

██
██
██
██
██
██
██
██
██
██
██
██
██

HIGHEST VIP REWARDS
✔ G U A R A N T E E D

██
██
██
██
██
██
██
██
██
██
██
██
██

King of The Castle
$200,000 in prizes

██
██
██
██
██
██
██
██
██
██
██
██
██

^62.5%

RAKEBACK
BONUS

NotATether (OP)

Legendary

Offline

Activity: 2142
Merit: 9078

Trêvoid █ No KYC-AML Crypto Swaps

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 12, 2025, 09:29:43 AM

#96

Quote from: armex2345 on May 12, 2025, 09:20:12 AM

It’s missing a few key features though. Not being able to tweak the search text is a bit of a letdown since that’s pretty important for narrowing things down.

Can you elaborate on this? I don't really understand what you mean by tweaking.

Quote from: armex2345 on May 12, 2025, 09:20:12 AM

Would you like variations that are more professional, casual, or critical?

As in what? Sorry but just like the other part, I'm not very sure what you're asking for here.

I am working on automatically including synonyms and verb conjugations of search terms in order to capture additional relevant topics though. This is something I can do independently of the document upload.

.
betpanda.io

│

ANONYMOUS & INSTANT
.......ONLINE CASINO.......

│

SLOT GAMES
....SPORTS....
LIVE CASINO

│

Regional Sponsor of the
Argentina National Team

nutildah

Legendary

Offline

Activity: 3528
Merit: 10177

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 18, 2025, 10:20:08 AM

Merited by vapourminer (2), NotATether (2)

#97

Quote from: NotATether on May 12, 2025, 09:29:43 AM

Quote from: armex2345 on May 12, 2025, 09:20:12 AM

It’s missing a few key features though. Not being able to tweak the search text is a bit of a letdown since that’s pretty important for narrowing things down.

Can you elaborate on this? I don't really understand what you mean by tweaking.

Quote from: armex2345 on May 12, 2025, 09:20:12 AM

Would you like variations that are more professional, casual, or critical?

As in what? Sorry but just like the other part, I'm not very sure what you're asking for here.
...

The problem is you're talking with a bot, or a human emulating a bot, rather. This last part is the AI asking him if he want the output rephrased but he just copy/pasted it because, naturally, he's a maroon:

Quote from: armex2345 on May 12, 2025, 09:20:12 AM

Would you like variations that are more professional, casual, or critical?

Don't let the bots bring you down, NotATether! Cheesy

As a human, I for one applaud your efforts and think its great to see alternative resources being built around forum data. Will remember to add it to my arsenal the next time I am researching something.

.
betpanda.io

│

ANONYMOUS & INSTANT
.......ONLINE CASINO.......

│

SLOT GAMES
....SPORTS....
LIVE CASINO

│

Regional Sponsor of the
Argentina National Team

Wouter Mense

Newbie

Offline

Activity: 16
Merit: 12

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 19, 2025, 09:50:57 AM

Merited by vapourminer (4)

#98

Quote from: NotATether on April 18, 2025, 03:37:35 PM

The issue is, I currently don't have a reliable way to measure post quality.

Suggest to look at "user quality". Example post history.

A lot of this kind of user exists. Looked at recent unread topics and this one I found at my third try.

The patterns to look for in this case there are about 1200 posts that all "look" the same:
- Each post begins with a quote.
- Followed by one or two lines of text.

Other things to look for:
- All roughly the same total length.
- All roughly the same number of paragraphs, of the same length.
- Same number of sentences, of the same length.
- Each with for example one image.

All these are in my opinion the result of "forced" content generation. Usually with financial incentive I would assume.

Of course above metric can be gamed. The thing here is that this pattern is predictable. The next posts of above user will also look the same. Introucing more variety in post style will take more effort, and would possibly also be indicative of improved quality.

NotATether (OP)

Legendary

Offline

Activity: 2142
Merit: 9078

Trêvoid █ No KYC-AML Crypto Swaps

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 19, 2025, 10:53:33 AM

#99

Quote from: nutildah on May 18, 2025, 10:20:08 AM

Don't let the bots bring you down, NotATether! Cheesy

Thanks, I appreciate it.

Quote from: Wouter Mense on May 19, 2025, 09:50:57 AM

Quote from: NotATether on April 18, 2025, 03:37:35 PM

The issue is, I currently don't have a reliable way to measure post quality.

Noted. I do think, however, that post quality can be quantified somehow, so I'm going to look for some research on how that would be calculated. Probably it should be between 0 and 1.

Then the user quality can be set to the mean of all post qualities from that user, which is then used as a weight for search results, but will not dampen results too much compared to post quality.

.
betpanda.io

│

ANONYMOUS & INSTANT
.......ONLINE CASINO.......

│

SLOT GAMES
....SPORTS....
LIVE CASINO

│

Regional Sponsor of the
Argentina National Team

Wouter Mense

Newbie

Offline

Activity: 16
Merit: 12

Re: Talksearch.io - Advanced Bitcointalk Search Engine

May 19, 2025, 12:10:51 PM
Last edit: May 19, 2025, 12:43:18 PM by Wouter Mense

#100

I do assume a strong correlation between post and user quality but I don't have proof.

Also I totally ignored topic context.

Quote from: NotATether on May 19, 2025, 10:53:33 AM

post quality can be quantified somehow

Looking at just one post without context? I guess it would be less cpu time?

Quote

look for some research

After reading your post I did pose a few questions to ai chat with possibly interesting results. Queries (in order, with typos, and ai chat answers between each query):

quantify post quality of a forum post
specifically site is bitcointalk.org
indicate which of these an be measured with low computational cost
rearrange the low cost metrics from best to worst
adjust for the fact that accounts can be bought and sold
adjust to the fact that users may get paid for posting
same analisys for comments vs opening posts
which are most usefule without taking context from other posts

Offtopic, I hope you appreciate getting more questions instead of more answers. I do believe asking the right questions is more helpful to start your research. I can't vouch for the quality of ai answers, just that it looked interesting. I'm not a programmer, but it does offer to write your code as well.

Pages: « 1 2 3 4 [5] 6 7 8 9 10 » All

Bitcoin Forum > Other > Meta > Talksearch.io - Advanced Bitcointalk Search Engine

« previous topic next topic »

Jump to: