Bitcoin Forum
March 16, 2025, 07:18:04 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 »  All
  Print  
Author Topic: List of all Bitcoin addresses with a balance  (Read 9806 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic. (14 posts by 1+ user deleted.)
LoyceV (OP)
Legendary
*
Offline Offline

Activity: 3612
Merit: 18439


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
December 11, 2024, 10:17:05 AM
 #221

It's back!
My site addresses.loyce.club/ is back online. For now, there's only the most recent snapshot. During the next year, I'll keep more snapshots again.

Direct link to LATEST versions
blockchair_bitcoin_addresses_and_balance_LATEST.tsv.gz (currently 1.5GB)
Bitcoin_addresses_LATEST.txt.gz (currently 1.3GB)

Bandwidth
Starting December 2024, I have a new VPS. This server is allowed 16 TB bandwidth per month. Enjoy!



I've received a few PMs from people who missed my data. It's always good to see it fills a need.

LoyceV (OP)
Legendary
*
Offline Offline

Activity: 3612
Merit: 18439


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
January 03, 2025, 04:45:12 PM
 #222

It was brought to my attention that my "sort" is "different" now, and I got these results testing:

Code:
cat Bitcoin_addresses_LATEST.txt.gz | gunzip | sha256sum
df0baad2301e9b897a02bd3fccb115968c82eb3956143e2f5b4c3ad7b2c227bf  -

So far so good.
Now, this file is sorted on my server from a cronjob. But when I sort it on my local computer, I get this:
Code:
cat Bitcoin_addresses_LATEST.txt.gz | gunzip | sort -S20% | sha256sum
27c2541369d0546ec7c7e70d09d807d8fc6d39435f8857e5ebbf8386584be2d2  -

Has anyone else noticed an incompatible sorting method? Should I change this to a different sorting? Or would that break scripts from people who are currently using it?

pbies
Full Member
***
Offline Offline

Activity: 326
Merit: 196



View Profile
January 03, 2025, 05:18:29 PM
Merited by LoyceV (42), vapourminer (1)
 #223

It was brought to my attention that my "sort" is "different" now, and I got these results testing:

Code:
cat Bitcoin_addresses_LATEST.txt.gz | gunzip | sha256sum
df0baad2301e9b897a02bd3fccb115968c82eb3956143e2f5b4c3ad7b2c227bf  -

So far so good.
Now, this file is sorted on my server from a cronjob. But when I sort it on my local computer, I get this:
Code:
cat Bitcoin_addresses_LATEST.txt.gz | gunzip | sort -S20% | sha256sum
27c2541369d0546ec7c7e70d09d807d8fc6d39435f8857e5ebbf8386584be2d2  -

Has anyone else noticed an incompatible sorting method? Should I change this to a different sorting? Or would that break scripts from people who are currently using it?

We were talking about that in our private messages.

My suggestions:
  • Use pv instead of cat, so you could see progress, it won't affect the result
  • Maybe sorting should use LC_ALL=C or LC_ALL=C.UTF-8 before sorting command so it could be always one type of sorting for all systems (it should work like that)
  • Because systems/servers/OSes differ, we always should give the sorting way for each sorting command (LC_ALL...)
  • If we change that now, we can break peoples' scripts, but we should make one way of sorting forever, that's a engineering idea as it should be
  • We can see in sorted file, on first page that fits the screen that the sorting differs depending on system or given LC_ALL; it is visible by naked eye that the addresses are sorted other way (mainly lowercase-uppercase are in other order)

BTC: bc1qmrexlspd24kevspp42uvjg7sjwm8xcf9w86h5k
LoyceV (OP)
Legendary
*
Offline Offline

Activity: 3612
Merit: 18439


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
January 03, 2025, 05:51:18 PM
 #224

We were talking about that in our private messages.
Thank you for pointing this out Smiley

Quote
Maybe sorting should use LC_ALL=C or LC_ALL=C.UTF-8 before sorting command so it could be always one type of sorting for all systems (it should work like that)
I'll wait if someone responds with a good reason to keep things the way they are. If not, I think I'll go for LC_ALL=C.

Quote
Because systems/servers/OSes differ, we always should give the sorting way for each sorting command (LC_ALL...)
I agree. I just didn't know about the difference, and (before my dedicated server disappeared) never stumbled upon this problem.

Quote
If we change that now, we can break peoples' scripts, but we should make one way of sorting forever, that's a engineering idea as it should be
Let's say give it 2 weeks. But I guess most people don't read here, until after I broke their script by changing things Tongue

Quote
We can see in sorted file, on first page that fits the screen that the sorting differs depending on system or given LC_ALL; it is visible by naked eye that the addresses are sorted other way (mainly lowercase-uppercase are in other order)
Here's the difference:
Code:
11111111111111111111HV1eYjP
11111111111111111111HeBAGj
11111111111111111111QekFQw
11111111111111111111UpYBrS
11111111111111111111g4hiWR
11111111111111111111jGyPM8
11111111111111111111o9FmEC
11111111111111111111ufYVpS
vs:
Code:
11111111111111111111g4hiWR
11111111111111111111HeBAGj
11111111111111111111HV1eYjP
11111111111111111111jGyPM8
11111111111111111111o9FmEC
11111111111111111111QekFQw
11111111111111111111ufYVpS
11111111111111111111UpYBrS
That is annoying to deal with!



This can of course easily be avoided by sorting the data on your local system before using it. For this project, it's quite easy. But for all Bitcoin addresses ever used, it can take hours to sort the data.

15052000bitcoin
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
January 29, 2025, 04:38:40 AM
 #225

oh wowwww Shocked Shocked Shocked

Very Thanks
pbies
Full Member
***
Offline Offline

Activity: 326
Merit: 196



View Profile
January 29, 2025, 04:50:50 AM
Last edit: January 29, 2025, 05:02:24 AM by pbies
Merited by LoyceV (8), vapourminer (1)
 #226

@LoyceV

So I've made some further tests.

And seems like it is all ok!

LC_ALL=C should be used on systems, that have it local or different.

I've put LC_ALL=C before sort command and before compare command (my cmn script, which uses comm program) to test:

sort-u-mt:
Code:
#!/usr/bin/env bash
FILESIZE=$(stat -c%s "$1")
time pv -cN input "$1" | dos2unix -f | LC_ALL=C sort -u -S 20% --parallel=16 | pv -cN output -s $FILESIZE > "$1.sorted~"
if [[ -s "$1.sorted~" ]]
then
mv "$1.sorted~" "$1"
echo Done.
else
echo Error!
fi
>&2 echo -ne "\a"

cmn:
Code:
#!/usr/bin/env bash
time LC_ALL=C comm -12 <(pv -cN in1 "$1") <(pv -cN in2 "$2") | (pv -cN out) > "$3"
echo -e "\nResult file has $(wc -l < "$3") lines, head:"
head "$3"
>&2 echo -ne "\a"

So your files are LC_ALL=C sorted but thru my sort my files are not.

If we add LC_ALL=C before sort and comm we get the expected results.

So in my opinion there is no change needed.

BTC: bc1qmrexlspd24kevspp42uvjg7sjwm8xcf9w86h5k
LoyceV (OP)
Legendary
*
Offline Offline

Activity: 3612
Merit: 18439


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
January 29, 2025, 08:20:43 AM
 #227

I've put LC_ALL=C before sort command
Why not just put it at the start of your script once?
Code:
export LC_ALL=C
I'll add this to the OP. And I'll add it to my own script, so it no longer depends on the server I'm using. And right when I wanted to do this, I realized it's there already:
Code:
export LC_ALL=C   # This makes "sort" a few percent faster
The reason I added this a long time ago is in the comment. I completely forgot this was in there. I'll also add it to "all addresses ever used".

Quote
Code:
sort -u -S 20% --parallel=16
Are you sure this makes it faster? When I tested it (on a server with HDD), adding more CPU threads only helps if it fits in RAM, and with more threads, sort needs more memory so you don't want that if it means writing more to /tmp.
Without the parallel-setting, sort already uses many cores. So I used this setting to limit it.

pbies
Full Member
***
Offline Offline

Activity: 326
Merit: 196



View Profile
January 29, 2025, 02:38:15 PM
Last edit: January 29, 2025, 03:10:52 PM by pbies
 #228

Quote
Quote
Code:
sort -u -S 20% --parallel=16
Are you sure this makes it faster? When I tested it (on a server with HDD), adding more CPU threads only helps if it fits in RAM, and with more threads, sort needs more memory so you don't want that if it means writing more to /tmp.
Without the parallel-setting, sort already uses many cores. So I used this setting to limit it.

Well, need to test memory percent value and number of threads with different values to make it fastest.

I think sort was single-threaded when no -parallel setting was given.

BTC: bc1qmrexlspd24kevspp42uvjg7sjwm8xcf9w86h5k
LoyceV (OP)
Legendary
*
Offline Offline

Activity: 3612
Merit: 18439


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
January 29, 2025, 02:45:17 PM
 #229

I think sort was single-threaded when no -parallel setting was given.
Mine uses all cores until available memory becomes a limitation.

timon174174
Newbie
*
Offline Offline

Activity: 2
Merit: 2


View Profile
March 03, 2025, 07:21:02 PM
Merited by LoyceV (2)
 #230

Hello, I can't download blockchair_bitcoin_addresses_and_balance_LATEST.tsv.gz

Forbidden
LoyceV (OP)
Legendary
*
Offline Offline

Activity: 3612
Merit: 18439


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
March 03, 2025, 07:36:13 PM
Last edit: March 04, 2025, 08:49:18 AM by LoyceV
Merited by timon174174 (1)
 #231

I can't download blockchair_bitcoin_addresses_and_balance_LATEST.tsv.gz
Forbidden
Thanks for reporting this. I'm not sure what went wrong, I've started an update but that will take a while to complete. Please check again tomorrow.

Update: it works again Smiley

timon174174
Newbie
*
Offline Offline

Activity: 2
Merit: 2


View Profile
March 04, 2025, 09:17:10 AM
 #232

I can't download blockchair_bitcoin_addresses_and_balance_LATEST.tsv.gz
Forbidden
Thanks for reporting this. I'm not sure what went wrong, I've started an update but that will take a while to complete. Please check again tomorrow.

Update: it works again Smiley


Yes it works, thank you for what you do
pbies
Full Member
***
Offline Offline

Activity: 326
Merit: 196



View Profile
March 08, 2025, 05:07:48 AM
 #233

Seems like there is still sth wrong with the files downloaded.

For few days it didn't changed, repeatedly downloading the same file again and again...

BTC: bc1qmrexlspd24kevspp42uvjg7sjwm8xcf9w86h5k
LoyceV (OP)
Legendary
*
Offline Offline

Activity: 3612
Merit: 18439


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
March 08, 2025, 09:40:13 AM
 #234

For few days it didn't changed, repeatedly downloading the same file again and again...
Thanks for letting me know. My last manual fix didn't remove a temporary directory, which prevented it from running again. It's should be okay again (but takes a while to update).

pbies
Full Member
***
Offline Offline

Activity: 326
Merit: 196



View Profile
March 08, 2025, 03:03:57 PM
 #235

When the dumps are created?

Every day at which hour?

I would make crontab entry to download a fresh one file each day...

BTC: bc1qmrexlspd24kevspp42uvjg7sjwm8xcf9w86h5k
LoyceV (OP)
Legendary
*
Offline Offline

Activity: 3612
Merit: 18439


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
March 08, 2025, 04:32:11 PM
Merited by vapourminer (1)
 #236

When the dumps are created?
Every day at which hour?
There's no fixed time: Blockchair's source data is sometimes delayed, which is why I check the file date before proceeding. If there's no new file, I try again an hour later.

Quote
I would make crontab entry to download a fresh one file each day...
Instead of downloading LATEST, you could download "today's date": http://addresses.loyce.club/Bitcoin_addresses_March_08_2025.txt.gz
If it doesn't exist, try again after an hour. That avoids duplicate downloads.

pbies
Full Member
***
Offline Offline

Activity: 326
Merit: 196



View Profile
March 08, 2025, 06:00:21 PM
 #237

Python 3 script below for adding to /etc/crontab for each hour.

It will download new file when files are different length.

You need to cd in crontab to folder where you have script and target file.

Code:
#!/usr/bin/env bash
# apt install aria2
echo Checking file size...
h=$(curl -sI http://addresses.loyce.club/Bitcoin_addresses_LATEST.txt.gz)
cl=$(echo "$h"|grep Content-Length)
l=$(echo "$cl"|grep -oE '[0-9]+')
f=$(stat -c%s Bitcoin_addresses_LATEST.txt.gz)
if [ $f -eq $l ]; then echo Duplicate! ; kill -INT $$ ; fi
echo Removing old file...
rm -f ./Bitcoin_addresses_LATEST.txt.gz
echo Downloading...
aria2c -x4 http://addresses.loyce.club/Bitcoin_addresses_LATEST.txt.gz
echo Unpacking...
pv Bitcoin_addresses_LATEST.txt.gz | gunzip > addrs-with-bal.txt
echo Done!
>&2 echo -ne "\a"

BTC: bc1qmrexlspd24kevspp42uvjg7sjwm8xcf9w86h5k
AliBah
Newbie
*
Offline Offline

Activity: 42
Merit: 0


View Profile
March 13, 2025, 07:17:51 AM
 #238

the last file downloaded correctly but when i want to use or extract that i got error :

    D:\blockchair_bitcoin_addresses_and_balance_March_12_2025.tsv.gz: Checksum error in D:\blockchair_bitcoin_addresses_and_balance_March_12_2025.tsv\blockchair_bitcoin_addresses_and_balance_March_12_2025.tsv. The file is corrupt
LoyceV (OP)
Legendary
*
Offline Offline

Activity: 3612
Merit: 18439


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
March 13, 2025, 07:49:02 AM
 #239

the last file downloaded correctly but when i want to use or extract that i got error :

    D:\blockchair_bitcoin_addresses_and_balance_March_12_2025.tsv.gz: Checksum error in D:\blockchair_bitcoin_addresses_and_balance_March_12_2025.tsv\blockchair_bitcoin_addresses_and_balance_March_12_2025.tsv. The file is corrupt
I tested the file:
Code:
gzip -t blockchair_bitcoin_addresses_and_balance_March_12_2025.tsv.gz
This doesn't give any errors. Just download it again.

AliBah
Newbie
*
Offline Offline

Activity: 42
Merit: 0


View Profile
March 13, 2025, 08:58:46 AM
 #240

SalaR@PC-User MINGW64 /d
# gzip -t blockchair_bitcoin_addresses_and_balance_March_12_2025.tsv.gz

gzip: blockchair_bitcoin_addresses_and_balance_March_12_2025.tsv.gz: invalid compressed data--format violated


redownloaded and still got error

I tried this file : blockchair_bitcoin_addresses_and_balance_March_11_2025.tsv.gz
and thats ok !!!
Pages: « 1 2 3 4 5 6 7 8 9 10 11 [12] 13 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!