Bitcoin Forum
July 17, 2024, 11:58:37 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 [2]  All
  Print  
Author Topic: All used addresses  (Read 688 times)
naufragus
Newbie
*
Offline Offline

Activity: 29
Merit: 50


View Profile
July 21, 2020, 12:44:45 AM
Last edit: July 21, 2020, 02:05:32 AM by naufragus
 #21

please donate to me instead( LOL), i never received a donation lol
no i am not selling data
[it will take me some days (not many, though) to upload every thing to git as of today]

and i think this is a much powerful proof we are NOT
really interested in money, except some private companies may


only one liners,  yeah oh... go learn some more
@loyce you didnt post the resulting lists, how can anybody be sure about; 
@windows/osx ;people please remove windows or mac os
LoyceV
Legendary
*
Offline Offline

Activity: 3374
Merit: 17045


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
August 03, 2020, 08:27:11 AM
 #22

I made List of all Bitcoin addresses ever used.

one feature of my lists is i tried to keep the original order in which addresses first appeared in the blockchair dumps..
It works with awk:
Code:
awk '!a[$0]++'
But this requires far too much memory. I can use this on data per day, but not on all data.
So for now, I gave up trying to keep addresses in chronological order. I'll keep the original data in case I find a different solution (or enough RAM) later.

naufragus
Newbie
*
Offline Offline

Activity: 29
Merit: 50


View Profile
August 23, 2020, 10:06:59 PM
Last edit: August 24, 2020, 12:21:59 PM by naufragus
 #23

I made List of all Bitcoin addresses ever used.

one feature of my lists is i tried to keep the original order in which addresses first appeared in the blockchair dumps..
It works with awk:
Code:
awk '!a[$0]++'
But this requires far too much memory. I can use this on data per day, but not on all data.
So for now, I gave up trying to keep addresses in chronological order. I'll keep the original data in case I find a different solution (or enough RAM) later.

Hey that is very nice , bro.
You have got the means, work hard and you are very good at it.
I knew that awk one-liner you wrote, tho i tried using perl because thought that might need less ram..
The sort command can keep chronological order without using ram but it needs a large temp directory (/tmp will not work as it is limited by the system ram value).
ok, cheers!
LoyceV
Legendary
*
Offline Offline

Activity: 3374
Merit: 17045


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
August 24, 2020, 09:59:41 AM
 #24

The sort command can keep chronological order without using ram
How? I haven't found that option.

bob123
Legendary
*
Offline Offline

Activity: 1624
Merit: 2481



View Profile WWW
August 24, 2020, 10:10:57 AM
 #25

The sort command can keep chronological order without using ram
How? I haven't found that option.

The command sort only takes roughly 50% of your available RAM.
If you are running out of memory using sort on a large file, most likely there isn't enough space on your hard drive.

What sort does, when the file is larger than your available RAM, is that it creates temporary files on the hard drive which are then merge-sorted at the end.
So the overall capacity needed is the sizes of the file times 3 (if you keep the original one) or 2 (if you overwrite the original file).


I might have just misunderstood the problem, might want to elaborate the actual issue?

LoyceMobile
Hero Member
*****
Offline Offline

Activity: 1666
Merit: 688


LoyceV on the road. Or couch.


View Profile WWW
August 24, 2020, 10:43:44 AM
 #26

I might have just misunderstood the problem, might want to elaborate the actual issue?
I want to remove duplicate lines without changing the order, so only keeping the first occurrence.

Say:
A
G
D
A
B
C
D

I want to keep:
A
G
D
B
C

LoyceV on the road Advertise here for LN Don't deal with this account (exception)
Advertise here for LN Tip my kids Exchange LN (20 coins). 1% fee. No KYC <€50/month
My useful topics: Meritt & Trust & Moreee Art Advertise here for LN Foru[url=https://bitcointalk.org/m
bob123
Legendary
*
Offline Offline

Activity: 1624
Merit: 2481



View Profile WWW
August 24, 2020, 11:13:58 AM
Merited by LoyceV (20), ABCbits (3), o_solo_miner (1)
 #27

I want to remove duplicate lines without changing the order, so only keeping the first occurrence.

If i am not mistaken, the following should work:
Code:
cat -n input.txt | sort -uk2 | sort -nk1 | cut -f2- > output.txt

None of these commands needs to hold the file in memory all at once.
But as mentioned previously, sort does need quite some disk space to create temporary files. So that might be a bottleneck, depending on your system specs.

LoyceV
Legendary
*
Offline Offline

Activity: 3374
Merit: 17045


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
August 24, 2020, 11:44:38 AM
 #28

Code:
cat -n input.txt | sort -uk2 | sort -nk1 | cut -f2- > output.txt
This looks genius in simplicity! It worked on a small sample, I'm currently transfering and extracting 31 GB of data to do the full test. There's no way the double sort will fit the 100 GB VPS, but this should work without using a lot of RAM.
I'll continu my test results in my own topic: List of all Bitcoin addresses ever used.

naufragus
Newbie
*
Offline Offline

Activity: 29
Merit: 50


View Profile
August 24, 2020, 11:44:49 AM
Last edit: August 24, 2020, 01:08:13 PM by naufragus
Merited by LoyceV (6), suchmoon (4), ABCbits (3)
 #29

i wrote my methodology for my addresses list with more details in my github repo ..
Here is how i did the sorting..

Code:
$ export TMPDIR='/large/tmp/dir'
$ export LC_ALL=C
$ nl concat.txt | sort -k2 -u | sort -n | cut -f2 > final.txt

Note that using LC_ALL=C will greatly speed up sorting!
Pages: « 1 [2]  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!