Bitcoin Forum
May 05, 2024, 06:55:48 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: python script compare lines in 2 text files and output matches  (Read 269 times)
crofrihosl (OP)
Jr. Member
*
Offline Offline

Activity: 56
Merit: 3


View Profile
September 10, 2019, 12:42:29 PM
Merited by LoyceV (1)
 #1

compare  every lines in file1 with lines in file2
string comparative not only first number

do this

file1
Code:
1FFYY4EGHTVBWHEQbPcceME9YA6BWnEJxK
1GYeVf48v55hWHwynqgpXSnP84A96K9JxJ
1Ji25E8DaLpsgekWhkQk4UG5L6pz468EKy
1K5MT7BbKvCj4YeALeoEQr5sK2bH2uZdWi
1KRQjx2T31HC5boSoj9h3eMxHPkTFVtcJX

file2
Code:
1C1wxy5pcFj9KBFDFFnVyUYr7puT8abHaW	
1K5MT7BbKvCj4YeALeoEQr5sK2bH2uZdWi
1Ly8X7xSoJdM6nfZSi1HDQuBjMjiuiev1r
12ux1FpMq5iJ14wycDV2DpBcqxHTTGPSjC
16jw8vgKjA8DThTwpBb3pfk6tGbMHWnz6x

output
Code:
1K5MT7BbKvCj4YeALeoEQr5sK2bH2uZdWi
"Bitcoin: the cutting edge of begging technology." -- Giraffe.BTC
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714935348
Hero Member
*
Offline Offline

Posts: 1714935348

View Profile Personal Message (Offline)

Ignore
1714935348
Reply with quote  #2

1714935348
Report to moderator
1714935348
Hero Member
*
Offline Offline

Posts: 1714935348

View Profile Personal Message (Offline)

Ignore
1714935348
Reply with quote  #2

1714935348
Report to moderator
mocacinno
Legendary
*
Offline Offline

Activity: 3388
Merit: 4919


https://merel.mobi => buy facemasks with BTC/LTC


View Profile WWW
September 10, 2019, 01:01:32 PM
Merited by bones261 (2), ABCbits (1)
 #2

in python, that's relatively easy...
I wrote this code from memory (and copy/pasted 2 lines from the source i mentioned below), it should work, but typos might happen

Code:
firstfile= [line.rstrip('\n') for line in open("textfile_containing_first_list.txt")]
secondfile= [line.rstrip('\n') for line in open("textfile_containing_second_list.txt")]
for firstline in firstfile:
  if firstline in secondfile:
    print(firstline)

part of the source : https://qiita.com/visualskyrim/items/1922429a07ca5f974467 (i was to lazy to write a loop over a filehandle from memory)

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
crofrihosl (OP)
Jr. Member
*
Offline Offline

Activity: 56
Merit: 3


View Profile
September 10, 2019, 01:41:40 PM
 #3

in python, that's relatively easy...
I wrote this code from memory (and copy/pasted 2 lines from the source i mentioned below), it should work, but typos might happen

Code:
firstfile= [line.rstrip('\n') for line in open("textfile_containing_first_list.txt")]
secondfile= [line.rstrip('\n') for line in open("textfile_containing_second_list.txt")]
for firstline in firstfile:
  if firstline in secondfile:
    print(firstline)

part of the source : https://qiita.com/visualskyrim/items/1922429a07ca5f974467 (i was to lazy to write a loop over a filehandle from memory)

5 STARS Grin Grin Grin
thank you so much
crofrihosl (OP)
Jr. Member
*
Offline Offline

Activity: 56
Merit: 3


View Profile
September 10, 2019, 01:50:07 PM
 #4

i tried to add a line to redirecting print output to a 3.txt  Undecided

i need the proper line
 
Code:
 print(firstline)	, file=open("3.txt", "a"))
errors 
>>IndentationError: unexpected indent
>>SyntaxError: invalid syntax
mocacinno
Legendary
*
Offline Offline

Activity: 3388
Merit: 4919


https://merel.mobi => buy facemasks with BTC/LTC


View Profile WWW
September 10, 2019, 01:56:13 PM
 #5

make sure your indentations are correct... Tab =/= space

As for writing to a file...
file= open("outputfile.txt","a+")
file.write("key %s\r\n" % firstline)
file.close()

I'll be heading home for the day, if you have more questions... Don't hesitate to ask them, i'll be answering them tomorrow (or somebody else will probably help you out in my absence)

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
crofrihosl (OP)
Jr. Member
*
Offline Offline

Activity: 56
Merit: 3


View Profile
September 10, 2019, 02:13:20 PM
 #6

make sure your indentations are correct... Tab =/= space

As for writing to a file...
file= open("outputfile.txt","a+")
file.write("key %s\r\n" % firstline)
file.close()

I'll be heading home for the day, if you have more questions... Don't hesitate to ask them, i'll be answering them tomorrow (or somebody else will probably help you out in my absence)

it print correct result
but in outputfile.txt file always containing the last line in file1 Undecided

have a good day


now i am using this batch for temporally solution
Code:
@echo off
comprs.py >> 3.txt
exit
bob123
Legendary
*
Offline Offline

Activity: 1624
Merit: 2481



View Profile WWW
September 10, 2019, 04:19:49 PM
 #7

it print correct result
but in outputfile.txt file always containing the last line in file1 Undecided

Then you are most probably calling it in the wrong place.
You need to write it to the file when you are checking (and printing) the line which is present in both files.

If we take the code from above:

Code:
// open file writable, in this example as: "file"

firstfile= [line.rstrip('\n') for line in open("textfile_containing_first_list.txt")]
secondfile= [line.rstrip('\n') for line in open("textfile_containing_second_list.txt")]
for firstline in firstfile:
  if firstline in secondfile:
    print(firstline)
    file.write(firstline+"\n")
file.close()

crofrihosl (OP)
Jr. Member
*
Offline Offline

Activity: 56
Merit: 3


View Profile
September 10, 2019, 08:04:00 PM
 #8


can you correct this code:

Code:
from coinkit.keypair import BitcoinKeypair

with open("prvkey.txt","r") as f:
    in_prvkey = f.readlines()
in_prvkey = [x.strip() for x in in_prvkey]
f.close()
#print  in_prvkey

outfile = open("prvkey2add.txt","w")
for x in in_prvkey:
  k = BitcoinKeypair(x)
  print k
 
outfile.write(k.address(x)+"\n")
outfile.close()


the output file part Huh

(kindly also adding some library that up to date and support all prvtkys format )
- read private keys from a file
- output public address
mocacinno
Legendary
*
Offline Offline

Activity: 3388
Merit: 4919


https://merel.mobi => buy facemasks with BTC/LTC


View Profile WWW
September 11, 2019, 09:18:44 AM
 #9

To tell you the truth, i've never used the module "BitcoinKeypair". That being said, i do see a potential problem with the identation of the code you've shared:
on line 10, you start looping over all private keys in list in_prvkey, but on line 14 you haven't used any identation, so the line "outfile.write(k.address(x)+"\n")" will only write one address to your file (the address generated by the last private key in in_prvkey.
If, accidentally, your input file contains a newline at the bottom of the file, there is a chance the last ellement of the list is empty, so the address-function might fail due to this...

Anyways, if you see errors, my gut feeling tells me that it's either a wrong identation combined with an error on the last line of the input file, or a misusage of the BitcoinKeypair module... Can you post the exact error message?

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
bob123
Legendary
*
Offline Offline

Activity: 1624
Merit: 2481



View Profile WWW
September 11, 2019, 09:27:45 AM
 #10

can you correct this code:
~snip~
the output file part Huh

If it prints the correct key, but writes the wrong (or nothing) to the file, my solution has been posted.
Just view the code and at least try to understand it. I gave an explanation on how to fix it, together with an example integrated in the code from above.

crofrihosl (OP)
Jr. Member
*
Offline Offline

Activity: 56
Merit: 3


View Profile
September 11, 2019, 01:01:58 PM
 #11

can you correct this code:
~snip~
the output file part Huh

If it prints the correct key, but writes the wrong (or nothing) to the file, my solution has been posted.
Just view the code and at least try to understand it. I gave an explanation on how to fix it, together with an example integrated in the code from above.


i did try it on first minute you posted
it didn't work too ! no need for explanation Grin

thank you bob123 and mocacinno + sent you a positive feedback
crofrihosl (OP)
Jr. Member
*
Offline Offline

Activity: 56
Merit: 3


View Profile
September 11, 2019, 01:09:07 PM
 #12

1- save this as comprs.py
Code:
firstfile= [line.rstrip('\n') for line in open("file1.txt")]
secondfile= [line.rstrip('\n') for line in open("file2.txt")]
for firstline in firstfile:
  if firstline in secondfile:
    print(firstline

2- save this as result.bat (batch file)
Code:
@echo off
comprs.py >> file3.txt
start file3.txt
exit

3-you only need to run result.bat
crofrihosl (OP)
Jr. Member
*
Offline Offline

Activity: 56
Merit: 3


View Profile
September 24, 2019, 02:31:44 PM
 #13

#edit

Code:
import ecdsa, ecdsa.der, ecdsa.util, hashlib
import hashlib, os, re, struct
import bitcoin

f1=open("output","a") # f1=open("output","w")
firstfile= [line.rstrip('\n') for line in open("file1t")]
secondfile= [line.rstrip('\n') for line in open("file2")]

for firstline in firstfile:
  if firstline in secondfile:
    print >>f1, (firstline)
LoyceV
Legendary
*
Offline Offline

Activity: 3304
Merit: 16599


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
September 24, 2019, 03:57:19 PM
 #14

Does it have to be python? Bash command comm does exactly what you need:
Code:
comm -12 <(sort file1) <(sort file2)

I don't know how fast a Python loop would be, but the above code takes about 0.05 seconds for 2 files with 50,000 lines each.

odolvlobo
Legendary
*
Offline Offline

Activity: 4298
Merit: 3214



View Profile
September 24, 2019, 04:57:54 PM
Last edit: September 24, 2019, 05:11:54 PM by odolvlobo
 #15

Code:
...
for firstline in firstfile:
  if firstline in secondfile:
    print >>f1, (firstline)
For a small number of lines, that might be ok. But for a large number of lines, it would be faster to sort the files first, and then compare. It's O(n log n) vs, O(n2).

Like this:

Does it have to be python? Bash command comm does exactly what you need:
Code:
comm -12 <(sort file1) <(sort file2)

I don't know how fast a Python loop would be, but the above code takes about 0.05 seconds for 2 files with 50,000 lines each.

Comparison psuedo code looks like this:
Code:
e1 = file1.begin()
e2 = file2.begin()
while e1 ≠ file1.end() and e2 ≠ file2.end()
    if *e1 < *e2
        ++e1
    else if *e1 > *e2
        ++e2
    else
        print *e1
        ++e1
        ++e2

Computer science FTW.

Join an anti-signature campaign: Click ignore on the members of signature campaigns.
PGP Fingerprint: 6B6BC26599EC24EF7E29A405EAF050539D0B2925 Signing address: 13GAVJo8YaAuenj6keiEykwxWUZ7jMoSLt
crofrihosl (OP)
Jr. Member
*
Offline Offline

Activity: 56
Merit: 3


View Profile
September 24, 2019, 05:30:52 PM
 #16

Does it have to be python? Bash command comm does exactly what you need:
Code:
comm -12 <(sort file1) <(sort file2)

I don't know how fast a Python loop would be, but the above code takes about 0.05 seconds for 2 files with 50,000 lines each.

i use windows so yes it has to be python
yes it is very very slow for files above 1,000,000 lines

when i want to compare a files with +10,000,000 lines to other 2,000,000 lines (i had to cancel and close the script Undecided )

i sort the first file then i split it to many small files 100k lines each (and was very very slow too Undecided)   

at some point python did not help my requirement

LoyceV
Legendary
*
Offline Offline

Activity: 3304
Merit: 16599


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
September 24, 2019, 06:16:07 PM
 #17

i use windows so yes it has to be python
You can use bash utilities on Windows too. Or boot your computer from an Ubuntu LIVE DVD just to do this.

Quote
yes it is very very slow for files above 1,000,000 lines

when i want to compare a files with +10,000,000 lines to other 2,000,000 lines (i had to cancel and close the script Undecided )
It took me a while to create 2 test-files with 50,000 Bitcoin addresses, so I just copied the same addresses to the same files to make it 10 million and 2 million lines per file.
The comm-code above took 9 seconds to find all matches (and my PC is not very fast). I strongly suggest to use the proper tools for the job Smiley

Update: it used 1.5 GB RAM to do this. If you have much more data to compare, it might reduce memory load if you sort the files first.

Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!