Bitcoin Forum
April 27, 2024, 05:22:16 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: linux question  (Read 950 times)
lolwut (OP)
Legendary
*
Offline Offline

Activity: 1148
Merit: 1000



View Profile
June 09, 2012, 05:14:20 AM
 #1

how can i search for a text string (*string*) inside every file on /mnt/raid?
1714195336
Hero Member
*
Offline Offline

Posts: 1714195336

View Profile Personal Message (Offline)

Ignore
1714195336
Reply with quote  #2

1714195336
Report to moderator
1714195336
Hero Member
*
Offline Offline

Posts: 1714195336

View Profile Personal Message (Offline)

Ignore
1714195336
Reply with quote  #2

1714195336
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714195336
Hero Member
*
Offline Offline

Posts: 1714195336

View Profile Personal Message (Offline)

Ignore
1714195336
Reply with quote  #2

1714195336
Report to moderator
1714195336
Hero Member
*
Offline Offline

Posts: 1714195336

View Profile Personal Message (Offline)

Ignore
1714195336
Reply with quote  #2

1714195336
Report to moderator
CIYAM
Legendary
*
Offline Offline

Activity: 1890
Merit: 1075


Ian Knowles - CIYAM Lead Developer


View Profile WWW
June 09, 2012, 05:18:03 AM
 #2

grep -r "string" /mnt/raid/*

With CIYAM anyone can create 100% generated C++ web applications in literally minutes.

GPG Public Key | 1ciyam3htJit1feGa26p2wQ4aw6KFTejU
lolwut (OP)
Legendary
*
Offline Offline

Activity: 1148
Merit: 1000



View Profile
June 09, 2012, 05:27:33 AM
 #3

grep -r "string" /mnt/raid/*


damn, its really that simple?
CIYAM
Legendary
*
Offline Offline

Activity: 1890
Merit: 1075


Ian Knowles - CIYAM Lead Developer


View Profile WWW
June 09, 2012, 05:59:56 AM
 #4

Pretty much that simple - if your string has characters that are normally regexp ones then you'll need to escape those (or perhaps there is an option to indicate you are not using a regexp).

You may also need some other options if binary files need to be included (not sure but I don't think they are normally included).

With CIYAM anyone can create 100% generated C++ web applications in literally minutes.

GPG Public Key | 1ciyam3htJit1feGa26p2wQ4aw6KFTejU
lolwut (OP)
Legendary
*
Offline Offline

Activity: 1148
Merit: 1000



View Profile
June 09, 2012, 06:26:31 AM
 #5

I have 4 external drives and an 8tb nas. I'm trying to take all of the data from the 4 externals and compare it and create one master folder of all unique files.

Any suggestions on how to accomplish this easily? Some files may have the same filename, but different content.
CIYAM
Legendary
*
Offline Offline

Activity: 1890
Merit: 1075


Ian Knowles - CIYAM Lead Developer


View Profile WWW
June 09, 2012, 06:50:10 AM
 #6

I see - a little tricky but you might find the following useful for this job:

Code:
find . -name \* -type f  | xargs -n1 md5sum | sort >x

Change . to for example /mnt/raid (do this for each of your drives changing the name x to something different each time).

If you then check the contents of each 'x' file you should see something like the following:

Code:
cat x
0f4fa2bf42a91febbde52b7a32495f94  ./sub1/usage.bat
1bd3cb2ef387818ebe0fc318c232e27d  ./test.txt
2f8ff6fabf4b2936197b8a93702461f9  ./y.txt
8594592b8f830139b266c2d167a6fc5c  ./test.bun.gz
8c917a3450a4969d7e32e8da71e176ab  ./sub2/menu.bat
d41d8cd98f00b204e9800998ecf8427e  ./x
f258dcd6600d3ebf238662f8445b5e4a  ./sub1/sub1.1/hello.txt

If you check "diffs' between the various x files then you should be able to find any identical md5 hashes (which doesn't guarantee that the files are identical but it is most likely that they are).

With CIYAM anyone can create 100% generated C++ web applications in literally minutes.

GPG Public Key | 1ciyam3htJit1feGa26p2wQ4aw6KFTejU
notme
Legendary
*
Offline Offline

Activity: 1904
Merit: 1002


View Profile
June 09, 2012, 06:53:50 AM
 #7

If you can program, just get a list of all the files, hash it, check your hash list.  If the hash is already in it, move on, otherwise add the hash and copy the file to the destination folder.  If you can't program, hire a programmer.

https://www.bitcoin.org/bitcoin.pdf
While no idea is perfect, some ideas are useful.
lolwut (OP)
Legendary
*
Offline Offline

Activity: 1148
Merit: 1000



View Profile
June 09, 2012, 07:02:49 AM
 #8

If you can program, just get a list of all the files, hash it, check your hash list.  If the hash is already in it, move on, otherwise add the hash and copy the file to the destination folder.  If you can't program, hire a programmer.

I'd be willing to send someone a few btc to write a script for me. It will have to work using mac binaries.
notme
Legendary
*
Offline Offline

Activity: 1904
Merit: 1002


View Profile
June 09, 2012, 07:08:39 AM
 #9

If you can program, just get a list of all the files, hash it, check your hash list.  If the hash is already in it, move on, otherwise add the hash and copy the file to the destination folder.  If you can't program, hire a programmer.

I'd be willing to send someone a few btc to write a script for me. It will have to work using mac binaries.

I would think something like python would run fine on a Mac, but there's likely programmers around who could do it without you having to install an interpreter.  If you post your requirements here you should get a decent response: https://bitcointalk.org/index.php?board=52.0.  If not, I can do it for you in python, but it will take me a couple days since I'm busy with a lot of other stuff.

https://www.bitcoin.org/bitcoin.pdf
While no idea is perfect, some ideas are useful.
CIYAM
Legendary
*
Offline Offline

Activity: 1890
Merit: 1075


Ian Knowles - CIYAM Lead Developer


View Profile WWW
June 09, 2012, 07:12:06 AM
 #10

This could be fairly simply done just as a bash script (assuming you can run bash scripts on a Mac) and I'd be willing to write it for you for 2 BTC, however, there are a couple of things I would need to know first:

1) Are you happy for MD5 (or SHA1) to be the decision that the files are identical?

2) Can the destination files simply go to one directory or if not then how to determine which directory to copy them to?

With CIYAM anyone can create 100% generated C++ web applications in literally minutes.

GPG Public Key | 1ciyam3htJit1feGa26p2wQ4aw6KFTejU
lolwut (OP)
Legendary
*
Offline Offline

Activity: 1148
Merit: 1000



View Profile
June 09, 2012, 07:17:08 AM
 #11

This could be fairly simply done just as a bash script (assuming you can run bash scripts on a Mac) and I'd be willing to write it for you for 2 BTC, however, there are a couple of things I would need to know first:

1) Are you happy for MD5 (or SHA1) to be the decision that the files are identical?

2) Can the destination files simply go to one directory or if not then how to determine which directory to copy them to?


I would say that md5 is sufficient. It's going to be 98% text files and 2% other text-based files (.sql,.sqlite,etc).

And destination DIR can be one directory, yes.
CIYAM
Legendary
*
Offline Offline

Activity: 1890
Merit: 1075


Ian Knowles - CIYAM Lead Developer


View Profile WWW
June 09, 2012, 07:57:59 AM
 #12

Okay - I've done this using two scripts. The first is called 'copy_files' and does the "find" and "md5sum" work:

Code:
if [ $# -lt 2 ]; then
 echo Usage: copy_files [source directory] [destination directory]
else
 if [ ! -d $2/hashes ]; then
  echo "Error: Did not find $2/hashes directory (create it and re-run if $2 is the correct destination)."
 else
  find $1 -name \* -type f  | xargs -n1 md5sum | xargs -n2 ./copy_new_file $2
 fi
fi

and the second is called "copy_new_file" which will copy the file to the destination unless a file with the same hash has already been copied before:

Code:
if [ $# -lt 3 ]; then
 echo Usage: copy_new_file [destination directory] [hash value] [source file]
else
 if [ ! -d $1/hashes ]; then
  echo "Error: Did not find $1/hashes directory (create it and re-run if $1 is the correct destination)."
 else
  if [ ! -d $1/hashes ]; then
   echo "Error: Did not find $1/hashes directory (create it and re-run if $1 is the correct destination)."
  else
   if [ ! -f $1/hashes/$2 ]; then
    echo $2 $3> $1/hashes/$2
    cp $3 $1
   fi
  fi
 fi
fi

To use first make sure you have execute permissions on both scripts:
Code:
chmod a+x copy_files copy_new_file

Now it is as simple as:
Code:
./copy_files source_dir dest_dir

This does have a problem that if you have two (or more) files that have the same name but have different hashes as the subsequent files will just overwrite the earlier ones. If this is going to be an issue for you then I'll work out a way to perhaps prefix the filename with the hash.

With CIYAM anyone can create 100% generated C++ web applications in literally minutes.

GPG Public Key | 1ciyam3htJit1feGa26p2wQ4aw6KFTejU
lolwut (OP)
Legendary
*
Offline Offline

Activity: 1148
Merit: 1000



View Profile
June 09, 2012, 08:01:13 AM
 #13

This does have a problem that if you have two (or more) files that have the same name but have different hashes as the subsequent files will just overwrite the earlier ones. If this is going to be an issue for you then I'll work out a way to perhaps prefix the filename with the hash.


Yeah that could be a problem. If you can just have it rename one of the files, that would be superb.
CIYAM
Legendary
*
Offline Offline

Activity: 1890
Merit: 1075


Ian Knowles - CIYAM Lead Developer


View Profile WWW
June 09, 2012, 08:22:38 AM
 #14

Okay - no problem - this is an updated "copy_new_file" script:

Code:
if [ $# -lt 2 ]; then
 echo Usage: copy_new_files [destination directory] [hash value] [source file]
else
 if [ ! -d $1/hashes ]; then
  echo "Error: Did not find $1/hashes directory (create it and re-run if $1 is the correct destination)."
 else
  if [ ! -f $1/hashes/$2 ]; then
   echo $2 $3> $1/hashes/$2
   fname="${3%.[^.]*}"
   if [ -f $1/$fname ]; then
    cp $3 $1/$2.$fname
   else
    cp $3 $1
   fi
  fi
 fi
fi

Now if it finds a file with the same name already exists then the destination filename with have the md5 hash prefixed (e.g. y.txt becomes 2f8ff6fabf4b2936197b8a93702461f9.y.txt).

With CIYAM anyone can create 100% generated C++ web applications in literally minutes.

GPG Public Key | 1ciyam3htJit1feGa26p2wQ4aw6KFTejU
CIYAM
Legendary
*
Offline Offline

Activity: 1890
Merit: 1075


Ian Knowles - CIYAM Lead Developer


View Profile WWW
June 09, 2012, 08:41:59 AM
 #15

Assuming that these scripts do actually accomplish what you are trying to do payment to 16grCc2rdtfRvnY2tKStaJDN3xgUHA4gjy would be much appreciated.


Cheers,

Ian.

With CIYAM anyone can create 100% generated C++ web applications in literally minutes.

GPG Public Key | 1ciyam3htJit1feGa26p2wQ4aw6KFTejU
lolwut (OP)
Legendary
*
Offline Offline

Activity: 1148
Merit: 1000



View Profile
June 09, 2012, 08:49:37 AM
 #16

Assuming that these scripts do actually accomplish what you are trying to do payment to 16grCc2rdtfRvnY2tKStaJDN3xgUHA4gjy would be much appreciated.


Cheers,

Ian.

http://blockchain.info/address/16grCc2rdtfRvnY2tKStaJDN3xgUHA4gjy

I can barely keep my eyes open, but i sent your payment anyway. I trust that if I have issues with the script you will support me Smiley

i'll message you later on, thanks for your help!
CIYAM
Legendary
*
Offline Offline

Activity: 1890
Merit: 1075


Ian Knowles - CIYAM Lead Developer


View Profile WWW
June 09, 2012, 08:53:19 AM
 #17

I can barely keep my eyes open, but i sent your payment anyway. I trust that if I have issues with the script you will support me Smiley

Payment received (thanks) and sure if you have any problems with it just PM me.

Smiley

With CIYAM anyone can create 100% generated C++ web applications in literally minutes.

GPG Public Key | 1ciyam3htJit1feGa26p2wQ4aw6KFTejU
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!