manage files *en masse* from the command line



I've been having difficulties to figure out how to set nuanced parameters for managing large sets of files.

For example, let’s say that I have a project with 600 files in an active working directory, and I wish to copy these files to a backup storage device. I don’t want to just overwrite everything in the storage directory, since sometimes an archived version of a file will be contain different information or edits that I wish to keep (this situation happens sometimes when collaborating on a file, and someone emails back a changed version using the same name as the original -- resulting in different doc versions with the same name, each of which I want to preserve).

The usual bash tools don't quite get me to where I want to be. The command cp –a -u -i sourcedirectory/* targetdirectory/ will allow me to copy a file with all of its attributes such as permissions and ownership (-a) only when the SOURCE file is newer than the destination file or when the destination file is missing (-u)it will ask me to verify all copies before overwriting (–i). While this gives me a lot of control to ensure that I am not accidently overwriting different files with the same name, it requires that I manually validate every file. For a folder of 600 files, this is a big chore.

What I would like to do is to move files from one folder to another, replacing the existing contents only if the time signatures (both date created and date modified) are identical. If they are not, I would like to keep both copies, and to be able to define a custom name parameter for the conflicts:

Thus, let’s say I have two files like this:

|date created |date modified
examplefile.txt |01-Jan-2012 |01-Feb-2012
examplefile.txt |01-Jan-2012 |15-Feb-2012

I would like, during the copy, to be either alerted of the conflict and prompted to make a renaming decision (and to verify that I am not entering a name that already exists), or simply to define in the script how the renaming action will occur, so that the final outcome would be this:

examplefile-conflict.txt
-or-
examplefile-feb.txt
-or-
examplefile –whatever.txt


MINOR NOTES on cp:

Basic command options

[Note: It's a good idea to always read man pages, especially for coreutils].

To recursivly copy without following symlinks and without altering attributes such as permissions and ownership of the source files, the command is:

cp -a - or - cp -dr --preserve=all

[Normally, when copying, the permissions and owner of the new file are attuned to the user performing the copy.]

To prevent overwriting an existing file, the command is:

cp -n - or - `cp --no-clobber'

[This option overrides a previous -i option]

To confirm before overwriting a file, the option is:

cp -i

Working with backups

Some GNU tools, including cp, ln, mv, install and tar support file backup and allow for optional controls of backup behavior.Without this option, the original versions are destroyed.

To make a backup of each file that would otherwise be overwritten or removed, the command is:

-b --backup[=method]

When the backup option is used but method is not specified, then the value of the VERSION_CONTROL [SIMPLEBACKUPSUFFIX ] environment variable is used. And if VERSION_CONTROL [SIMPLEBACKUPSUFFIX] is not set, the default backup type is existing [~].

Note that the short form of this option, -b does not accept any argument. Using -b is equivalent to using --backup=existing. Append suffix to each backup file made with --b.

Valid methods are both descriptive names and unique abbreviations:

none off Never make backups (even if --backup is given).

numbered t Always make numbered backups.

existing nil Make numbered backups of files that already numbered backups, simple backups otherwise.

simple never Always make simple backups. Please note ‘never’ is not to be confused with ‘none’.

-S suffix --suffix=suffix Thus, if a file 'foo' exists in the both the source and target:

cp -r --backup source target    #   rename foo → foo~
cp -r --backup=t source target  #   rename foo → foo.~1~ (or foo.~2~, etc)

Further, when copying foo to bar, if there is already a file called bar, the existing bar will be renamed. After the copy bar will contain the contents of foo. By default, bar is renamed to bar~.


View or Post Comments