1

Premise: I often get files from colleagues that I need to work on. Often times, these files have spaces in the names. Working with these files at the command line or in scripts can be tedious.

Possible solutions: With the rename program (on nix systems), I can easily rename these files, eg:

$ rename --sanitize --lower-case *

I recently found that rename can actually just create a link to the original file, leaving the original filename unchanged:

$ rename --sanitize --lower-case --symlink *

Question: What are the (potential) pros/cons of these two methods?

For example, it seems that creating a bunch of symlinks has the drawback of making my filesystem 'messier', but renaming the files has the draw back of trying to match up the files that I'm using with my colleagues (whether I'm redistributing my code or just communicating "I did [analysis] on [file_x.csv]").

Additional info:

Generally I'm the only person actively working on these files, but it is important that my work be archived so that other people can refer back to it or re-analyze the data in any way of their choosing. I work in an academic setting, so, in principle, the raw data and my methods should be archived indefinitely.

John
  • 111
  • 4
  • 1
    Are you the only one performing analyses? Or do other people work with these files too? – kbrose Aug 14 '18 at 14:09
  • I'm generally the only person working on these files. I updated the question with more details. – John Aug 14 '18 at 16:29

1 Answers1

1

I understand your pain!

Can you not simply use tools to read the filenames and filter those as required, thereby letting your development language do all the escaping etc. for you?

In Python, for example, that'd mean using something like: os.walk() or os.listdir().

n1k31t4
  • 14,858
  • 2
  • 30
  • 49
  • That's a great point, I like that. I believe os.walk and os.listdir return lists or other iterables. I can imagine that as I add files to the directory, the resulting iterables are going to change, so I can't hardcode a reference to a file as, eg, os.listdir()[0] if I anticipate re-running that script in the future. Are there any work arounds for this? – John Aug 14 '18 at 17:25
  • @John - They do return iterables (generators). You can sort by time-created etc. (probably any meta-data associated with the files), then hard-code positions; however, I generally advise against hard-coding. Otherwise, you can hardcode the names of the files or use regex patterns. – n1k31t4 Aug 14 '18 at 17:53