Usul: non-POSIX tabular Unix utilities

Today I pushed usul to GitHub. Usul is a suite of non-POSIX replacements for the standard Unix command line utilities, designed to fix a number of problems POSIX enforces in the name of backwards compatibility. As an example, let’s take ls, which happens to be one of the very few utilities I’ve written so far. POSIX demands that the output of ls -Rl look something like this:

total 16
drwxr-xr-x  3 cls  cls   102  4 Jan  2012 colors
-rw-r--r--  1 cls  cls   261 28 Jun 23:04 gvimrc
-rw-r--r--  1 cls  cls  1104 28 Jun 23:10 vimrc

./colors:
total 16
-rw-r--r--  1 cls  cls  7765  4 Jan  2012 molokai.vim

This is absolutely terrible if you want to do something with this information, like pipe it somewhere. We have these pointless “total 16” headers which we have to grep -v out of the way to begin with, then we have two different formats for timestamps depending on whether they’re older than six months or not, and… don’t even get me started on the “./colors:” format for recursing directories. Seriously. Now, Usul has output that is much more useful for piping around:

drwxr-xr-x  cls  cls  2012-01-04 14:36  colors/
-rw-r--r--  cls  cls  2012-01-04 14:36  colors/molokai.vim
-rw-r--r--  cls  cls  2012-06-28 23:04  gvimrc
-rw-r--r--  cls  cls  2012-06-28 23:10  vimrc

Each of those double spaces is actually a tab, which are field separators, and the line feeds are record separators, à la awk.

Well, it turns out that we have a totally separate utility called find, with an argument syntax totally unlike any other, largely for the reasons shown above. So we can use find to recursively list those directories, but if the paths contain a line feed we have to use find -print0 to separate files with NUL bytes instead of line feeds (which isn’t even in POSIX), and then we’ll need xargs -0 to read it again, but no other tools support that so we’ll probably end up creating some monstrous find query and the Unix philosophy is dead.

All Usul utilities use the environment variables FS and RS, field and record separators, just as awk does. find . -print0 is just the same as ls -R with FS= RS=, but all utilities (e.g., grep) would understand that the records are separated by NUL bytes, not just xargs. They are all ‘tabular’, so you don’t have to waste time coercing output from one program to another.

Another thing Usul does well is just having generally sane behaviour. ls -Rd will do what you expect, only ls -Ra will recurse into hidden directories, and so on. It’s also very pedantic about avoiding race conditions. Anyway, I’ll be quietly getting on with these. It’s mostly for my own benefit, but someone else might find them useful, too.