Tuesday, October 6, 2009

Coding styles that make me twitch, part 4

All too often, I see code like this:

$md5 = `md5sum $filename.new | awk '{ print $1 }'`;
$md5 =~ s/\n$//;
if ($md5 ne $origmd5) {
system("mv $filename $filename.old");
system("mv $filename.new $filename");

}
Well, you get the idea.

Sure, Perl can pretend to be a glorified shell script, but there are perfectly well-functioning internal functions for these things.

Let's ignore the non-portability of the md5sum command for the time being, and also avoid shaving that awk call down with cut and an enclosing echo -n $(…) - we're here for the Perl, not the shell, right?

The above system() calls can easily be replaced with the following to save yourself a few unnecessary forks:

rename($filename,"$filename.old") and
rename("$filename.new",$filename);
Note the slight refinement of using and to avoid clobbering the file. But the documentation for rename() suggests that we use move() from File::Copy instead:

use File::Copy qw(mv);
mv($filename,"$filename.old") and
mv("$filename.new",$filename);
Similarly, there are nice built-in functions for many other frequent victims of system(), e.g.:

chmod
chgrp + chown
link
mkdir
rmdir
symlink
unlink
And of course there is a bunch of more or less helpful modules on CPAN (in addition to the already mentioned File::Copy), e.g. File::Path.

2 comments:

Unknown said...

I have a couple of toolkitty like things I keep for doing system maintenance stuff.

I actually like the use of operating system mv functions like that sometimes.

If the reason to inline it is to reduce the need to fork, the reason you KEEP using them is you want absolutely certainly that the operation will retain the full system semantics, which the Perl equivalent reimplementation might accidentally break.

Of course, sometimes you DO want to have the Perl versions and have more control.

Just depends on the situation.

(The md5sum on the other hand, should probably be Perl)

bakkushan said...

Yes, this is clearly a YMMV situation.

In the cases where I've found a need for specific OS semantics, it's usually been better to implement the job in a shell script; I prefer something close to POSIX shell.