I run a lot of scripts (these days using slurm) and, as is often the case in bioinformatics, they fail more often than I would wish for. In order to be confident that the rest of the pipeline won’t continue anytime any of my commands returns non-zero exit code that suggests a problem, I use following two lines at the beginning of my scripts:
set -e (causes the shell to exit if any subcommand or pipeline returns a non-zero status, read more here about when it is a good idea to use it and when not)
set -x (each command gets printed right before it is executed so that the logfiles become very verbose and arguably easier to debug:)
However this time interesting thing happened. I was running Tandem Repeats Finder and my script kept crashing although the output looked completely normal and did process all sequences in my input fasta file. I subsampled my input file into very tiny file and it happened again, with return code 63 that doesn’t have any traditional meaning. I cleaned my fasta header, removed all whitespaces and prepared very neat fasta file and the script crashed again, leaving hundreds of dot characters in my logfile. Interestingly, when I ran the whole thing from command line, Tandem Repeats Finder finished by saying “Done” and no nasty dots in the output.
This time I submitted a script with 5 sequences only and returned error code was 5.
I did the same thing with 2 sequences and returned error code was 2.
I ran the command again from command line (that finished by graceful “Done.” and no dots) and typed echo $?
$? stores value of the last executed command and it was surprisingly non-zero. So although it “seemed” that everything was fine from the command line, it actually wasn’t.
Or was it?
It turns out everything was fine, just the software decided – for some reason – to return number of sequences as an error code instead of 0 as one would expect.
Going back to the manual, I found out that with -ngs parameter, Tandem Repeats Finder finally returns 0 when everything is fine. So this was one of those rare moments when I was being unjustly mistrustful:-)
The standard consensus is to use exit codes to report important errors, not runtime features. There is a set of reserved exit codes with special meanings as described here. If you decide to use exit codes differently, good practice is to mention how you use them in a manual.