Parsing options in bash2017-11-12
So, you're writing a shell script and you've come to realize that it needs to be able to perform several different functions. You could separate each of those into a separate script... but they're closely tied together; they complement each other. It only makes sense to make them available via the same command, just activated using different options.
Doing it manually
Obviously, the simplest option is employing a
while loop, a
case statement (or, alternatively,
ifs) and doing it by yourself.
#!/bin/bash OPTION_A=0 OPTION_B="" OPTION_THREE="" while [ $# -gt 0 ]; do case "$1" in -a) OPTION_A=1 ;; -b) shift if [ $# -gt 0 ]; then OPTION_B=$1 else echo "Option -b requires an argument" exit fi ;; --double) OPTION_DOUBLE=1 ;; -*) if [ "$1" == "--" ]; then shift break else echo "Unknown option '$1'" exit fi ;; *) break ;; esac shift done
While relatively simple, this approach has a couple of drawbacks:
- If you have a lot of options, the amount of code balloons up quite quick.
- Checking for option arguments requires even more code, and there's a risk you'll make a mistake somewhere.
- Did you remember about supporting
--as an "end of options" marker?
Employing getopts (bash built-in)
Parsing options is a relatively common problems, and as it usually happens with common problems – there's a library for that.
If you're using a POSIX-conformant shell, you can use the shell builtin
getopts for option parsing.
#!/bin/bash while getopts "ab:d" OPTNAME; do case "$OPTNAME" in a) OPTION_A=1 ;; b) OPTION_B=$OPTARG ;; d) OPTION_DOUBLE=1 ;; \?) exit ;; esac done
getopts will go through the script arguments, parsing options and their arguments (values).
The exit status is zero while parsing options, and non-zero when a non-option is encountered, which makes it
rather natural to stick the whole thing into a
Okay, but how do we actually specify the supported options? This done via the first argument to
is usually called an option string. It is basically a list of possible single-letter options,
each of them optionally followed by a colon (
:), which serves as a "this option expects an arguments" marker.
getopts processes the script options, it puts the current option index in the
and the option argument (if it takes one) in the
OPTARG variable. The variable which stores the actual option name
can be controlled by the user and is the second argument to
getopts; in the example above, I use
as it's a descriptive name that also fits nicely with the other two variables.
Error handling with getopts
What about error handling, you may ask?
getopts does that for you, too! If an unknown option is encountered,
getopts will print an error message, and the selected variable (
OPTNAME in the example above) will be
? (a question mark character).
If you want more control over error handling, you may prepend your option string with a colon
(so in the example above, it would become
#!/bin/bash while getopts ":ab:d" OPTNAME; do case "$OPTNAME" in a) OPTION_A=1 ;; b) OPTION_B=$OPTARG ;; d) OPTION_DOUBLE=1 ;; :) echo "Option '-$OPTARG' requires an argument, ya dingus" exit ;; \?) echo "I don't know what '-$OPTARG' is!" exit ;; esac done
When you do this, the
getopts behaviour changes in a few ways.
First, it won't automatically print any error messages.
Second, upon encountering an unknown option,
OPTNAMEwill be set to
?(as in the standard scenario), and the unknown option will be put into the
- Third, when an option requiring an argument is missing said argument,
OPTNAMEwill be set to
:(a colon) and said option will be put into the
Employing getopt (standalone binary)
One downside of
getopts that was hinted by the examples above is, unfortunately, the lack of support for
If we need to support these, we can use the separate
getopt --name 'mytestscript' --options 'ab:' --longoptions 'double' -- "$@"
Hmm... This doesn't really look like option parsing, now does it? So what does
getopt do, really? Basically, it performs three functions for us:
Error handling: like the shell builtin, it will print error messages when an error is encountered: an unknown option, or an option missing a parameter. You can also control this behaviour with the prepend-with-colon function; should you do that, errors will be silently swallowed.
Shuffling the arguments: while POSIX mandates that options may not follow non-options, many implementations of
getopt(3)(the libc function) allow for mingling the two (so you can do something like
chmod u+x -R directory/). The most common example of this is glibc (the GNU C Library), commonly found on Linux.
getopt(1)inherits this behaviour. If you don't want this, you can prepend the option string with
+(a plus sign). Alternatively, you can set the environment variable
POSIXLY_CORRECT, although this has the downside of altering the behaviour of many other programs.
- Marking the end of options: the output of
getoptwill always contain a
--to tell us where the options end.
Okay, but that still doesn't answer the question: what does
getopt actually DO? It's a separate program, so it can't set any variables inside our shell.
As hinted above,
getopt outputs a reformatted version of the argument list on stdin. For an example:
user $ ./mytestscript nonoption -b argument --double 'non option with spaces' -b 'argument' --double -- 'nonoption1' 'non option with spaces'
Where do we go from here? Well, we need to somehow put the output from
getopt into our positional parameters (
$1 and so on).
To do this, we can use the
There's one problem, though:
getopt, by default, quotes the encountered non-options and arguments, and if we
pass the output as-is to
set, said quotes will make it to our parameters. We can work around this by using
eval, which will cause the shell
to properly process the quotes first.
OPTIONS=`getopt --name 'mytestscript' --options 'ab:' --longoptions 'double' -- "$@"` eval set -- "$OPTIONS"
Now that our positional parameters are all set, we can go back and copy most of the code from the first approach.
OPTIONS=`getopt --name 'mytestscript' --options 'ab:' --longoptions 'double' -- "$@"` [ "$?" -ne 0 ] && exit eval set -- "$OPTIONS" while [ $# -gt 0 ]; do case "$1" in -a) OPTION_A=1 ;; -b) shift OPTION_B=$1 ;; --double) OPTION_DOUBLE=1 ;; --) shift break ;; esac shift done
While the symbol soup near
getopt itself may look a bit terrifying, the script itself is quite nice and readable.
Why not to use getopt(1)
Unfortunately, as it often happens, many nice things have their drawbacks, and
getopt(1) is no different.
The main problem with said program is possible differences in behaviour between different platforms. For example,
on some Unices,
getopt(1) doesn't support long options... which was pretty much the only reason we considered
using it over the shell builtin!
The other issue is the possibly non-POSIX-conformant behaviour. This one heavily depends on our use case; if we're writing a script for personal use, and our system exhibits the glibc behaviour, the ability to intertwine options and non-options may be comfortable. On the other hand, if we want to redistribute the script, it may cause portability issues.
Manual parsing vs. getopts
That being said, we're left with the first two approaches? Which way to go?
Personally, I think that the answer is "it depends" – if you're only using short options, using
be the better way, since not only you're guaranteed for the option parsing behaviour to follow a standard,
but also the possible users of the script are guaranteed their parameters will be parsed in a certain way.
Should you need to support long options, or optional arguments, you're pretty much bound to write the code yourself. And that doesn't automatically make it a bad thing! Just be sure to test your code thoroughly to make sure your users won't have to spend their time wrestling your option parsing code, instead of actually enjoying the script's features.