Got Parameters? Just Use Docopt
Written by J. David Smith
Published on 07 September 2017
It's one of those days where I am totally unmotivated to accomplish anything (despite the fact that I technically already have – the first draft of my qual survey is done!). So, here's a brief aside that's been in the back of my mind for a few months now.
It is extremely common for the simulations in my line of work
Or our, hi fellow student! to have a large set of
parameters. The way that this is handled varies from person to person, and at
this point I feel as though I've seen everything; I've seen simple getopt
usage, I've seen home-grown command-line parsers, I've seen
compile-time #define
s used to switch models!
Worse, proper documentation on what the
parameters mean and what valid inputs are is as inconsistent as the
implementations themselves. Enough. There is a better way.
Docopt is a library that is available in basically any language you care about This includes C, C++, Python, Rust, R, and even Shell! Language is not an excuse for skipping on this. that parses a documentation string for your command line interface and automatically builds a parser from it. Take, for example, this CLI that I used for a re-implementation of my work on Socialbots: See here for context on what the parameters (aside from ζ, which has never actually been used) mean.
Simulation for <conference>.
Usage:
recon <graph> <inst> <k> (–etc | –hmnm | –zeta <zeta>) [options]
recon (-h | –help)
Options:
-h –help Show this screen.
–etc Expected triadic closure acceptance.
–etc-zeta <zeta> Expected triadic closure acceptance with ζ.
–zeta <zeta> HM + ζ acceptance.
–hmnm Non-Monotone HM acceptance.
–degree-incentive Enable degree incentive in acceptance function.
–wi Use the WI delta function.
–fof-scale <scale> Set B_fof(u) = <scale> B_f(u). [default: 0.5]
–log <log> Log to write output to.
This isn't a simple set of parameters, but it is far from the most complex I've
worked with. Just in this example, we have positional arguments (<graph> <inst> <k>
)
followed by mutually-exclusive settings (–etc | –hmnm | ...
)
followed by optional parameters ([options]
). Here is how you'd parse this
with the Rust version of Docopt:
const USAGE: &str = ""; // the docstring above
#[derive(Serialize, Deserialize)]
struct Args {
// parameter types, e.g.
arg_graph: String,
arg_k: usize,
flag_wi: bool,
// ...
}
fn main() {
let args: Args = Docopt::new(USAGE)
.and_then(|d| d.deserialize())
.unwrap_or_else(|e| e.exit());
}
This brief incantation:
- Parses the documentation string, making sure it can be interpreted.
- Correctly handles using
recon -h
andrecon –help
to print the docstring. - Automatically deserializes every given parameter.
- Exits with a descriptive (if sometimes esoteric, in this implementation) error message if a parameter is missing or of the wrong type.
The same thing, but in C++
is:
static const char USAGE[] = R""; // the docstring above
int main(int argv, char* argv[]) {
std::map<std::string, docopt::value> args
= docopt::docopt(USAGE,
{argv + 1, argv + argc},
true,
"Version 0.1");
}
Although in this version type validation must be done manually (e.g. if you
expect a number but the user provides a string, you must check that the given
type can be cast to a string), this is still dramatically simpler than any
parsing code I've seen in the wild. Even better: your docstring is always up
to date with the parameters that you actually take.
Of course,
certain amounts of bitrot are always possible. For example, you could add a
parameter but never implement handling for it. However, you can't accidentally
add or rename a flag and then never add it to the docstring, which is far more
common in my experience. So – for your sanity and mine –
please just use Docopt (or another CLI-parsing library) to read your
parameters. These libraries are easy to statically link into your code (to
avoid .dll
/.so
not found issues), and so your code remains easy to move
from machine to machine in compiled form. Please. You won't regret it.