Better CLI option parsing in Scala

April 04, 2012

Command line interfaces have been with us almost as long as computers themselves - first cli's were created in early 1960s, only two decades later than first general purpose computer was presented (in 1946). From that old times command line and command line interfaces became trusted and loyal friends of the programmer.

Fast forward to modern times. Now, half a century later, graphical user interfaces seem to be triumphing over all other interfaces. GUI evangelists love to present pretty charts, which demonstrate that CLI's are dying. But is CLI really dead or planning to die?

No way, of course. The whole UNIX philosopy is built around small, orthogonal utilities, each doing its small piece of work perfectly (think wc, cat, find, bash pipes). And, obviously, this effect is almost impossible to achieve using pretty graphical systems with lots of buttons, checkboxes, sliders, knobs and what-not.

The "almost" part of that statement is what keeps the momentum in GUI development - Gnome, Ubuntu, KDE, Apple all search for the elusive solution, which could unite the world of interfaces once and for all.

The biggest problem with their effort is the heterogeneosity of the user base.

At one end of the scale, we have people, that see the computer first time in their life - so the interface must be immediately obvious to them. This is the main focus of the Apple development, for example - they build stunning, intuitive, easy-to-use interfaces, which are a pleasure to use. But focus on the users, which "see the computer the first time in their life" may be good for marketing, but not for real computer use - just try to do any real-word system scripting task via gui. Ok, you can even forget scripting - just copy a file from one dir to another, and you are already losing your precious time if you are doing it with gui.

Which brings us to the other end of the scale - hardcore programmers. The guys, they use nothing but the command line, and they are extremly productive using it (I can't back up this claim with any hard data, only my observations - can you do better?). Again, this approach has problems - it's hard to learn and requires some specialized knowledge. So, command line will probably remain the chosen weapon for "elite" users, and the mainstream would sit in the gui land.

But since most of programmers sit closer to the "hardcore" part of the distribution, products for them surely should expose some command-line interface. And exposing that interface must be as convenient as possible for the developer - ideally, the perfect CLI framework should be baked into the language or easily available.

So, here begins my story :)

Some time ago, I needed to add a CLI to one of my simple programs. I immediately turned to the often-mentioned Apache Commons CLI library. Since Apache community is usually producing something awesome, I expected that library would quickly relieve me of my problems, and I will be able to move to another tasks.

I was hugely disappointed.

First, the default option choice is crippled - to get anything meaningful, you'll need to use OptionBuilder. And that option builder looks like it was built for an example of how you should not architect the builder - and then developer suddenly made a commit into a master branch (by mistake, of course). I have many questions about the whole concept of mutable builders, but creating a mutable static builder?! And the fact that to construct the simple option (say, option that takes a single string argument) you need to use at least 5 lines of code is a bit overwhelming.

My next stop was scopt, popular utility to parse options in Scala. It looks good, but is unable to parse options, which take a list of arguments (i.e. -a 1 2 3). And you have no way to extend it to get those lists (except forking the lib).

So I set off on a journey to create "The Better Option Parser". I was inspired by option definition syntax in ruby's Trollop library and by way to extract options in configrity.

I had quite a success in this journey - I released the library recently, called Scallop.

It features:

POSIX-style option parsing - capable of parsing short and long options
property arguments - most famous from Ant (-Dkey=value key2=value2)
Extracts flags, single-argument and multiple-argument options
Default and required options
Careful and powerful parsing of trailing arguments
Completely immutable option builder - you can reuse it, delegate option definitions to submodules, etc.

On top of that, Scallop is easily extendable with new argument types.

Enough talking! Let me show you some code:

import org.rogach.scallop._;
 
val opts = Scallop(List("-d","--num-limbs","1"))
  .version("test 1.2.3 (c) 2012 Mr S") // --version option is provided for you
                                       // in "verify" stage it would print this message and exit
  .banner("""Usage: test [OPTION]... [pet-name]
            |test is an awesome program, which does something funny      
            |Options:
            |""".stripMargin) // --help is also provided
                              //  will also exit after printing version, banner, and options usage
  .opt[Boolean]("donkey", descr = "use donkey mode") // simple flag option
  .opt("monkeys", default = Some(2), short = 'm') // you can add the default option
                                                  // the type will be inferred
  .opt[Int]("num-limbs", 'k', 
    "number of libms", required = true) // you can override the default short-option character
  .opt[List[Double]]("params") // default converters are provided for all primitives
                               //and for lists of primitives
  .props('D',"some key-value pairs")
  .args(List("-Dalpha=1","-D","betta=2","gamma=3", "Pigeon")) // you can add parameters a bit later
  .trailArg[String]("pet name") // you can specify what do you want to get from the end of 
                                // args list
  .verify
 
opts.get[Boolean]("donkey") should equal (Some(true))
opts[Int]("monkeys") should equal (2)
opts[Int]("num-limbs") should equal (1)
opts.prop('D',"alpha") should equal (Some("1"))
opts.prop('E',"gamma") should equal (None)
opts[String]("pet name") should equal ("Pigeon")
intercept[WrongTypeRequest] {
  opts[Double]("monkeys") // this will throw an exception at runtime
                          // because the wrong type is requested
}
 
println(opts.help) // returns options description
println(opts.summary) // returns summary of parser status (with current arg values)

If you will run this option setup with "--help" option, you would see:

test 1.2.3 (c) 2012 Mr Placeholder
Usage: test [OPTION]...
test is an awesome program, which does something funny      
Options:
 
-Dkey=value [key=value]...
    some key-value pairs
-d, --donkey  
    use donkey mode
-m, --monkeys  
-k, --num-limbs  
    number of libms
-p, --params  ...

Scallop has extensive support for trailing arguments parsing, which can be used for simple things:

val opts = Scallop(List("first","second"))
  .trailArg[String]("required file")
  .trailArg[String]("optional file", required = false)
  .verify
opts[String]("required file") should equal ("first")
opts.get[String]("optional file") should equal (Some("second"))

...and for complex things. For example, scallop's parser is clever enough to handle the following case correctly:

val opts = Scallop(List("-Ekey1=value1", "key2=value2", "key3=value3",
                        "first", "1","2","3","second","4","5","6"))
  .props('E')
  .trailArg[String]("first list name")
  .trailArg[List[Int]]("first list values")
  .trailArg[String]("second list name")
  .trailArg[List[Double]]("second list values")
  .verify
opts.propMap('E') should equal ((1 to 3).map(i => ("key"+i,"value"+i)).toMap)
opts[String]("first list name") should equal ("first")
opts[String]("second list name") should equal ("second")
opts[List[Int]]("first list values") should equal (List(1,2,3))
opts[List[Double]]("second list values") should equal (List[Double](4,5,6))

And last but not the least, you can easily extend it for providing your own arguments types support:

case class Person(name:String, phone:String)
val personConverter = new ValueConverter[Person] {
  val nameRgx = """([A-Za-z]*)""".r
  val phoneRgx = """([0-9\-]*)""".r
  // parse is a method, that takes a list of arguments to all option invocations:
  // for example, "-a 1 2 -a 3 4 5" would produce List(List(1,2),List(3,4,5)).
  // parse returns Left, if there was an error while parsing
  // if no option was found, it returns Right(None)
  // and if option was found, it returns Right(...)
  def parse(s:List[List[String]]):Either[Unit,Option[Person]] = 
    s match {
      case ((nameRgx(name) :: phoneRgx(phone) :: Nil) :: Nil) => 
        Right(Some(Person(name,phone))) // successfully found our person
      case Nil => Right(None) // no person found
      case _ => Left(Unit) // error when parsing
    }
  val manifest = implicitly[Manifest[Person]] // some magic to make typing work
  val argType = org.rogach.scallop.ArgType.LIST
}
val opts = Scallop(List("--person", "Pete", "123-45"))
  .opt[Person]("person")(personConverter)
  .verify
opts[Person]("person") should equal (Person("Pete", "123-45"))

The code is hosted on GitHub - suggestions, bug reports, and pull requests are all welcome!

Comments

scalableApril 5, 2012 at 4:34 PM
This comment has been removed by the author.
ReplyDelete
Replies
scalableApril 5, 2012 at 4:35 PM
I'm using this in production at Klout already, it even parses some big data arguments from Hadoop as a Scoobi jar! --Alexy
ReplyDelete
Replies
brianJune 8, 2012 at 7:38 AM
if you wanted to leverage something better than ApacheCLI on the jvm - args4j or jcommander
ReplyDelete
Replies
RogachJune 8, 2012 at 9:37 AM
brian:
I was fully aware of those projects at that moment. But I felt that they lacked Scala-specific features and proper type safety. Thus I decided to create the "better" option parser, and I feel that now (long after this post :) Scallop eclipses both args4j and jcommander in terms of features and code conciseness. You can read more in documentation to Scallop - https://github.com/Rogach/scallop/wiki.
ReplyDelete
Replies
UnknownAugust 9, 2012 at 7:52 AM
Hey Rogach,

Scallop is awesome. For the most part it allows a very clean specification of the commandline interface and allows me to declare configuration variables in exactly one place. Another favorite features of mine is the summary method which I just discovered this morning. I have three feature requests / suggestions if you are up for that:

1) We would really like to avoid having to dereference Options with parens whenever we read data from the Config object (i.e. we want to be able to write config.numIterations rather than config.numIterations()). The parens significantly reduce readability as they suggest to someone reading the code that the some method is being called and that the method may have side effects (you may disagree here). Anyway, to avoid the parens, we've resorted to something like the following:

val _numIterations = opt[Int]("num-iterations", default = Some(Int.MaxValue)); lazy val numIterations = _numIterations()

Which has the effect that we want, but looks uglier than we'd like. It would be great if those semantics could be matched (however, access to the _numIterations variable isn't necessary) without the ugly syntax. Any ideas?

2) (config.summary almost entirely achieves this) I have appreciated CL toolkits that print default values in the automatic help

3) (this last one is not essential) often the variable name is redundant with the command line parameter name; it would be cool if there were a way to have a default for the commandline parameter name be a function of the variable name (maybe inferred somehow via reflection?)

Anyway, thanks again for great work on an awesome commandline parser for scala!
ReplyDelete
Replies
AnonymousSeptember 13, 2012 at 9:13 PM
Hi rogach, scallop looks nice, congrats!

How do you compare it to argot (http://software.clapper.org/argot/)?
ReplyDelete
Replies
NickDecember 30, 2021 at 12:20 AM
Thanks for sharing the article.
ReplyDelete
Replies

Add comment

Rogach on Scala

Better CLI option parsing in Scala

Comments

Post a Comment

Popular posts from this blog

How to create your own simple 3D render engine in pure Java

Solving quadruple dependency injection problem in Angular

Configuration objects in Scallop