Rusting My (Academic) Code

Written by J. David Smith
Published on 23 February 2017

A couple of weeks ago I mentioned several methods that I'd been (trying) to use to help improve the verifiability of my research. This is the first in an (eventual) series on the subject. Today's topic? Rust.

One of the reasons it took me so long to write this post is the difficulties inherent in describing why I chose Rust over another, more traditional language like Java or C/C++. Each previous time, I motivated the selection by first covering the problems of those languages. I have since come to realize that that is not a productive approach – each iteration invariably devolved into opining about the pros and cons of various trade-offs made in language design. In this post, I am instead going to describe the factors that led to me choosing Rust, and what practical gains it has given me over the past two projects I've worked on.

In the Beginning...

First, some background on who I am, what I do, and what constraints I have on my work. I am a graduate student at the University of Florida studying Optimization/Security on Online Social Networks under Dr. My T. Thai. Most problems I work on are framed as (stochastic) graph optimization problems, which are almost universally NP-hard. We typically employ approximation algorithms to address this, but even then the resulting algorithm can still be rather slow to experiment with due to a combination of difficult problems, large datasets and the number of repetitions needed to establish actual performance.

This leads to two often-conflicting constraints: the implementations musts be performant to allow us to both meet publication deadlines and compete with previous implementations, and the implementations must also be something we can validate as correct. Well, ideally. Often, validating code is only done by the authors and code is not released. Most code in my field is in either C or C++, with the occasional outlier in Java when performance is less of a concern, in order to satisfy that first constraint. However, after repeatedly having to work in other's rushed C/C++ codebases, I got fed up. Enough of this! I thought, There must be something better!

I set about re-implementing the state-of-the-art approximation algorithm (SSA) for the Influence Maximization problem. Hung Nguyen, Thang Dinh, My Thai.
“Stop-and-Stare: Optimal Sampling Algorithms for Viral Marketing in Billion-Scale Networks.”
In the Proceedings of SIGMOD 2016. This method has an absurdly-optimized C++ implementation, using all sorts of tricks to eek out every possible bit of performance. Solving this problem – even approximately – on a network with 40+ million nodes is no small feat, and the implementation shows the effort they put into getting there. This implementation formed my baseline both for performance and correctness: every language I rewrote it in should produce (roughly) identical solutions, The algorithm uses non-deterministic sampling, so there is room for some error after all the big contributors have been found. and every feasible language would have to get in the same ballpark in terms of memory usage and performance.

In my spare time over the space of a couple of months, I implemented SSA in Chez Scheme, Clojure, Haskell, OCaml and, eventually, Rust. My bent towards functional programming shows clearly in this choice of languages. I'm a lisp weenie at heart, but unfortunately no Scheme implementation I tried nor Clojure could remotely compete with the C++ implementation. Haskell and OCaml shared the problem that graph processing in purely-functional languages is just an enormous pain in the ass, in addition to not hitting my performance goals Some of this is due to my unfamiliarity with optimizing in these languages. However, I didn't want to get into that particular black magic just to use a language that was already painful for use with graphs. – though they got much closer. Then I tried Rust. Holy. Shit.

The difference was immediately apparent. Right off the bat, I got performance in the ballpark of the baseline (once I remembered to compile in release mode). It was perhaps 50% slower than the C++ implementation. After a few rounds of optimization, I got that down to about 20-25%. Unfortunately, I no longer have the timing info. As that isn't the focus of this post, I also haven't re-run these experiments. Had I been willing to break more safety guarantees, I could have applied several of the optimizations from the baseline to make further improvement. Some of this performance is due to the ability to selectively use mutability to speed up critical loops, and some is due to the fact that certain graph operations are simply easier to express in an imperative fashion, leading me to write less wasteful code. What's more: it also has a similar type system – with its strong guarantees – to the system of ML-family languages that I adore, combined with a modern build/packaging tool and a thriving ecosystem. It seemed that I'd found a winner. Now, for the real test.

The First Paper: Batched Stochastic Optimization

I was responsible for writing the implementation for the next paper I worked on. The code – and paper – aren't out yet. I'll hear back in early March as to whether it was accepted. We were extending a previous work Xiang Li, J David Smith, Thang Dinh, My T. Thai. “Privacy Issues in Light of Reconnaissance Attacks with Incomplete Information.”
In the Proceedings of WI 2016. to support batched updates. As the problem was again NP-hard with dependencies on the ordering of choices made, this was no small task. Our batching method ended up being exponential in complexity, but with a strict upper bound on the size of the search space so that it remained feasible. This was my first testbed for Rust as a language for my work.

Immediately, it paid dividends. I was able to take advantage of a number of wonderful libraries that allowed me to jump straight from starting work to implementing our method. Performance was excellent, parallelism was easy, and I was able to easily log in a goddamn parseable format. It was wonderful. Even the time I spent fighting the borrow checker A phase I will note was very temporary; I rarely run into it anymore. failed to outrun the benefits gained by being able to build on others' work. I had a real testing framework, which caught numerous bugs in my batching code. Even parallelism, which generally isn't much of a problem for such small codebases, was no harder than applying OpenMP. Indeed, since I needed read-write locks it was in fact easier as OpenMP lacks that feature (you have to use pthread instead). It was glorious, and I loved it.

Still do, to be honest.

The Second Work: Generic Analysis

The next paper I worked on was theory driven. I showed a new method to estimate a lower bound on approximation quality for a previously unstudied Not exactly new, but general bounds for this class of problem had not really been considered. As a result, the greedy approximation algorithm hadn't seen much use on it. class of problems. The general idea is that we're maximizing an objective function f, subject to some constraints where we know that all maximal solutions have the same size. I needed to be able to transparently plug in different values of f, operating on different kinds of data, with different constraints. Further, for efficiency's sake I also needed a way to represent the set of elements that would need updating after each step.

Rust gave me the tools to do this. I defined a trait Objective to abstract over the different possible values of f. Each implementor could handle building their own internal representation, and with associated types I could easily allow each kind to operate on their own kind of data.

It worked. Really well, actually.

I wrote a completely generic greedy algorithm in terms of this trait, along with some pretty fancy analysis. Everything just...worked, and with static dispatch I paid very little at runtime for this level of indirection. At, least, as soon as it compiled.

Not All Fun & Games

As great as my time with Rust has been, there are still a few flaws I feel compelled to point out.

Borrow Checking is Sometimes Extremely Painful

While in most cases the borrow checker became something I dealt with subconsciously, in a few it was still an extraordinary pain in the ass. The first, and what seems to be the most common, is the case of having callbacks on structs. Some of this comes down to confusion over the correct syntax (e.g. Fn(...) -> X or fn(...) -> X), but I mentally recoil at the thought of trying to make a struct with callbacks on it. This was not a pleasant experience.

The second arose in writing the constructor for an InfMax objective using the SSA sampling method. There are multiple different diffusion models for this problem, each represented as a struct implementing Iterator. I wanted my InfMax struct to own the Graph object it operated on, and pass a reference of this to the sampling iterator. The borrow checker refused.

To this day, I still don't know how to get that to work. I ultimately caved and had InfMax store a reference with lifetime 'a, giving the sampler a reference with a lifetime 'b such that 'a: 'b (that is, 'a is at least as long as 'b). While this workaround took only a moderate amount of time to find, I still lost quite a bit trying to get it to work as I wanted. Giving InfMax a reference never caused any issues, but I still find it annoying.

Trait Coherence

In order to use the bit-set library, I wanted to force the Element associated type of each Objective to satisfy Into<usize>. The NodeIndex type from petgraph does not, but has a method fn index(self) -> usize. Due to the rules about when you can implement traits (namely, that you can't implement traits when both the trait and the type are non-local), this was impossible without forking or getting a PR into the petgraph repository.

Forking was my ultimate solution, and hopefully a temporary one. There isn't a clear way to work around this Implementing a helper trait was frustrating because you can't implement both From<T: Into<usize>> for Helper and From<NodeIndex> for Helper, presumably because at a future point NodeIndex could implement From/Into<usize>? , and yet there also isn't a clear way to make this work without allowing crates to break each other.

Fortran FFI & Absent Documentation

I almost feel bad complaining about this, because it is so, so niche. But I'm going to. I had to interface with a bit (read: nearly 4000 lines) of Fortran code. In theory, since the Fortran ABI is sufficiently similar to C's, I ought to be able to just apply the same techniques as I use for C FFI. As it happens, this is mostly correct. Mostly.

Fortran FFI is poorly documented for C, so as one might imagine instructions for interfacing from Rust were nearly absent. It turns out that, in Fortran, everything is a pointer. Even elementary types like int. Of course, I also had the added complexity that matrices are stored in column-major order (as opposed to the row-major order used everywhere else). This led to a lengthy period of confusion as to whether I was simply encoding my matrices wrong, wasn't passing the pointers correctly, or was failing in some other way (like passing f32s instead of f64s).

It turns out that you simply need to flatten all the matrices, pass everything by pointer, and ensure that no call is made to functions with unspecified-dimensional matrices. Calling functions with matrices of the form REAL(N, M) works provided N and M are either variables in scope or parameters of the function.

Numeric-Cast-Hell

The inability to multiply numbers of different types together without as is immensely frustrating. I know why this is – I've been bitten by a / b doing integer division before – but formulae littered with x as f64 / y as f64 are simply difficult to read. Once you've figured out the types everything needs to be, this can be largely remedied by ensuring everything coming into the function has correct type and pre-casting everything else. This helps dramatically, though in the end I often found myself just making everything f32 to save myself the trouble (as that was what most values already were). The inverted notation for things like pow and log (written x.pow(y) and x.log(y) in Rust) further hinders readability.

Verifiability & The Big Wins for Academic Use

I began this post by referencing the verifiability of simulations (and other code). Through all this, I have focused more on the usability wins rather than the verifiability. Why is that? Fundamentally, it is because code is only as verifiable as it is written to be. The choice of language only impacts this in how well it constrains authors to “clearly correct” territory. In comparison to C/C++, Rust has clear advantages. There is no risk of accidental integer division, very low risk of parallelism bugs due to the excellent standard primitives, and nearly-free documentation. Some commenting required. Accidentally indexing outside of bounds (a problem I've run into before) is immediately detected, and error handling is rather sane.

Further, the structure of Rust encourages modularization and allows formerly-internal modules to be pulled out into their own crates, which can be independently verified for correctness and re-used with ease – even without uploading them to crates.io. A concrete example of this is my Reverse Influence Sampling Iterators, which were originally (and still are, unfortunately) a local module from my most recent paper and have subsequently found use in the (re-)implementation of several recent influence maximization algorithms.

This modularity is, I think, the biggest win. While reviewing and verifying a several-thousand-line codebase associated with a paper is unlikely to ever be practical, a research community building a set of common libraries could not only reduce the need to continually re-invent the wheel, but also limit the scope of any individual implementation to only the novel portion. This may reduce codebase sizes to the point that review could become practical.

Case in Point: the original implementation of the TipTop algorithm was nearly 1500 lines of C++. My more recent Rust implementation is 253. Admittedly, this is without the weighted sampling allowed in the C++ implementation. However, that [isn't hard to implement][weighted].

(Please forgive my lack of comments, that was tight paper deadline.) This gain is due to the fact that I didn't have to include a graph representation or parsing, sampling, logging (the logs are machine-readable!), or command-line parsing.

In Conclusion...

For a while, I was worried that my exploration of alternate languages would be fruitless, that I'd wasted my time and would be stuck using C++ for the remainder of grad school. Words cannot describe how happy I am that this is not the case. Despite its flaws, using Rust has been an immeasurable improvement over C++ both in terms of productivity and sanity. Further, while verifiability is very much secondary in my choice of language, I believe the tools and safety that Rust provides are clearly advantageous in this area.