Rusting My (Academic) Code
Written by J. David Smith
Published on 23 February 2017
A couple of weeks ago I mentioned several methods that I'd been (trying) to use to help improve the verifiability of my research. This is the first in an (eventual) series on the subject. Today's topic? Rust.
One of the reasons it took me so long to write this post is the difficulties inherent in describing why I chose Rust over another, more traditional language like Java or C/C++. Each previous time, I motivated the selection by first covering the problems of those languages. I have since come to realize that that is not a productive approach – each iteration invariably devolved into opining about the pros and cons of various trade-offs made in language design. In this post, I am instead going to describe the factors that led to me choosing Rust, and what practical gains it has given me over the past two projects I've worked on.
In the Beginning...
First, some background on who I am, what I do, and what constraints I have on my work. I am a graduate student at the University of Florida studying Optimization/Security on Online Social Networks under Dr. My T. Thai. Most problems I work on are framed as (stochastic) graph optimization problems, which are almost universally NP-hard. We typically employ approximation algorithms to address this, but even then the resulting algorithm can still be rather slow to experiment with due to a combination of difficult problems, large datasets and the number of repetitions needed to establish actual performance.
This leads to two often-conflicting constraints: the implementations musts be performant to allow us to both meet publication deadlines and compete with previous implementations, and the implementations must also be something we can validate as correct. Well, ideally. Often, validating code is only done by the authors and code is not released. Most code in my field is in either C or C++, with the occasional outlier in Java when performance is less of a concern, in order to satisfy that first constraint. However, after repeatedly having to work in other's rushed C/C++ codebases, I got fed up. Enough of this! I thought, There must be something better!
I set about re-implementing the state-of-the-art approximation algorithm (SSA)
for the Influence Maximization problem.
Hung Nguyen, Thang Dinh,
My Thai.
“Stop-and-Stare: Optimal Sampling Algorithms for Viral Marketing
in Billion-Scale Networks.”
In the Proceedings of SIGMOD 2016. This method has an absurdly-optimized C++ implementation, using
all sorts of tricks to eek out every possible bit of performance. Solving this
problem – even approximately – on a network with 40+ million nodes is no
small feat, and the implementation shows the effort they put into getting
there. This implementation formed my baseline both for performance and
correctness: every language I rewrote it in should produce (roughly) identical
solutions,
The algorithm uses non-deterministic sampling, so there
is room for some error after all the big contributors have been found. and every feasible language would have to get in the same
ballpark in terms of memory usage and performance.
In my spare time over the space of a couple of months, I implemented SSA in Chez Scheme, Clojure, Haskell, OCaml and, eventually, Rust. My bent towards functional programming shows clearly in this choice of languages. I'm a lisp weenie at heart, but unfortunately no Scheme implementation I tried nor Clojure could remotely compete with the C++ implementation. Haskell and OCaml shared the problem that graph processing in purely-functional languages is just an enormous pain in the ass, in addition to not hitting my performance goals Some of this is due to my unfamiliarity with optimizing in these languages. However, I didn't want to get into that particular black magic just to use a language that was already painful for use with graphs. – though they got much closer. Then I tried Rust. Holy. Shit.
The difference was immediately apparent. Right off the bat, I got performance
in the ballpark of the baseline (once I remembered to compile in release
mode). It was perhaps 50% slower than the C++ implementation. After a few
rounds of optimization, I got that down to about 20-25%.
Unfortunately, I no longer have the timing info. As that isn't the focus of
this post, I also haven't re-run these experiments. Had I been
willing to break more safety guarantees, I could have applied several of the
optimizations from the baseline to make further improvement. Some of this
performance is due to the ability to selectively use mutability to speed up
critical loops, and some is due to the fact that certain graph operations are
simply easier to express in an imperative fashion, leading me to write less
wasteful code. What's more: it also has a similar type system – with its
strong guarantees – to the system of ML-family languages that I adore,
combined with a modern build/packaging tool and a thriving ecosystem. It seemed
that I'd found a winner. Now, for the real test.
The First Paper: Batched Stochastic Optimization
I was responsible for writing the implementation for the next paper I worked
on.
The code – and paper – aren't out yet. I'll hear back in
early March as to whether it was accepted. We were extending
a previous work
Xiang Li, J David Smith, Thang Dinh, My T.
Thai. “Privacy Issues in Light of Reconnaissance Attacks with Incomplete
Information.”
In the Proceedings of WI 2016. to support
batched updates. As the problem was again NP-hard with dependencies on the
ordering of choices made, this was no small task. Our batching method ended up
being exponential in complexity, but with a strict upper bound on the size of
the search space so that it remained feasible. This was my first testbed for
Rust as a language for my work.
Immediately, it paid dividends. I was able to take advantage of
a number of wonderful libraries
that allowed me to jump straight from starting work to implementing our method.
Performance was excellent, parallelism was easy, and I was able to
easily log in a goddamn parseable format. It was wonderful. Even the
time I spent fighting the borrow checker
A phase I will note was
very temporary; I rarely run into it anymore. failed to outrun
the benefits gained by being able to build on others' work. I had a real
testing framework, which caught numerous bugs in my batching code. Even
parallelism, which generally isn't much of a problem for such small codebases,
was no harder than applying OpenMP. Indeed, since I needed
read-write locks it was in fact easier as OpenMP lacks that feature
(you have to use pthread
instead). It was glorious, and I loved it.
Still do, to be honest.
The Second Work: Generic Analysis
The next paper I worked on was theory driven. I showed a new method to estimate a lower bound on approximation quality for a previously unstudied Not exactly new, but general bounds for this class of problem had not really been considered. As a result, the greedy approximation algorithm hadn't seen much use on it. class of problems. The general idea is that we're maximizing an objective function f, subject to some constraints where we know that all maximal solutions have the same size. I needed to be able to transparently plug in different values of f, operating on different kinds of data, with different constraints. Further, for efficiency's sake I also needed a way to represent the set of elements that would need updating after each step.
Rust gave me the tools to do this. I defined a trait Objective
to
abstract over the different possible values of f. Each implementor
could handle building their own internal representation, and with associated
types I could easily allow each kind to operate on their own kind of data.
It worked. Really well, actually.
I wrote a completely generic greedy algorithm in terms of this trait, along with some pretty fancy analysis. Everything just...worked, and with static dispatch I paid very little at runtime for this level of indirection. At, least, as soon as it compiled.
Not All Fun & Games
As great as my time with Rust has been, there are still a few flaws I feel compelled to point out.
Borrow Checking is Sometimes Extremely Painful
While in most cases the borrow checker became something I dealt with
subconsciously, in a few it was still an extraordinary pain in the ass. The
first, and what seems to be the most common, is the case of having callbacks on
structs. Some of this comes down to confusion over the correct syntax (e.g.
Fn(...) -> X
or fn(...) -> X
), but I mentally recoil at the thought of
trying to make a struct with callbacks on it. This was not a pleasant
experience.
The second arose in writing the constructor for an InfMax
objective using the SSA
sampling method. There are multiple different
diffusion models for this problem, each represented as a struct
implementing
Iterator
. I wanted my InfMax
struct to own the Graph
object it operated
on, and pass a reference of this to the sampling iterator. The borrow checker
refused.
To this day, I still don't know how to get that to work. I ultimately caved and
had InfMax
store a reference with lifetime 'a
, giving the sampler
a reference with a lifetime 'b
such that 'a: 'b
(that is, 'a
is at least
as long as 'b
). While this workaround took only a moderate amount of time to
find, I still lost quite a bit trying to get it to work as I wanted. Giving
InfMax
a reference never caused any issues, but I still find it annoying.
Trait Coherence
In order to use the bit-set
library, I wanted to force the
Element
associated type of each Objective
to satisfy Into<usize>
. The
NodeIndex
type from petgraph
does not, but has a method fn index(self) -> usize
. Due to the rules about when you can implement traits (namely, that you
can't implement traits when both the trait and the type are non-local), this
was impossible without forking or getting a PR into the petgraph
repository.
Forking was my ultimate solution, and hopefully a temporary one. There isn't
a clear way to work around this
Implementing a helper trait was frustrating
because you can't implement both From<T: Into<usize>> for Helper
and
From<NodeIndex> for Helper
, presumably because at a future point NodeIndex
could implement From/Into<usize>
?, and yet there also isn't
a clear way to make this work without allowing crates to break each other.
Fortran FFI & Absent Documentation
I almost feel bad complaining about this, because it is so, so niche. But I'm going to. I had to interface with a bit (read: nearly 4000 lines) of Fortran code. In theory, since the Fortran ABI is sufficiently similar to C's, I ought to be able to just apply the same techniques as I use for C FFI. As it happens, this is mostly correct. Mostly.
Fortran FFI is poorly documented for C, so as one might imagine instructions
for interfacing from Rust were nearly absent. It turns out that, in Fortran,
everything is a pointer. Even elementary types like int
. Of course, I also
had the added complexity that matrices are stored in column-major order (as
opposed to the row-major order used everywhere else). This led to a lengthy
period of confusion as to whether I was simply encoding my matrices wrong,
wasn't passing the pointers correctly, or was failing in some other way (like
passing f32
s instead of f64
s).
It turns out that you simply need to flatten all the matrices, pass everything
by pointer, and ensure that no call is made to functions with
unspecified-dimensional matrices.
Calling functions with
matrices of the form REAL(N, M)
works provided N
and M
are either
variables in scope or parameters of the function.
Numeric-Cast-Hell
The inability to multiply numbers of different types together without as
is
immensely frustrating. I know why this is – I've been bitten by a / b
doing
integer division before – but formulae littered with x as f64 / y as f64
are
simply difficult to read. Once you've figured out the types everything needs to
be, this can be largely remedied by ensuring everything coming into the
function has correct type and pre-casting everything else. This helps
dramatically, though in the end I often found myself just making everything
f32
to save myself the trouble (as that was what most values already were).
The inverted notation for things like pow
and log
(written x.pow(y)
and
x.log(y)
in Rust) further hinders readability.
Verifiability & The Big Wins for Academic Use
I began this post by referencing the verifiability of simulations (and other code). Through all this, I have focused more on the usability wins rather than the verifiability. Why is that? Fundamentally, it is because code is only as verifiable as it is written to be. The choice of language only impacts this in how well it constrains authors to “clearly correct” territory. In comparison to C/C++, Rust has clear advantages. There is no risk of accidental integer division, very low risk of parallelism bugs due to the excellent standard primitives, and nearly-free documentation. Some commenting required. Accidentally indexing outside of bounds (a problem I've run into before) is immediately detected, and error handling is rather sane.
Further, the structure of Rust encourages modularization and allows formerly-internal modules to be pulled out into their own crates, which can be independently verified for correctness and re-used with ease – even without uploading them to crates.io. A concrete example of this is my Reverse Influence Sampling Iterators, which were originally (and still are, unfortunately) a local module from my most recent paper and have subsequently found use in the (re-)implementation of several recent influence maximization algorithms.
This modularity is, I think, the biggest win. While reviewing and verifying a several-thousand-line codebase associated with a paper is unlikely to ever be practical, a research community building a set of common libraries could not only reduce the need to continually re-invent the wheel, but also limit the scope of any individual implementation to only the novel portion. This may reduce codebase sizes to the point that review could become practical.
Case in Point: the original implementation of the TipTop
algorithm was nearly 1500 lines of C++. My more recent Rust implementation
is 253.
Admittedly, this is without the weighted sampling allowed
in the C++ implementation. However, that isn't hard to implement.
(Please forgive my lack of comments, that was tight paper
deadline.) This gain is due to the fact that I didn't have to
include a graph representation or parsing,
sampling, logging (the logs are machine-readable!), or
command-line parsing.
In Conclusion...
For a while, I was worried that my exploration of alternate languages would be fruitless, that I'd wasted my time and would be stuck using C++ for the remainder of grad school. Words cannot describe how happy I am that this is not the case. Despite its flaws, using Rust has been an immeasurable improvement over C++ both in terms of productivity and sanity. Further, while verifiability is very much secondary in my choice of language, I believe the tools and safety that Rust provides are clearly advantageous in this area.