Written by J. David Smith Published on 25 July 2020
A bit over one week ago—July 16th—we launched the inaugural Gladiator
league using my LeagueBot.
Despite some initial difficulties with JMP quirks,
multiple cards with the same set code and collector number. For example: and
Goblin Oriflamme and Rhox
Faithmender both get exported as 1 ..... (JMP) 130. the launch was fairly smooth. In this short time we've had
251 league completions,
Only decks with the full 5 matches
played are counted here. which has been very gratifying for me. You
see: while the LeagueBot may appear to have sprung into existence fully formed,
this couldn't be further from the truth.
I want to take this opportunity to walk through the history of the bot. Its
history is closely tied to my own involvement in singleton formats on Arena, and
so at the same time I'm going to cover some Gladiator pre-history.
In the Beginning...
The first Singleton event I can find record of on MTG Arena took place from
September 27th–October 1st, 2018.
Cards in Format:
1,006 The event offered a play queue for 60-card no-banlist singleton games.
Wizards would repeat this event later that year from December 3rd–7th and again
two months later from February 14th–18th, 2019. This is where my involvement begins.
I created a Discord server for Singleton shortly after Valentine's Day event ended and
posted about it on the MagicArena
At this point in time, I stuck with the 60-card format due to the small number
of cards in the format.
Cards in Format:
1,526 Given that
Rats were the two most popular
(and most annoying) decks in each Singleton event, I might have also banned
them to try to encourage people to give it a chance—but I honestly don't
This discord never really took off, and while I will never know for sure why
that happened, I do know one reason. This is also the point at which I built the
precursor to the LeagueBot we all know and love: the Sharktocrab
Bot. I'd intended to build it out
into a LeagueBot inspired by Penny Dreadful,
but ended up being overrun with work shortly after and never quite got around to
it. At this point in time, I had initial plans for how to record matches in a
safe manner, but hadn't figured out some details like deck submission (would we
use the rising star AetherHub? the old guard Goldfish?). However, I also made a
fatal mistake with this bot.
Those of you that've used Discord for a while have probably seen the "React to
get speaking permissions" gatekeeping bots. That is one of the few features I
got implemented into Sharktocrab—and it worked! Except I would occasionally
get DMs about how it didn't—DMs that I could never reproduce. Later, I
realized the likely culprit was my use of Heroku's free hosting—which, though
free, comes with the caveat that your app may be put to
there is no activity. While I can't know for sure, I believe that this is the
cause of those unreproducible bug reports.
Enter: Arena Highlander
Between my own lack of time and the obvious failure to start, the server rotted
away for the better part of the year. Then, on September 28th, 2019 corporat
created the Historic Canadian Highlander discord, and advertised it on
a short time later.
Cards in Format:
Petitioners and Rats were both on the initial points list, effectively
banning those decks for much the same reason that I had previously. The first
tournament ran that weekend, a round-robin with 6 participants that was won by
Oogablast with a 4-1.
Tournament Results: 4-1
Within two weeks, a pool of regulars was already forming.
likely recognize some of them, like Roseface and GoblinMatron, from Gladiator
discord. Wheeler joined fairly early on, but wasn't an active participant in the
format til much later. I joined about 3 weeks into the format, and won
a tournament for the first (and only) time playing Abzan Field of the
The deck wasn't actually very good, but it and tournament results around this
time highlight the problems with 100-card singleton with a small card pool: at
some point you just need good stuff to put in your deck. Midrange was clearly
very good, with a couple of aggro decks with good top-ends (Gruul, Boros,
Mono-Black) also putting up results.
Anyway, this core of players formed the bulk of tournament participation over
the next few months. Participation ebbed and flowed, but never really took off.
To corporat's credit, he was actively trying to generate & maintain interest in
the format with both weekly tournaments and periodic posts about said
tournaments on reddit. While the weekly tournaments certainly helped maintain
interest, there was a chronic lack of pick-up games during the week that, in my
opinion, were both a cause for and caused by the limited number of players.
The Wheeler Effect and the Birth of the LeagueBot
All that changed on April 19th, 2020,
Cards in format
2,622 when Wheeler streamed Arena Highlander (then called Historic Highlander)
for the first time. The format exploded. The discord gained hundreds of users
basically overnight. At the same time, I had just graduated, COVID-19 had
frustrated my job search, and I suddenly found myself with copious free time.
With the sudden influx of players, I decided to finally finish what I'd started
more than a year prior and build the LeagueBot.
During that year, designs for the bot had been in the back of my mind with some
regularity. I had mapped out the flows for match reporting, and had ideas for
the registration process. Putting those designs into practice was mostly
straightforward, with Discord itself giving me the ultimate assist by making
uploads so seamless that I could use them for deck registration. One week later,
the week-long April 2020 Test
League went live—but not before
the format split.
I was not involved in many of the conversations preceding the split between
Arena Highlander and Gladiator, so I'll refrain from commenting on it. I
will say, though, that the timing was incredibly frustrating. I began work on
the bot on April 20th, and had made substantial progress on it by the time that
the split happened on the 22nd. While I'd hoped that the launch of the League
on the 26th would help drive continued interest in Arena Highlander, it became
apparent within about a month that the bulk of the new blood had followed
Wheeler to Gladiator.
To his credit: there was a clear effort to
accomodate the existing Arena Highlander format. For example, Gladiator
tournaments were deliberately not run at the same time as the existing weekly
Arena Highlander tournament, despite Saturday afternoon being an excellent
time for tournaments across time zones.
Transitioning to Gladiator
Around this time, I took a break. I found work, played other games, and let my
frustration with the failure of the league and with Magic in general
WAR–IKO was an awful time to be playing Magic. subside. I
popped into the Gladiator discord in early June when there was discussion of
starting a league, but aside from that was largely uninvolved.
Then, on July 7th I (finally) reached out to Roseface about moving the
LeagueBot over to Gladiator. The only thing that really needed changing was a
few names that read "Arena Highlander" and the domain name it used. Two days
later, I heard back: 👍. The plan was to launch on July 16th to coincide with
the release of Jumpstart—giving just over a week to switch things over.
Cards in Format:
This was plenty of time, and I had the opportunity to add in a couple of new
features like the API. Aside
from the previously mentioned issues with Jumpstart card exports, launch went
smoothly. Seeing the Gladiator community jump into the league and play more
than I'd ever expected was…nice. I'll take it.
You might have noticed that throughout this I've tracked the number of cards in
the format. In just over a year, the number of cards on Arena has doubled. With remastered Pioneer sets, it may well double again by next year.
I think this is relevant to understanding the evolution of the greater
Arena-singleton format. For instance: prior to ELD, a 100-card singleton deck
would include anywhere from 3-5% of all cards on Arena,
is about 3%, 100/2,038 is about 5%. A typically Gladiator deck has around 60
non-lands, and 3c decks may legitimately play 90+ nonbasic cards. which
means you would often dip into borderline (or even outright) draft chaff for
playables, especially in mono-colored decks. This means that your card quality
was relatively low compared to now (when one plays 2-3% of all cards), and led
to an abundance of 2-3c Goodstuff decks along with legitimate concerns that the
consistency of Petitioners / Rats decks could be problematic despite the
obviously low card quality. I'm glad those days are behind us, and am looking
forward to seeing how the increasing card pool lets the format further
Written by J. David Smith Published on 07 May 2019
In late January, I started working on a new research project. Without getting
into too much detail, it involves network analysis on blockchain-based
My opinion on blockchain tech has not noticeably
improved.After some initial results, my advisor & I decided
to target ASONAM 2019 for publication, which has submissions in mid-late April.
The exact, final deadline fell on the 28th this year.
I didn't make it.
I would like to briefly explore the reasons that I missed the deadline. Like
prior deadlines I've missed, there is some element of research being simply
unpredictable. However, there are larger errors that I made that would be
valuable to not repeat.
I Was Indecisive
The first, and in my opinion largest, mistake boils down to indecision. While
the project started in late January, I never committed to a single research
question. Rather, I continued with various bits of exploratory analysis up
until very shortly before the deadline.
While much of this exploratory analysis holds small bits of value, it doesn't
come together as a cohesive whole. It isn't a paper. I knew this,
experientially, from the project I spent most of last year on. In that project,
I did a lot of initial analysis that never ended up in the paper despite
informing my understanding of the problem space.
Some of that will
be going back in as I finish dealing with revisions for journal publication in
the next couple of weeks. Despite knowing that much of my
exploratory work would find its final resting place on the cutting room floor,
I persisted with it up until about 12 days before the submission deadline.
Some of these results were quite strong,
One of our key results is
that prior network analysis had missed a key behavior native to Bitcoin
(one-time-use change addresses), which skew the "user" network substantially
unless corrected for. which gave me hope that I'd still be
able to put together something compelling. Had the small extension I spent
several of those 12 days on panned out, I might be writing a different post.
However: it didn't, and I'm writing this post because those results were not
enough, and I didn't commit to this idea until it was too late to fully realize
I Was Passive
As I worked through my analysis and began forming the project, it became clear
(to me) that the initial direction that I'd worked out with Dr. Thai wasn't
going to be viable. There were several alternatives, and earlier in the
semester I suggested each of them in turn to Dr. Thai in our meetings.
However, I didn't do so forcefully. I'm saying now that the direction was
inviable, but in early March I was not nearly so clear. I danced around the
point, and when she pushed back against it I wilted. Had I been clear,
I could've easily been spending six-to-eight weeks fleshing out one of these
more promising alternatives. It may still not have panned out, but it at least
wouldn't have felt like as much of a waste.
I Did Not Understand the Limitations of My Tools
Over the past year, I have spent a lot of time working within the
pandas ecosystem. It's been great! Pandas is
a great library for many tasks, with a flexible API that has allowed me to do
a lot of analysis much more quickly than I would've been able to
otherwise—especially when paired with
plotnine to quickly generate
complex visualizations. However, pandas has a bit of a problem with large data.
Specifically: pandas' memory usage is highly variable, difficult to predict,
and impossible to control. An operation may have minimal memory overhead and
take less than a second to compute—but a small modification to it may instead
take hours to run and result in a deep copy of some or all of the data. When the
first copy of the data clocks in at 120GB, doing a deep copy automatically
slows things to a crawl, and very quickly led to OOMing the server. The most
common culprit was the .groupby(...) method, though I had issues with some
chained aggregations via .apply(...) as well. Unfortunately, groupby(...)
is a fundamental operation necessary for my work, so many of attempts to
finalize results in the final days before the deadline simply fell apart.
Long nights and wasted hours could've been recovered if I'd realized the cause
of these memory issues. While I'd encountered performance
issues with pandas before, I had largely attributed them to hitting slow paths
in what is ultimately a Python library. During this project, I stumbled upon
this post by
Pandas creator Wes McKinney that hits on the reasons for many of the issues
I've faced. As useful as Pandas is, it became clear to me that it currently
isn't going to be a viable option for analysis of this particular
Not that I'm giving up on Pandas. In fact, I still use it
heavily. Rather, I now am better-equipped to identify which problems it is
ill-suited for. This one in particular has (a) a large memory footprint, and
(b) a heavy reliance on groupby(...) operations for reasons intrinsic to the
data. The combination of these two means that pandas is simply not the
Ultimately, I ended up rewriting many of these bits of analysis in Rust
and doing the aggregations manually, then loading and plotting the results.
These simple Rust programs took were not too difficult to write thanks to the
hdf5 and csv libraries, and
even with repeated data loading they are substantially faster than my
Python/pandas code. This let me complete part of my analysis, but ultimately
I lost too much time to struggling with MemoryErrors to be able to complete
all of it.
Despite this failure, I'm not particularly upset. I am frustrated, but am
trying to channel this frustration into dealing with the problems I faced
productively. I am particularly glad that I have had an excuse to basically
ignore research work for the past week, between grading exams and preparing to
teach a summer class. It has given me time to reflect on the factors that led
to this failure and realize that—even though it is my fault—it is
something that I can learn from and improve on subsequent papers.
Written by J. David Smith Published on 12 February 2018
I rerolled from my Hunter (MM lyfe) to a Brewmaster shortly after the Nighthold released, and holy hell I didn't expect this ride. I was going to just play my already-110 BDK, but our other tank was a BDK and after the incident with Cenarius
A bug/weird interaction on Cenarius removed Marrowrend stacks much faster than they could be applied. This made tanking Cenarius pre-fix on a BDK really awkward and more difficult than necessary. the raid team was kind of sketchy on having two of the same tank. So, I powerleveled my formerly-Windwalker Monk from 88 to 110 over a weekend and started tanking.
Since then, I've become active in the #brewlounge and #brewcraft channels in the Peak of Serenity Discord, tanked a bit of Mythic
5/9M T20. Less than I'd have liked., and picked up maintenance of the WoWAnalyzer modules for Brewmaster. All in the space of less than a year. Whew.
Now BfA has been announced and changes are rolling out on the alpha. Since Brewmasters have received no super noteworthy changes yet, it's the perfect time for a wishlist! Before we get to that, however, I'd like to break down what the Brewmaster kit looks like right now so we have some context for my BfA wishlist.
What We Have
Stagger & Mastery
Guardian Druids have held the title of "Dodge Tank" in the past, it now firmly belongs to Brewmasters. It is not uncommon to see people reach 70% or more dodge rates in practice thanks to our Mastery: Elusive Brawler. EB provides a stacking bonus to our dodge chance each time we either fail to dodge a dodgeable ability or cast certain spells.
The current list is Blackout Strike, Breath of Fire, and (with Blackout Combo and rarely used) Blackout Strike + Purifying Brew. This mastery performs three functions: first, it boosts our overall dodge percentage; second, it allows use to consistently deal with dodgeable mechanics through proactive play (banking EB-generating abilities); third, it mitigates downsides of randomness-driven mitigation.
That last point is fundamental. Historically, dodge-based mitigation has been much inferior to reduction-based mitigation due to shear randomness: if you fail dodge several hits in a row, you can simply die.
Those more familiar with pre-MoP tanking may be confused by this. Dodge was a good stat! It is more useful to compare Dodge for brewmasters with Defense prior to its removal: defense capping prevented crushing blows from randomly spiking you. Reaching the defense cap was considered mandatory. Brewmasters (along with VDH) have by far the least armor of any tank (I have 2,940 armor without trinkets, while a similarly geared DK has 5k+ and a Paladin has 7k+), meaning that pre-Stagger melee hits are much larger. Mastery prevents this from happening altogether. At current gear levels, many BrMs are guaranteed to dodge at least every 3rd hit without taking any actions. With the low cooldown of Blackout Strike, this gets close to a guaranteed every-other-hit dodge rate.
While Elusive Brawler significantly smooths our damage intake from melees, without further mitigation we would still be extremely spikey.
Smooth damage intake can be healed much more efficiently, lowering the strain on your healer's mana and reaction times. Stagger, our second "mitigation" method not only addresses this, but in fact has led us to be the smoothest tank in terms of damage-intake despite our reliance on the inherently random nature of dodge. Stagger causes every hit to be partially absorbed (currently 75% with Ironskin Brew up and no specific talents or gear), with the absorbed damage spread over the next ten seconds.
The exact mechanics of it aren't quite straightforward. emptyrivers has an excellent overview of how it works on the Peak of Serenity. Functionally, this means that every would-be spike of damage instead is translated into a much lower damage-taken-per-second effect.
Despite not reducing total damage taken, Stagger enabled ludicrous cheesing
Creative strategies that make otherwise-difficult mechanics nearly non-existant. early in the expansion. A cap on the total staggered damage (at most 10x player HP) was patched in after Tier 19 ended. Though it doesn't affect normal gameplay, it does mean that previous cases of stagger cheesing were much riskier if not impractical.
On the whole, Elusive Brawler and Stagger together give Brewmasters a unique way of smoothing damage taken – turning what would otherwise be the spikiest tank into the smoothest. In particular, Elusive Brawler gives rise to gameplay that no other tank presently has.
If you haven't known the joy of tanking Tyrannical Blackrook Hold as a BrM: Smashspite's charge can be dodged. It still applies stacks, and will rapidly reach the point of one-shotting you through any and all cooldowns. This leads to an intense do-or-die minigame.
Brews & CDR
The second, active, half of Brewmasters' mitigation toolkit comes in the form of Ironskin Brew & Purifying Brew. Ironskin Brew increases the amount of damage absorbed by Stagger from 40% to 75%. Note again that this does not mitigate anything. Our active mitigation is instead packed into Purifying Brew, which removes 40% (without traits or talents) the damage that has been absorbed by Stagger but not yet dealt over time. These two abilities share 3 charges with a cooldown of 21 seconds each (reduced by haste). However, Ironskin Brew only lasts for 8 seconds, meaning that based on these charges alone we are not capable of maintaining the buff – let alone Purifying.
However, this cooldown is reduced significantly by correctly performing the conventional rotation since key rotational abilities (Tiger Palm and Keg Smash) reduce the cooldown of brews by a significant amount. This situation is further helped by the presence of the (mandatory) Black Ox Brew talent, which restores all 3 charges on use, on an effective 40-45 second cooldown.
BoB benefits from the cooldown reduction of TP and KS. Combined, this gives each Brew charge an effective cooldown of around 6-7 seconds. Or, put another way: exactly enough to keep Ironskin Brew up 100% of the time and have occasional charges to spend on Purification.
This system sounds great. It leaves us with manageable DTPS over the course of a fight, while allowing us to mitigate large spikes such as Fel Claws with Purifying Brew. Unfortunately, it suffers from a fundamental flaw: prior to Tier 20, there was no cap on the duration of the Ironskin Brew buff. This led to effective BrMs ending fights with several minutes of time remaining on the buff, while still having plentiful brews to spend on-demand for Purification. In response to this, Blizzard implemented a cap of three times the buff duration (24 seconds with traits). This turned the previously-comfortable Brew system into one with an annoying buff to maintain and another button you rarely use. And when I say rarely, I mean rarely: pre-Mythic it is possible to go entire fights without needing to Purify due to low damage intake, and on Mythic bosses Purification may only occur a couple of times a minute. This leads to an unsatisfying situation where we have to jump through a bunch of hoops to hit 100% ISB uptime – which is nearly mandatory to avoid being DK- or DH-levels of spikey without the self-sustain, cooldowns, or utility to compensate – but then have excess charges that we worked to generate but can't use effectively.
On the whole, I like the Brew/CDR system. However, its flaws are noticeable and have no easy solutions. In conjunction with this system, Stagger has been on the razor's edge between hopelessly busted and borderline useless since its value fundamentally scales with the amount of incoming damage. On high damage fights, Purifying Brew provides excellent mitigation while Ironskin Brew keeps our damage intake smooth and easily healable. Below that point, Purifying Brew becomes worthless because we cannot reach a high enough level of staggered damage for purifying to be worth more than pressing Ironskin Brew again.
I won't spend long on this, but the current UI situation for Brewmasters is preeeeettty bad. The maintenance buff of Ironskin Brew is difficult to track on the default UI – a situation made worse by the fact that dropping it can lead to a quick, untimely demise. Worse: the Stagger bar caps out at 100% of your HP, which is quite frankly nothing. It is common for experienced brewmasters on #brew_questions to recommend not purifying at less than 100% of your HP staggered.
...Brewmasters are really good. They're a strong tank class for challenging content, fun to play, and have a distinct playstyle not shared with other classes. While flaws exist, the spec works despite them. There is still room for improvement, however, so now we get on to the part you're really reading this for: what I want in Battle for Azeroth.
What I Want
1. Death to Ironskin
The biggest complaint I have with the playstyle of BrM is the annoyance that is ISB. It is only interactive insofar as it requires not pressing Purify too much and executing your rotation well. Pressing the ISB button is merely an annoyance that is necessary to confirm you've done the other two things.
As we saw in Nighthold, we clearly can't leave it uncapped either. So, what do we do? An oft-suggested option is to make ISB baseline and reduce the CDR available for Purifying Brew. However, this seems unlikely to be effective either given how infrequently we have to Purify. It is clear from the Legion Beta posts that they intended for Ironskin Brew to be used as an active mitigation ability like Shield Block, but doing so left Brewmasters uncomfortably spikey in the interim.
My main issue with ISB as it stands is the needlessness of pressing that extra button during an already high-APM rotation.
I have logs with negative overall downtime since brews are off the GCD. What if Ironskin Brew were just removed, an equivalent and appropriately-themed buff given to Tiger Palm / Keg Smash, and Purifying Brew's cooldown reduced to compensate? While clearly not a perfect solution (it doesn't solve the fundamental balance issues with Stagger, for example), it does eliminate my major annoyance with current Brewmastery.
2. Better Utility
You might notice that in the What We Have segment I never once mentioned utility. While we have some unique utility currently, it mostly boils down to (a) cheesing fights with Dodge & Stagger, and (b) Ring of Peace – both of which are highly situational. This is in stark contrast to Meatball's Savage Roar, Protadin's Healing & Bubbles, and Blood DKs. My wish for BfA here is just to have a bit to set us apart from other tanks.
On the BfA Alpha, we're getting a bit of "extra" utility in the form of Leg Sweep being made baseline. This is supported by AoE stuns being stripped from other classes, which means that the presence of Leg Sweep in our kit becomes more noticeable – especially if we are allowed to go into BfA with both Ring of Peace and Leg Sweep available together.
3. Revisiting Dead Talents
On the whole, Brewmasters don't have many talents that are just totally completely unusable garbage. Even Chi Wave, which is used in neither dungeons nor raid content, can be a decent talent out in the world. I've used every talent we have at some point for a reason beyond just trying it out – every talent, that is, except for the Level 45 Row.
The Level 45 row has exactly 1 talent in it: Black Ox Brew. Don't @ me. I will fight you. BoB is better, both empirically and in feelycraft, than either of the alternatives in every feasible situation. Going into BfA, I'd like to see that change. That may mean buffing the alternatives,
I'd love to have access to proper self-sustain through a buffed GotM but more realistically it means that BoB just needs to get hit. The cooldown reduction it provides is simply too high – it is both immensively powerful at smoothing an entire fight due to its low cooldown, and excellent for dealing with specific mechanics like KJ's Fel Claws.
4. (Just for Me) Better Transparancy
As I mentioned in the intro, I maintain the WoWAnalyzer modules for Brewmaster. This has been a lot of fun: the dev team is fantastic, #brewcraft has been super helpful in filling in the gaps in my knowledge, and the codebase is generally nice to work with. That said, it has not been without frustrations. Stagger in particular has been an immense nuisance. I rewrote what the previous maintainer (WOPR) had done so that I could add some extra features that were infeasible before, but that implementation sometimes breaks down for no apparent reason.
In BfA, I'd like to see the amount of staggered damage reported in the logs. There is already a classResources field that records things like energy, runes, mana, etc. While there may be technical limitations to adding Stagger to that list, I'd very much like to see it happen. It'd make my job significantly easier.
Wrapping It Up
Playing Brewmaster during Legion has been one of the highlights of my WoW career, and I'm excited to see where we end up in BfA. Although I'm obviously hoping that not much changes since BrM is currently a lot of fun, I am hopeful that we can have some wrinkles ironed out in the transition to the next expansion.
Written by J. David Smith Published on 07 September 2017
It's one of those days where I am totally unmotivated to accomplish anything
(despite the fact that I technically already have – the first draft of my qual
survey is done!). So, here's a brief aside that's been in the back of my mind
for a few months now.
It is extremely common for the simulations in my line of work
Or our, hi fellow student! to have a large set of
parameters. The way that this is handled varies from person to person, and at
this point I feel as though I've seen everything; I've seen simple getopt
usage, I've seen home-grown command-line parsers, I've seen
compile-time #defines used to switch models!
Fig 1: Me, reacting to #ifdef PARAM modelA #else modelB #endif Worse, proper documentation on what the
parameters mean and what valid inputs are is as inconsistent as the
implementations themselves. Enough. There is a better way.
Docopt is a library that is available in basically any
language you care about
R, and even
Shell! Language is not an excuse for
skipping on this. that parses a
documentation string for your command line interface and automatically
builds a parser from it. Take, for example, this CLI that I used for a
re-implementation of my work on Socialbots:
here for context on what the parameters
(aside from ζ, which has never actually been used) mean.
Simulation for <conference>.
recon <graph> <inst> <k> (–etc | –hmnm | –zeta <zeta>) [options]
recon (-h | –help)
-h –help Show this screen.
–etc Expected triadic closure acceptance.
–etc-zeta <zeta> Expected triadic closure acceptance with ζ.
–zeta <zeta> HM + ζ acceptance.
–hmnm Non-Monotone HM acceptance.
–degree-incentive Enable degree incentive in acceptance function.
–wi Use the WI delta function.
–fof-scale <scale> Set B_fof(u) = <scale> B_f(u). [default: 0.5]
–log <log> Log to write output to.
This isn't a simple set of parameters, but it is far from the most complex I've
worked with. Just in this example, we have positional arguments (<graph> <inst> <k>)
followed by mutually-exclusive settings (–etc | –hmnm | ...)
followed by optional parameters ([options]). Here is how you'd parse this
with the Rust version of Docopt:
Although in this version type validation must be done manually (e.g. if you
expect a number but the user provides a string, you must check that the given
type can be cast to a string), this is still dramatically simpler than any
parsing code I've seen in the wild. Even better: your docstring is always up
to date with the parameters that you actually take.
certain amounts of bitrot are always possible. For example, you could add a
parameter but never implement handling for it. However, you can't accidentally
add or rename a flag and then never add it to the docstring, which is far more
common in my experience. So – for your sanity and mine –
please just use Docopt (or another CLI-parsing library) to read your
parameters. These libraries are easy to statically link into your code (to
avoid .dll/.so not found issues), and so your code remains easy to move
from machine to machine in compiled form. Please. You won't regret it.
Written by J. David Smith Published on 23 February 2017
A couple of weeks ago I mentioned several methods that I'd been (trying) to use to
help improve the verifiability of my research. This is the first in an
(eventual) series on the subject. Today's topic? Rust.
One of the reasons it took me so long to write this post is the difficulties
inherent in describing why I chose Rust over another, more traditional
language like Java or C/C++. Each previous time, I motivated the selection by
first covering the problems of those languages. I have since come to realize
that that is not a productive approach – each iteration invariably devolved
into opining about the pros and cons of various trade-offs made in language
design. In this post, I am instead going to describe the factors that led to me
choosing Rust, and what practical gains it has given me over the past two
projects I've worked on.
In the Beginning...
First, some background on who I am, what I do, and what constraints I have on
my work. I am a graduate student at the University of Florida studying
Optimization/Security on Online Social Networks under Dr. My T. Thai.
Most problems I work on are framed as (stochastic) graph optimization problems,
which are almost universally NP-hard. We typically employ
approximation algorithms to address this, but even then the resulting
algorithm can still be rather slow to experiment with due to a combination of
difficult problems, large datasets and the number of repetitions needed to
establish actual performance.
This leads to two often-conflicting constraints: the implementations musts be
performant to allow us to both meet publication deadlines and compete with
previous implementations, and the implementations must also be something we can
validate as correct.
Well, ideally. Often, validating code is only
done by the authors and code is not released. Most code in my
field is in either C or C++, with the occasional outlier in Java when
performance is less of a concern, in order to satisfy that first constraint.
However, after repeatedly having to work in other's rushed C/C++ codebases,
I got fed up. Enough of this! I thought, There must be something better!
I set about re-implementing the state-of-the-art approximation algorithm (SSA)
for the Influence Maximization problem.
Hung Nguyen, Thang Dinh,
My Thai. “Stop-and-Stare: Optimal Sampling Algorithms for Viral Marketing
in Billion-Scale Networks.” In the Proceedings of SIGMOD 2016. This method has an absurdly-optimized C++ implementation, using
all sorts of tricks to eek out every possible bit of performance. Solving this
problem – even approximately – on a network with 40+ million nodes is no
small feat, and the implementation shows the effort they put into getting
there. This implementation formed my baseline both for performance and
correctness: every language I rewrote it in should produce (roughly) identical
The algorithm uses non-deterministic sampling, so there
is room for some error after all the big contributors have been found. and every feasible language would have to get in the same
ballpark in terms of memory usage and performance.
In my spare time over the space of a couple of months, I implemented SSA in
Chez Scheme, Clojure, Haskell, OCaml and,
eventually, Rust. My bent towards functional programming shows clearly
in this choice of languages. I'm a lisp weenie at heart, but unfortunately no
Scheme implementation I tried nor Clojure could remotely compete with the C++
implementation. Haskell and OCaml shared the problem that graph processing in
purely-functional languages is just an enormous pain in the ass, in addition to
not hitting my performance goals
Some of this is due to my
unfamiliarity with optimizing in these languages. However, I didn't want to get
into that particular black magic just to use a language that was already
painful for use with graphs. – though they got much closer.
Then I tried Rust. Holy. Shit.
The difference was immediately apparent. Right off the bat, I got performance
in the ballpark of the baseline (once I remembered to compile in release
mode). It was perhaps 50% slower than the C++ implementation. After a few
rounds of optimization, I got that down to about 20-25%.
Unfortunately, I no longer have the timing info. As that isn't the focus of
this post, I also haven't re-run these experiments. Had I been
willing to break more safety guarantees, I could have applied several of the
optimizations from the baseline to make further improvement. Some of this
performance is due to the ability to selectively use mutability to speed up
critical loops, and some is due to the fact that certain graph operations are
simply easier to express in an imperative fashion, leading me to write less
wasteful code. What's more: it also has a similar type system – with its
strong guarantees – to the system of ML-family languages that I adore,
combined with a modern build/packaging tool and a thriving ecosystem. It seemed
that I'd found a winner. Now, for the real test.
The First Paper: Batched Stochastic Optimization
I was responsible for writing the implementation for the next paper I worked
The code – and paper – aren't out yet. I'll hear back in
early March as to whether it was accepted. We were extending
a previous work
Xiang Li, J David Smith, Thang Dinh, My T.
Thai. “Privacy Issues in Light of Reconnaissance Attacks with Incomplete
Information.” In the Proceedings of WI 2016. to support
batched updates. As the problem was again NP-hard with dependencies on the
ordering of choices made, this was no small task. Our batching method ended up
being exponential in complexity, but with a strict upper bound on the size of
the search space so that it remained feasible. This was my first testbed for
Rust as a language for my work.
Immediately, it paid dividends. I was able to take advantage of
that allowed me to jump straight from starting work to implementing our method.
Performance was excellent, parallelism was easy, and I was able to
easily log in a goddamn parseable format. It was wonderful. Even the
time I spent fighting the borrow checker
A phase I will note was
very temporary; I rarely run into it anymore. failed to outrun
the benefits gained by being able to build on others' work. I had a real
testing framework, which caught numerous bugs in my batching code. Even
parallelism, which generally isn't much of a problem for such small codebases,
was no harder than applying OpenMP. Indeed, since I needed
read-write locks it was in fact easier as OpenMP lacks that feature
(you have to use pthread instead). It was glorious, and I loved it.
Still do, to be honest.
The Second Work: Generic Analysis
The next paper I worked on was theory driven. I showed a new method to
estimate a lower bound on approximation quality for a previously unstudied
Not exactly new, but general bounds for this class of problem had
not really been considered. As a result, the greedy approximation algorithm
hadn't seen much use on it. class of problems. The general
idea is that we're maximizing an objective function f, subject to some
constraints where we know that all maximal solutions have the same size.
I needed to be able to transparently plug in different values of f, operating
on different kinds of data, with different constraints. Further, for
efficiency's sake I also needed a way to represent the set of elements that
would need updating after each step.
Rust gave me the tools to do this. I defined a trait Objective to
abstract over the different possible values of f. Each implementor
could handle building their own internal representation, and with associated
types I could easily allow each kind to operate on their own kind of data.
It worked. Really well, actually.
I wrote a completely generic greedy algorithm in terms of this trait,
along with some pretty fancy analysis. Everything just...worked, and
with static dispatch I paid very little at runtime for this level of
indirection. At, least, as soon as it compiled.
Not All Fun & Games
As great as my time with Rust has been, there are still a few flaws I feel
compelled to point out.
Borrow Checking is Sometimes Extremely Painful
While in most cases the borrow checker became something I dealt with
subconsciously, in a few it was still an extraordinary pain in the ass. The
first, and what seems to be the most common, is the case of having callbacks on
structs. Some of this comes down to confusion over the correct syntax (e.g.
Fn(...) -> X or fn(...) -> X), but I mentally recoil at the thought of
trying to make a struct with callbacks on it. This was not a pleasant
The second arose in writing the [constructor][ssabuilder] for an InfMax
objective using the SSA sampling method. There are multiple different
diffusion models for this problem, each represented as a struct implementing
Iterator. I wanted my InfMax struct to own the Graph object it operated
on, and pass a reference of this to the sampling iterator. The borrow checker
To this day, I still don't know how to get that to work. I ultimately caved and
had InfMax store a reference with lifetime 'a, giving the sampler
a reference with a lifetime 'b such that 'a: 'b (that is, 'a is at least
as long as 'b). While this workaround took only a moderate amount of time to
find, I still lost quite a bit trying to get it to work as I wanted. Giving
InfMax a reference never caused any issues, but I still find it annoying.
In order to use the bit-set library, I wanted to force the
Element associated type of each Objective to satisfy Into<usize>. The
NodeIndex type from petgraph does not, but has a method fn index(self) -> usize. Due to the rules about when you can implement traits (namely, that you
can't implement traits when both the trait and the type are non-local), this
was impossible without forking or getting a PR into the petgraph repository.
Forking was my ultimate solution, and hopefully a temporary one. There isn't
a clear way to work around this
Implementing a helper trait was frustrating
because you can't implement both From<T: Into<usize>> for Helper and
From<NodeIndex> for Helper, presumably because at a future point NodeIndex
could implement From/Into<usize>?, and yet there also isn't
a clear way to make this work without allowing crates to break each other.
Fortran FFI & Absent Documentation
I almost feel bad complaining about this, because it is so, so niche. But
I'm going to. I had to interface with a bit (read: nearly 4000 lines) of Fortran
code. In theory, since the Fortran ABI is sufficiently similar to C's, I ought
to be able to just apply the same techniques as I use for C FFI. As it happens,
this is mostly correct. Mostly.
Fortran FFI is poorly documented for C, so as one might imagine instructions
for interfacing from Rust were nearly absent. It turns out that, in Fortran,
everything is a pointer. Even elementary types like int. Of course, I also
had the added complexity that matrices are stored in column-major order (as
opposed to the row-major order used everywhere else). This led to a lengthy
period of confusion as to whether I was simply encoding my matrices wrong,
wasn't passing the pointers correctly, or was failing in some other way (like
passing f32s instead of f64s).
It turns out that you simply need to flatten all the matrices, pass everything
by pointer, and ensure that no call is made to functions with
Calling functions with
matrices of the form REAL(N, M) works provided N and M are either
variables in scope or parameters of the function.
The inability to multiply numbers of different types together without as is
immensely frustrating. I know why this is – I've been bitten by a / b doing
integer division before – but formulae littered with x as f64 / y as f64 are
simply difficult to read. Once you've figured out the types everything needs to
be, this can be largely remedied by ensuring everything coming into the
function has correct type and pre-casting everything else. This helps
dramatically, though in the end I often found myself just making everything
f32 to save myself the trouble (as that was what most values already were).
The inverted notation for things like pow and log (written x.pow(y) and
x.log(y) in Rust) further hinders readability.
Verifiability & The Big Wins for Academic Use
I began this post by referencing the verifiability of simulations (and other
code). Through all this, I have focused more on the usability wins rather than
the verifiability. Why is that? Fundamentally, it is because code is only as
verifiable as it is written to be. The choice of language only impacts this in
how well it constrains authors to “clearly correct” territory. In comparison to
C/C++, Rust has clear advantages. There is no risk of accidental integer
division, very low risk of parallelism bugs due to the excellent
standard primitives, and nearly-free documentation.
Some commenting required. Accidentally indexing outside of
bounds (a problem I've run into before) is immediately detected, and error
handling is rather sane.
Further, the structure of Rust encourages modularization and allows
formerly-internal modules to be pulled out into their own crates, which can be
independently verified for correctness and re-used with ease – even without
uploading them to crates.io. A concrete example of this is
my Reverse Influence Sampling Iterators, which were originally (and
still are, unfortunately) a local module from my most recent paper and have
subsequently found use in the (re-)implementation of several recent influence
This modularity is, I think, the biggest win. While reviewing and verifying
a several-thousand-line codebase associated with a paper is unlikely to ever be
practical, a research community building a set of common libraries could not
only reduce the need to continually re-invent the wheel, but also limit the
scope of any individual implementation to only the novel portion. This may
reduce codebase sizes to the point that review could become practical.
Case in Point: the original implementation of the TipTop
algorithm was nearly 1500 lines of C++. My more recent Rust implementation
Admittedly, this is without the weighted sampling allowed
in the C++ implementation. However, that isn't hard to implement.
For a while, I was worried that my exploration of alternate languages would be
fruitless, that I'd wasted my time and would be stuck using C++ for the
remainder of grad school. Words cannot describe how happy I am that this is not
the case. Despite its flaws, using Rust has been an immeasurable improvement
over C++ both in terms of productivity and sanity. Further, while verifiability
is very much secondary in my choice of language, I believe the tools and safety
that Rust provides are clearly advantageous in this area.