This is a selection of some of the projects I have worked on over the course of my career. It is broken down into three sections: Research, Industry, and Hobby Projects.
Projects done during the course of my PhD work at the University of Florida under Dr. My T. Thai. I studied approximate methods for discrete optimization with a focus on its applications in social networks analysis.
In the study of modern social networks, we work with networks under very restrictive access controls. My goal with this project was to develop a method to measure community structure in an unsupervised fashion that functions despite such limitations.
To accomplish this, I developed a novel method to calculate the local sparsity of a connection on the network, which is closely related to community structure. Using state-of-the-art Monte Carlo estimation methods, my approach is query-efficient enough to run on Twitter in a reasonable timeframe. Further, I developed a scalable, highly-parallel Rust implementation to calculate the distribution of this quantity on very large networks—and explored what this summarization can tell us about the network structure.
Most optimization problems can be divided into two pieces: the constraint system on which you solve, and the function that you optimize under those constraints. Many traditional problems use a simple size constraint paired with a function that has diminishing returns (or, formally, submodularity)—a property that leads to very powerful results for approximate optimization.
We developed an algorithm to optimize a much broader class of functions on a more powerful constraint system known as the integer lattice. I built an efficient Rust implementation of our method, which we used to show that our approach is faster than prior work both in a holistic sense and in terms of query requirements—without a drop in solution quality.
One of the core functions of social networks is to spread information. It is natural, then, to ask how we can maximize such spread. Unfortunately, even evaluating the expected spread is very difficult—it is in class #P and requires exponential time to do exactly. Nonetheless, in this project we developed an efficient, approximately optimal algorithm to solve this problem.
We used a sampling-based approach to estimate the optimization surface efficiently. In fact, the number of samples required ended up being so low that we switched from the traditional greedy algorithm to an Integer Programming approach to improve our approximation guarantee to be at least 1 - ε times the optimal. I built the implementation we used from this: first in C++ with library code from previous projects, and subsequently a full rewrite in Rust along with safe wrappers for the C APIs of the CPLEX and Gurobi solvers to make the implementation obviously correct.
Now-public projects done during my time as an intern at IBM in Summer 2014 and 2015.
In software engineering, the task of finding and effectively fixing security vulnerabilities is daunting—especially in a codebase with hundreds of thousands or millions of lines of code. Many classes of vulnerabilities are caused by malicious input flowing to unintended parts of the software, making code analysis a powerful tool for identifying possible vulnerabilities.
However, even having a list of vulnerabilities constructed automatically still leaves the developer with the task of finding an appropriate fix location. At IBM, I built an interactive visualization tool for the AppScan Source static code analysis product. Developers could explore the map, using it to identify points where one fix could address multiple vulnerabilities in a single change.
IBM has a network traffic management appliance (DataPower) that has historically been focused on use in enterprise networks. As part of their cloud push, they wanted to expand this to include use on their platform-as-a-service offering. My team designed a developed a proof-of-concept system that exposed the full power of the appliance to users of the IBM Cloud (called Bluemix at the time).
A key challenge we faced was the single-tenant design of the DataPower appliance: it assumed that anyone with write-access could define a policy to manage any traffic, which does not match the situation in a multi-tenant cloud system. To address this, we designed a traffic policy description JSON schema that allowed users to define policies that only affected their own application while maintaining most of the power and flexibility of the underlying hardware.
We built a user-friendly cloud interface (running Angular.js on the front-end and Node.js on the back-end) for querying traffic patterns from ElasticSearch , checking policies against that traffic pre-deployment, and subsequently deploying those policies to the appliance. This proof-of-concept was deployed to one of the cloud datacenters for demonstration, and parts were later incorporated into the DataPower product under the hood.
Projects done for personal enjoyment, or to solve problems in the communities I am involved in.
As an officer for Occasional Excellence's Weekend raid team in World of Warcraft, one of my responsibilities is to review combat logs mid- and post-raid to identify issues hindering our progress and propose solutions. On longer bosses like Queen Azshara or N'zoth, the Corruptor, these can become tedious as the same queries are often repeated over many nights of progression.
To address this, I built a visualization platform in TypeScript to perform two functions: display aggregated results over an entire night of progression (something that WarcraftLogs does poorly), and save and repeat queries automatically on future nights.
Visualizations are written on the client in a grammar of graphics via the Vega-Lite library, allowing me to iterate quickly and identify macro problems in our raid performance. This also allows them to be shared with other users, though at this time I only know of one other person using this tool.
One of the challenges in learning to play World of Warcraft is how opaque the causes of your success and failure in-game can be. WoWAnalyzer helps address this by breaking down a player's performance, grading them on a number of metrics and providing concrete suggestions for improvement.
I maintain the Brewmaster Monk and Protection Paladin analysis code, along with many of our testing tools. My goal has been to provide two main pieces of analysis: introductory-level evaluation of performance suitable for new players, and detailed evaluations of item, trinket and talent values suitable for advanced players.
The Brewmaster analysis has been my focus for much of my time helping on WoWAnalyzer. I've implemented a number of advanced features to support my goals. For example: instead of a fixed threshold for defensive ability use that may be wildly incorrect on some encounters, my code automatically determines the appropriate threshold based on time-varying player stats and random effects that are outside of player control.
I have been a user of tiling window managers on Linux systems for over a decade, and decided to try my hand at writing one with the maturation of the Wayland backend. A tiling window manager differs from manual window management as done in most desktop environments in that it aims to automatically places windows in a structured manner.
The core is implemented in C,
building on the wlc
compositing
library. It exposes the windowing primitives to Guile
Scheme, where the main window management logic is
implemented. A key feature I implemented was layout
composition, which allows layout functions to be composed to create more complex layouts dynamically.