Graph-based Semi-Supervised Active Learning for Edge Flows

We develop a graph-based semi-supervised learning (SSL) method for learning edge flows defined on a graph. Specifically, we are given flow measurements on a subset of edges and want to predict the flows on the remaining edges. To this end, we develop a computational framework that imposes certain constraints on the overall flows, such as (approximate) flow conservation. These constraints render our approach different from classical graph-based SSL for vertex labels, which posits that tightly connected nodes share similar labels and leverages the graph structure accordingly to extrapolate from a few vertex labels to the unlabeled vertices. We derive bounds for our method’s reconstruction error and demonstrate its strong performance on synthetic and real-world flow networks from transportation, physical infrastructure, and the Web. Computationally, our method boils down to a sparse linear least-squares problem, which we can efficiently solve with iterative methods. Furthermore, we provide two active learning algorithms for selecting informative edges on which to measure flow, which has applications for optimal sensor deployment.