Home | Research | Contact | Fun |

### Computer Science

A Market Framework for Eliciting Private Data,
NIPS
2015.
B Waggoner, RM F, J Abernethy.

[ abstract + ]

We propose a mechanism for purchasing information from a sequence of participants. The participants may simply hold data points they wish to sell, or may have more sophisticated information; either way, they are incentivized to participate as long as they believe their data points are representative or their information will improve the mechanism's future prediction on a test set. The mechanism, which draws on the principles of prediction markets, has a bounded budget and minimizes generalization error for Bregman divergence loss functions. We then show how to modify this mechanism to preserve the privacy of participants' information: At any given time, the current prices and predictions of the mechanism reveal almost no information about any one participant, yet in total over all participants, information is accurately aggregated.

On Elicitation Complexity,
NIPS
2015.
RM F, IA Kash.

[ abstract + ]
[ arXiv ]

Elicitation is the study of statistics or properties which are computable via empirical risk minimization. While several recent papers have approached the general question of which properties are elicitable, we suggest that this is the wrong question---all properties are elicitable by first eliciting the entire distribution or data set, and thus the important question is how elicitable. Specifically, what is the minimum number of regression parameters needed to compute the property?

Building on previous work, we introduce a new notion of elicitation complexity and lay the foundations for a calculus of elicitation. We establish several general results and techniques for proving upper and lower bounds on elicitation complexity. These results provide tight bounds for eliciting the Bayes risk of any loss, a large class of properties which includes spectral risk measures and several new properties of interest.

Convergence Analysis of Prediction Markets via Randomized Subspace Descent,
NIPS
2015.
RM F, MD Reid.

[ abstract + ]
[ pdf ]
[ arXiv ]
[ talk ]

Prediction markets are economic mechanisms for aggregating information about future events through sequential interactions with traders. The pricing mechanisms in these markets are known to be related to optimization algorithms in machine learning and through these connections we have some understanding of how equilibrium market prices relate to the beliefs of the traders in a market. However, little is known about rates and guarantees for the convergence of these sequential mechanisms, and two recent papers cite this as an important open question.

In this paper we show how some previously studied prediction market trading models can be understood as a natural generalization of randomized coordinate descent which we call randomized subspace descent (RSD). We establish convergence rates for RSD and leverage them to prove rates for the two prediction market models above, answering the open questions. Our results extend beyond standard centralized markets to arbitrary trade networks.

Vector-Valued Property Elicitation,
COLT
2015.
RM F, IA Kash.

[ abstract + ]
[ pdf ]
[ talk ]

The elicitation of a statistic, or property of a distribution, is the task of devising proper scoring rules, equivalently proper losses, which incentivize an agent or algorithm to truthfully estimate the desired property of the underlying probability distribution.

Exploiting the connection of such elicitation to convex analysis, we address the vector-valued property case, which has received relatively little attention in the literature but is relevant for machine learning applications. We first provide a very general characterization of linear and ratio-of-linear properties, which resolves an open problem by unifying several previous characterizations in machine learning and statistics. We then ask which vectors of properties admit nonseparable scores, which cannot be expressed as a sum of scores for each coordinate separately, a natural desideratum for machine learning. We show that linear and ratio-of-linear do admit nonseparable scores, and provide evidence for the conjecture that these are the only such properties (up to link functions). Finally, we provide a general method for producing identification functions and address an open problem by showing that convex maximal level sets are insufficient for elicitability in general.

Generalized Mixability via Entropic Duality,
COLT
2015.
MD Reid, RM F, RC Williamson, N Mehta.

[ abstract + ]
[ arXiv ]

Mixability is a property of a loss which characterizes when fast convergence is possible in the game of prediction with expert advice. We show that a key property of mixability generalizes, and the exp and log operations present in the usual theory are not as special as one might have thought. In doing this we introduce a more general notion of Φ-mixability where Φ is a general entropy (i.e., any convex function on probabilities). We show how a property shared by the convex dual of any such entropy yields a natural algorithm (the minimizer of a regret bound) which, analogous to the classical aggregating algorithm, is guaranteed a constant regret when used with Φ-mixable losses. We characterize precisely which Φ have Φ-mixable losses and put forward a number of conjectures about the optimality and relationships between different choices of entropy.

Elicitation for Aggregation,
AAAI
2015.
RM F, Y Chen, IA Kash.

[ abstract + ]
[ pdf ]
[ arXiv ]

We study the problem of eliciting and aggregating probabilistic information from multiple agents. In order to successfully aggregate the predictions of agents, the principal needs to elicit some notion of confidence from agents, capturing how much experience or knowledge led to their predictions. To formalize this, we consider a principal who wishes to elicit predictions about a random variable from a group of Bayesian agents, each of whom have privately observed some independent samples of the random variable, and hopes to aggregate the predictions as if she had directly observed the samples of all agents. Leveraging techniques from Bayesian statistics, we represent confidence as the number of samples an agent has observed, which is quantified by a hyperparameter from a conjugate family of prior distributions. This then allows us to show that if the principal has access to a few samples, she can achieve her aggregation goal by eliciting predictions from agents using proper scoring rules. In particular, if she has access to one sample, she can successfully aggregate the agents' predictions if and only if every posterior predictive distribution corresponds to a unique value of the hyperparameter. Furthermore, this uniqueness holds for many common distributions of interest. When this uniqueness property does not hold, we construct a novel and intuitive mechanism where a principal with two samples can elicit and optimally aggregate the agents' predictions.

On Risk Measures, Market Making, and Exponential Families,
SIGecom Exchanges
2014.
J Abernethy, RM F, S Kutty.

[ abstract + ]
[ link ]

In this note we elaborate on an emerging connection between three areas of research: (a) the concept of a risk measure developed within financial mathematics for reasoning about risk attitudes of agents under uncertainty, (b) the design of automated market makers for prediction markets, and (c) the family of probability distributions known as exponential families.

General Truthfulness Characterizations via Convex Analysis,
WINE
2014.
RM F, IA Kash.

[ abstract + ]
[ slides ]
[ pdf ]
[ arXiv ]
[ link ]
[ poster ]

We present a model of truthful elicitation which generalizes and extends mechanisms, scoring rules, and a number of related settings that do not quite qualify as one or the other. Our main result is a characterization theorem, yielding characterizations for all of these settings, including a new characterization of scoring rules for non-convex sets of distributions. We generalize this model to eliciting some property of the agent's private information, and provide the first general characterization for this setting. We also show how this yields a new proof of a result in mechanism design due to Saks and Yu.

Randomized Subspace Descent,
OPT
2014.
RM F, MD Reid.

[ abstract + ]
[ link ]

We develop a generalization of randomized coordinate descent for smooth convex problems, where the coordinates specify arbitrary subspaces, and derive standard O(1/epsilon) and O(1/log epsilon) rates. For the special case of overlapping 2-block subspaces (i.e. graphs), which has received attention in the literature recently, we derive a convergence rate on a given graph in terms of its algebraic connectivity. Using this connection, we introduce bounds for graph topologies not previously considered. We conclude with preliminary progress toward the interesting open question: what is the best network structure for a given optimization problem?

Market Making with Decreasing Utility for Information,
UAI
2014.
M Dudik, RM F, JW Vaughan.

[ abstract + ]
[ arXiv ]
[ link ]

We study information elicitation in cost-function-based combinatorial prediction markets when the market maker’s utility for information decreases over time. In the sudden revelation setting, it is known that some piece of information will be revealed to traders, and the market maker wishes to prevent guaranteed profits for trading on the sure information. In the gradual decrease setting, the market maker’s utility for (partial) information decreases continuously over time. We design adaptive cost functions for both settings which: (1) preserve the information previously gathered in the market; (2) eliminate (or diminish) rewards to traders for the publicly revealed information; (3) leave the reward structure unaffected for other information; and (4) maintain the market maker’s worst-case loss. Our constructions utilize mixed Bregman divergence, which matches our notion of utility for information.

A General Volume-Parameterized Market Making Framework,
EC
2014.
J Abernethy, RM F, JW Vaughan, X Li.

[ abstract + ]
[ pdf ]
[ link ]

We introduce a framework for automated market making for prediction markets, the volume parameterized market (VPM), in which securities are priced based on the market maker's current liabilities as well as the total volume of trade in the market. We provide a set of mathematical tools that can be used to analyze markets in this framework, and show that many existing market makers (including cost-function based markets, profit-charging markets, and buy-only markets) all fall into this framework as special cases. Using the framework, we design a new market maker, the perspective market, that satisfies four desirable properties (worst-case loss, no arbitrage, increasing liquidity, and shrinking spread) in the complex market setting, but fails to satisfy information incorporation. However, we show that the sacrifice of information incorporation is unavoidable: we prove an impossibility result showing that any market maker that prices securities based only on the trade history cannot satisfy all five properties simultaneously. Instead, we show that perspective markets may satisfy a weaker notion that we call center-price information incorporation.

Convex Foundations for Generalized MaxEnt Models,
MaxEnt
2013.
RM F, MD Reid.

[ abstract + ]
[ slides ]
[ pdf ]

We present an approach to maximum entropy models that highlights the convex geometry and duality of Generalized Exponential Families (GEFs) and their connection to Bregman divergences. Using our framework, we are able to resolve a puzzling aspect of the bijection of Banerjee et al. (2005) between classical exponential families and what they call regular Bregman divergences. Their regularity condition rules out all but Bregman divergences generated from log-convex generators. We recover their bijection and show that a much broader class of divergences correspond to GEFs via two key observations: 1) Like classical exponential families, GEFs have a ``cumulant’’ C whose subdifferential contains the mean; 2) Generalized relative entropy is a C-Bregman divergence between parameters, which becomes the KL divergence for Shannon entropy. We also show that every incomplete market with cost function C (see Abernethy et al. (2011)) can be expressed as a complete market, where the prices are constrained to be a GEF with cumulant C. This provides an entirely new interpretation of prediction markets, relating their design back to the principle of maximum entropy.

How to Hedge an Option Against an Adversary: Black-Scholes Pricing is Minimax Optimal,
NIPS
2013.
J Abernethy, P Bartlett, RM F, A Wibisono.

[ abstract + ]
[ pdf ]
[ link ]

We consider a popular problem in finance, option pricing, through the lens of an online learning game between Nature and an Investor. In the Black-Scholes option pricing model from 1973, the Investor can continuously hedge the risk of an option by trading the underlying asset, assuming that the asset's price fluctuates according to Geometric Brownian Motion (GBM). We consider a worst-case model, in which Nature chooses a sequence of price fluctuations under a cumulative quadratic volatility constraint, and the Investor can make a sequence of hedging decisions. Our main result is to show that the value of our proposed game, which is the "regret" of the hedging strategy, converges to the Black-Scholes option price. We use significantly weaker assumptions than previous work---for instance, we allow large jumps in the asset price---and show that the Black-Scholes hedging strategy is near-optimal for the Investor even in this non-stochastic framework.

Eliciting Private Information from Selfish Agents,
Ph.D. Thesis, UC Berkeley
2013.
RM F.

[ abstract + ]
[ printable ]
[ compact ]

Ever since the Internet opened the floodgates to millions of users, each looking after their own interests, modern algorithm design has needed to be increasingly robust to strategic manipulation. Often, it is the inputs to these algorithms which are provided by strategic agents, who may lie for their own benefit, necessitating the design of algorithms which incentivize the truthful revelation of private information -- but how can this be done? This is a fundamental question with answers from many disciplines, from mechanism design to scoring rules and prediction markets. Each domain has its own model with its own assumptions, yet all seek schemes to gather private information from selfish agents, by leveraging incentives. Together, we refer to such models as elicitation.

This dissertation unifies and advances the theory of incentivized information elicitation. Using tools from convex analysis, we introduce a new model of elicitation with a matching characterization theorem which together encompass mechanism design, scoring rules, prediction markets, and other models. This lays a firm foundation on which the rest of the dissertation is built.

It is natural to consider a setting where agents report some alternate representation of their private information, called a property, rather than reporting it directly. We extend our model and characterization to this setting, revealing even deeper ties to convex analysis and convex duality, and we use these connections to prove new results for linear, smooth nonlinear, and finite-valued properties. Exploring the linear case further, we show a new four-fold equivalence between scoring rules, prediction markets, Bregman divergences, and generalized exponential families.

Applied to mechanism design, our framework offers a new perspective. By focusing on the (convex) consumer surplus function, we simplify a number of existing results, from the classic revenue equivalence theorem, to more recent characterizations of mechanism implementability.

Finally, we follow a line of research on the interpretation of prediction markets, relating a new stochastic framework to the classic Walrasian equilibrium and to stochastic mirror descent, thereby strengthening ties between prediction markets and machine learning.

Parallel Boosting with Momentum,
ECML
2013.
K Canini, RM F, I Mukherjee, Y Singer.

[ abstract + ]
[ link ]

We describe a new, simplifed, and general analysis of a fusion of Nesterov's accelerated gradient with parallel coordinate descent. The resulting algorithm, which we call BOOM, for boosting with momentum, enjoys the merits of both techniques. Namely, BOOM retains the momentum and convergence properties of the accelerated gradient method while taking into account the curvature of the objective function. We describe an distributed implementation of BOOM which is suitable for massive high dimensional datasets. We show experimentally that BOOM is especially effective in large scale learning problems with rare yet informative features.

Interpreting prediction markets: a stochastic approach,
NIPS
2012.
RM F, N Della Penna, MD Reid.

[ abstract + ]
[ slides ]
[ pdf ]
[ link ]

We strengthen recent connections between prediction markets and learning by showing that a natural class of market makers can be understood as performing stochastic mirror descent when trader demands are sequentially drawn from a fixed distribution. This provides new insights into how market prices (and price paths) may be interpreted as a summary of the market's belief distribution by relating them to the optimization problem being solved. In particular, we show that under certain conditions the stationary point of the stochastic process of prices generated by the market is equal to the market's Walrasian equilibrium of classic market analysis. Together, these results suggest how traditional market making mechanisms might be replaced with general purpose learning algorithms while still retaining guarantees about their behaviour.

A Characterization of Scoring Rules for Linear Properties,
COLT
2012.
J Abernethy, RM F.

[ abstract + ]
[ slides ]
[ pdf ]
[ link ]

We consider the design of

proper scoring rules, equivalentlyproper losses, when the goal is to elicit some function, known as aproperty, of the underlying distribution. We provide a full characterization of the class of proper scoring rules when the property is linear as a function of the input distribution. A key conclusion is that any such scoring rule can be written in the form of aBregman divergencefor some convex function. We also apply our results to the design of prediction market mechanisms, showing a strong equivalence between scoring rules for linear properties and automated prediction market makers.

Minimax Option Pricing Meets Black-Scholes in the Limit,
STOC
2012.
J Abernethy, RM F, A Wibisono.

[ abstract + ]
[ slides ]
[ pdf ]
[ link ]
[ poster ]

Option contracts are a type of financial derivative that allow investors to hedge risk and speculate on the variation of an asset's future market price. In short, an option has a particular payout that is based on the market price for an asset on a given date in the future. In 1973, Black and Scholes proposed a valuation model for options that essentially estimates the tail risk of the asset price under the assumption that the price will fluctuate according to geometric Brownian motion. More recently, DeMarzo et al., among others, have proposed more robust valuation schemes, where we can even assume an adversary chooses the price fluctuations. This framework can be considered as a sequential two-player zero-sum game between the investor and Nature. We analyze the value of this game in the limit, where the investor can trade at smaller and smaller time intervals. Under weak assumptions on the actions of Nature (an adversary), we show that the minimax option price asymptotically approaches exactly the Black-Scholes valuation. The key piece of our analysis is showing that Nature's minimax optimal dual strategy converges to geometric Brownian motion in the limit.

Social Learning in a Changing World,
WINE
2011.
RM F, G Schoenebeck, O Tamuz.

[ abstract + ]
[ pdf ]
[ arXiv ]
[ link ]

We study a model of learning on social networks in dynamic environments, describing a group of agents who are each trying to estimate an underlying state that varies over time, given access to weak signals and the estimates of their social network neighbors.

We study three models of agent behavior. In the "fixed response" model agents use a fixed linear combination to incorporate information from their peers into their own estimate. This can be thought of as an extension of the DeGroot model to a dynamic setting. In the "best response" model players calculate minimum variance linear estimators of the underlying state.

We show that regardless of the initial configuration, fixed response dynamics converge to a steady state, and that the same holds for best response on the complete graph. We show that best response dynamics can, in the long term, lead to estimators with higher variance than is achievable using well chosen fixed responses.

The "penultimate prediction" model is an elaboration of the best response model. While this model only slightly complicates the computations required of the agents, we show that in some cases it greatly increases the efficiency of learning, and on the complete graphs is in fact optimal, in a strong sense.

A Collaborative Mechanism for Crowdsourcing Prediction Problems,
NIPS
2011.
J Abernethy, RM F.

[ abstract + ]
[ pdf ]
[ arXiv ]
[ link ]
[ poster ]

Machine Learning competitions such as the Netflix Prize have proven reasonably successful as a method of "crowdsourcing" prediction tasks. But these competitions have a number of weaknesses, particularly in the incentive structure they create for the participants. We propose a new approach, called a Crowdsourced Learning Mechanism, in which participants collaboratively "learn" a hypothesis for a given prediction task. The approach draws heavily from the concept of a prediction market, where traders bet on the likelihood of a future event. In our framework, the mechanism continues to publish the current hypothesis, and participants can modify this hypothesis by wagering on an update. The critical incentive property is that a participant profits exactly how much her update improves the performance of the hypothesis on a released test set.

On learning algorithms for nash equilibria,
Algorithmic Game Theory
Springer 2010.
C Daskalakis, RM F, CH Papadimitriou, G Pierrakos, G Valiant.

[ abstract + ]
[ pdf ]
[ link ]

Can learning algorithms find a Nash equilibrium? This is a natural question for several reasons. Learning algorithms resemble the behavior of players in many naturally arising games, and thus results on the convergence or non-convergence properties of such dynamics may inform our understanding of the applicability of Nash equilibria as a plausible solution concept in some settings. A second reason for asking this question is in the hope of being able to prove an impossibility result, not dependent on complexity assumptions, for computing Nash equilibria via a restricted class of reasonable algorithms. In this work, we begin to answer this question by considering the dynamics of the standard multiplicative weights update learning algorithms (which are known to converge to a Nash equilibrium for zero-sum games). We revisit a 3x3 game defined by Shapley in the 1950s in order to establish that fictitious play does not converge in general games. For this simple game, we show via a potential function argument that in a variety of settings the multiplicative updates algorithm impressively fails to find the unique Nash equilibrium, in that the cumulative distributions of players produced by learning dynamics actually drift away from the equilibrium.

Online hypergraph matching: hiring teams of secretaries,
Masters Thesis, Cornell University
2008.
RM F, advised by R. Kleinberg.

[ abstract + ]
[ pdf ]

The goal of this paper is to find a competitive algorithm for the following problem. We are given a hypergraph G = (X, E), |E| = n, with weighted edges and maximum edge size k. We wish to maximize the weight of a matching M ⊆ E of G, given that edges are revealed in a random order, and that when an edge e is revealed the algorithm must decide irrevocably whether or not e ∈ M. Note that this is a generalization of the classic secretary problem.

### Mathematics

On the Hardness of State Amalgamation,
In preparation.
RM F.

[ abstract + ]

It is well-known that any strong shift equivalence between subshifts of finite type can be expressed as a sequence of symbol splittings followed by a sequence of symbol amalgamations. We explore the simpler question of when a strong shift equivalence can be obtained using only amalgamations. Specifically, we consider the problem of determining the largest number of consecutive amalgamations of a given vertex shift, and show that it is NP-hard by reduction from the hitting set problem.

Topological entropy bounds for hyperbolic plateaus of the Hénon map,
SIAM Undergraduate Research Online
2014.
RM F.

[ abstract + ]
[ pdf ]
[ arXiv ]
[ link ]

Combining two existing rigorous computational methods, for verifying hyperbolicity [Arai, 2007] and for computing topological entropy bounds [Day et al., 2008], we prove lower bounds on topological entropy for 43 hyperbolic plateaus of the Hénon map. We also examine the 16 area-preserving plateaus studied by Arai and compare our results with related work. Along the way, we augment the entropy algorithms of Day et al. with routines to optimize the algorithmic parameters and simplify the resulting semi-conjugate subshift.

Efficient automation of index pairs in computational Conley index theory,
SIAM Journal on Applied Dynamical Systems
11:82-109, 2012.
RM F, R Treviño.

[ abstract + ]
[ pdf ]
[ arXiv ]
[ link ]

We present new methods of automating the construction of index pairs, essential ingredients of discrete Conley index theory. These new algorithms are further steps in the direction of automating computer-assisted proofs of semi-conjugacies from a map on a manifold to a subshift of finite type. We apply these new algorithms to the standard map at different values of the perturbative parameter

and obtain rigorous lower bounds for its topological entropy forεin [.7, 2].ε

Algorithms for rigorous entropy bounds and symbolic dynamics,
SIAM Journal on Applied Dynamical Systems
7:1477-1506, 2008.
S Day, RM F, R Treviño.

[ abstract + ]
[ slides ]
[ pdf ]
[ link ]

The aim of this paper is to introduce a method for computing rigorous lower bounds for topological entropy. The topological entropy of a dynamical system measures the number of trajectories that separate in finite time and quantifies the complexity of the system. Our method relies on extending existing computational Conley index techniques for constructing semiconjugate symbolic dynamical systems. Besides offering a description of the dynamics, the constructed symbol system allows for the computation of a lower bound for the topological entropy of the original system. Our overall goal is to construct symbolic dynamics that yield a high lower bound for entropy. The method described in this paper is algorithmic and, although it is computational, yields mathematically rigorous results. For illustration, we apply the method to the Hénon map, where we compute a rigorous lower bound of 0.4320 for topological entropy.

Symmetric fractal trees in three dimensions,
Chaos Solitons Fractals
Elsevier 32(2):284-295, 2007.
RM F, E Lock, DA Brown.

[ abstract + ]
[ pdf ]
[ link ]

In this paper, we classify and describe a method for constructing fractal trees in three dimensions. We explore certain aspects of these trees, such as space-filling and self-contact.