Search

Scholarly Works (99 results)

Sort By:

Show:

Thesis
Peer Reviewed

Towards Socially and Economically Beneficial Machine Learning

Guo, Wenshuo
Advisor(s): Jordan, Michael

UC Berkeley Electronic Theses and Dissertations (2022)

From digital platforms, automated transportation to healthcare, the rapid deployment of machine learning has in many ways changed our everyday life. However, when learning systems are deployed in the real world, they immediately face a complex social and economic context which poses feasibility constraints, drives the underlying dynamics, and influences the kinds of data that the systems can actually obtain. Optimizing a single offline objective in isolation to these contexts can lead to severe unintended consequences at deployment and hinder the improvement of social welfare that the system has the potential to bring.

In this thesis, I summarize my research works on developing learning algorithms that incorporate such social economic contexts into the design from three aspects: (i) Learning with noisy input data; (ii) Learning with bandit-type user feedback; (iii) Learning under causal dynamics. I will situate each of these with the particular applications of machine learning on fair classification, resource allocation, auction and platform design.

Cover page: Towards Socially and Economically Beneficial Machine Learning

Thesis
Peer Reviewed

Learning Beyond the Standard Model (of Data)

Tripuraneni, Nilesh
Advisor(s): Jordan, Michael

UC Berkeley Electronic Theses and Dissertations (2022)

Classically, most machine learning (ML) methodology has made an innocuous modeling assumption: data drawn from both the training/test sets has been independently sampled from a pair of identical distributions with nice properties. Yet, in the situations modern ML methods must confront, deviations from this idealized setting are quickly becoming the norm–not the exception. In this thesis, we address the challenges arising in understanding the often unexpected phenomenology in these settings by developing theory in two areas of interest: transfer learning and robust learning. In particular, we focus on identifying what structural conditions/techniques are needed to permit sample-efficient learning in these new settings, in order to answer questions such as why pretraining is so effective and what the limits of learning are for extremely heavy-tailed distributions.

Cover page: Learning Beyond the Standard Model (of Data)

Article
Peer Reviewed

Attractor dynamics and parallelism in a connectionist sequential machine

Jordan, Michael I.

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 8 (1986)

Fluent human sequential behavior, such as that observed in speech production, ischaracterized by a high degree of parallelbm, fiizzy boundaries, and insensitivity toperturbations, hi this paper, I consider a theoretical treatment of sequential behaviorwhich is based on data from speech production. A networi^ is discussed which tsessentially a sequential machine built out of connectionist components. The networkrelies on distributed representations and a hi(^ degree of parallelism at the level of thecomponent processing units. These properties lead to parallelism at the level at whichwhole output vectors arise, and constraints must be imposed to make the performanceof the network more sequential. The sequential tr^ectories that are realized by thenetwork have dynamic properties that are analogous to those observed in networkswith point attractors (Hopfield, 1982): learned tn^ectories generalize, and attractorssuch as limit cycles can arise.

Cover page: Attractor dynamics and parallelism in a connectionist sequential machine

Thesis
Peer Reviewed

Parallel Machine Learning Using Concurrency Control

Pan, Xinghao
Advisor(s): Jordan, Michael I

UC Berkeley Electronic Theses and Dissertations (2017)

Many machine learning algorithms iteratively process datapoints and transform global model parameters. It has become increasingly impractical to serially execute such iterative algorithms as processor speeds fail to catch up to the growth in dataset sizes.

To address these problems, the machine learning community has turned to two parallelization strategies: bulk synchronous parallel (BSP), and coordination-free. BSP algorithms partition computational work among workers, with occasional synchronization at global barriers, but has only been applied to ‘embarrassingly parallel’ problems where work is trivially factorizable. Coordination-free algorithms simply allow concurrent processors to execute in parallel, interleaving transformations and possibly introducing inconsistencies. Theoretical analysis is then required to prove that the coordination-free algorithm produces a reasonable approximation to the desired outcome, under assumptions on the problem and system.

In this dissertation, we propose and explore a third approach by applying concurrency control to manage parallel transformations in machine learning algorithms. We identify points of possible interference between parallel iterations by examining the semantics of the serial algorithm. Coordination is then introduced to either avoid or resolve such conflicts, whereas non-conflicting transformations are allowed to execute concurrently. Our parallel algorithms are thus engineered to produce the same exact output as the serial machine learning algorithm, preserving the serial algorithm’s theoretical guarantees of correctness while maximizing concurrency.

We demonstrate the feasibility of our approach to parallelizing a variety of machine learning algorithms, including nonparametric unsupervised learning, graph clustering, discrete optimization, and sparse convex optimization. We theoretically prove and empirically verify that our parallel algorithms produce equivalent output to their serial counterparts. We also theoretically analyze the expected concurrency of our parallel algorithms, and empirically demonstrate their scalability.

Cover page: Parallel Machine Learning Using Concurrency Control

Thesis
Peer Reviewed

Incorporating Supervision for Visual Recognition and Segmentation

Shyr, Alex
Advisor(s): Jordan, Michael I

UC Berkeley Electronic Theses and Dissertations (2011)

Unsupervised algorithms which do not make use of labels are commonly found in computer vision and are widely applicable to all problem settings.

In the presence of expert-labeled ground truth information, however, these algorithms are not optimal. Altering the unsupervised models to include labels

is not always a straight forward modification. In this dissertation, we explore various ways to incorporate human supervision.

We first start with the task of visual sequence recognition and demonstrate ways to effectively make use of temporal information. Next, we tackle the problem of scene segmentation and

devise a novel framework to discriminatively train a generative hierarchical model with nonparametric Bayesian priors; the methodology can be easily applied

to other nonparametric Bayesian models. Finally, we approach the difficult problem of object segmentation and describe how shape priors can be infused into

a generative Bayesian segmentation model. We demonstrate the effectiveness of our models and algorithms on datasets which are widely used by the research community and universally

regarded as difficult. The dissertation concludes with active venues for future research.

Cover page: Incorporating Supervision for Visual Recognition and Segmentation

Thesis
Peer Reviewed

Modeling Events in Time using Cascades of Poisson Processes

Simma, Aleksandr
Advisor(s): Jordan, Michael i

UC Berkeley Electronic Theses and Dissertations (2010)

For many applications, the data of interest can be best thought of as events - entities that occur at a particular moment in time, have features and may in turn trigger the occurrence of other events. This thesis presents techniques for modeling the temporal dynamics of events by making each event induce an inhomogeneous Poisson process of others following it. The collection of all events observed is taken to be a draw from the superposition of the induced Poisson processes, as well as a baseline process for some of the initial triggers. The magnitude and shape of the induced Poisson processes controls the number, timing and features of the triggered events. We provide techniques for parameterizing these processes and present efficient, scalable techniques for inference.

The framework is then applied to three different domains that demonstrate the power of the approach. First, we consider the problem of identifying dependencies in a computer network through passive observation and provide a technique based on hypothesis testing for accurately discovering interactions between machines. Then, we look at the relationships between Twitter messages about stocks, using the application as a test-bed to experiment with different parameterizations of induced processes. Finally, we apply these tools to build a model of the revision history of Wikipedia, identifying how the community propagates edits from a page to its neighbors and demonstrating the scalability of our approach to very large datasets.

Cover page: Modeling Events in Time using Cascades of Poisson Processes

Thesis
Peer Reviewed

Structure-Driven Algorithm Design in Optimization and Machine Learning

Lin, Tianyi
Advisor(s): Jordan, Michael I

UC Berkeley Electronic Theses and Dissertations (2023)

A textbook property of optimization algorithms is their ability to solve the problems under generic regularity conditions. Two examples are simplex method and gradient descent (GD) method. However, the performance of these fundamental and general-purpose optimization algorithms is often unsatisfactory; they often run slowly and perhaps return the suboptimal solutions in generic settings. In my view, this is the price of their generality; indeed, the generic algorithms are an achievement, but for many problems, the gains from leveraging spe- cial structure can be huge. A basic question then arises: how can we harness problem-specific structure within our algorithms to obtain fast, practical algorithms with strong performance guarantees? As more structured data-driven decision-making models emerge, this question has become increasingly pressing and relevant to practitioners.

For example, the GD is known to get stuck at a suboptimal saddle points in nonconvex optimization. Nonetheless, a line of recent works have shown that random initialization or perturbation changes the dynamics of GD and makes it provably converge to a global optimal solution. In addition, both Markov decision process (MDP) and discrete optimal transport (OT) problems can be solved using large-scale linear programs. Rather than using generic LP algorithms, the policy iteration and the Sinkhorn iteration exploit special structures in MDP and OT and thus perform better in practice. Adapting algorithms to problem-specific structure is generally referred to as structure-driven algorithm design.

Although this line of research – which has been studied extensively for over 70 years – has enjoyed widespread success, the machine-learning success stories have introduced new formulations ripe for deep theoretical analysis and remarkable practical impact. My research pushes this frontier by identifying special structure of reliable machine learning (minimax optimization) and multi-agent machine learning (high-order optimization and beyond) and design optimal algorithms for computing the appropriately defined optimal solutions; and other structured problems, such as efficient entropic regularized optimal transport, gradient- free nonsmooth nonconvex optimization, and adaptive and doubly optimal learning in games.

Cover page: Structure-Driven Algorithm Design in Optimization and Machine Learning

Thesis
Peer Reviewed

Statistical models for analyzing human genetic variation

Sankararaman, Sriram
Advisor(s): Jordan, Michael I

UC Berkeley Electronic Theses and Dissertations (2010)

Advances in sequencing and genomic technologies are providing new opportunities to understand the genetic basis of phenotypes such as diseases. Translating the large volumes of heterogeneous, often noisy, data into biological insights presents challenging problems of statistical inference. In this thesis, we focus on three important statistical problems that arise in our efforts to understand the genetic basis of phenotypic variation in humans.

At the molecular level, we focus on the problem of identifying the amino acid residues in a protein that are important for its function. Identifying functional residues is essential to understanding the effect of genetic variation on protein function as well as to understanding protein function itself. We propose computational methods that predict functional residues using evolutionary information as well as from a combination of evolutionary and structural information. We demonstrate that these methods can accurately predict catalytic residues in enzymes. Case studies on well-studied enzymes show that these methods can be useful in guiding future experiments.

At the population level, discovering the link between genetic and phenotypic variation requires an understanding of the genetic structure of human populations. A common form of population structure is that found in admixed groups formed by the intermixing of several ancestral populations, such as African-Americans and Latinos. We describe a Bayesian hidden Markov model of admixture and propose efficient algorithms to infer the fine-scale structure of admixed populations. We show that the fine-scale structure of these populations can be inferred even when the ancestral populations are unknown or extinct. Further, the inference algorithm can run efficiently on genome-scale datasets. This model is well-suited to estimate other parameters of biological interest such as the allele frequencies of ancestral populations which can be used, in turn, to reconstruct extinct populations.

Finally, we address the problem of sharing genomic data while preserving the privacy of individual participants. We analyze the problem of detecting an individual genotype from the summary statistics of single nucleotide polymorphisms (SNPs) released in a study. We derive upper bounds on the power of detection as a function of the study size, number of exposed SNPs and the false positive rate, thereby providing guidelines as to which set of SNPs can be safely exposed.

Cover page: Statistical models for analyzing human genetic variation

Thesis
Peer Reviewed

The Dynamics of Recommender Systems

Krauth, Karl M
Advisor(s): Jordan, Michael

UC Berkeley Electronic Theses and Dissertations (2022)

Over the past three decades, the reach of recommender systems has grown exponentially. Today, recommender systems are deployed on all major internet platforms, influencing our opinions, decisions, careers, and relationships. However, despite their far-reaching impact, these algorithms and their consequences are still poorly understood. In this thesis, we argue that this is due to the challenging dynamics of the recommendation problem. We outline four problems that distinguish the dynamics of recommendation from other dynamical systems, making them particularly hard to reason about: (1) direct measurement and experimentation are often infeasible, (2) feedback effects make it difficult to reason about cause and effect, (3) the scale of internet platforms requires increased algorithmic complexity, and (4) incentives created by recommender systems cause users to behave strategically. We build the foundations necessary to understand and remedy these four problems, paving the way for a complete understanding of the dynamics of recommender systems and their consequences.

Cover page: The Dynamics of Recommender Systems

Article
Peer Reviewed

Learning in Multi-Stage Decentralized Matching Markets

UCLA Previously Published Works (2021)

Creative Commons 'BY' version 4.0 license