
M Ramgraber, R Baptista, D McLaughlin and Y M Marzouk, Ensemble transport smoothing. Part 2: nonlinear updates,Preprint, (2022).

M Ramgraber, R Baptista, D McLaughlin and Y M Marzouk, Ensemble transport smoothing. Part 1: unified framework,Preprint, (2022).

X Zhang, J Blanchet, Y M Marzouk, V A Nguyen and S Wang, Distributionally robust Gaussian process regression and Bayesian inverse problems,Preprint, (2022).

R Baptista, Y M Marzouk and O Zahm, Gradient-based data and parameter dimension reduction for Bayesian models: an information theoretic perspective,Preprint, (2022).

R. Baptista, Y. M. Marzouk and O. Zahm, On the representation and learning of monotone triangular transport maps,Preprint, (2022).

R Baptista, L Cao, J Chen, O Ghattas, F Li, Y M Marzouk and J T Oden, Bayesian model calibration for block copolymer self-assembly: Likelihood-free inference and expected information gain computation via measure transport,Preprint, (2022).

B. Zhang, T. Sahai and Y. M. Marzouk, Computing eigenfunctions of the multidimensional Ornstein-Uhlenbeck operator,Preprint, (2021).

A. Scarinci, M. Fehler and Y. M. Marzouk, Bayesian inference under model misspecification using transport-Lagrangian distances: an application to seismic inversion,Preprint, (2021).

R. Baptista, Y. M. Marzouk, R. Morrison and O. Zahm, Learning non-Gaussian graphical models via Hessian scores and triangular transport,Preprint, (2021).

N. Kovachki, R. Baptista, B. Hosseini and Y. M. Marzouk, Conditional sampling with monotone GANs,Preprint, (2020).

C. Feng and Y. M. Marzouk, A layered multiple importance sampling scheme for focused optimal Bayesian experimental design,Preprint, (2019).

F. Augustin and Y. M. Marzouk, A trust region method for derivative-free nonlinear constrained stochastic optimization,Preprint, (2017).

X. Huan and Y. M. Marzouk, Sequential Bayesian optimal experimental design via approximate dynamic programming,Preprint, (2016).

N. Lowry, R. Mangoubi, M. Desai, Y. M. Marzouk and P. Sammak, Bayesian level sets for image segmentation,Preprint, (2015).

F. Augustin and Y. M. Marzouk, NOWPAC: A provably convergent derivative-free nonlinear optimizer with path-augmented constraints,Preprint, (2014).
No preprints found matching filter.
Ensemble transport smoothing. Part 2: nonlinear updates
Smoothing is a specialized form of Bayesian inference for state-space models that characterizes the posterior distribution of a collection of states given an associated sequence of observations. Our companion manuscript proposes a general framework for transport-based ensemble smoothing, which includes linear Kalman-type smoothers as special cases. Here, we build on this foundation to realize and demonstrate nonlinear backward ensemble transport smoothers. We discuss parameterization and regularization of the associated transport maps, and then examine the performance of these smoothers for nonlinear and chaotic dynamical systems that exhibit non-Gaussian behavior. In these settings, our nonlinear transport smoothers yield lower estimation error than conventional linear smoothers and state-of-the-art iterative ensemble Kalman smoothers, for comparable numbers of model evaluations.
BibTeX entry
@article{ramgraber-smoothingparttwo-2022, author = "M Ramgraber and R Baptista and D McLaughlin and Y M Marzouk", journal = "Preprint", title = "Ensemble transport smoothing. Part 2: nonlinear updates", year = "2022", abstract = "Smoothing is a specialized form of Bayesian inference for state-space models that characterizes the posterior distribution of a collection of states given an associated sequence of observations. Our companion manuscript proposes a general framework for transport-based ensemble smoothing, which includes linear Kalman-type smoothers as special cases. Here, we build on this foundation to realize and demonstrate nonlinear backward ensemble transport smoothers. We discuss parameterization and regularization of the associated transport maps, and then examine the performance of these smoothers for nonlinear and chaotic dynamical systems that exhibit non-Gaussian behavior. In these settings, our nonlinear transport smoothers yield lower estimation error than conventional linear smoothers and state-of-the-art iterative ensemble Kalman smoothers, for comparable numbers of model evaluations.", keywords = "Data assimilation, smoothing, ensemble methods, triangular transport." }
Ensemble transport smoothing. Part 1: unified framework
Smoothers are algorithms for Bayesian time series re-analysis. Most operational smoothers rely either on affine Kalman-type transformations or on sequential importance sampling. These strategies occupy opposite ends of a spectrum that trades computational efficiency and scalability for statistical generality and consistency: non-Gaussianity renders affine Kalman updates inconsistent with the true Bayesian solution, while the ensemble size required for successful importance sampling can be prohibitive. This paper revisits the smoothing problem from the perspective of measure transport, which offers the prospect of consistent prior-to-posterior transformations for Bayesian inference. We leverage this capacity by proposing a general ensemble framework for transport-based smoothing. Within this framework, we derive a comprehensive set of smoothing recursions based on nonlinear transport maps and detail how they exploit the structure of state-space models in fully non-Gaussian settings. We also describe how many standard Kalman-type smoothing algorithms emerge as special cases of our framework. A companion paper explores the implementation of nonlinear ensemble transport smoothers in greater depth.
BibTeX entry
@article{ramgraber-smoothingpartone-2022, author = "M Ramgraber and R Baptista and D McLaughlin and Y M Marzouk", journal = "Preprint", title = "Ensemble transport smoothing. Part 1: unified framework", year = "2022", abstract = "Smoothers are algorithms for Bayesian time series re-analysis. Most operational smoothers rely either on affine Kalman-type transformations or on sequential importance sampling. These strategies occupy opposite ends of a spectrum that trades computational efficiency and scalability for statistical generality and consistency: non-Gaussianity renders affine Kalman updates inconsistent with the true Bayesian solution, while the ensemble size required for successful importance sampling can be prohibitive. This paper revisits the smoothing problem from the perspective of measure transport, which offers the prospect of consistent prior-to-posterior transformations for Bayesian inference. We leverage this capacity by proposing a general ensemble framework for transport-based smoothing. Within this framework, we derive a comprehensive set of smoothing recursions based on nonlinear transport maps and detail how they exploit the structure of state-space models in fully non-Gaussian settings. We also describe how many standard Kalman-type smoothing algorithms emerge as special cases of our framework. A companion paper explores the implementation of nonlinear ensemble transport smoothers in greater depth.", keywords = "Data assimilation, smoothing, ensemble methods, triangular transport" }
On minimax density estimation via measure transport
We study the convergence properties, in Hellinger and related distances, of nonparametric density estimators based on measure transport. These estimators represent the measure of interest as the pushforward of a chosen reference distribution under a transport map, where the map is chosen via a maximum likelihood objective (equivalently, minimizing an empirical Kullback-Leibler loss) or a penalized version thereof. We establish concentration inequalities for a general class of penalized measure transport estimators, by combining techniques from M-estimation with analytical properties of the transport-based density representation. We then demonstrate the implications of our theory for the case of triangular Knothe-Rosenblatt (KR) transports on the $d$-dimensional unit cube, and show that both penalized and unpenalized versions of such estimators achieve minimax optimal convergence rates over Hölder classes of densities. Specifically, we establish optimal rates for unpenalized nonparametric maximum likelihood estimation over bounded Hölder-type balls, and then for certain Sobolev-penalized estimators and sieved wavelet estimators.
BibTeX entry
@article{wang-tde-2022, author = "S Wang and Y M Marzouk", journal = "Preprint", title = "On minimax density estimation via measure transport", year = "2022", abstract = "We study the convergence properties, in Hellinger and related distances, of nonparametric density estimators based on measure transport. These estimators represent the measure of interest as the pushforward of a chosen reference distribution under a transport map, where the map is chosen via a maximum likelihood objective (equivalently, minimizing an empirical Kullback-Leibler loss) or a penalized version thereof. We establish concentration inequalities for a general class of penalized measure transport estimators, by combining techniques from M-estimation with analytical properties of the transport-based density representation. We then demonstrate the implications of our theory for the case of triangular Knothe-Rosenblatt (KR) transports on the $d$-dimensional unit cube, and show that both penalized and unpenalized versions of such estimators achieve minimax optimal convergence rates over Hölder classes of densities. Specifically, we establish optimal rates for unpenalized nonparametric maximum likelihood estimation over bounded Hölder-type balls, and then for certain Sobolev-penalized estimators and sieved wavelet estimators." }
Distributionally robust Gaussian process regression and Bayesian inverse problems
We study a distributionally robust optimization formulation (i.e., a min-max game) for two representative problems in Bayesian nonparametric estimation: Gaussian process regression and, more generally, linear inverse problems. Our formulation seeks the best mean-squared error predictor, in an infinite-dimensional space, against an adversary who chooses the worst-case model in a Wasserstein ball around a nominal infinite-dimensional Bayesian model. The transport cost is chosen to control features such as the degree of roughness of the sample paths that the adversary is allowed to inject. We show that the game has a well-defined value (i.e., strong duality holds in the sense that max-min equals min-max) and that there exists a unique Nash equilibrium which can be computed by a sequence of finite-dimensional approximations. Crucially, the worst-case distribution is itself Gaussian. We explore properties of the Nash equilibrium and the effects of hyperparameters through a set of numerical experiments, demonstrating the versatility of our modeling framework.
BibTeX entry
@article{zhang-dro-2022, author = "X Zhang and J Blanchet and Y M Marzouk and V A Nguyen and S Wang", journal = "Preprint", title = "Distributionally robust Gaussian process regression and Bayesian inverse problems", year = "2022", abstract = "We study a distributionally robust optimization formulation (i.e., a min-max game) for two representative problems in Bayesian nonparametric estimation: Gaussian process regression and, more generally, linear inverse problems. Our formulation seeks the best mean-squared error predictor, in an infinite-dimensional space, against an adversary who chooses the worst-case model in a Wasserstein ball around a nominal infinite-dimensional Bayesian model. The transport cost is chosen to control features such as the degree of roughness of the sample paths that the adversary is allowed to inject. We show that the game has a well-defined value (i.e., strong duality holds in the sense that max-min equals min-max) and that there exists a unique Nash equilibrium which can be computed by a sequence of finite-dimensional approximations. Crucially, the worst-case distribution is itself Gaussian. We explore properties of the Nash equilibrium and the effects of hyperparameters through a set of numerical experiments, demonstrating the versatility of our modeling framework." }
Gradient-based data and parameter dimension reduction for Bayesian models: an information theoretic perspective
We consider the problem of reducing the dimensions of parameters and data in non-Gaussian Bayesian inference problems. Our goal is to identify an "informed" subspace of the parameters and an "informative" subspace of the data so that a high-dimensional inference problem can be approximately reformulated in low-to-moderate dimensions, thereby improving the computational efficiency of many inference techniques. To do so, we exploit gradient evaluations of the log-likelihood function. Furthermore, we use an information-theoretic analysis to derive a bound on the posterior error due to parameter and data dimension reduction. This bound relies on logarithmic Sobolev inequalities, and it reveals the appropriate dimensions of the reduced variables. We compare our method with classical dimension reduction techniques, such as principal component analysis and canonical correlation analysis, on applications ranging from mechanics to image processing.
BibTeX entry
@article{dimred2022, author = "R Baptista and Y M Marzouk and O Zahm", journal = "Preprint", title = "Gradient-based data and parameter dimension reduction for Bayesian models: an information theoretic perspective", year = "2022", abstract = "We consider the problem of reducing the dimensions of parameters and data in non-Gaussian Bayesian inference problems. Our goal is to identify an "informed" subspace of the parameters and an "informative" subspace of the data so that a high-dimensional inference problem can be approximately reformulated in low-to-moderate dimensions, thereby improving the computational efficiency of many inference techniques. To do so, we exploit gradient evaluations of the log-likelihood function. Furthermore, we use an information-theoretic analysis to derive a bound on the posterior error due to parameter and data dimension reduction. This bound relies on logarithmic Sobolev inequalities, and it reveals the appropriate dimensions of the reduced variables. We compare our method with classical dimension reduction techniques, such as principal component analysis and canonical correlation analysis, on applications ranging from mechanics to image processing.", keywords = "Bayesian inference, gradient-based dimension reduction, logarithmic Sobolev inequalities, conditional mutual information, low-dimensional subspaces, coordinate selection" }
On the representation and learning of monotone triangular transport maps
Transportation of measure provides a versatile approach for modeling complex probability distributions, with applications in density estimation, Bayesian inference, generative modeling, and beyond. Monotone triangular transport maps—approximations of the Knothe–Rosenblatt (KR) rearrangement—are a canonical choice for these tasks. Yet the representation and parameterization of such maps have a significant impact on their generality and expressiveness, and on properties of the optimization problem that arises in learning a map from data (e.g., via maximum likelihood estimation). We present a general framework for representing monotone triangular maps via invertible transformations of smooth functions. We establish conditions on the transformation such that the associated infinite-dimensional minimization problem has no spurious local minima, i.e., all local minima are global minima; and we show for target distributions satisfying certain tail conditions that the unique global minimizer corresponds to the KR map. Given a sample from the target, we then propose an adaptive algorithm that estimates a sparse semi-parametric approximation of the underlying KR map. We demonstrate how this framework can be applied to joint and conditional density estimation, likelihood-free inference, and structure learning of directed graphical models, with stable generalization performance across a range of sample sizes.
BibTeX entry
@article{baptista-atm-2020, author = "R. Baptista and Y. M. Marzouk and O. Zahm", journal = "Preprint", title = "On the representation and learning of monotone triangular transport maps", year = "2022", abstract = "Transportation of measure provides a versatile approach for modeling complex probability distributions, with applications in density estimation, Bayesian inference, generative modeling, and beyond. Monotone triangular transport maps—approximations of the Knothe–Rosenblatt (KR) rearrangement—are a canonical choice for these tasks. Yet the representation and parameterization of such maps have a significant impact on their generality and expressiveness, and on properties of the optimization problem that arises in learning a map from data (e.g., via maximum likelihood estimation). We present a general framework for representing monotone triangular maps via invertible transformations of smooth functions. We establish conditions on the transformation such that the associated infinite-dimensional minimization problem has no spurious local minima, i.e., all local minima are global minima; and we show for target distributions satisfying certain tail conditions that the unique global minimizer corresponds to the KR map. Given a sample from the target, we then propose an adaptive algorithm that estimates a sparse semi-parametric approximation of the underlying KR map. We demonstrate how this framework can be applied to joint and conditional density estimation, likelihood-free inference, and structure learning of directed graphical models, with stable generalization performance across a range of sample sizes.", keywords = "Transportation of measure, Knothe–Rosenblatt rearrangement, normalizing flows, monotone functions, infinite-dimensional optimization, adaptive approximation, multivariate polynomials, wavelets, density estimation." }
Bayesian model calibration for block copolymer self-assembly: Likelihood-free inference and expected information gain computation via measure transport
We consider the Bayesian calibration of models describing the phenomenon of block copolymer (BCP) self-assembly using image data produced by microscopy or X-ray scattering techniques. To account for the random long-range disorder in BCP equilibrium structures, we introduce auxiliary variables to represent this aleatory uncertainty. These variables, however, result in an integrated likelihood for high-dimensional image data that is generally intractable to evaluate. We tackle this challenging Bayesian inference problem using a likelihood-free approach based on measure transport together with the construction of summary statistics for the image data. We also show that expected information gains (EIGs) from the observed data about the model parameters can be computed with no significant additional cost. Lastly, we present a numerical case study based on the Ohta–Kawasaki model for diblock copolymer thin film self-assembly and top-down microscopy characterization. For calibration, we introduce several domain-specific energy- and Fourier-based summary statistics, and quantify their informativeness using EIG. We demonstrate the power of the proposed approach to study the effect of data corruptions and experimental designs on the calibration results.
BibTeX entry
@article{baptista-copoly-2022, author = "R Baptista and L Cao and J Chen and O Ghattas and F Li and Y M Marzouk and J T Oden", journal = "Preprint", title = "Bayesian model calibration for block copolymer self-assembly: Likelihood-free inference and expected information gain computation via measure transport", year = "2022", doi = "10.48550/arXiv.2206.11343", abstract = "We consider the Bayesian calibration of models describing the phenomenon of block copolymer (BCP) self-assembly using image data produced by microscopy or X-ray scattering techniques. To account for the random long-range disorder in BCP equilibrium structures, we introduce auxiliary variables to represent this aleatory uncertainty. These variables, however, result in an integrated likelihood for high-dimensional image data that is generally intractable to evaluate. We tackle this challenging Bayesian inference problem using a likelihood-free approach based on measure transport together with the construction of summary statistics for the image data. We also show that expected information gains (EIGs) from the observed data about the model parameters can be computed with no significant additional cost. Lastly, we present a numerical case study based on the Ohta--Kawasaki model for diblock copolymer thin film self-assembly and top-down microscopy characterization. For calibration, we introduce several domain-specific energy- and Fourier-based summary statistics, and quantify their informativeness using EIG. We demonstrate the power of the proposed approach to study the effect of data corruptions and experimental designs on the calibration results.", keywords = "block copolymers, material self-assembly, uncertainty quantification, likelihood-free inference, summary statistics, measure transport, expected information gain, Ohta–Kawasaki model" }
Computing eigenfunctions of the multidimensional Ornstein-Uhlenbeck operator
We discuss approaches to computing eigenfunctions of the Ornstein–Uhlenbeck (OU) operator in more than two dimensions. While the spectrum of the OU operator and theoretical properties of its eigenfunctions have been well characterized in previous research, the practical computation of general eigenfunctions has not been resolved. We review special cases for which the eigenfunctions can be expressed exactly in terms of commonly used orthogonal polynomials. Then we present a tractable approach for computing the eigenfunctions in general cases and comment on its dimension dependence.
BibTeX entry
@article{zhang-eigenfunctions-2021, author = "B. Zhang and T. Sahai and Y. M. Marzouk", journal = "Preprint", title = "Computing eigenfunctions of the multidimensional Ornstein-Uhlenbeck operator", year = "2021", doi = "10.48550/arXiv.2110.09229", abstract = "We discuss approaches to computing eigenfunctions of the Ornstein--Uhlenbeck (OU) operator in more than two dimensions. While the spectrum of the OU operator and theoretical properties of its eigenfunctions have been well characterized in previous research, the practical computation of general eigenfunctions has not been resolved. We review special cases for which the eigenfunctions can be expressed exactly in terms of commonly used orthogonal polynomials. Then we present a tractable approach for computing the eigenfunctions in general cases and comment on its dimension dependence." }
Bayesian inference under model misspecification using transport-Lagrangian distances: an application to seismic inversion
Model misspecification constitutes a major obstacle to reliable inference in many inverse problems. Inverse problems in seismology, for example, are particularly affected by misspecification of wave propagation velocities. In this paper, we focus on a specific seismic inverse problem–-full-waveform moment tensor inversion - and develop a Bayesian framework that seeks robustness to velocity misspecification. A novel element of our framework is the use of transport-Lagrangian (TL) distances between observed and model predicted waveforms to specify a loss function, and the use of this loss to define a generalized belief update via a Gibbs posterior. The TL distance naturally disregards certain features of the data that are more sensitive to model misspecification, and therefore produces less biased or dispersed posterior distributions in this setting. To make the latter notion precise, we use several diagnostics to assess the quality of inference and uncertainty quantification, i.e., continuous rank probability scores and rank histograms. We interpret these diagnostics in the Bayesian setting and compare the results to those obtained using more typical Gaussian noise models and squared-error loss, under various scenarios of misspecification. Finally, we discuss potential generalizability of the proposed framework to a broader class of inverse problems affected by model misspecification.
BibTeX entry
@article{scarinci-tl-2021, author = "A. Scarinci and M. Fehler and Y. M. Marzouk", journal = "Preprint", title = "Bayesian inference under model misspecification using transport-Lagrangian distances: an application to seismic inversion", year = "2021", abstract = "Model misspecification constitutes a major obstacle to reliable inference in many inverse problems. Inverse problems in seismology, for example, are particularly affected by misspecification of wave propagation velocities. In this paper, we focus on a specific seismic inverse problem---full-waveform moment tensor inversion - and develop a Bayesian framework that seeks robustness to velocity misspecification. A novel element of our framework is the use of transport-Lagrangian (TL) distances between observed and model predicted waveforms to specify a loss function, and the use of this loss to define a generalized belief update via a Gibbs posterior. The TL distance naturally disregards certain features of the data that are more sensitive to model misspecification, and therefore produces less biased or dispersed posterior distributions in this setting. To make the latter notion precise, we use several diagnostics to assess the quality of inference and uncertainty quantification, i.e., continuous rank probability scores and rank histograms. We interpret these diagnostics in the Bayesian setting and compare the results to those obtained using more typical Gaussian noise models and squared-error loss, under various scenarios of misspecification. Finally, we discuss potential generalizability of the proposed framework to a broader class of inverse problems affected by model misspecification." }
Learning non-Gaussian graphical models via Hessian scores and triangular transport
Undirected probabilistic graphical models represent the conditional dependencies, or Markov properties, of a collection of random variables. Knowing the sparsity of such a graphical model is valuable for modeling multivariate distributions and for efficiently performing inference. While the problem of learning graph structure from data has been studied extensively for certain parametric families of distributions, most existing methods fail to consistently recover the graph structure for non-Gaussian data. Here we propose an algorithm for learning the Markov structure of continuous and non-Gaussian distributions. To characterize conditional independence, we introduce a score based on integrated Hessian information from the joint log-density, and we prove that this score upper bounds the conditional mutual information for a general class of distributions. To compute the score, our algorithm SING estimates the density using a deterministic coupling, induced by a triangular transport map, and iteratively exploits sparse structure in the map to reveal sparsity in the graph. For certain non-Gaussian datasets, we show that our algorithm recovers the graph structure even with a biased approximation to the density. Among other examples, we apply sing to learn the dependencies between the states of a chaotic dynamical system with local interactions.
BibTeX entry
@article{baptista-lng-2021, author = "R. Baptista and Y. M. Marzouk and R. Morrison and O. Zahm", journal = "Preprint", title = "Learning non-Gaussian graphical models via Hessian scores and triangular transport", year = "2021", abstract = "Undirected probabilistic graphical models represent the conditional dependencies, or Markov properties, of a collection of random variables. Knowing the sparsity of such a graphical model is valuable for modeling multivariate distributions and for efficiently performing inference. While the problem of learning graph structure from data has been studied extensively for certain parametric families of distributions, most existing methods fail to consistently recover the graph structure for non-Gaussian data. Here we propose an algorithm for learning the Markov structure of continuous and non-Gaussian distributions. To characterize conditional independence, we introduce a score based on integrated Hessian information from the joint log-density, and we prove that this score upper bounds the conditional mutual information for a general class of distributions. To compute the score, our algorithm SING estimates the density using a deterministic coupling, induced by a triangular transport map, and iteratively exploits sparse structure in the map to reveal sparsity in the graph. For certain non-Gaussian datasets, we show that our algorithm recovers the graph structure even with a biased approximation to the density. Among other examples, we apply sing to learn the dependencies between the states of a chaotic dynamical system with local interactions.", keywords = "Undirected graphical models, structure learning, non-Gaussian distributions, conditional mutual information, transport map, sparsity" }
Conditional sampling with monotone GANs
We present a new approach for sampling conditional measures that enables uncertainty quantification in supervised learning tasks. We construct a mapping that transforms a reference measure to the probability measure of the output conditioned on new inputs. The mapping is trained via a modification of generative adversarial networks (GANs), called monotone GANs, that imposes monotonicity constraints and a block triangular structure. We present theoretical results, in an idealized setting, that support our proposed method as well as numerical experiments demonstrating the ability of our method to sample the correct conditional measures in applications ranging from inverse problems to image in-painting.
BibTeX entry
@article{kovachki-monotonegans-2020, author = "N. Kovachki and R. Baptista and B. Hosseini and Y. M. Marzouk", journal = "Preprint", title = "Conditional sampling with monotone GANs", year = "2020", abstract = "We present a new approach for sampling conditional measures that enables uncertainty quantification in supervised learning tasks. We construct a mapping that transforms a reference measure to the probability measure of the output conditioned on new inputs. The mapping is trained via a modification of generative adversarial networks (GANs), called monotone GANs, that imposes monotonicity constraints and a block triangular structure. We present theoretical results, in an idealized setting, that support our proposed method as well as numerical experiments demonstrating the ability of our method to sample the correct conditional measures in applications ranging from inverse problems to image in-painting." }
A layered multiple importance sampling scheme for focused optimal Bayesian experimental design
We develop a new computational approach for "focused" optimal Bayesian experimental design with nonlinear models, with the goal of maximizing expected information gain in targeted subsets of model parameters. Our approach considers uncertainty in the full set of model parameters, but employs a design objective that can exploit learning trade-offs among different parameter subsets. We introduce a new layered multiple importance sampling scheme that provides consistent estimates of expected information gain in this focused setting. This sampling scheme yields significant reductions in estimator bias and variance for a given computational effort, making optimal design more tractable for a wide range of computationally intensive problems.
BibTeX entry
@article{art_77, author = "C. Feng and Y. M. Marzouk", journal = "Preprint", title = "A layered multiple importance sampling scheme for focused optimal Bayesian experimental design", year = "2019", abstract = "We develop a new computational approach for "focused" optimal Bayesian experimental design with nonlinear models, with the goal of maximizing expected information gain in targeted subsets of model parameters. Our approach considers uncertainty in the full set of model parameters, but employs a design objective that can exploit learning trade-offs among different parameter subsets. We introduce a new layered multiple importance sampling scheme that provides consistent estimates of expected information gain in this focused setting. This sampling scheme yields significant reductions in estimator bias and variance for a given computational effort, making optimal design more tractable for a wide range of computationally intensive problems.", keywords = "optimal experimental design, Bayesian inference, Monte Carlo methods, multiple importance sampling, expected information gain, mutual information" }
A trust region method for derivative-free nonlinear constrained stochastic optimization
We present the algorithm SNOWPAC for derivative-free constrained stochastic optimization. The algorithm builds on a model-based approach for deterministic nonlinear constrained derivative-free optimization that introduces an ‘‘inner boundary path’’ to locally convexify the feasible domain and ensure feasible trial steps. We extend this deterministic method via a generalized trust region approach that accounts for noisy evaluations of the objective and constraints. To reduce the impact of noise, we fit consistent Gaussian processes to past objective and constraint evaluations. Our approach incorporates a wide variety of probabilistic risk or deviation measures in both the objective and the constraints. Numerical benchmarking demonstrates SNOWPAC’s efficiency and highlights the accuracy of the optimization solutions found.
BibTeX entry
@article{art_55, author = "F. Augustin and Y. M. Marzouk", journal = "Preprint", title = "A trust region method for derivative-free nonlinear constrained stochastic optimization", year = "2017", abstract = "We present the algorithm SNOWPAC for derivative-free constrained stochastic optimization. The algorithm builds on a model-based approach for deterministic nonlinear constrained derivative-free optimization that introduces an ``inner boundary path'' to locally convexify the feasible domain and ensure feasible trial steps. We extend this deterministic method via a generalized trust region approach that accounts for noisy evaluations of the objective and constraints. To reduce the impact of noise, we fit consistent Gaussian processes to past objective and constraint evaluations. Our approach incorporates a wide variety of probabilistic risk or deviation measures in both the objective and the constraints. Numerical benchmarking demonstrates SNOWPAC's efficiency and highlights the accuracy of the optimization solutions found." }
Sequential Bayesian optimal experimental design via approximate dynamic programming
The design of multiple experiments is commonly undertaken via suboptimal strategies, such as batch (open-loop) design that omits feedback or greedy (myopic) design that does not account for future effects. This paper introduces new strategies for the optimal design of sequential experiments. First, we rigorously formulate the general sequential optimal experimental design (sOED) problem as a dynamic program. Batch and greedy designs are shown to result from special cases of this formulation. We then focus on sOED for parameter inference, adopting a Bayesian formulation with an information theoretic design objective. To make the problem tractable, we develop new numerical approaches for nonlinear design with continuous parameter, design, and observation spaces. We approximate the optimal policy by using backward induction with regression to construct and refine value function approximations in the dynamic program. The proposed algorithm iteratively generates trajectories via exploration and exploitation to improve approximation accuracy in frequently visited regions of the state space. Numerical results are verified against analytical solutions in a linear-Gaussian setting. Advantages over batch and greedy design are then demonstrated on a nonlinear source inversion problem where we seek an optimal policy for sequential sensing.
BibTeX entry
@article{huan-sequential-2016, author = "X. Huan and Y. M. Marzouk", journal = "Preprint", title = "Sequential Bayesian optimal experimental design via approximate dynamic programming", year = "2016", abstract = "The design of multiple experiments is commonly undertaken via suboptimal strategies, such as batch (open-loop) design that omits feedback or greedy (myopic) design that does not account for future effects. This paper introduces new strategies for the optimal design of sequential experiments. First, we rigorously formulate the general sequential optimal experimental design (sOED) problem as a dynamic program. Batch and greedy designs are shown to result from special cases of this formulation. We then focus on sOED for parameter inference, adopting a Bayesian formulation with an information theoretic design objective. To make the problem tractable, we develop new numerical approaches for nonlinear design with continuous parameter, design, and observation spaces. We approximate the optimal policy by using backward induction with regression to construct and refine value function approximations in the dynamic program. The proposed algorithm iteratively generates trajectories via exploration and exploitation to improve approximation accuracy in frequently visited regions of the state space. Numerical results are verified against analytical solutions in a linear-Gaussian setting. Advantages over batch and greedy design are then demonstrated on a nonlinear source inversion problem where we seek an optimal policy for sequential sensing.", keywords = "sequential experimental design, Bayesian experimental design, approximate dynamic programming, feedback control policy, lookahead, approximate value iteration, information gain" }
Bayesian level sets for image segmentation
This paper presents a new algorithm for image segmentation and classification, Bayesian Level Sets (BLS). BLS harnesses the advantages of two well-known algorithms: variational level sets and finite mixture model EM (FMM-EM). Like FMM-EM, BLS has a simple, probabilistic implementation which natively extends to an arbitrary number of classes. Via a level set-inspired geometric prior, BLS returns smooth, regular segmenting contours that are robust to noise. In practice, BLS is also observed to be robust to fairly lenient initial conditions. A comparative analysis of the three algorithms (BLS, level set, FMM-EM) is presented, and the advantages of BLS are quantitatively demonstrated on realistic applications such as pluripotent stem cell colonies, brain MRI phantoms, and stem cell nuclei.
BibTeX entry
@article{art_39, author = "N. Lowry and R. Mangoubi and M. Desai and Y. M. Marzouk and P. Sammak", journal = "Preprint", title = "Bayesian level sets for image segmentation", year = "2015", abstract = "This paper presents a new algorithm for image segmentation and classification, Bayesian Level Sets (BLS). BLS harnesses the advantages of two well-known algorithms: variational level sets and finite mixture model EM (FMM-EM). Like FMM-EM, BLS has a simple, probabilistic implementation which natively extends to an arbitrary number of classes. Via a level set-inspired geometric prior, BLS returns smooth, regular segmenting contours that are robust to noise. In practice, BLS is also observed to be robust to fairly lenient initial conditions. A comparative analysis of the three algorithms (BLS, level set, FMM-EM) is presented, and the advantages of BLS are quantitatively demonstrated on realistic applications such as pluripotent stem cell colonies, brain MRI phantoms, and stem cell nuclei." }
NOWPAC: A provably convergent derivative-free nonlinear optimizer with path-augmented constraints
This paper proposes the algorithm NOWPAC (Nonlinear Optimization With Path-Augmented Constraints) for nonlinear constrained derivative-free optimization. The algorithm uses a trust region framework based on fully linear models for the objective function and the constraints. A new constraint-handling scheme based on an inner boundary path allows for the computation of feasible trial steps using models for the constraints. We prove that the iterates computed by NOWPAC converge to a local first order critical point. We also discuss the convergence of NOWPAC in situations where evaluations of the objective function or the constraints are inexact, e.g., corrupted by numerical errors. For this, we determine a rate of decay that the magnitude of these numerical errors must satisfy, while approaching the critical point, to guarantee convergence. In settings where adjusting the accuracy of the objective or constraint evaluations is not possible, as is often the case in practical applications, we introduce an error indicator to detect these regimes and prevent deterioration of the optimization results.
BibTeX entry
@article{augustin-nowpac-2014, author = "F. Augustin and Y. M. Marzouk", journal = "Preprint", title = "NOWPAC: A provably convergent derivative-free nonlinear optimizer with path-augmented constraints", year = "2014", abstract = "This paper proposes the algorithm NOWPAC (Nonlinear Optimization With Path-Augmented Constraints) for nonlinear constrained derivative-free optimization. The algorithm uses a trust region framework based on fully linear models for the objective function and the constraints. A new constraint-handling scheme based on an inner boundary path allows for the computation of feasible trial steps using models for the constraints. We prove that the iterates computed by NOWPAC converge to a local first order critical point. We also discuss the convergence of NOWPAC in situations where evaluations of the objective function or the constraints are inexact, e.g., corrupted by numerical errors. For this, we determine a rate of decay that the magnitude of these numerical errors must satisfy, while approaching the critical point, to guarantee convergence. In settings where adjusting the accuracy of the objective or constraint evaluations is not possible, as is often the case in practical applications, we introduce an error indicator to detect these regimes and prevent deterioration of the optimization results." }
Announcements
May 2022
Congratulations to Ricardo Baptista for successfully defending his PhD thesis!
Congratulations to Ricardo Baptista for successfully defending his PhD thesis!
December 2021
Congratulations to Ben Zhang for successfully defending his PhD thesis!
Congratulations to Ben Zhang for successfully defending his PhD thesis!
September 2021
Welcome to new UQGroup graduate students Kate Fisher, Julien Luzzatto, and Danny Sharp!
Welcome to new UQGroup graduate students Kate Fisher, Julien Luzzatto, and Danny Sharp!
September 2021
Congratulations to Andrea Scarinci for successfully defending his PhD thesis!
Congratulations to Andrea Scarinci for successfully defending his PhD thesis!
August 2021
Welcome to new postdocs Nisha Chandramoorthy and Matt Li, who both recently completed their PhD degrees at MIT!
Welcome to new postdocs Nisha Chandramoorthy and Matt Li, who both recently completed their PhD degrees at MIT!
June 2021
Welcome to new UQGroup graduate students Dimitris Konomis and Josh White!
Welcome to new UQGroup graduate students Dimitris Konomis and Josh White!
June 2021
Welcome to new postdoc Dallas Foster, who is joining the group after finishing his PhD at Oregon State University!
More announcements
Welcome to new postdoc Dallas Foster, who is joining the group after finishing his PhD at Oregon State University!