P B Rubio, Y M Marzouk and M Parno, A transport approach to sequential simulation-based inference,Preprint, (2023).
K Leung, D Thompson, J Susiluoto, J Jagalur-Mohan, A Braverman and Y M Marzouk, Evaluating the accuracy of Gaussian approximations in VSWIR imaging spectroscopy retrievals,Preprint, (2023).
B J Zhang, Y M Marzouk and K Spiliopoulos, Transport map unadjusted Langevin algorithms,Preprint, (2023).
A Maurais, T Alsup, B Peherstorfer and Y M Marzouk, Multifidelity Covariance Estimation via Regression on the Manifold of Symmetric Positive Definite Matrices,Preprint, (2023).
R Baptista, B Hosseini, N B Kovachki, Y M Marzouk and A Sagiv, An Approximation Theory Framework for Measure-Transport Sampling Algorithms,Preprint, (2023).
J Pidstrigach, Y M Marzouk, S Reich and S Wang, Infinite-Dimensional Diffusion Models for Function Spaces,Preprint, (2022).
N Chandramoorthy, F Schaefer and Y M Marzouk, A score-based operator Newton method for measure transport,Preprint, (2023).
M T C Li, Y M Marzouk and O Zahm, Principal Feature Detection via Φ-Sobolev Inequalities,Preprint, (2023).
M Ramgraber, R Baptista, D McLaughlin and Y M Marzouk, Ensemble transport smoothing. Part 2: nonlinear updates,Preprint, (2022).
M Ramgraber, R Baptista, D McLaughlin and Y M Marzouk, Ensemble transport smoothing. Part 1: unified framework,Preprint, (2022).
X Zhang, J Blanchet, Y M Marzouk, V A Nguyen and S Wang, Distributionally robust Gaussian process regression and Bayesian inverse problems,Preprint, (2022).
R Baptista, Y M Marzouk and O Zahm, Gradient-based data and parameter dimension reduction for Bayesian models: an information theoretic perspective,Preprint, (2022).
R. Baptista, Y. M. Marzouk and O. Zahm, On the representation and learning of monotone triangular transport maps,Preprint, (2022).
R Baptista, L Cao, J Chen, O Ghattas, F Li, Y M Marzouk and J T Oden, Bayesian model calibration for block copolymer self-assembly: Likelihood-free inference and expected information gain computation via measure transport,Preprint, (2022).
B. Zhang, T. Sahai and Y. M. Marzouk, Computing eigenfunctions of the multidimensional Ornstein-Uhlenbeck operator,Preprint, (2021).
A. Scarinci, M. Fehler and Y. M. Marzouk, Bayesian inference under model misspecification using transport-Lagrangian distances: an application to seismic inversion,Preprint, (2021).
R. Baptista, Y. M. Marzouk, R. Morrison and O. Zahm, Learning non-Gaussian graphical models via Hessian scores and triangular transport,Preprint, (2021).
N. Kovachki, R. Baptista, B. Hosseini and Y. M. Marzouk, Conditional sampling with monotone GANs,Preprint, (2020).
C. Feng and Y. M. Marzouk, A layered multiple importance sampling scheme for focused optimal Bayesian experimental design,Preprint, (2019).
F. Augustin and Y. M. Marzouk, A trust region method for derivative-free nonlinear constrained stochastic optimization,Preprint, (2017).
X. Huan and Y. M. Marzouk, Sequential Bayesian optimal experimental design via approximate dynamic programming,Preprint, (2016).
N. Lowry, R. Mangoubi, M. Desai, Y. M. Marzouk and P. Sammak, Bayesian level sets for image segmentation,Preprint, (2015).
F. Augustin and Y. M. Marzouk, NOWPAC: A provably convergent derivative-free nonlinear optimizer with path-augmented constraints,Preprint, (2014).
No preprints found matching filter.
A transport approach to sequential simulation-based inference
We present a new transport-based approach to efficiently perform sequential Bayesian inference of static model parameters. The strategy is based on the extraction of conditional distribution from the joint distribution of parameters and data, via the estimation of structured (e.g., block triangular) transport maps. This gives explicit surrogate models for the likelihood functions and their gradients. This allow gradient-based characterizations of posterior density via transport maps in a model-free, online phase. This framework is well suited for parameter estimation in case of complex noise models including nuisance parameters and when the forward model is only known as a black box. The numerical application of this method is performed in the context of characterization of ice thickness with conductivity measurements.
BibTeX entry
@article{rubio-sequentialtransport-2023, author = "P B Rubio and Y M Marzouk and M Parno", journal = "Preprint", month = "8", title = "A transport approach to sequential simulation-based inference", year = "2023", doi = "https://doi.org/10.48550/arXiv.2308.13940", abstract = "We present a new transport-based approach to efficiently perform sequential Bayesian inference of static model parameters. The strategy is based on the extraction of conditional distribution from the joint distribution of parameters and data, via the estimation of structured (e.g., block triangular) transport maps. This gives explicit surrogate models for the likelihood functions and their gradients. This allow gradient-based characterizations of posterior density via transport maps in a model-free, online phase. This framework is well suited for parameter estimation in case of complex noise models including nuisance parameters and when the forward model is only known as a black box. The numerical application of this method is performed in the context of characterization of ice thickness with conductivity measurements.", keywords = "transport, ssm, inference" }
Evaluating the accuracy of Gaussian approximations in VSWIR imaging spectroscopy retrievals
The joint retrieval of surface reflectances and atmospheric parameters in VSWIR imaging spectroscopy is a computationally challenging high-dimensional problem. Using NASA’s Surface Biology and Geology mission as the motivational context, the uncertainty associated with the retrievals is crucial for further application of the retrieved results for environmental applications. Although Markov chain Monte Carlo (MCMC) is a Bayesian method ideal for uncertainty quantification, the full-dimensional implementation of MCMC for the retrieval is computationally intractable.
In this work, we developed a block Metropolis MCMC algorithm for the high-dimensional VSWIR surface reflectance retrieval that leverages the structure of the forward radiative transfer model to enable tractable fully Bayesian computation. We use the posterior distribution from this MCMC algorithm to assess the limitations of optimal estimation, the state-of-the-art Bayesian algorithm in operational retrievals which is more computationally efficient but uses a Gaussian approximation to characterize the posterior. Analyzing the differences in the posterior computed by each method, the MCMC algorithm was shown to give more physically sensible results and reveals the non-Gaussian structure of the posterior, specifically in the atmospheric aerosol optical depth parameter and the low-wavelength surface reflectances.
In this work, we developed a block Metropolis MCMC algorithm for the high-dimensional VSWIR surface reflectance retrieval that leverages the structure of the forward radiative transfer model to enable tractable fully Bayesian computation. We use the posterior distribution from this MCMC algorithm to assess the limitations of optimal estimation, the state-of-the-art Bayesian algorithm in operational retrievals which is more computationally efficient but uses a Gaussian approximation to characterize the posterior. Analyzing the differences in the posterior computed by each method, the MCMC algorithm was shown to give more physically sensible results and reveals the non-Gaussian structure of the posterior, specifically in the atmospheric aerosol optical depth parameter and the low-wavelength surface reflectances.
BibTeX entry
@article{leung-spectroscopy-2023, author = "K Leung and D Thompson and J Susiluoto and J Jagalur-Mohan and A Braverman and Y M Marzouk", journal = "Preprint", month = "8", title = "Evaluating the accuracy of Gaussian approximations in VSWIR imaging spectroscopy retrievals", year = "2023", abstract = "The joint retrieval of surface reflectances and atmospheric parameters in VSWIR imaging spectroscopy is a computationally challenging high-dimensional problem. Using NASA's Surface Biology and Geology mission as the motivational context, the uncertainty associated with the retrievals is crucial for further application of the retrieved results for environmental applications. Although Markov chain Monte Carlo (MCMC) is a Bayesian method ideal for uncertainty quantification, the full-dimensional implementation of MCMC for the retrieval is computationally intractable. In this work, we developed a block Metropolis MCMC algorithm for the high-dimensional VSWIR surface reflectance retrieval that leverages the structure of the forward radiative transfer model to enable tractable fully Bayesian computation. We use the posterior distribution from this MCMC algorithm to assess the limitations of optimal estimation, the state-of-the-art Bayesian algorithm in operational retrievals which is more computationally efficient but uses a Gaussian approximation to characterize the posterior. Analyzing the differences in the posterior computed by each method, the MCMC algorithm was shown to give more physically sensible results and reveals the non-Gaussian structure of the posterior, specifically in the atmospheric aerosol optical depth parameter and the low-wavelength surface reflectances.", keywords = "mcmc" }
Transport map unadjusted Langevin algorithms
Langevin dynamics are widely used in sampling high-dimensional, non-Gaussian distributions whose densities are known up to a normalizing constant. In particular, there is strong interest in unadjusted Langevin algorithms (ULA), which directly discretize Langevin dynamics to estimate expectations over the target distribution. We study the use of transport maps that approximately normalize a target distribution as a way to precondition and accelerate the convergence of Langevin dynamics. We show that in continuous time, when a transport map is applied to Langevin dynamics, the result is a Riemannian manifold Langevin dynamics (RMLD) with metric defined by the transport map. This connection suggests more systematic ways of learning metrics, and also yields alternative discretizations of the RMLD described by the map, which we study. Moreover, we show that under certain conditions, when the transport map is used in conjunction with ULA, we can improve the geometric rate of convergence of the output process in the 2–Wasserstein distance. Illustrative numerical results complement our theoretical claims.
BibTeX entry
@article{zhang-transportula-2023, author = "B J Zhang and Y M Marzouk and K Spiliopoulos", journal = "Preprint", month = "2", title = "Transport map unadjusted Langevin algorithms", year = "2023", doi = "10.48550/arXiv.2302.07227", abstract = "Langevin dynamics are widely used in sampling high-dimensional, non-Gaussian distributions whose densities are known up to a normalizing constant. In particular, there is strong interest in unadjusted Langevin algorithms (ULA), which directly discretize Langevin dynamics to estimate expectations over the target distribution. We study the use of transport maps that approximately normalize a target distribution as a way to precondition and accelerate the convergence of Langevin dynamics. We show that in continuous time, when a transport map is applied to Langevin dynamics, the result is a Riemannian manifold Langevin dynamics (RMLD) with metric defined by the transport map. This connection suggests more systematic ways of learning metrics, and also yields alternative discretizations of the RMLD described by the map, which we study. Moreover, we show that under certain conditions, when the transport map is used in conjunction with ULA, we can improve the geometric rate of convergence of the output process in the 2--Wasserstein distance. Illustrative numerical results complement our theoretical claims.", keywords = "angevin dynamics, transport maps, Bayesian inference, Markov chain Monte Carlo" }
Multifidelity Covariance Estimation via Regression on the Manifold of Symmetric Positive Definite Matrices
We introduce a multifidelity estimator of covariance matrices formulated as the solution to a regression problem on the manifold of symmetric positive definite matrices. The estimator is positive definite by construction, and the Mahalanobis distance minimized to obtain it possesses properties which enable practical computation. We show that our manifold regression multifidelity (MRMF) covariance estimator is a maximum likelihood estimator under a certain error model on manifold tangent space. More broadly, we show that our Riemannian regression framework encompasses existing multifidelity covariance estimators constructed from control variates. We demonstrate via numerical examples that our estimator can provide significant decreases, up to one order of magnitude, in squared estimation error relative to both single-fidelity and other multifidelity covariance estimators. Furthermore, preservation of positive definiteness ensures that our estimator is compatible with downstream tasks, such as data assimilation and metric learning, in which this property is essential.
BibTeX entry
@article{maurais-covarianceregression-2023, author = "A Maurais and T Alsup and B Peherstorfer and Y M Marzouk", journal = "Preprint", month = "7", title = "Multifidelity Covariance Estimation via Regression on the Manifold of Symmetric Positive Definite Matrices", year = "2023", abstract = "We introduce a multifidelity estimator of covariance matrices formulated as the solution to a regression problem on the manifold of symmetric positive definite matrices. The estimator is positive definite by construction, and the Mahalanobis distance minimized to obtain it possesses properties which enable practical computation. We show that our manifold regression multifidelity (MRMF) covariance estimator is a maximum likelihood estimator under a certain error model on manifold tangent space. More broadly, we show that our Riemannian regression framework encompasses existing multifidelity covariance estimators constructed from control variates. We demonstrate via numerical examples that our estimator can provide significant decreases, up to one order of magnitude, in squared estimation error relative to both single-fidelity and other multifidelity covariance estimators. Furthermore, preservation of positive definiteness ensures that our estimator is compatible with downstream tasks, such as data assimilation and metric learning, in which this property is essential.", keywords = "covariance, regression, SPD" }
Diffusion map particle systems for generative modeling
We propose a novel diffusion map particle system (DMPS) for generative modeling, based on diffusion maps and Laplacian-adjusted Wasserstein gradient descent (LAWGD). Diffusion maps are used to approximate the generator of the Langevin diffusion process from samples, and hence to learn the underlying data-generating manifold. On the other hand, LAWGD enables efficient sampling from the target distribution given a suitable choice of kernel, which we construct here via a spectral approximation of the generator, computed with diffusion maps. Numerical experiments show that our method outperforms others on synthetic datasets, including examples with manifold structure.
BibTeX entry
@article{li-diffusionparticles-2023, author = "F Li and Y M Marzouk", journal = "Preprint", month = "4", title = "Diffusion map particle systems for generative modeling", year = "2023", doi = "10.48550/arXiv.2304.00200", abstract = "We propose a novel diffusion map particle system (DMPS) for generative modeling, based on diffusion maps and Laplacian-adjusted Wasserstein gradient descent (LAWGD). Diffusion maps are used to approximate the generator of the Langevin diffusion process from samples, and hence to learn the underlying data-generating manifold. On the other hand, LAWGD enables efficient sampling from the target distribution given a suitable choice of kernel, which we construct here via a spectral approximation of the generator, computed with diffusion maps. Numerical experiments show that our method outperforms others on synthetic datasets, including examples with manifold structure." }
An Approximation Theory Framework for Measure-Transport Sampling Algorithms
This article presents a general approximation-theoretic framework to analyze measure transport algorithms for probabilistic modeling. A primary motivating application for such algorithms is sampling – a central task in statistical inference and generative modeling. We provide a priori error estimates in the continuum limit, i.e., when the measures (or their densities) are given, but when the transport map is discretized or approximated using a finite-dimensional function space. Our analysis relies on the regularity theory of transport maps and on classical approximation theory for high-dimensional functions. A third element of our analysis, which is of independent interest, is the development of new stability estimates that relate the distance between two maps to the distance (or divergence) between the pushforward measures they define. We present a series of applications of our framework, where quantitative convergence rates are obtained for practical problems using Wasserstein metrics, maximum mean discrepancy, and Kullback–Leibler divergence. Specialized rates for approximations of the popular triangular Kn{ö}the-Rosenblatt maps are obtained, followed by numerical experiments that demonstrate and extend our theory.
BibTeX entry
@article{baptista-theorytransportsample-2023, author = "R Baptista and B Hosseini and N B Kovachki and Y M Marzouk and A Sagiv", journal = "Preprint", month = "2", title = "An Approximation Theory Framework for Measure-Transport Sampling Algorithms", year = "2023", doi = "10.48550/arXiv.2302.13965", abstract = "This article presents a general approximation-theoretic framework to analyze measure transport algorithms for probabilistic modeling. A primary motivating application for such algorithms is sampling -- a central task in statistical inference and generative modeling. We provide a priori error estimates in the continuum limit, i.e., when the measures (or their densities) are given, but when the transport map is discretized or approximated using a finite-dimensional function space. Our analysis relies on the regularity theory of transport maps and on classical approximation theory for high-dimensional functions. A third element of our analysis, which is of independent interest, is the development of new stability estimates that relate the distance between two maps to the distance~(or divergence) between the pushforward measures they define. We present a series of applications of our framework, where quantitative convergence rates are obtained for practical problems using Wasserstein metrics, maximum mean discrepancy, and Kullback--Leibler divergence. Specialized rates for approximations of the popular triangular Kn{ö}the-Rosenblatt maps are obtained, followed by numerical experiments that demonstrate and extend our theory.", keywords = "Transport map, generative models, stability analysis, approximation theory" }
Infinite-Dimensional Diffusion Models for Function Spaces
We define diffusion-based generative models in infinite dimensions, and apply them to the generative modeling of functions. By first formulating such models in the infinite-dimensional limit and only then discretizing, we are able to obtain a sampling algorithm that has \emph{dimension-free} bounds on the distance from the sample measure to the target measure. Furthermore, we propose a new way to perform conditional sampling in an infinite-dimensional space and show that our approach outperforms previously suggested procedures.
BibTeX entry
@article{pidstrigach-infinitedimdiffusion-2023, author = "J Pidstrigach and Y M Marzouk and S Reich and S Wang", journal = "Preprint", month = "2", title = "Infinite-Dimensional Diffusion Models for Function Spaces", year = "2022", doi = "10.48550/arXiv.2302.10130", abstract = "We define diffusion-based generative models in infinite dimensions, and apply them to the generative modeling of functions. By first formulating such models in the infinite-dimensional limit and only then discretizing, we are able to obtain a sampling algorithm that has \emph{dimension-free} bounds on the distance from the sample measure to the target measure. Furthermore, we propose a new way to perform conditional sampling in an infinite-dimensional space and show that our approach outperforms previously suggested procedures." }
A score-based operator Newton method for measure transport
Transportation of probability measures underlies many core tasks in statistics and machine learning, from variational inference to generative modeling. A typical goal is to represent a target probability measure of interest as the push-forward of a tractable source measure through a learned map. We present a new construction of such a transport map, given the ability to evaluate the score of the target distribution. Specifically, we characterize the map as a zero of an infinite-dimensional score-residual operator and derive a Newton-type method for iteratively constructing such a zero. We prove convergence of these iterations by invoking classical elliptic regularity theory for partial differential equations (PDE) and show that this construction enjoys rapid convergence, under smoothness assumptions on the target score. A key element of our approach is a generalization of the elementary Newton method to infinite-dimensional operators, other forms of which have appeared in nonlinear PDE and in dynamical systems. Our Newton construction, while developed in a functional setting, also suggests new iterative algorithms for approximating transport maps.
BibTeX entry
@article{chandramoorthy-newtonmethodtransport-2023, author = "N Chandramoorthy and F Schaefer and Y M Marzouk", journal = "Preprint", month = "5", title = "A score-based operator Newton method for measure transport", year = "2023", doi = "10.48550/arXiv.2305.09792", abstract = "Transportation of probability measures underlies many core tasks in statistics and machine learning, from variational inference to generative modeling. A typical goal is to represent a target probability measure of interest as the push-forward of a tractable source measure through a learned map. We present a new construction of such a transport map, given the ability to evaluate the score of the target distribution. Specifically, we characterize the map as a zero of an infinite-dimensional score-residual operator and derive a Newton-type method for iteratively constructing such a zero. We prove convergence of these iterations by invoking classical elliptic regularity theory for partial differential equations (PDE) and show that this construction enjoys rapid convergence, under smoothness assumptions on the target score. A key element of our approach is a generalization of the elementary Newton method to infinite-dimensional operators, other forms of which have appeared in nonlinear PDE and in dynamical systems. Our Newton construction, while developed in a functional setting, also suggests new iterative algorithms for approximating transport maps.", keywords = "Transportation of measure, score-based modeling, optimal transport, Newton method, KAM iteration, elliptic PDE." }
Principal Feature Detection via Φ-Sobolev Inequalities
We investigate the approximation of high-dimensional target measures as low-dimensional updates of a dominating reference measure. This approximation class replaces the associated density with the composition of: (i) a feature map that identifies the leading principal components or features of the target measure, relative to the reference, and (ii) a low-dimensional profile function. When the reference measure satisfies a subspace ϕ-Sobolev inequality, we construct a computationally tractable approximation that yields certifiable error guarantees with respect to the Amari α-divergences. Our construction proceeds in two stages. First, for any feature map and any α-divergence, we obtain an analytical expression for the optimal profile function. Second, for linear feature maps, the principal features are obtained from eigenvectors of a matrix involving gradients of the log-density. Neither step requires explicit access to normalizing constants. Notably, by leveraging the ϕ-Sobolev inequalities, we demonstrate that these features universally certify approximation errors across the range of α-divergences α∈(0,1]. We then propose an application to Bayesian inverse problems and provide an analogous construction with approximation guarantees that hold in expectation over the data. We conclude with an extension of the proposed dimension reduction strategy to nonlinear feature maps.
BibTeX entry
@article{li-featuredetection-2023, author = "M T C Li and Y M Marzouk and O Zahm", journal = "Preprint", month = "5", title = "Principal Feature Detection via Φ-Sobolev Inequalities", year = "2023", doi = "10.48550/arXiv.2305.06172", abstract = "We investigate the approximation of high-dimensional target measures as low-dimensional updates of a dominating reference measure. This approximation class replaces the associated density with the composition of: (i) a feature map that identifies the leading principal components or features of the target measure, relative to the reference, and (ii) a low-dimensional profile function. When the reference measure satisfies a subspace ϕ-Sobolev inequality, we construct a computationally tractable approximation that yields certifiable error guarantees with respect to the Amari α-divergences. Our construction proceeds in two stages. First, for any feature map and any α-divergence, we obtain an analytical expression for the optimal profile function. Second, for linear feature maps, the principal features are obtained from eigenvectors of a matrix involving gradients of the log-density. Neither step requires explicit access to normalizing constants. Notably, by leveraging the ϕ-Sobolev inequalities, we demonstrate that these features universally certify approximation errors across the range of α-divergences α∈(0,1]. We then propose an application to Bayesian inverse problems and provide an analogous construction with approximation guarantees that hold in expectation over the data. We conclude with an extension of the proposed dimension reduction strategy to nonlinear feature maps.", keywords = "gradient-based dimension reduction, φ-Sobolev inequalities, Amari α-divergences, Bayesian inference, principal components, feature detection" }
Ensemble transport smoothing. Part 2: nonlinear updates
Smoothing is a specialized form of Bayesian inference for state-space models that characterizes the posterior distribution of a collection of states given an associated sequence of observations. Our companion manuscript proposes a general framework for transport-based ensemble smoothing, which includes linear Kalman-type smoothers as special cases. Here, we build on this foundation to realize and demonstrate nonlinear backward ensemble transport smoothers. We discuss parameterization and regularization of the associated transport maps, and then examine the performance of these smoothers for nonlinear and chaotic dynamical systems that exhibit non-Gaussian behavior. In these settings, our nonlinear transport smoothers yield lower estimation error than conventional linear smoothers and state-of-the-art iterative ensemble Kalman smoothers, for comparable numbers of model evaluations.
BibTeX entry
@article{ramgraber-smoothingparttwo-2022, author = "M Ramgraber and R Baptista and D McLaughlin and Y M Marzouk", journal = "Preprint", title = "Ensemble transport smoothing. Part 2: nonlinear updates", year = "2022", abstract = "Smoothing is a specialized form of Bayesian inference for state-space models that characterizes the posterior distribution of a collection of states given an associated sequence of observations. Our companion manuscript proposes a general framework for transport-based ensemble smoothing, which includes linear Kalman-type smoothers as special cases. Here, we build on this foundation to realize and demonstrate nonlinear backward ensemble transport smoothers. We discuss parameterization and regularization of the associated transport maps, and then examine the performance of these smoothers for nonlinear and chaotic dynamical systems that exhibit non-Gaussian behavior. In these settings, our nonlinear transport smoothers yield lower estimation error than conventional linear smoothers and state-of-the-art iterative ensemble Kalman smoothers, for comparable numbers of model evaluations.", keywords = "Data assimilation, smoothing, ensemble methods, triangular transport." }
Ensemble transport smoothing. Part 1: unified framework
Smoothers are algorithms for Bayesian time series re-analysis. Most operational smoothers rely either on affine Kalman-type transformations or on sequential importance sampling. These strategies occupy opposite ends of a spectrum that trades computational efficiency and scalability for statistical generality and consistency: non-Gaussianity renders affine Kalman updates inconsistent with the true Bayesian solution, while the ensemble size required for successful importance sampling can be prohibitive. This paper revisits the smoothing problem from the perspective of measure transport, which offers the prospect of consistent prior-to-posterior transformations for Bayesian inference. We leverage this capacity by proposing a general ensemble framework for transport-based smoothing. Within this framework, we derive a comprehensive set of smoothing recursions based on nonlinear transport maps and detail how they exploit the structure of state-space models in fully non-Gaussian settings. We also describe how many standard Kalman-type smoothing algorithms emerge as special cases of our framework. A companion paper explores the implementation of nonlinear ensemble transport smoothers in greater depth.
BibTeX entry
@article{ramgraber-smoothingpartone-2022, author = "M Ramgraber and R Baptista and D McLaughlin and Y M Marzouk", journal = "Preprint", title = "Ensemble transport smoothing. Part 1: unified framework", year = "2022", abstract = "Smoothers are algorithms for Bayesian time series re-analysis. Most operational smoothers rely either on affine Kalman-type transformations or on sequential importance sampling. These strategies occupy opposite ends of a spectrum that trades computational efficiency and scalability for statistical generality and consistency: non-Gaussianity renders affine Kalman updates inconsistent with the true Bayesian solution, while the ensemble size required for successful importance sampling can be prohibitive. This paper revisits the smoothing problem from the perspective of measure transport, which offers the prospect of consistent prior-to-posterior transformations for Bayesian inference. We leverage this capacity by proposing a general ensemble framework for transport-based smoothing. Within this framework, we derive a comprehensive set of smoothing recursions based on nonlinear transport maps and detail how they exploit the structure of state-space models in fully non-Gaussian settings. We also describe how many standard Kalman-type smoothing algorithms emerge as special cases of our framework. A companion paper explores the implementation of nonlinear ensemble transport smoothers in greater depth.", keywords = "Data assimilation, smoothing, ensemble methods, triangular transport" }
On minimax density estimation via measure transport
We study the convergence properties, in Hellinger and related distances, of nonparametric density estimators based on measure transport. These estimators represent the measure of interest as the pushforward of a chosen reference distribution under a transport map, where the map is chosen via a maximum likelihood objective (equivalently, minimizing an empirical Kullback-Leibler loss) or a penalized version thereof. We establish concentration inequalities for a general class of penalized measure transport estimators, by combining techniques from M-estimation with analytical properties of the transport-based density representation. We then demonstrate the implications of our theory for the case of triangular Knothe-Rosenblatt (KR) transports on the $d$-dimensional unit cube, and show that both penalized and unpenalized versions of such estimators achieve minimax optimal convergence rates over Hölder classes of densities. Specifically, we establish optimal rates for unpenalized nonparametric maximum likelihood estimation over bounded Hölder-type balls, and then for certain Sobolev-penalized estimators and sieved wavelet estimators.
BibTeX entry
@article{wang-tde-2022, author = "S Wang and Y M Marzouk", journal = "Preprint", title = "On minimax density estimation via measure transport", year = "2022", abstract = "We study the convergence properties, in Hellinger and related distances, of nonparametric density estimators based on measure transport. These estimators represent the measure of interest as the pushforward of a chosen reference distribution under a transport map, where the map is chosen via a maximum likelihood objective (equivalently, minimizing an empirical Kullback-Leibler loss) or a penalized version thereof. We establish concentration inequalities for a general class of penalized measure transport estimators, by combining techniques from M-estimation with analytical properties of the transport-based density representation. We then demonstrate the implications of our theory for the case of triangular Knothe-Rosenblatt (KR) transports on the $d$-dimensional unit cube, and show that both penalized and unpenalized versions of such estimators achieve minimax optimal convergence rates over Hölder classes of densities. Specifically, we establish optimal rates for unpenalized nonparametric maximum likelihood estimation over bounded Hölder-type balls, and then for certain Sobolev-penalized estimators and sieved wavelet estimators." }
Distributionally robust Gaussian process regression and Bayesian inverse problems
We study a distributionally robust optimization formulation (i.e., a min-max game) for two representative problems in Bayesian nonparametric estimation: Gaussian process regression and, more generally, linear inverse problems. Our formulation seeks the best mean-squared error predictor, in an infinite-dimensional space, against an adversary who chooses the worst-case model in a Wasserstein ball around a nominal infinite-dimensional Bayesian model. The transport cost is chosen to control features such as the degree of roughness of the sample paths that the adversary is allowed to inject. We show that the game has a well-defined value (i.e., strong duality holds in the sense that max-min equals min-max) and that there exists a unique Nash equilibrium which can be computed by a sequence of finite-dimensional approximations. Crucially, the worst-case distribution is itself Gaussian. We explore properties of the Nash equilibrium and the effects of hyperparameters through a set of numerical experiments, demonstrating the versatility of our modeling framework.
BibTeX entry
@article{zhang-dro-2022, author = "X Zhang and J Blanchet and Y M Marzouk and V A Nguyen and S Wang", journal = "Preprint", title = "Distributionally robust Gaussian process regression and Bayesian inverse problems", year = "2022", abstract = "We study a distributionally robust optimization formulation (i.e., a min-max game) for two representative problems in Bayesian nonparametric estimation: Gaussian process regression and, more generally, linear inverse problems. Our formulation seeks the best mean-squared error predictor, in an infinite-dimensional space, against an adversary who chooses the worst-case model in a Wasserstein ball around a nominal infinite-dimensional Bayesian model. The transport cost is chosen to control features such as the degree of roughness of the sample paths that the adversary is allowed to inject. We show that the game has a well-defined value (i.e., strong duality holds in the sense that max-min equals min-max) and that there exists a unique Nash equilibrium which can be computed by a sequence of finite-dimensional approximations. Crucially, the worst-case distribution is itself Gaussian. We explore properties of the Nash equilibrium and the effects of hyperparameters through a set of numerical experiments, demonstrating the versatility of our modeling framework." }
Gradient-based data and parameter dimension reduction for Bayesian models: an information theoretic perspective
We consider the problem of reducing the dimensions of parameters and data in non-Gaussian Bayesian inference problems. Our goal is to identify an "informed" subspace of the parameters and an "informative" subspace of the data so that a high-dimensional inference problem can be approximately reformulated in low-to-moderate dimensions, thereby improving the computational efficiency of many inference techniques. To do so, we exploit gradient evaluations of the log-likelihood function. Furthermore, we use an information-theoretic analysis to derive a bound on the posterior error due to parameter and data dimension reduction. This bound relies on logarithmic Sobolev inequalities, and it reveals the appropriate dimensions of the reduced variables. We compare our method with classical dimension reduction techniques, such as principal component analysis and canonical correlation analysis, on applications ranging from mechanics to image processing.
BibTeX entry
@article{dimred2022, author = "R Baptista and Y M Marzouk and O Zahm", journal = "Preprint", title = "Gradient-based data and parameter dimension reduction for Bayesian models: an information theoretic perspective", year = "2022", abstract = "We consider the problem of reducing the dimensions of parameters and data in non-Gaussian Bayesian inference problems. Our goal is to identify an "informed" subspace of the parameters and an "informative" subspace of the data so that a high-dimensional inference problem can be approximately reformulated in low-to-moderate dimensions, thereby improving the computational efficiency of many inference techniques. To do so, we exploit gradient evaluations of the log-likelihood function. Furthermore, we use an information-theoretic analysis to derive a bound on the posterior error due to parameter and data dimension reduction. This bound relies on logarithmic Sobolev inequalities, and it reveals the appropriate dimensions of the reduced variables. We compare our method with classical dimension reduction techniques, such as principal component analysis and canonical correlation analysis, on applications ranging from mechanics to image processing.", keywords = "Bayesian inference, gradient-based dimension reduction, logarithmic Sobolev inequalities, conditional mutual information, low-dimensional subspaces, coordinate selection" }
On the representation and learning of monotone triangular transport maps
Transportation of measure provides a versatile approach for modeling complex probability distributions, with applications in density estimation, Bayesian inference, generative modeling, and beyond. Monotone triangular transport maps—approximations of the Knothe–Rosenblatt (KR) rearrangement—are a canonical choice for these tasks. Yet the representation and parameterization of such maps have a significant impact on their generality and expressiveness, and on properties of the optimization problem that arises in learning a map from data (e.g., via maximum likelihood estimation). We present a general framework for representing monotone triangular maps via invertible transformations of smooth functions. We establish conditions on the transformation such that the associated infinite-dimensional minimization problem has no spurious local minima, i.e., all local minima are global minima; and we show for target distributions satisfying certain tail conditions that the unique global minimizer corresponds to the KR map. Given a sample from the target, we then propose an adaptive algorithm that estimates a sparse semi-parametric approximation of the underlying KR map. We demonstrate how this framework can be applied to joint and conditional density estimation, likelihood-free inference, and structure learning of directed graphical models, with stable generalization performance across a range of sample sizes.
BibTeX entry
@article{baptista-atm-2020, author = "R. Baptista and Y. M. Marzouk and O. Zahm", journal = "Preprint", title = "On the representation and learning of monotone triangular transport maps", year = "2022", abstract = "Transportation of measure provides a versatile approach for modeling complex probability distributions, with applications in density estimation, Bayesian inference, generative modeling, and beyond. Monotone triangular transport maps—approximations of the Knothe–Rosenblatt (KR) rearrangement—are a canonical choice for these tasks. Yet the representation and parameterization of such maps have a significant impact on their generality and expressiveness, and on properties of the optimization problem that arises in learning a map from data (e.g., via maximum likelihood estimation). We present a general framework for representing monotone triangular maps via invertible transformations of smooth functions. We establish conditions on the transformation such that the associated infinite-dimensional minimization problem has no spurious local minima, i.e., all local minima are global minima; and we show for target distributions satisfying certain tail conditions that the unique global minimizer corresponds to the KR map. Given a sample from the target, we then propose an adaptive algorithm that estimates a sparse semi-parametric approximation of the underlying KR map. We demonstrate how this framework can be applied to joint and conditional density estimation, likelihood-free inference, and structure learning of directed graphical models, with stable generalization performance across a range of sample sizes.", keywords = "Transportation of measure, Knothe–Rosenblatt rearrangement, normalizing flows, monotone functions, infinite-dimensional optimization, adaptive approximation, multivariate polynomials, wavelets, density estimation." }
Bayesian model calibration for block copolymer self-assembly: Likelihood-free inference and expected information gain computation via measure transport
We consider the Bayesian calibration of models describing the phenomenon of block copolymer (BCP) self-assembly using image data produced by microscopy or X-ray scattering techniques. To account for the random long-range disorder in BCP equilibrium structures, we introduce auxiliary variables to represent this aleatory uncertainty. These variables, however, result in an integrated likelihood for high-dimensional image data that is generally intractable to evaluate. We tackle this challenging Bayesian inference problem using a likelihood-free approach based on measure transport together with the construction of summary statistics for the image data. We also show that expected information gains (EIGs) from the observed data about the model parameters can be computed with no significant additional cost. Lastly, we present a numerical case study based on the Ohta–Kawasaki model for diblock copolymer thin film self-assembly and top-down microscopy characterization. For calibration, we introduce several domain-specific energy- and Fourier-based summary statistics, and quantify their informativeness using EIG. We demonstrate the power of the proposed approach to study the effect of data corruptions and experimental designs on the calibration results.
BibTeX entry
@article{baptista-copoly-2022, author = "R Baptista and L Cao and J Chen and O Ghattas and F Li and Y M Marzouk and J T Oden", journal = "Preprint", title = "Bayesian model calibration for block copolymer self-assembly: Likelihood-free inference and expected information gain computation via measure transport", year = "2022", doi = "10.48550/arXiv.2206.11343", abstract = "We consider the Bayesian calibration of models describing the phenomenon of block copolymer (BCP) self-assembly using image data produced by microscopy or X-ray scattering techniques. To account for the random long-range disorder in BCP equilibrium structures, we introduce auxiliary variables to represent this aleatory uncertainty. These variables, however, result in an integrated likelihood for high-dimensional image data that is generally intractable to evaluate. We tackle this challenging Bayesian inference problem using a likelihood-free approach based on measure transport together with the construction of summary statistics for the image data. We also show that expected information gains (EIGs) from the observed data about the model parameters can be computed with no significant additional cost. Lastly, we present a numerical case study based on the Ohta--Kawasaki model for diblock copolymer thin film self-assembly and top-down microscopy characterization. For calibration, we introduce several domain-specific energy- and Fourier-based summary statistics, and quantify their informativeness using EIG. We demonstrate the power of the proposed approach to study the effect of data corruptions and experimental designs on the calibration results.", keywords = "block copolymers, material self-assembly, uncertainty quantification, likelihood-free inference, summary statistics, measure transport, expected information gain, Ohta–Kawasaki model" }
Computing eigenfunctions of the multidimensional Ornstein-Uhlenbeck operator
We discuss approaches to computing eigenfunctions of the Ornstein–Uhlenbeck (OU) operator in more than two dimensions. While the spectrum of the OU operator and theoretical properties of its eigenfunctions have been well characterized in previous research, the practical computation of general eigenfunctions has not been resolved. We review special cases for which the eigenfunctions can be expressed exactly in terms of commonly used orthogonal polynomials. Then we present a tractable approach for computing the eigenfunctions in general cases and comment on its dimension dependence.
BibTeX entry
@article{zhang-eigenfunctions-2021, author = "B. Zhang and T. Sahai and Y. M. Marzouk", journal = "Preprint", title = "Computing eigenfunctions of the multidimensional Ornstein-Uhlenbeck operator", year = "2021", doi = "10.48550/arXiv.2110.09229", abstract = "We discuss approaches to computing eigenfunctions of the Ornstein--Uhlenbeck (OU) operator in more than two dimensions. While the spectrum of the OU operator and theoretical properties of its eigenfunctions have been well characterized in previous research, the practical computation of general eigenfunctions has not been resolved. We review special cases for which the eigenfunctions can be expressed exactly in terms of commonly used orthogonal polynomials. Then we present a tractable approach for computing the eigenfunctions in general cases and comment on its dimension dependence." }
Bayesian inference under model misspecification using transport-Lagrangian distances: an application to seismic inversion
Model misspecification constitutes a major obstacle to reliable inference in many inverse problems. Inverse problems in seismology, for example, are particularly affected by misspecification of wave propagation velocities. In this paper, we focus on a specific seismic inverse problem–-full-waveform moment tensor inversion - and develop a Bayesian framework that seeks robustness to velocity misspecification. A novel element of our framework is the use of transport-Lagrangian (TL) distances between observed and model predicted waveforms to specify a loss function, and the use of this loss to define a generalized belief update via a Gibbs posterior. The TL distance naturally disregards certain features of the data that are more sensitive to model misspecification, and therefore produces less biased or dispersed posterior distributions in this setting. To make the latter notion precise, we use several diagnostics to assess the quality of inference and uncertainty quantification, i.e., continuous rank probability scores and rank histograms. We interpret these diagnostics in the Bayesian setting and compare the results to those obtained using more typical Gaussian noise models and squared-error loss, under various scenarios of misspecification. Finally, we discuss potential generalizability of the proposed framework to a broader class of inverse problems affected by model misspecification.
BibTeX entry
@article{scarinci-tl-2021, author = "A. Scarinci and M. Fehler and Y. M. Marzouk", journal = "Preprint", title = "Bayesian inference under model misspecification using transport-Lagrangian distances: an application to seismic inversion", year = "2021", abstract = "Model misspecification constitutes a major obstacle to reliable inference in many inverse problems. Inverse problems in seismology, for example, are particularly affected by misspecification of wave propagation velocities. In this paper, we focus on a specific seismic inverse problem---full-waveform moment tensor inversion - and develop a Bayesian framework that seeks robustness to velocity misspecification. A novel element of our framework is the use of transport-Lagrangian (TL) distances between observed and model predicted waveforms to specify a loss function, and the use of this loss to define a generalized belief update via a Gibbs posterior. The TL distance naturally disregards certain features of the data that are more sensitive to model misspecification, and therefore produces less biased or dispersed posterior distributions in this setting. To make the latter notion precise, we use several diagnostics to assess the quality of inference and uncertainty quantification, i.e., continuous rank probability scores and rank histograms. We interpret these diagnostics in the Bayesian setting and compare the results to those obtained using more typical Gaussian noise models and squared-error loss, under various scenarios of misspecification. Finally, we discuss potential generalizability of the proposed framework to a broader class of inverse problems affected by model misspecification." }
Learning non-Gaussian graphical models via Hessian scores and triangular transport
Undirected probabilistic graphical models represent the conditional dependencies, or Markov properties, of a collection of random variables. Knowing the sparsity of such a graphical model is valuable for modeling multivariate distributions and for efficiently performing inference. While the problem of learning graph structure from data has been studied extensively for certain parametric families of distributions, most existing methods fail to consistently recover the graph structure for non-Gaussian data. Here we propose an algorithm for learning the Markov structure of continuous and non-Gaussian distributions. To characterize conditional independence, we introduce a score based on integrated Hessian information from the joint log-density, and we prove that this score upper bounds the conditional mutual information for a general class of distributions. To compute the score, our algorithm SING estimates the density using a deterministic coupling, induced by a triangular transport map, and iteratively exploits sparse structure in the map to reveal sparsity in the graph. For certain non-Gaussian datasets, we show that our algorithm recovers the graph structure even with a biased approximation to the density. Among other examples, we apply sing to learn the dependencies between the states of a chaotic dynamical system with local interactions.
BibTeX entry
@article{baptista-lng-2021, author = "R. Baptista and Y. M. Marzouk and R. Morrison and O. Zahm", journal = "Preprint", title = "Learning non-Gaussian graphical models via Hessian scores and triangular transport", year = "2021", abstract = "Undirected probabilistic graphical models represent the conditional dependencies, or Markov properties, of a collection of random variables. Knowing the sparsity of such a graphical model is valuable for modeling multivariate distributions and for efficiently performing inference. While the problem of learning graph structure from data has been studied extensively for certain parametric families of distributions, most existing methods fail to consistently recover the graph structure for non-Gaussian data. Here we propose an algorithm for learning the Markov structure of continuous and non-Gaussian distributions. To characterize conditional independence, we introduce a score based on integrated Hessian information from the joint log-density, and we prove that this score upper bounds the conditional mutual information for a general class of distributions. To compute the score, our algorithm SING estimates the density using a deterministic coupling, induced by a triangular transport map, and iteratively exploits sparse structure in the map to reveal sparsity in the graph. For certain non-Gaussian datasets, we show that our algorithm recovers the graph structure even with a biased approximation to the density. Among other examples, we apply sing to learn the dependencies between the states of a chaotic dynamical system with local interactions.", keywords = "Undirected graphical models, structure learning, non-Gaussian distributions, conditional mutual information, transport map, sparsity" }
Conditional sampling with monotone GANs
We present a new approach for sampling conditional measures that enables uncertainty quantification in supervised learning tasks. We construct a mapping that transforms a reference measure to the probability measure of the output conditioned on new inputs. The mapping is trained via a modification of generative adversarial networks (GANs), called monotone GANs, that imposes monotonicity constraints and a block triangular structure. We present theoretical results, in an idealized setting, that support our proposed method as well as numerical experiments demonstrating the ability of our method to sample the correct conditional measures in applications ranging from inverse problems to image in-painting.
BibTeX entry
@article{kovachki-monotonegans-2020, author = "N. Kovachki and R. Baptista and B. Hosseini and Y. M. Marzouk", journal = "Preprint", title = "Conditional sampling with monotone GANs", year = "2020", abstract = "We present a new approach for sampling conditional measures that enables uncertainty quantification in supervised learning tasks. We construct a mapping that transforms a reference measure to the probability measure of the output conditioned on new inputs. The mapping is trained via a modification of generative adversarial networks (GANs), called monotone GANs, that imposes monotonicity constraints and a block triangular structure. We present theoretical results, in an idealized setting, that support our proposed method as well as numerical experiments demonstrating the ability of our method to sample the correct conditional measures in applications ranging from inverse problems to image in-painting." }
A layered multiple importance sampling scheme for focused optimal Bayesian experimental design
We develop a new computational approach for "focused" optimal Bayesian experimental design with nonlinear models, with the goal of maximizing expected information gain in targeted subsets of model parameters. Our approach considers uncertainty in the full set of model parameters, but employs a design objective that can exploit learning trade-offs among different parameter subsets. We introduce a new layered multiple importance sampling scheme that provides consistent estimates of expected information gain in this focused setting. This sampling scheme yields significant reductions in estimator bias and variance for a given computational effort, making optimal design more tractable for a wide range of computationally intensive problems.
BibTeX entry
@article{art_77, author = "C. Feng and Y. M. Marzouk", journal = "Preprint", title = "A layered multiple importance sampling scheme for focused optimal Bayesian experimental design", year = "2019", abstract = "We develop a new computational approach for "focused" optimal Bayesian experimental design with nonlinear models, with the goal of maximizing expected information gain in targeted subsets of model parameters. Our approach considers uncertainty in the full set of model parameters, but employs a design objective that can exploit learning trade-offs among different parameter subsets. We introduce a new layered multiple importance sampling scheme that provides consistent estimates of expected information gain in this focused setting. This sampling scheme yields significant reductions in estimator bias and variance for a given computational effort, making optimal design more tractable for a wide range of computationally intensive problems.", keywords = "optimal experimental design, Bayesian inference, Monte Carlo methods, multiple importance sampling, expected information gain, mutual information" }
A trust region method for derivative-free nonlinear constrained stochastic optimization
We present the algorithm SNOWPAC for derivative-free constrained stochastic optimization. The algorithm builds on a model-based approach for deterministic nonlinear constrained derivative-free optimization that introduces an ‘‘inner boundary path’’ to locally convexify the feasible domain and ensure feasible trial steps. We extend this deterministic method via a generalized trust region approach that accounts for noisy evaluations of the objective and constraints. To reduce the impact of noise, we fit consistent Gaussian processes to past objective and constraint evaluations. Our approach incorporates a wide variety of probabilistic risk or deviation measures in both the objective and the constraints. Numerical benchmarking demonstrates SNOWPAC’s efficiency and highlights the accuracy of the optimization solutions found.
BibTeX entry
@article{art_55, author = "F. Augustin and Y. M. Marzouk", journal = "Preprint", title = "A trust region method for derivative-free nonlinear constrained stochastic optimization", year = "2017", abstract = "We present the algorithm SNOWPAC for derivative-free constrained stochastic optimization. The algorithm builds on a model-based approach for deterministic nonlinear constrained derivative-free optimization that introduces an ``inner boundary path'' to locally convexify the feasible domain and ensure feasible trial steps. We extend this deterministic method via a generalized trust region approach that accounts for noisy evaluations of the objective and constraints. To reduce the impact of noise, we fit consistent Gaussian processes to past objective and constraint evaluations. Our approach incorporates a wide variety of probabilistic risk or deviation measures in both the objective and the constraints. Numerical benchmarking demonstrates SNOWPAC's efficiency and highlights the accuracy of the optimization solutions found." }
Sequential Bayesian optimal experimental design via approximate dynamic programming
The design of multiple experiments is commonly undertaken via suboptimal strategies, such as batch (open-loop) design that omits feedback or greedy (myopic) design that does not account for future effects. This paper introduces new strategies for the optimal design of sequential experiments. First, we rigorously formulate the general sequential optimal experimental design (sOED) problem as a dynamic program. Batch and greedy designs are shown to result from special cases of this formulation. We then focus on sOED for parameter inference, adopting a Bayesian formulation with an information theoretic design objective. To make the problem tractable, we develop new numerical approaches for nonlinear design with continuous parameter, design, and observation spaces. We approximate the optimal policy by using backward induction with regression to construct and refine value function approximations in the dynamic program. The proposed algorithm iteratively generates trajectories via exploration and exploitation to improve approximation accuracy in frequently visited regions of the state space. Numerical results are verified against analytical solutions in a linear-Gaussian setting. Advantages over batch and greedy design are then demonstrated on a nonlinear source inversion problem where we seek an optimal policy for sequential sensing.
BibTeX entry
@article{huan-sequential-2016, author = "X. Huan and Y. M. Marzouk", journal = "Preprint", title = "Sequential Bayesian optimal experimental design via approximate dynamic programming", year = "2016", abstract = "The design of multiple experiments is commonly undertaken via suboptimal strategies, such as batch (open-loop) design that omits feedback or greedy (myopic) design that does not account for future effects. This paper introduces new strategies for the optimal design of sequential experiments. First, we rigorously formulate the general sequential optimal experimental design (sOED) problem as a dynamic program. Batch and greedy designs are shown to result from special cases of this formulation. We then focus on sOED for parameter inference, adopting a Bayesian formulation with an information theoretic design objective. To make the problem tractable, we develop new numerical approaches for nonlinear design with continuous parameter, design, and observation spaces. We approximate the optimal policy by using backward induction with regression to construct and refine value function approximations in the dynamic program. The proposed algorithm iteratively generates trajectories via exploration and exploitation to improve approximation accuracy in frequently visited regions of the state space. Numerical results are verified against analytical solutions in a linear-Gaussian setting. Advantages over batch and greedy design are then demonstrated on a nonlinear source inversion problem where we seek an optimal policy for sequential sensing.", keywords = "sequential experimental design, Bayesian experimental design, approximate dynamic programming, feedback control policy, lookahead, approximate value iteration, information gain" }
Bayesian level sets for image segmentation
This paper presents a new algorithm for image segmentation and classification, Bayesian Level Sets (BLS). BLS harnesses the advantages of two well-known algorithms: variational level sets and finite mixture model EM (FMM-EM). Like FMM-EM, BLS has a simple, probabilistic implementation which natively extends to an arbitrary number of classes. Via a level set-inspired geometric prior, BLS returns smooth, regular segmenting contours that are robust to noise. In practice, BLS is also observed to be robust to fairly lenient initial conditions. A comparative analysis of the three algorithms (BLS, level set, FMM-EM) is presented, and the advantages of BLS are quantitatively demonstrated on realistic applications such as pluripotent stem cell colonies, brain MRI phantoms, and stem cell nuclei.
BibTeX entry
@article{art_39, author = "N. Lowry and R. Mangoubi and M. Desai and Y. M. Marzouk and P. Sammak", journal = "Preprint", title = "Bayesian level sets for image segmentation", year = "2015", abstract = "This paper presents a new algorithm for image segmentation and classification, Bayesian Level Sets (BLS). BLS harnesses the advantages of two well-known algorithms: variational level sets and finite mixture model EM (FMM-EM). Like FMM-EM, BLS has a simple, probabilistic implementation which natively extends to an arbitrary number of classes. Via a level set-inspired geometric prior, BLS returns smooth, regular segmenting contours that are robust to noise. In practice, BLS is also observed to be robust to fairly lenient initial conditions. A comparative analysis of the three algorithms (BLS, level set, FMM-EM) is presented, and the advantages of BLS are quantitatively demonstrated on realistic applications such as pluripotent stem cell colonies, brain MRI phantoms, and stem cell nuclei." }
NOWPAC: A provably convergent derivative-free nonlinear optimizer with path-augmented constraints
This paper proposes the algorithm NOWPAC (Nonlinear Optimization With Path-Augmented Constraints) for nonlinear constrained derivative-free optimization. The algorithm uses a trust region framework based on fully linear models for the objective function and the constraints. A new constraint-handling scheme based on an inner boundary path allows for the computation of feasible trial steps using models for the constraints. We prove that the iterates computed by NOWPAC converge to a local first order critical point. We also discuss the convergence of NOWPAC in situations where evaluations of the objective function or the constraints are inexact, e.g., corrupted by numerical errors. For this, we determine a rate of decay that the magnitude of these numerical errors must satisfy, while approaching the critical point, to guarantee convergence. In settings where adjusting the accuracy of the objective or constraint evaluations is not possible, as is often the case in practical applications, we introduce an error indicator to detect these regimes and prevent deterioration of the optimization results.
BibTeX entry
@article{augustin-nowpac-2014, author = "F. Augustin and Y. M. Marzouk", journal = "Preprint", title = "NOWPAC: A provably convergent derivative-free nonlinear optimizer with path-augmented constraints", year = "2014", abstract = "This paper proposes the algorithm NOWPAC (Nonlinear Optimization With Path-Augmented Constraints) for nonlinear constrained derivative-free optimization. The algorithm uses a trust region framework based on fully linear models for the objective function and the constraints. A new constraint-handling scheme based on an inner boundary path allows for the computation of feasible trial steps using models for the constraints. We prove that the iterates computed by NOWPAC converge to a local first order critical point. We also discuss the convergence of NOWPAC in situations where evaluations of the objective function or the constraints are inexact, e.g., corrupted by numerical errors. For this, we determine a rate of decay that the magnitude of these numerical errors must satisfy, while approaching the critical point, to guarantee convergence. In settings where adjusting the accuracy of the objective or constraint evaluations is not possible, as is often the case in practical applications, we introduce an error indicator to detect these regimes and prevent deterioration of the optimization results." }
Announcements
February 2024
Congratulations to Fengyi Li for a successful thesis defense!
Congratulations to Fengyi Li for a successful thesis defense!
January 2024
The UQGroup welcomes new Postdocs Ayoub Belhadji, Timo Schorlepp, and Mirjeta Pasha!
The UQGroup welcomes new Postdocs Ayoub Belhadji, Timo Schorlepp, and Mirjeta Pasha!
September 2023
The UQ Group welcomes new graduate students Oliver Wang and Julie Zhu!
The UQ Group welcomes new graduate students Oliver Wang and Julie Zhu!
July 2023
Welcome to our new postdoc Mathieu Le Provost!
Welcome to our new postdoc Mathieu Le Provost!
May 2023
Congratulations to Michael Brennan for successfully defending his PhD thesis!
Congratulations to Michael Brennan for successfully defending his PhD thesis!
September 2022
Welcome to Hannah Lu, joining the group on the heels of finishing her Ph.D. at Stanford!
Welcome to Hannah Lu, joining the group on the heels of finishing her Ph.D. at Stanford!
May 2022
Congratulations to Ricardo Baptista for successfully defending his PhD thesis!
More announcements
Congratulations to Ricardo Baptista for successfully defending his PhD thesis!