Physical phenomena are characterized by models spanning a wide range of spatio-temporal scales (multi-scale) as well as scientific principles (multi-physics). Carrying out engineering tasks –-such as uncertainty quantification and design– using the full complexity of these models faces insurmountable computational barriers. Typical examples include geological flows, climate modeling, mutli-scale material modeling, protein folding, etc. The ultimate goal of my research is to make feasible the routine use of complex realistic models by scientists and engineers. There are two –seemingly- different research areas which, I believe, are necessary in order to accomplish this goal. Firstly, new mathematical principles must be developed that allow us to bridge the gaps between model layers (coarse-graining). Secondly, computationally expensive models have to be replaced by cheaper alternatives. Both of these areas are summarized in the following paragraphs.
Bridging the Gaps¶
It is easier to visualize this problem by making use of an example from my current research. Suppose that our goal is to design from first principles an alloy with particular extreme properties. This is essentially an inverse problem. We set some target properties and we look for the configuration of an alloy that realizes (or comes close to realizing) them. The degrees of freedom of the complex model include the atomic lattice type plus a motif, the configuration of the alloy (which atom sits on which lattice site), vibrational and electronic degrees of freedom. Solving the inverse problem using the full degrees of freedom is next to impossible. Therefore, we are looking for a coarse-grained description involving only the lattice type and the configuration. The question that arises is what the effective probabilistic description of the coarse-grained system is. What we seek is a well-founded mathematical principle that describes how information propagates from one model to the other.
Background and Experience¶
The relative entropy measures the distance between two probability distributions. One may think of it as a generalization of the entropy (which measures the distance of a distribution from the uniform distribution). The principle states that the “best” probabilistic description of the coarse-grained system minimizes the relative entropy between a set of candidate descriptions and the true one. Remarkably, the minimization problem has a unique solution under quite general settings. In the past, we have used this method to develop a biasing potential method for the calculation of free energies with respect to arbitrary reaction coordinates, to find a coarse-grained description of an expensive water model, while it has been used by others to study protein folding and develop effective peptide models. Most recently, we were able to apply it to the alloy problem described above.
The relative entropy principle is much more general in providing the glue between two model layers. In the next 3-5 years I plan to investigate it in greater depths. In addition to continuing my research on getting effective probabilistic descriptions of alloys, I will expand in two other application areas: 1) Estimating pseudo potentials used in ab initio calculations and 2) Finding empirical potentials from ab initio calculations. These aspects will involve further elaboration of the following key ideas:
Coarse Model Selection¶
It is often the case that the coarse-grained description of a model is the subject of debate. For example, when searching for an effective empirical potential there is a wide variety of descriptions available. We can use non-bonded pairwise interactions, interactions between three atoms, form explicit bonds or even use an embedded atom approach. How does one pick the best candidate? On one hand, the choice one finally makes is dictated by the computational burden that can be tolerated for carrying out the task for which the model is destined. On the other hand, any form of coarse-graining results in loss of information and in any given set of coarse models with comparable complexity some might perform better than others. The relative entropy principle can be stated in a way that allows the selection of the coarse model. This aspect is a completely unexplored research area with potentially big impact.
The generality of the relative entropy principle does not come for free. The details of the optimization problem are application dependent and –as a general rule- it is notoriously difficult to solve. The complicating factor is that an accurate estimation of the relative entropy and its derivatives requires the ability to simulate the fine scale system extensively. Since this is not possible under most circumstances, we have to observe only a noisy version of these quantities. So far, we have been successful in developing efficient stochastic optimization schemes for solving this problem borrowing ideas from the Machine Learning community. However, there is still a lot of room for further mathematical developments: 1) Necessary and sufficient conditions for the convergence of the algorithms must be sought; 2) Convergence criteria must be developed; 3) Theoretical or numerical criteria for selecting any parameters must be found.
Quantifying Lost Information¶
How does the epistemic uncertainty induced by the finite observations of the fine scale model transfer to the coarse-grained description? Can its impact be quantified? Such an achievement is important in at least two ways: 1) It could be used to guide the selection of the fine scale states we observe (experimental design towards improving the model); 2) It could be propagated further up the scales in a hierarchical manner and be reflected on the quantities of interest.
Uncertainty Quantification, Inverse Problems and Design under Uncertainty¶
Various research groups and national laboratories spent many years developing sophisticated software for modeling realistic physical phenomena. When the code matures, the ultimate goal is to apply it to common engineering tasks such as uncertainty quantification (study the effect of uncertain inputs to the output), inverse problems (set target properties and look for inputs that realize them) and design under uncertainty (optimize a quantity of interest while some inputs are beyond control). There are two common factors that make all these problems extremely difficult to solve: 1) A probabilistic model must be constructed for the input (stochastic input modeling); 2) The forward model might be very expensive to evaluate even on modern petascale and exascale computing oriented supercomputers. To clarify the ideas, consider the problem of optimizing the output of an oil reservoir. For the sake of the argument, assume that the only variables we can vary at are the locations of the wells. In addition to these there are random high-dimensional variables beyond our control (e.g. the soil permeability). The randomness is not inherent, but represents our lack of knowledge. The first task is to construct an input model for the spatial random fields that are beyond our control. We do this by: 1) Constructing a flexible prior model; 2) Restricting it to satisfy any experimental data we might have available (e.g. measurements from exploration wells and seismic data). The first step is a difficult endeavor of each own complicated by the high-dimensionality of the fields. The second step hides an inverse problem (e.g. seismic inversion) involving a very expensive forward model. Assuming the first task is complete, a design problem under uncertainty must be solved. This may be carried out utilizing stochastic optimization schemes with a very expensive target function. A realistic solution of the inverse problem of the first task and the design problem of the second task requires the replacement of the computer codes with cheap to evaluate surrogates. Background and Experience
During the past three years I had the opportunity to work extensively on the construction of computer surrogates to be used in uncertainty quantification tasks. We worked with several models the most successful of which were based on Gaussian Process regression and Probabilistic Graphical models. We developed models that were able to capture discontinuities or sharp variations in the stochastic space, model the correlation of distinct stochastic variables as well as spatio-temporal ones. By formulating the surrogate construction problem in a Bayesian framework, we were able to develop techniques that adaptively choose the simulations to be performed while quantifying the effect of the finite number of observations on the quantities of interest. With regards to the stochastic input modeling problem, we used non-linear dimensionality reduction techniques and Probabilistic Graphical Models to address the high-dimensionality issue.
In the immediate future, I plan to continue my work on uncertainty quantification, inverse problems and design under uncertainty using as driving forces applications of interest to national laboratories and industry (physical modeling in random media/materials, complex interconnected systems such as aircraft engines, geological and reservoir modeling, etc.). Obviously, many of these tasks will be enhanced with collaborative efforts with colleagues that provide mathematical and physical modeling expertise in these application areas. The two core areas of this research are elaborated below:
Data-driven Stochastic Input Modeling¶
Even though this part should not be considered independently from the surrogate creation problem, I put it as different topic in order to emphasize the special developments required. It is a prerequisite to any engineering application under uncertainty. The major complicating factors are the scarcity of experimental observations in combination with high-dimensionality. The key aspects that need to be explored are dimensionality reduction and density estimation.
The problem of constructing a surrogate of a very expensive computer code faces several obstacles. A quick summary is as follows:
- Discontinuities and Localized Features: In general, a realistic physical model would exhibit discontinuities or localized features with respect to stochastic or control parameters. Furthermore, these regions are also the most important in engineering applications. Discontinuities affect the stability of the system, while localized features usually enclose the design solutions. Capturing them requires the development of non-stationary models (e.g. treed Gaussian Processes or Probabilistic Graphical Models with different potentials on each node) in combination with Active Learning techniques (see below).
- Limited Data: The most important difficulty in constructing a surrogate is the impact of the finite number of observations on the quantities of interest. This can be captured by formulating the problem in a fully Bayesian way. Contrary to traditional methodologies, we are not seeking a unique surrogate but a probability measure over the space of possible surrogates. The weight of this probability measure corresponds to the epistemic uncertainty induced by the finite number of observations.
- Active Learning: Which are the most informative simulations that we should perform? The answer to this question depends on the application for which the surrogate is destined. For example in uncertainty quantification and in inverse problems, the decision must be weighted by the probability distribution of the inputs. In design problems the answer depends also on the quantity being optimized. In general non-stationary features should also affect the criterion we choose. The Bayesian formulation allows for a rigorous derivation of active learning schemes.
- High-Dimensionality Even after dimensionality reduction, the stochastic input model will still involve a considerable number of variables. However, in many problems the stochastic input is a spatially varying random field whose impact on the response surface might be local. Such ideas have been used successfully by the Multi-scale Finite Element method for the study of flow through porous media. In order to exploit locality, I plan to develop spatially restricted surrogates that communicate globally.
Discovering hidden parts of the natural world, solving problems that can potentially contribute to the scientific and technological miracle of the last two centuries are amongst the most enjoyable activities I have taken part in. My long term goal is to study multi-scale/multi-physics systems in much greater depth. I am intrigued by the sheer number of fundamental questions that are still out there. During the next decade, I am expecting a gradual increase in multi-scale models that are powered both by first principles and experimental data. In a Bayesian setting, the prior would be an uncertain coarse model conditioned on observations. Furthermore, as computational power increases, multi-scale models will start involving more and more scales in a hierarchical manner (e.g. from pseudo-potentials, to molecular dynamics, to microstructure evolution, to continuum mechanics). Information loss on each level would have to be quantified and propagated to the next one. Experimental data might be available on some levels and not on others. The main obstacle will still be high-dimensionality but in a completely different order of magnitude. The advent of exa-scale computing will increase dramatically the size of problems that we can solve; albeit by changing the way we practice computational science (e.g. communication models would have to be redesigned since when using millions of cores the probability of failure would be non-negligible). My basic training in Applied Mathematics in combination with my experience working in an engineering materials group during my PhD, allows me to act as a communication channel between various principles. This is one of my assets that I am eager to exploit by seeking collaborations with other scientists from diverse disciplines (e.g. computational sustainability and medical applications).