Guiding Global Placement With Reinforcement Learning
Subtitled arXiv preprint arXiv:2109.02631, R. Kirby, K. Nottingham, R. Roy, S. Godil, B. Catanzaro. 2021.
Recent advances in GPU accelerated global and detail placement have reduced the time to solution by an order of magnitude. This advancement allows us to leverage data driven optimization (such as Reinforcement Learning) in an effort to improve the final quality of placement results. In this work we augment state-of-the-art, force-based global placement solvers with a reinforcement learning agent trained to improve the final detail placed Half Perimeter Wire Length (HPWL). We propose novel control schemes with either global or localized control of the placement process. We then train reinforcement learning agents to use these controls to guide placement to improved solutions. In both cases, the augmented optimizer finds improved placement solutions. Our trained agents achieve an average 1% improvement in final detail place HPWL across a range of academic benchmarks and more than 1% in global place HPWL on real industry designs.
Non-Dissipative and Structure-Preserving Emulators via Spherical Optimization
Subtitled arXiv:2108.12053, D. Dai, Y. Epshteyn, A. Narayan. 2021.
Approximating a function with a finite series, eg, involving polynomials or trigonometric functions, is a critical tool in computing and data analysis. The construction of such approximations via now-standard approaches like least squares or compressive sampling does not ensure that the approximation adheres to certain convex linear structural constraints, such as positivity or monotonicity. Existing approaches that ensure such structure are norm-dissipative and this can have a deleterious impact when applying these approaches, eg, when numerical solving partial differential equations. We present a new framework that enforces via optimization such structure on approximations and is simultaneously norm-preserving. This results in a conceptually simple convex optimization problem on the sphere, but the feasible set for such problems can be very complex. We establish well-posedness of the optimization problem through results on spherical convexity and design several spherical-projection-based algorithms to numerically compute the solution. Finally, we demonstrate the effectiveness of this approach through several numerical examples.
|Robust topology optimization with low rank approximation using artificial neural networks,
V. Keshavarzzadeh, R. M. Kirby, A. Narayan. In Computational Mechanics, 2021.
We present a low rank approximation approach for topology optimization of parametrized linear elastic structures. The parametrization is considered on loading and stiffness of the structure. The low rank approximation is achieved by identifying a parametric connection among coarse finite element models of the structure (associated with different design iterates) and is used to inform the high fidelity finite element analysis. We build an Artificial Neural Network (ANN) map between low resolution design iterates and their corresponding interpolative coefficients (obtained from low rank approximations) and use this surrogate to perform high resolution parametric topology optimization. We demonstrate our approach on robust topology optimization with compliance constraints/objective functions and develop error bounds for the the parametric compliance computations. We verify these parametric computations with more challenging quantities of interest such as the p-norm of von Mises stress. To conclude, we use our approach on a 3D robust topology optimization and show significant reduction in computational cost via quantitative measures.
N. Truong, C. Yuksel, C. Watcharopas, J. A. Levine, R. M. Kirby. In IEEE Transactions on Visualization and Computer Graphics, IEEE, 2021.
Robustly handling collisions between individual particles in a large particle-based simulation has been a challenging problem. We introduce particle merging-and-splitting, a simple scheme for robustly handling collisions between particles that prevents inter-penetrations of separate objects without introducing numerical instabilities. This scheme merges colliding particles at the beginning of the time-step and then splits them at the end of the time-step. Thus, collisions last for the duration of a time-step, allowing neighboring particles of the colliding particles to influence each other. We show that our merging-and-splitting method is effective in robustly handling collisions and avoiding penetrations in particle-based simulations. We also show how our merging-and-splitting approach can be used for coupling different simulation systems using different and otherwise incompatible integrators. We present simulation tests …
Multifidelity Modeling for Physics-Informed Neural Networks (PINNs)
Subtitled arXiv preprint arXiv:2106.13361, M. Penwarden, S. Zhe, A. Narayan, R. M. Kirby. 2021.
Multifidelity simulation methodologies are often used in an attempt to judiciously combine low-fidelity and high-fidelity simulation results in an accuracy-increasing, cost-saving way. Candidates for this approach are simulation methodologies for which there are fidelity differences connected with significant computational cost differences. Physics-informed Neural Networks (PINNs) are candidates for these types of approaches due to the significant difference in training times required when different fidelities (expressed in terms of architecture width and depth as well as optimization criteria) are employed. In this paper, we propose a particular multifidelity approach applied to PINNs that exploits low-rank structure. We demonstrate that width, depth, and optimization criteria can be used as parameters related to model fidelity, and show numerical justification of cost differences in training due to fidelity parameter choices. We test our multifidelity scheme on various canonical forward PDE models that have been presented in the emerging PINNs literature.
Facilitating Data Discovery for Large-scale Science Facilities using Knowledge Networks
Y. Qin, I. Rodero, M. Parashar. In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE, pp. 651-660. 2021.
Large-scale multiuser scientific facilities, such as geographically distributed observatories, remote instruments, and experimental platforms, represent some of the largest national investments and can enable dramatic advances across many areas of science. Recent examples of such advances include the detection of gravitational waves and the imaging of a black hole’s event horizon. However, as the number of such facilities and their users grow, along with the complexity, diversity, and volumes of their data products, finding and accessing relevant data is becoming increasingly challenging, limiting the potential impact of facilities. These challenges are further amplified as scientists and application workflows increasingly try to integrate facilities’ data from diverse domains. In this paper, we leverage concepts underlying recommender systems, which are extremely effective in e-commerce, to address these data-discovery and data-access challenges for large-scale distributed scientific facilities. We first analyze data from facilities and identify and model user-query patterns in terms of facility location and spatial localities, domain-specific data models, and user associations. We then use this analysis to generate a knowledge graph and develop the collaborative knowledge-aware graph attention network (CKAT) recommendation model, which leverages graph neural networks (GNNs) to explicitly encode the collaborative signals through propagation and combine them with knowledge associations. Moreover, we integrate a knowledge-aware neural attention mechanism to enable the CKAT to pay more attention to key information while reducing irrelevant noise, thereby increasing the accuracy of the recommendations. We apply the proposed model on two real-world facility datasets and empirically demonstrate that the CKAT can effectively facilitate data discovery, significantly outperforming several compelling state-of-the-art baseline models.
Facilitating Staging-based Unstructured Mesh Processing to Support Hybrid In-Situ Workflows
Z. Wang, P. Subedi, M. Dorier, P.E. Davis, M. Parashar. In 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 960-964. 2021.
In-situ and in-transit processing alleviate the gap between the computing and I/O capabilities by scheduling data analytics close to the data source. Hybrid in-situ processing splits data analytics into two stages: the data processing that runs in-situ aims to extract regions of interest, which are then transferred to staging services for further in-transit analytics. To facilitate this type of hybrid in-situ processing, the data staging service needs to support complex intermediate data representations generated/consumed by the in-situ tasks. Unstructured (or irregular) mesh is one such derived data representation that is typically used and bridges simulation data and analytics. However, how staging services efficiently support unstructured mesh transfer and processing remains to be explored. This paper investigates design options for transferring and processing unstructured mesh data using staging services. Using polygonal mesh data as an example, we show that hybrid in-situ workflows with staging-based unstructured mesh processing can effectively support hybrid in-situ workflows, and can significantly decrease data movement overheads.
|Investigating In Situ Reduction via Lagrangian Representations for Cosmology and Seismology Applications,
S. Sane, C. R. Johnson, H. Childs. In Computational Science -- ICCS 2021, Springer International Publishing, pp. 436--450. 2021.
Although many types of computational simulations produce time-varying vector fields, subsequent analysis is often limited to single time slices due to excessive costs. Fortunately, a new approach using a Lagrangian representation can enable time-varying vector field analysis while mitigating these costs. With this approach, a Lagrangian representation is calculated while the simulation code is running, and the result is explored after the simulation. Importantly, the effectiveness of this approach varies based on the nature of the vector field, requiring in-depth investigation for each application area. With this study, we evaluate the effectiveness for previously unexplored cosmology and seismology applications. We do this by considering encumbrance (on the simulation) and accuracy (of the reconstructed result). To inform encumbrance, we integrated in situ infrastructure with two simulation codes, and evaluated on representative HPC environments, performing Lagrangian in situ reduction using GPUs as well as CPUs. To inform accuracy, our study conducted a statistical analysis across a range of spatiotemporal configurations as well as a qualitative evaluation. In all, we demonstrate effectiveness for both cosmology and seismology—time-varying vector fields from these domains can be reduced to less than 1% of the total data via Lagrangian representations, while maintaining accurate reconstruction and requiring under 10% of total execution time in over 80% of our experiments.
Structure-preserving Nonlinear Filtering for Continuous and Discontinuous Galerkin Spectral/hp Element Methods
Subtitled arXiv preprint arXiv:2106.08316, V. Zala, R. M. Kirby, A. Narayan. 2021.
Finite element simulations have been used to solve various partial differential equations (PDEs) that model physical, chemical, and biological phenomena. The resulting discretized solutions to PDEs often do not satisfy requisite physical properties, such as positivity or monotonicity. Such invalid solutions pose both modeling challenges, since the physical interpretation of simulation results is not possible, and computational challenges, since such properties may be required to advance the scheme. We, therefore, consider the problem of computing solutions that preserve these structural solution properties, which we enforce as additional constraints on the solution. We consider in particular the class of convex constraints, which includes positivity and monotonicity. By embedding such constraints as a postprocessing convex optimization procedure, we can compute solutions that satisfy general types of convex constraints. For certain types of constraints (including positivity and monotonicity), the optimization is a filter, i.e., a norm-decreasing operation. We provide a variety of tests on one-dimensional time-dependent PDEs that demonstrate the method's efficacy, and we empirically show that rates of convergence are unaffected by the inclusion of the constraints.
Leveraging user access patterns and advanced cyberinfrastructure to accelerate data delivery from shared-use scientific observatories
Y. Qin, I. Rodero, A. Simonet, C. Meertens, D. Reiner, J. Riley, M. Parashar. In Future Generation Computer Systems, North-Holland, pp. 14-27. 2021.
With the growing number and increasing availability of shared-use instruments and observatories, observational data is becoming an essential part of application workflows and contributor to scientific discoveries in a range of disciplines. However, the corresponding growth in the number of users accessing these facilities coupled with the expansion in the scale and variety of the data, is making it challenging for these facilities to ensure their data can be accessed, integrated, and analyzed in a timely manner, and is resulting significant demands on their cyberinfrastructure (CI). In this paper, we present the design of a push-based data delivery framework that leverages emerging in-network capabilities, along with data pre-fetching techniques based on a hybrid data management model. Specifically, we analyze data access traces for two large-scale observatories, Ocean Observatories Initiative (OOI) and Geodetic Facility for the Advancement of Geoscience (GAGE), to identify typical user access patterns and to develop a model that can be used for data pre-fetching. Furthermore, we evaluate our data pre-fetching model and the proposed framework using a simulation of the Virtual Data Collaboratory (VDC) platform that provides in-network data staging and processing capabilities. The results demonstrate that the ability of the framework to significantly improve data delivery performance and reduce network traffic at the observatories’ facilities.
The Exascale Framework for High Fidelity coupled Simulations (EFFIS): Enabling whole device modeling in fusion science
E. Suchyta, S. Klasky, N. Podhorszki, M. Wolf, A. Adesoji, C.S. Chang, J. Choi, P. E. Davis, J. Dominski, S. Ethier, I. Foster, K. Germaschewski, B. Geveci, C. Harris, K. A. Huck, Q. Liu, J. Logan, K. Mehta, G. Merlo, S. V. Moore, T. Munson, M. Parashar, D. Pugmire, M. S. Shephard, C. W. Smith, P. Subedi, L. Wan, R. Wang, S. Zhang. In The International Journal of High Performance Computing Applications, SAGE Publications, pp. 10943420211019119. 2021.
We present the Exascale Framework for High Fidelity coupled Simulations (EFFIS), a workflow and code coupling framework developed as part of the Whole Device Modeling Application (WDMApp) in the Exascale Computing Project.EFFIS consists of a library, command line utilities, and a collection of run-time daemons. Together, these software products enable users to easily compose and execute workflows that include: strong or weak coupling, in situ (or offline)analysis/visualization/monitoring, command-and-control actions, remote dashboard integration, and more. We describe WDMApp physics coupling cases and computer science requirements that motivate the design of the EFFIS framework. Furthermore, we explain the essential enabling technology that EFFIS leverages: ADIOS for performant data movement, PerfStubs/TAU for performance monitoring, and an advanced COUPLER for transforming coupling data from its native format to the representation needed by another application. Finally, we demonstrate EFFIS using coupled multi-simulation WDMApp workflows and exemplify how the framework supports the project’s needs. We show that EFFIS and its associated services for data movement, visualization, and performance collection does not introduce appreciable overhead to the WDMApp workflow and that the resource-dominant application’s idle time while waiting for data is minimal.
Sensitivity analysis of random linear differential–algebraic equations using system norms
R. Pulch, A. Narayan, T. Stykel. In Journal of Computational and Applied Mathematics, North-Holland, pp. 113666. 2021.
We consider linear dynamical systems composed of differential–algebraic equations (DAEs), where a quantity of interest (QoI) is assigned as output. Physical parameters of a system are modelled as random variables to quantify uncertainty, and we investigate a variance-based sensitivity analysis of the random QoI. Based on expansions via generalised polynomial chaos, the stochastic Galerkin method yields a new deterministic system of DAEs of high dimension. We define sensitivity measures by system norms, ie, the H∞-norm of the transfer function associated with the Galerkin system for different combinations of outputs. To ameliorate the enormous computational effort required to compute norms of high-dimensional systems, we apply balanced truncation, a particular method of model order reduction (MOR), to obtain a low-dimensional linear dynamical system that produces approximations of system norms …
Adaptive Density Tracking by Quadrature for Stochastic Differential Equations
Subtitled arXiv preprint arXiv:2105.08148, R. A. Moore, A. Narayan. 2021.
Density tracking by quadrature (DTQ) is a numerical procedure for computing solutions to Fokker-Planck equations that describe probability densities for stochastic differential equations (SDEs). In this paper, we extend upon existing tensorized DTQ procedures by utilizing a flexible quadrature rule that allows for unstructured, adaptive meshes. We propose and describe the procedure for -dimensions, and demonstrate that the resulting adaptive procedure is significantly more efficient than a tensorized approach. Although we consider two-dimensional examples, all our computational procedures are extendable to higher dimensional problems.
Budget-limited distribution learning in multifidelity problems
Subtitled arXiv preprint arXiv:2105.04599, Y. Xu, A. Narayan. 2021.
Multifidelity methods are widely used for statistical estimation of quantities of interest (QoIs) in uncertainty quantification using simulation codes of differing costs and accuracies. Many methods approximate numerical-valued statistics that represent only limited information of the QoIs. In this paper, we introduce a semi-parametric approach that aims to effectively describe the distribution of a scalar-valued QoI in the multifidelity setup. Under a linear model hypothesis, we propose an exploration-exploitation strategy to reconstruct the full distribution of a scalar-valued QoI using samples from a subset of low-fidelity regressors. We derive an informative asymptotic bound for the mean 1-Wasserstein distance between the estimator and the true distribution, and use it to adaptively allocate computational budget for parametric estimation and non-parametric reconstruction. Assuming the linear model is correct, we prove that such a procedure is consistent, and converges to the optimal policy (and hence optimal computational budget allocation) under an upper bound criterion as the budget goes to infinity. A major advantage of our approach compared to several other multifidelity methods is that it is automatic, and its implementation does not require a hierarchical model setup, cross-model information, or \textita priori known model statistics. Numerical experiments are provided in the end to support our theoretical analysis.
|Kernel optimization for Low-Rank Multi-Fidelity Algorithms,
M. Razi, M. Kirby, A. Narayan. In International Journal for Uncertainty Quantification, Begel House Inc., pp. 31-54. 2021.
One of the major challenges for low-rank multi-fidelity (MF) approaches is the assumption that low-fidelity (LF) and high-fidelity (HF) models admit``similar''low-rank kernel representations. Low-rank MF methods have traditionally attempted to exploit low-rank representations of\emph linear kernels. However, such linear kernels may not be able to capture low-rank behavior, and they may admit LF and HF kernels that are not similar. Such a situation renders a naive approach to low-rank MF procedures ineffective. In this paper, we propose a novel approach for the selection of a near-optimal kernel function for use in low-rank MF methods. The proposed framework is a two-step strategy wherein:(1) hyperparameters of a library of kernel functions are optimized, and (2) a particular combination of of the optimized kernels is selected, through either a convex mixture (Additive Kernel Approach) or through a data-driven …
Randomized weakly admissible meshes
Subtitled arXiv preprint arXiv:2101.04043, Y. Xu, A. Narayan. 2021.
A weakly admissible mesh (WAM) on a continuum real-valued domain is a sequence of discrete grids such that the discrete maximum norm of polynomials on the grid is comparable to the supremum norm of polynomials on the domain. The asymptotic rate of growth of the grid sizes and of the comparability constant must grow in a controlled manner. In this paper we generalize the notion of a WAM to a hierarchical subspaces of not necessarily polynomial functions, and we analyze particular strategies for random sampling as a technique for generating WAMs. Our main results show that WAM's and their stronger variant, admissible meshes, can be generated by random sampling, and our analysis provides concrete estimates for growth of both the meshes and the discrete-continuum comparability constants.
Multilevel Designed Quadrature for Partial Differential Equations with Random Inputs
V. Keshavarzzadeh, R. M. Kirby, A. Narayan. In SIAM Journal on Scientific Computing, Vol. 43, No. 2, Society for Industrial and Applied Mathematics, pp. A1412-A1440. 2021.
We introduce a numerical method, multilevel designed quadrature for computing the statistical solution of partial differential equations with random input data. Similar to multilevel Monte Carlo methods, our method relies on hierarchical spatial approximations in addition to a parametric/stochastic sampling strategy. A key ingredient in multilevel methods is the relationship between the spatial accuracy at each level and the number of stochastic samples required to achieve that accuracy. Our sampling is based on flexible quadrature points that are designed for a prescribed accuracy, which can yield less overall computational cost compared to alternative multilevel methods. We propose a constrained optimization problem that determines the number of samples to balance the approximation error with the computational budget. We further show that the optimization problem is convex and derive analytic formulas for the optimal number of points at each level. We validate the theoretical estimates and the performance of our multilevel method via numerical examples on a linear elasticity and a steady state heat diffusion problem.
On the computation of recurrence coefficients for univariate orthogonal polynomials
Subtitled arXiv preprint arXiv:2101.11963, Z. Liu, A. Narayan. 2021.
Associated to a finite measure on the real line with finite moments are recurrence coefficients in a three-term formula for orthogonal polynomials with respect to this measure. These recurrence coefficients are frequently inputs to modern computational tools that facilitate evaluation and manipulation of polynomials with respect to the measure, and such tasks are foundational in numerical approximation and quadrature. Although the recurrence coefficients for classical measures are known explicitly, those for nonclassical measures must typically be numerically computed. We survey and review existing approaches for computing these recurrence coefficients for univariate orthogonal polynomial families and propose a novel" predictor-corrector" algorithm for a general class of continuous measures. We combine the predictor-corrector scheme with a stabilized Lanczos procedure for a new hybrid algorithm that computes recurrence coefficients for a fairly wide class of measures that can have both continuous and discrete parts. We evaluate the new algorithms against existing methods in terms of accuracy and efficiency.
An efficient method of calculating composition-dependent inter-diffusion coefficients based on compressed sensing method
Y. Qin, A. Narayan, K. Cheng, P. Wang. In Computational Materials Science, Vol. 188, Elsevier, pp. 110145. 2021.
Composition-dependent inter-diffusion coefficients are key parameters in many physical processes. Due to the under-determinedness of the governing diffusion equations, numerical methods either impose strict physical conditions on the samples or require a computationally onerous amount of data. To address such problems, we propose a novel inverse framework to recover the diffusion coefficients using a compressed sensing method, which in principle can be extended to alloy systems with arbitrary number of species. Comparing to conventional methods, the new approach does not impose any priori assumptions on the functional relationship between diffusion coefficients and concentrations, nor any preference on the locations of the samples, as long as it is in the diffused zone. It also requires much less data compared to least-squares approaches. Through a few numerical examples of ternary and quandary systems, we demonstrate the accuracy and robustness of the new method.
Fast Barycentric-Based Evaluation Over Spectral/hp Elements
Subtitled arXiv preprint arXiv:2103.03594, E. Laughton, V. Zala, A. Narayan, R. M. Kirby, D. Moxey. 2021.
As the use of spectral/hp element methods, and high-order finite element methods in general, continues to spread, community efforts to create efficient, optimized algorithms associated with fundamental high-order operations have grown. Core tasks such as solution expansion evaluation at quadrature points, stiffness and mass matrix generation, and matrix assembly have received tremendousattention. With the expansion of the types of problems to which high-order methods are applied, and correspondingly the growth in types of numerical tasks accomplished through high-order methods, the number and types of these core operations broaden. This work focuses on solution expansion evaluation at arbitrary points within an element. This operation is core to many postprocessing applications such as evaluation of streamlines and pathlines, as well as to field projection techniques such as mortaring. We expand barycentric interpolation techniques developed on an interval to 2D (triangles and quadrilaterals) and 3D (tetrahedra, prisms, pyramids, and hexahedra) spectral/hp element methods. We provide efficient algorithms for their implementations, and demonstrate their effectiveness using the spectral/hp element library Nektar++.