Efficient Implementation of Smoothness-Increasing Accuracy-Conserving (SIAC) Filters for Discontinuous Galerkin Solutions|
H. Mirzaee, J.K. Ryan, R.M. Kirby. In Journal of Scientific Computing, pp. (in press). 2011.
The discontinuous Galerkin (DG) methods provide a high-order extension of the finite volume method in much the same way as high-order or spectral/hp elements extend standard finite elements. However, lack of inter-element continuity is often contrary to the smoothness assumptions upon which many post-processing algorithms such as those used in visualization are based. Smoothness-increasing accuracy-conserving (SIAC) filters were proposed as a means of ameliorating the challenges introduced by the lack of regularity at element interfaces by eliminating the discontinuity between elements in a way that is consistent with the DG methodology; in particular, high-order accuracy is preserved and in many cases increased. The goal of this paper is to explicitly define the steps to efficient computation of this filtering technique as applied to both structured triangular and quadrilateral meshes. Furthermore, as the SIAC filter is a good candidate for parallelization, we provide, for the first time, results that confirm anticipated performance scaling when parallelized on a shared-memory multi-processor machine.
Numerical Solution of Linear Volterra Integral Equations of the Second Kind with Sharp Gradients|
S.A. Isaacson, R.M. Kirby. In Journal of Computational and Applied Mathematics, Vol. 235, No. 14, pp. 4283--4301. 2011.
Collocation methods are a well-developed approach for the numerical solution of smooth and weakly singular Volterra integral equations. In this paper, we extend these methods through the use of partitioned quadrature based on the qualocation framework, to allow the efficient numerical solution of linear, scalar Volterra integral equations of the second kind with smooth kernels containing sharp gradients. In this case, the standard collocation methods may lose computational efficiency despite the smoothness of the kernel. We illustrate how the qualocation framework can allow one to focus computational effort where necessary through improved quadrature approximations, while keeping the solution approximation fixed. The computational performance improvement introduced by our new method is examined through several test examples. The final example we consider is the original problem that motivated this work: the problem of calculating the probability density associated with a continuous-time random walk in three dimensions that may be killed at a fixed lattice site. To demonstrate how separating the solution approximation from quadrature approximation may improve computational performance, we also compare our new method to several existing Gregory, Sinc, and global spectral methods, where quadrature approximation and solution approximation are coupled.
Towards the Development on an h-p-Refinement Strategy Based Upon Error Estimate Sensitivity|
P.K. Jimack, R.M. Kirby. In Computers and Fluids, Vol. 46, No. 1, pp. 277--281. 2011.
The use of (a posteriori) error estimates is a fundamental tool in the application of adaptive numerical methods across a range of fluid flow problems. Such estimates are incomplete however, in that they do not necessarily indicate where to refine in order to achieve the most impact on the error, nor what type of refinement (for example h-refinement or p-refinement) will be best. This paper extends preliminary work of the authors (Comm Comp Phys, 2010;7:631–8), which uses adjoint-based sensitivity estimates in order to address these questions, to include application with p-refinement to arbitrary order and the use of practical a posteriori estimates. Results are presented which demonstrate that the proposed approach can guide both the h-refinement and the p-refinement processes, to yield improvements in the adaptive strategy compared to the use of more orthodox criteria.
To CG or to HDG: A Comparative Study|
R.M. Kirby, B. Cockburn, S.J. Sherwin. In Journal of Scientific Computing, Note: published online, 2011.
Hybridization through the border of the elements (hybrid unknowns) combined with a Schur complement procedure (often called static condensation in the context of continuous Galerkin linear elasticity computations) has in various forms been advocated in the mathematical and engineering literature as a means of accomplishing domain decomposition, of obtaining increased accuracy and convergence results, and of algorithm optimization. Recent work on the hybridization of mixed methods, and in particular of the discontinuous Galerkin (DG) method, holds the promise of capitalizing on the three aforementioned properties; in particular, of generating a numerical scheme that is discontinuous in both the primary and flux variables, is locally conservative, and is computationally competitive with traditional continuous Galerkin (CG) approaches. In this paper we present both implementation and optimization strategies for the Hybridizable Discontinuous Galerkin (HDG) method applied to two dimensional elliptic operators. We implement our HDG approach within a spectral/hp element framework so that comparisons can be done between HDG and the traditional CG approach.
We demonstrate that the HDG approach generates a global trace space system for the unknown that although larger in rank than the traditional static condensation system in CG, has significantly smaller bandwidth at moderate polynomial orders. We show that if one ignores set-up costs, above approximately fourth-degree polynomial expansions on triangles and quadrilaterals the HDG method can be made to be as efficient as the CG approach, making it competitive for time-dependent problems even before taking into consideration other properties of DG schemes such as their superconvergence properties and their ability to handle hp-adaptivity.
Formal Specification of MPI 2.0: Case Study in Specifying a Practical Concurrent Programming API|
G. Li, R. Palmer, M. DeLisi, G. Gopalakrishnan, R.M. Kirby. In Science of Computer Programming, Vol. 76, pp. 65--81. 2011.
We describe the first formal specification of a non-trivial subset of MPI, the dominant communication API in high performance computing. Engineering a formal specification for a non-trivial concurrency API requires the right combination of rigor, executability, and traceability, while also serving as a smooth elaboration of a pre-existing informal specification. It also requires the modularization of reusable specification components to keep the length of the specification in check. Long-lived APIs such as MPI are not usually 'textbook minimalistic' because they support a diverse array of applications, a diverse community of users, and have efficient implementations over decades of computing hardware. We choose the TLA+ notation to write our specifications, and describe how we organized the specification of around 200 of the 300 MPI 2.0 functions. We detail a handful of these functions in this paper, and assess our specification with respect to the aforementioned requirements. We close with a description of possible approaches that may help render the act of writing, understanding, and validating the specifications of concurrency APIs much more productive.
Direct Isosurface Visualization of Hex-Based High-Order Geometry and Attribute Representations|
T. Martin, E. Cohen, R.M. Kirby. In IEEE Transactions on Visualization and Computer Graphics (TVCG), Vol. PP, No. 99, pp. 1--14. 2011.
In this paper, we present a novel isosurface visualization technique that guarantees the accuarate visualization of isosurfaces with complex attribute data defined on (un-)structured (curvi-)linear hexahedral grids. Isosurfaces of high-order hexahedralbased finite element solutions on both uniform grids (including MRI and CT scans) and more complex geometry represent a domain of interest that can be rendered using our algorithm. Additionally, our technique can be used to directly visualize solutions and attributes in isogeometric analysis, an area based on trivariate high-order NURBS (Non-Uniform Rational B-splines) geometry and attribute representations for the analysis. Furthermore, our technique can be used to visualize isosurfaces of algebraic functions. Our approach combines subdivision and numerical root-finding to form a robust and efficient isosurface visualization algorithm that does not miss surface features, while finding all intersections between a view frustum and desired isosurfaces. This allows the use of view-independent transparency in the rendering process. We demonstrate our technique through a straightforward CPU implementation on both complexstructured and complex-unstructured geometry with high-order simulation solutions, isosurfaces of medical data sets, and isosurfaces of algebraic functions.
Finite element implementation of mechanochemical phenomena in neutral deformable porous media under finite deformation|
G.A. Ateshian, M.B. Albro, S.A. Maas, J.A. Weiss. In Journal of Biomechanical Engineering, Vol. 133, No. 8, 2011.
Biological soft tissues and cells may be subjected to mechanical as well as chemical (osmotic) loading under their natural physiological environment or various experimental conditions. The interaction of mechanical and chemical effects may be very significant under some of these conditions, yet the highly nonlinear nature of the set of governing equations describing these mechanisms poses a challenge for the modeling of such phenomena. This study formulated and implemented a finite element algorithm for analyzing mechanochemical events in neutral deformable porous media under finite deformation. The algorithm employed the framework of mixture theory to model the porous permeable solid matrix and interstitial fluid, where the fluid consists of a mixture of solvent and solute. A special emphasis was placed on solute-solid matrix interactions, such as solute exclusion from a fraction of the matrix pore space (solubility) and frictional momentum exchange that produces solute hindrance and pumping under certain dynamic loading conditions. The finite element formulation implemented full coupling of mechanical and chemical effects, providing a framework where material properties and response functions may depend on solid matrix strain as well as solute concentration. The implementation was validated using selected canonical problems for which analytical or alternative numerical solutions exist. This finite element code includes a number of unique features that enhance the modeling of mechanochemical phenomena in biological tissues. The code is available in the public domain, open source finite element program FEBio (http://mrl.sci.utah.edu/software). [DOI: 10.1115/1.4004810]
From h to p Efficiently: Selecting the Optimal Spectral/hp Discretisation in Three Dimensions|
C.D. Cantwell, S.J. Sherwin, R.M. Kirby, P.H.J. Kelly. In Mathematical Modelling of Natural Phenomena, Vol. 6, No. 3, pp. 84--96. 2011.
Finding consistent strain distributions in the glenohumeral capsule between two subjects: Implications for development of physical examinations|
N.J. Drury, B.J. Ellis, J.A. Weiss, P.J. McMahon, R.E. Debski. In Journal of Biomechanics, Vol. 44, No. 4, pp. 607-613. February, 2011.
The anterior-inferior glenohumeral capsule is the primary passive stabilizer to the glenohumeral joint during anterior dislocation. Physical examinations following dislocation are crucial for proper diagnosis of capsule pathology; however,they are not standardized for joint position which may lead to misdiagnoses and poor outcomes. To suggest joint positions for physical examinations where the stability provided by the capsule may be consistent among patients, the objective of this study was to evaluate the distribution of maximum principal strain on the anterior-inferior capsule using two validated subject-specific finite element models of the glenohumeral joint at clinically relevant joint positions. The joint positions with 25 N anterior load applied at 60° of glenohumeral abduction and 10°, 20°, 30° and 40° of external rotation resulted in distributions of strain that were similar between shoulders(r2≥0.7). Furthermore, those positions with 20-40° of external rotation resulted in capsule strains on the glenoid side of the anterior band of the inferior glenohumeral ligament that were significantly greater than in all other capsule regions. These findings suggest that anterior stability provided by the anterior-inferior capsule may be consistent among subjects at joint positions with 60° of glenohumeral abduction and a mid-range (20-40°) of external rotation, and that the glenoid side has the greatest contribution to stability at these joint positions. Therefore, it may be possible to establish standard joint positions for physical examinations that clinicians can use to effectively diagnose pathology in the anterior-inferior capsule following dislocation and lead to improved outcomes.
The capsule's contribution to total hip construct stability - a finite element analysis|
J.M. Elkins, J.S. Stroud, M.J. Rudert, Y. Tochigi, D.R. Pedersen, B.J. Ellis, J.J. Callaghan, J.A. Weiss, T.D. Brown. In Journal of Orthopedic Research, Vol. 29, No. 11, Note: William Harris, MD Award, pp. 1642--1648. November, 2011.
Instability is a significant concern in total hip arthroplasty (THA), particularly when there is structural compromise of the capsule due to pre-existing pathology or due to necessities of surgical approach. An experimentally grounded fiber-direction-based finite element model of the hip capsule was developed, and was integrated with an established three-dimensional model of impingement/dislocation. Model validity was established by close similarity to results from a cadaveric experiment in a servohydraulic hip simulator. Parametric computational runs explored effects of graded levels of capsule thickness, of regional detachment from the capsule's femoral or acetabular insertions, of surgical incisions of capsule substance, and of capsule defect repairs. Depending strongly upon the specific site, localized capsule defects caused varying degrees of construct stability compromise, with several specific situations involving over 60\% decrement in dislocation resistance. Construct stability was returned substantially toward intact-capsule levels following well-conceived repairs, although the suture sites involved were often at substantial risk of failure. These parametric model results underscore the importance of retaining or robustly repairing capsular structures in THA, in order to maximize overall construct stability. © 2011 Orthopaedic Research Society. Published by Wiley Periodicals, Inc. J Orthop Res 29:1642–1648, 2011
Defect Sampling in Global Error Estimation for ODEs and Method-Of-Lines PDEs Using Adjoint Methods|
SCI Technical Report, L.T. Tran, M. Berzins. No. UUSCI-2011-006, SCI Institute, University of Utah, 2011.
A Conservered Developmental Patterning Network Produces Quantitatively Different Output in Multiple Species of Drosophila|
C. Fowlkes, K. Eckenrode, M. Bragdon, M.D. Meyer, Z. Wunderlich, L. Simirenko, C. Luengo, S. Keranen, C. Henriquez, D. Knowles, M. Biggin, M. Eisen, A. DePace. In PLoS Genetics, Vol. 7, No. 10:e1002346, pp. 17 pages. October, 2011.
Differences in the level, timing, or location of gene expression can contribute to alternative phenotypes at the molecular and organismal level. Understanding the origins of expression differences is complicated by the fact that organismal morphology and gene regulatory networks could potentially vary even between closely related species. To assess the scope of such changes, we used high-resolution imaging methods to measure mRNA expression in blastoderm embryos of Drosophila yakuba and Drosophila pseudoobscura and assembled these data into cellular resolution atlases, where expression levels for 13 genes in the segmentation network are averaged into species-specific, cellular resolution morphological frameworks. We demonstrate that the blastoderm embryos of these species differ in their morphology in terms of size, shape, and number of nuclei. We present an approach to compare cellular gene expression patterns between species, while accounting for varying embryo morphology, and apply it to our data and an equivalent dataset for Drosophila melanogaster. Our analysis reveals that all individual genes differ quantitatively in their spatio-temporal expression patterns between these species, primarily in terms of their relative position and dynamics. Despite many small quantitative differences, cellular gene expression profiles for the whole set of genes examined are largely similar. This suggests that cell types at this stage of development are conserved, though they can differ in their relative position by up to 3-4 cell widths and in their relative proportion between species by as much as 5-fold. Quantitative differences in the dynamics and relative level of a subset of genes between corresponding cell types may reflect altered regulatory functions between species. Our results emphasize that transcriptional networks can diverge over short evolutionary timescales and that even small changes can lead to distinct output in terms of the placement and number of equivalent cells.
A fast iterative method for solving the Eikonal equation on triangulated surfaces|
Z. Fu, W.-K. Jeong, Y. Pan, R.M. Kirby, R.T. Whitaker. In SIAM Journal of Scientific Computing, Vol. 33, No. 5, pp. 2468--2488. 2011.
PubMed Central ID: PMC3360588
This paper presents an efficient, fine-grained parallel algorithm for solving the Eikonal equation on triangular meshes. The Eikonal equation, and the broader class of Hamilton–Jacobi equations to which it belongs, have a wide range of applications from geometric optics and seismology to biological modeling and analysis of geometry and images. The ability to solve such equations accurately and efficiently provides new capabilities for exploring and visualizing parameter spaces and for solving inverse problems that rely on such equations in the forward model. Efficient solvers on state-of-the-art, parallel architectures require new algorithms that are not, in many cases, optimal, but are better suited to synchronous updates of the solution. In previous work [W. K. Jeong and R. T. Whitaker, SIAM J. Sci. Comput., 30 (2008), pp. 2512–2534], the authors proposed the fast iterative method (FIM) to efficiently solve the Eikonal equation on regular grids. In this paper we extend the fast iterative method to solve Eikonal equations efficiently on triangulated domains on the CPU and on parallel architectures, including graphics processors. We propose a new local update scheme that provides solutions of first-order accuracy for both architectures. We also propose a novel triangle-based update scheme and its corresponding data structure for efficient irregular data mapping to parallel single-instruction multiple-data (SIMD) processors. We provide detailed descriptions of the implementations on a single CPU, a multicore CPU with shared memory, and SIMD architectures with comparative results against state-of-the-art Eikonal solvers.
Formal Analysis of MPI-Based Parallel Programs: Present and Future|
G. Gopalakrishnan, R.M. Kirby, S. Siegel, R. Thakur, W. Gropp, E. Lusk, B.R. de Supinski, M. Schultz, G. Bronevetsky. In Communications of the ACM, pp. (accepted). 2011.
Role of the acetabular labrum in load support across the hip joint|
C.R. Henak, B.J. Ellis, M.D. Harris, A.E. Anderson, C.L. Peters, J.A. Weiss. In Journal of Biomechanics, Vol. 44, No. 12, pp. 2201-2206. 2011.
The relatively high incidence of labral tears among patients presenting with hip pain suggests that the acetabular labrum is often subjected to injurious loading in vivo. However, it is unclear whether the labrum participates in load transfer across the joint during activities of daily living. This study examined the role of the acetabular labrum in load transfer for hips with normal acetabular geometry and acetabular dysplasia using subject-specific finite element analysis. Models were generated from volumetric CT data and analyzed with and without the labrum during activities of daily living. The labrum in the dysplastic model supported 4–11\% of the total load transferred across the joint, while the labrum in the normal model supported only 1–2\% of the total load. Despite the increased load transferred to the acetabular cartilage in simulations without the labrum, there were minimal differences in cartilage contact stresses. This was because the load supported by the cartilage correlated with the cartilage contact area. A higher percentage of load was transferred to the labrum in the dysplastic model because the femoral head achieved equilibrium near the lateral edge of the acetabulum. The results of this study suggest that the labrum plays a larger role in load transfer and joint stability in hips with acetabular dysplasia than in hips with normal acetabular geometry.
Interpreting Performance Data Across Intuitive Domains|
M. Schulz, J.A. Levine, P.-T. Bremer, T. Gamblin, V. Pascucci. In International Conference on Parallel Processing, Taipei, Taiwan, IEEE, pp. 206--215. 2011.
GPU-Based Interactive Cut-Surface Extraction From High-0rder Finite Element Fields|
B. Nelson, R. Haimes, R.M. Kirby. In IEEE Transactions on Visualization and Computer Graphics (IEEE Visualization Issue), Vol. 17, No. 12, pp. 1803--1811. 2011.
We present a GPU-based ray-tracing system for the accurate and interactive visualization of cut-surfaces through 3D simulations of physical processes created from spectral/hp high-order finite element methods. When used by the numerical analyst to debug the solver, the ability for the imagery to precisely reflect the data is critical. In practice, the investigator interactively selects from a palette of visualization tools to construct a scene that can answer a query of the data. This is effective as long as the implicit contract of image quality between the individual and the visualization system is upheld. OpenGL rendering of scientific visualizations has worked remarkably well for exploratory visualization for most solver results. This is due to the consistency between the use of first-order representations in the simulation and the linear assumptions inherent in OpenGL (planar fragments and color-space interpolation). Unfortunately, the contract is broken when the solver discretization is of higher-order. There have been attempts to mitigate this through the use of spatial adaptation and/or texture mapping. These methods do a better job of approximating what the imagery should be but are not exact and tend to be view-dependent. This paper introduces new rendering mechanisms that specifically deal with the kinds of native data generated by high-order finite element solvers. The exploratory visualization tools are reassessed and cast in this system with the focus on image accuracy. This is accomplished in a GPU setting to ensure interactivity.
A Toolkit for Forward/Inverse Problems in Electrocardiography within the SCIRun Problem Solving Environment|
B.M. Burton, J.D. Tate, B. Erem, D.J. Swenson, D.F. Wang, D.H. Brooks, P.M. van Dam, R.S. MacLeod. In Proceedings of the 2011 IEEE Int. Conf. Engineering and Biology Society (EMBC), pp. 267--270. 2011.
PubMed ID: 22254301
PubMed Central ID: PMC3337752
Computational modeling in electrocardiography often requires the examination of cardiac forward and inverse problems in order to non-invasively analyze physiological events that are otherwise inaccessible or unethical to explore. The study of these models can be performed in the open-source SCIRun problem solving environment developed at the Center for Integrative Biomedical Computing (CIBC). A new toolkit within SCIRun provides researchers with essential frameworks for constructing and manipulating electrocardiographic forward and inverse models in a highly efficient and interactive way. The toolkit contains sample networks, tutorials and documentation which direct users through SCIRun-specific approaches in the assembly and execution of these specific problems.
Morse Set Classification and Hierarchical Refinement using Conley Index|
Guoning Chen, Qingqing Deng, Andrzej Szymczak, Robert S. Laramee, and Eugene Zhang. In IEEE Transactions on Visualization and Computer Graphics (TVCG), Vol. 18, No. 5, pp. 767--782. June, 2011.
PubMed ID: 21690641
Morse decomposition provides a numerically stable topological representation of vector fields that is crucial for their rigorous interpretation. However, Morse decomposition is not unique, and its granularity directly impacts its computational cost. In this paper, we propose an automatic refinement scheme to construct the Morse Connection Graph (MCG) of a given vector field in a hierarchical fashion. Our framework allows a Morse set to be refined through a local update of the flow combinatorialization graph, as well as the connection regions between Morse sets. The computation is fast because the most expensive computation is concentrated on a small portion of the domain. Furthermore, the present work allows the generation of a topologically consistent hierarchy of MCGs, which cannot be obtained using a global method. The classification of the extracted Morse sets is a crucial step for the construction of the MCG, for which the Poincaré index is inadequate. We make use of an upper bound for the Conley index, provided by the Betti numbers of an index pair for a translation along the flow, to classify the Morse sets. This upper bound is sufficiently accurate for Morse set classification and provides supportive information for the automatic refinement process. An improved visualization technique for MCG is developed to incorporate the Conley indices. Finally, we apply the proposed techniques to a number of synthetic and real-world simulation data to demonstrate their utility.
|A wildland fire modeling and visualization environment,
J. Mandel, J.D. Beezley, A. Kochanski, V.Y. Kondratenko, L. Zhang, E. Anderson, J. Daniels II, C.T. Silva, C.R. Johnson. In Proceedings of the Ninth Symposium on Fire and Forest Meteorology, pp. (published online). 2011.