Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.
Deep brain stimulation
BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).
Developing software tools for science has always been a central vision of the SCI Institute.

Image Analysis

SCI's imaging work addresses fundamental questions in 2D and 3D image processing, including filtering, segmentation, surface reconstruction, and shape analysis. In low-level image processing, this effort has produce new nonparametric methods for modeling image statistics, which have resulted in better algorithms for denoising and reconstruction. Work with particle systems has led to new methods for visualizing and analyzing 3D surfaces. Our work in image processing also includes applications of advanced computing to 3D images, which has resulted in new parallel algorithms and real-time implementations on graphics processing units (GPUs). Application areas include medical image analysis, biological image processing, defense, environmental monitoring, and oil and gas.


ross

Ross Whitaker

Segmentation
sarang

Sarang Joshi

Shape Statistics
Segmentation
Brain Atlasing
tolga

Tolga Tasdizen

Image Processing
Machine Learning
chris

Chris Johnson

Diffusion Tensor Analysis
shireen

Shireen Elhabian

Image Analysis
Computer Vision


Funded Research Projects:



Publications in Image Analysis:


Morphology of uranium oxides reduced from magnesium and sodium diuranate,
A.M. Chalifoux, L. Gibb, K.N. Wurth, T. Tenner, T. Tasdizen, L. MacDonald. In Radiochimica Acta, Vol. 112, No. 2, pp. 73-84. 2024.

Morphological analysis of uranium materials has proven to be a key signature for nuclear forensic purposes. This study examines the morphological changes to magnesium diuranate (MDU) and sodium diuranate (SDU) during reduction in a 10 % hydrogen atmosphere with and without steam present. Impurity concentrations of the materials were also examined pre and post reduction using energy dispersive X-ray spectroscopy combined with scanning electron microscopy (SEM-EDX). The structures of the MDU, SDU, and UO x samples were analyzed using powder X-ray diffraction (p-XRD). Using this method, UO x from MDU was found to be a mixture of UO2, U4O9, and MgU2O6 while UO x from SDU were combinations of UO2, U4O9, U3O8, and UO3. By SEM, the MDU and UO x from MDU had identical morphologies comprised of large agglomerates of rounded particles in an irregular pattern. SEM-EDX revealed pockets of high U and high Mg content distributed throughout the materials. The SDU and UO x from SDU had slightly different morphologies. The SDU consisted of massive agglomerates of platy sheets with rough surfaces. The UO x from SDU was comprised of massive agglomerates of acicular and sub-rounded particles that appeared slightly sintered. Backscatter images of SDU and related UO x materials showed sub-rounded dark spots indicating areas of high Na content, especially in UO x materials created in the presence of steam. SEM-EDX confirmed the presence of high sodium concentration spots in the SDU and UO x from SDU. Elemental compositions were found to not change between pre and post reduction of MDU and SDU indicating that reduction with or without steam does not affect Mg or Na concentrations. The identification of Mg and Na impurities using SEM analysis presents a readily accessible tool in nuclear material analysis with high Mg and Na impurities likely indicating processing via MDU or SDU, respectively. Machine learning using convolutional neural networks (CNNs) found that the MDU and SDU had unique morphologies compared to previous publications and that there are distinguishing features between materials created with and without steam.



Leveraging computer vision for predicting collision risks: a cross-sectional analysis of 2019–2021 fatal collisions in the USA
Q.C. Nguyen, M. Alirezaei, X. Yue, H. Mane, D. Li, L. Zhao, T.T. Nguyen, R. Patel, W. Yu, M. Hu, D. Quistberg, T. Tasdizen. In Injury Prevention, BMJ, 2024.

Objective The USA has higher rates of fatal motor vehicle collisions than most high-income countries. Previous studies examining the role of the built environment were generally limited to small geographic areas or single cities. This study aims to quantify associations between built environment characteristics and traffic collisions in the USA.

Methods Built environment characteristics were derived from Google Street View images and summarised at the census tract level. Fatal traffic collisions were obtained from the 2019–2021 Fatality Analysis Reporting System. Fatal and non-fatal traffic collisions in Washington DC were obtained from the District Department of Transportation. Adjusted Poisson regression models examined whether built environment characteristics are related to motor vehicle collisions in the USA, controlling for census tract sociodemographic characteristics.

Results Census tracts in the highest tertile of sidewalks, single-lane roads, streetlights and street greenness had 70%, 50%, 30% and 26% fewer fatal vehicle collisions compared with those in the lowest tertile. Street greenness and single-lane roads were associated with 37% and 38% fewer pedestrian-involved and cyclist-involved fatal collisions. Analyses with fatal and non-fatal collisions in Washington DC found streetlights and stop signs were associated with fewer pedestrians and cyclists-involved vehicle collisions while road construction had an adverse association.

Conclusion This study demonstrates the utility of using data algorithms that can automatically analyse street segments to create indicators of the built environment to enhance understanding of large-scale patterns and inform interventions to decrease road traffic injuries and fatalities.



Examining the role of passive design indicators in energy burden reduction: Insights from a machine learning and deep learning approach
S. Ghorbany, M. Hu, S. Yao, C. Wang, Q.C. Nguyen, X. Yue, M. Alirezaei, T. Tasdizen, M Sisk. In Building and Environment, Elsevier, 2024.

Passive design characteristics (PDC) play a pivotal role in reducing the energy burden on households without imposing additional financial constraints on project stakeholders. However, the scarcity of PDC data has posed a challenge in previous studies when assessing their energy-saving impact. To tackle this issue, this research introduces an innovative approach that combines deep learning-powered computer vision with machine learning techniques to examine the relationship between PDC and energy burden in residential buildings. In this study, we employ a convolutional neural network computer vision model to identify and measure key indicators, including window-to-wall ratio (WWR), external shading, and operable window types, using Google Street View images within the Chicago metropolitan area as our case study. Subsequently, we utilize the derived passive design features in conjunction with demographic characteristics to train and compare various machine learning methods. These methods encompass Decision Tree Regression, Random Forest Regression, and Support Vector Regression, culminating in the development of a comprehensive model for energy burden prediction. Our framework achieves a 74.2 % accuracy in forecasting the average energy burden. These results yield invaluable insights for policymakers and urban planners, paving the way toward the realization of smart and sustainable cities.



Improving uranium oxide pathway discernment and generalizability using contrastive self-supervised learning
J Johnson, L McDonald, T Tasdizen. In Computational Materials Science, Vol. 223, Elsevier, 2024.

In the field of Nuclear Forensics, there exists a plethora of different tools to aid investigators when performing analysis of unknown nuclear materials. Many of these tools offer visual representations of the uranium ore concentrate (UOC) materials that include complimentary and contrasting information. In this paper, we present a novel technique drawing from state-of-the-art machine learning methods that allows information from scanning electron microscopy images (SEM) to be combined to create digital encodings of the material that can be used to determine the material’s processing route. Our technique can classify UOC processing routes with greater than 96% accuracy in a fraction of a second and can be adapted to unseen samples at similarly high accuracy. The technique’s high accuracy and speed allow forensic investigators to quickly get preliminary results, while generalization allows the model to be adapted to new materials or processing routes quickly without the need for complete retraining of the model.



HistoEM: A Pathologist-Guided and Explainable Workflow Using Histogram Embedding for Gland Classification,
A. Ferrero, E. Ghelichkhan, H. Manoochehri, M.M. Ho, D.J. Albertson, B.J. Brintz, T. Tasdizen, R.T. Whitaker, B. Knudsen. In Modern Pathology, Vol. 37, No. 4, 2024.

Pathologists have, over several decades, developed criteria for diagnosing and grading prostate cancer. However, this knowledge has not, so far, been included in the design of convolutional neural networks (CNN) for prostate cancer detection and grading. Further, it is not known whether the features learned by machine-learning algorithms coincide with diagnostic features used by pathologists. We propose a framework that enforces algorithms to learn the cellular and subcellular differences between benign and cancerous prostate glands in digital slides from hematoxylin and eosin–stained tissue sections. After accurate gland segmentation and exclusion of the stroma, the central component of the pipeline, named HistoEM, utilizes a histogram embedding of features from the latent space of the CNN encoder. Each gland is represented by 128 feature-wise histograms that provide the input into a second network for benign vs cancer classification of the whole gland. Cancer glands are further processed by a U-Net structured network to separate low-grade from high-grade cancer. Our model demonstrates similar performance compared with other state-of-the-art prostate cancer grading models with gland-level resolution. To understand the features learned by HistoEM, we first rank features based on the distance between benign and cancer histograms and visualize the tissue origins of the 2 most important features. A heatmap of pixel activation by each feature is generated using Grad-CAM and overlaid on nuclear segmentation outlines. We conclude that HistoEM, similar to pathologists, uses nuclear features for the detection of prostate cancer. Altogether, this novel approach can be broadly deployed to visualize computer-learned features in histopathology images.



A Longitudinal Analysis of Pre-and Post-Operative Dysmorphology in Metopic Craniosynostosis
J.W. Beiriger, W. Tao, Z. Irgebay, J. Smetona, L. Dvoracek, N. Kass, A. Dixon, C. Zhang, M. Mehta, R. Whitaker, J. Goldstein. In The Cleft Palate Craniofacial Journal, Sage, 2024.
DOI: 10.1177/10556656241237605

Objective

The purpose of this study is to objectively quantify the degree of overcorrection in our current practice and to evaluate longitudinal morphological changes using CranioRateTM, a novel machine learning skull morphology assessment tool.  

Design

Retrospective cohort study across multiple time points.

Setting

Tertiary care children's hospital.

Patients

Patients with preoperative and postoperative CT scans who underwent fronto-orbital advancement (FOA) for metopic craniosynostosis.

Main Outcome Measures

We evaluated preoperative, postoperative, and two-year follow-up skull morphology using CranioRateTM to generate a Metopic Severity Score (MSS), a measure of degree of metopic dysmorphology, and Cranial Morphology Deviation (CMD) score, a measure of deviation from normal skull morphology.

Results

Fifty-five patients were included, average age at surgery was 1.3 years. Sixteen patients underwent follow-up CT imaging at an average of 3.1 years. Preoperative MSS was 6.3 ± 2.5 (CMD 199.0 ± 39.1), immediate postoperative MSS was −2.0 ± 1.9 (CMD 208.0 ± 27.1), and longitudinal MSS was 1.3 ± 1.1 (CMD 179.8 ± 28.1). MSS approached normal at two-year follow-up (defined as MSS = 0). There was a significant relationship between preoperative MSS and follow-up MSS (R2 = 0.70).

Conclusions

MSS quantifies overcorrection and normalization of head shape, as patients with negative values were less “metopic” than normal postoperatively and approached 0 at 2-year follow-up. CMD worsened postoperatively due to postoperative bony changes associated with surgical displacements following FOA. All patients had similar postoperative metopic dysmorphology, with no significant association with preoperative severity. More severe patients had worse longitudinal dysmorphology, reinforcing that regression to the metopic shape is a postoperative risk which increases with preoperative severity.



MASSM: An End-to-End Deep Learning Framework for Multi-Anatomy Statistical Shape Modeling Directly From Images
Subtitled “arXiv preprint arXiv:2403.11008,” J. Ukey, T. Kataria, S.Y. Elhabian. 2024.

Statistical Shape Modeling (SSM) is an effective method for quantitatively analyzing anatomical variations within populations. However, its utility is limited by the need for manual segmentations of anatomies, a task that relies on the scarce expertise of medical professionals. Recent advances in deep learning have provided a promising approach that automatically generates statistical representations from unsegmented images. Once trained, these deep learning-based models eliminate the need for manual segmentation for new subjects. Nonetheless, most current methods still require manual pre-alignment of image volumes and specifying a bounding box around the target anatomy prior for inference, resulting in a partially manual inference process. Recent approaches facilitate anatomy localization but only estimate statistical representations at the population level. However, they cannot delineate anatomy directly in images and are limited to modeling a single anatomy. Here, we introduce MASSM, a novel end-to-end deep learning framework that simultaneously localizes multiple anatomies in an image, estimates population-level statistical representations, and delineates each anatomy. Our findings emphasize the crucial role of local correspondences, showcasing their indispensability in providing superior shape information for medical imaging tasks.



StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining
Subtitled “arXiv preprint arXiv:2403.11340,” T. Kataria, B. Knudsen, S.Y. Elhabian. 2024.

Hematoxylin and Eosin (H&E) staining is the most commonly used for disease diagnosis and tumor recurrence tracking. Hematoxylin excels at highlighting nuclei, whereas eosin stains the cytoplasm. However, H&E stain lacks details for differentiating different types of cells relevant to identifying the grade of the disease or response to specific treatment variations. Pathologists require special immunohistochemical (IHC) stains that highlight different cell types. These stains help in accurately identifying different regions of disease growth and their interactions with the cell’s microenvironment. The advent of deep learning models has made Image-to-Image (I2I) translation a key research area, reducing the need for expensive physical staining processes. Pix2Pix and CycleGAN are still the most commonly used methods for virtual staining applications. However, both suffer from hallucinations or staining irregularities when H&E stain has less discriminate information about the underlying cells IHC needs to highlight (e.g.,CD3 lymphocytes). Diffusion models are currently the state-of-the-art models for image generation and conditional generation tasks. However, they require extensive and diverse datasets (millions of samples) to converge, which is less feasible for virtual staining applications. Inspired by the success of multitask deep learning models for limited dataset size, we propose StainDiffuser, a novel multitask dual diffusion architecture for virtual staining that converges under a limited training budget. StainDiffuser trains two diffusion processes simultaneously: (a) generation of cell-specific IHC stain from H&E and (b) H&E-based cell segmentation using coarse segmentation only during training. Our results show that StainDiffuser produces high-quality results for easier (CK8/18,epithelial marker) and difficult stains(CD3, Lymphocytes).



Estimation and Analysis of Slice Propagation Uncertainty in 3D Anatomy Segmentation
Subtitled “arXiv preprint arXiv:2403.12290,” R. Nihalaani, T. Kataria, J. Adams, S.Y. Elhabian. 2024.

Supervised methods for 3D anatomy segmentation demonstrate superior performance but are often limited by the availability of annotated data. This limitation has led to a growing interest in self-supervised approaches in tandem with the abundance of available unannotated data. Slice propagation has emerged as an self-supervised approach that leverages slice registration as a self-supervised task to achieve full anatomy segmentation with minimal supervision. This approach significantly reduces the need for domain expertise, time, and the cost associated with building fully annotated datasets required for training segmentation networks. However, this shift toward reduced supervision via deterministic networks raises concerns about the trustworthiness and reliability of predictions, especially when compared with more accurate supervised approaches. To address this concern, we propose the integration of calibrated uncertainty quantification (UQ) into slice propagation methods, providing insights into the model’s predictive reliability and confidence levels. Incorporating uncertainty measures enhances user confidence in self-supervised approaches, thereby improving their practical applicability. We conducted experiments on three datasets for 3D abdominal segmentation using five UQ methods. The results illustrate that incorporating UQ improves not only model trustworthiness, but also segmentation accuracy. Furthermore, our analysis reveals various failure modes of slice propagation methods that might not be immediately apparent to end-users. This study opens up new research avenues to improve the accuracy and trustworthiness of slice propagation methods.



EfficientMorph: Parameter-Efficient Transformer-Based Architecture for 3D Image Registration
Subtitled “arXiv preprint arXiv:2403.11026,” A.Z.B. Aziz, M.S.T. Karanam, T. Kataria, S.Y. Elhabian. 2024.

Transformers have emerged as the state-of-the-art architecture in medical image registration, outperforming convolutional neural networks (CNNs) by addressing their limited receptive fields and overcoming gradient instability in deeper models. Despite their success, transformer-based models require substantial resources for training, including data, memory, and computational power, which may restrict their applicability for end users with limited resources. In particular, existing transformer-based 3D image registration architectures face three critical gaps that challenge their efficiency and effectiveness. Firstly, while mitigating the quadratic complexity of full attention by focusing on local regions, window-based attention mechanisms often fail to adequately integrate local and global information. Secondly, feature similarities across attention heads that were recently found in multi-head attention architectures indicate a significant computational redundancy, suggesting that the capacity of the network could be better utilized to enhance performance. Lastly, the granularity of tokenization, a key factor in registration accuracy, presents a trade-off; smaller tokens improve detail capture at the cost of higher computational complexity, increased memory demands, and a risk of overfitting. Here, we propose EfficientMorph, a transformer-based architecture for unsupervised 3D image registration. It optimizes the balance between local and global attention through a plane-based attention mechanism, reduces computational redundancy via cascaded group attention, and captures fine details without compromising computational efficiency, thanks to a Hi-Res tokenization strategy complemented by merging operations. We compare the effectiveness of EfficientMorph on two public datasets, OASIS and IXI, against other state-of-the-art models. Notably, EfficientMorph sets a new benchmark for performance on the OASIS dataset with ∼16-27× fewer parameters.



Aardvark: Composite Visualizations of Trees, Time-Series, and Images
D. Lange, R. Judson-Torres, T.A. Zangle, A. Lex. In IEEE Transactions on Visualization and Computer Graphics, IEEE, 2024.

How do cancer cells grow, divide, proliferate and die? How do drugs influence these processes? These are difficult questions that we can attempt to answer with a combination of time-series microscopy experiments, classification algorithms, and data visualization. However, collecting this type of data and applying algorithms to segment and track cells and construct lineages of proliferation is error-prone; and identifying the errors can be challenging since it often requires cross-checking multiple data types. Similarly, analyzing and communicating the results necessitates synthesizing different data types into a single narrative. State-of-the-art visualization methods for such data use independent line charts, tree diagrams, and images in separate views. However, this spatial separation requires the viewer of these charts to combine the relevant pieces of data in memory. To simplify this challenging task, we describe design principles for weaving cell images, time-series data, and tree data into a cohesive visualization. Our design principles are based on choosing a primary data type that drives the layout and integrates the other data types into that layout. We then introduce Aardvark, a system that uses these principles to implement novel visualization techniques. Based on Aardvark, we demonstrate the utility of each of these approaches for discovery, communication, and data debugging in a series of case studies.



Road Traffic Injuries and the Built Environment in Bogotá, Colombia, 2015–2019: A Cross-Sectional Analysis
H.Y. Zewdie, O.L. Sarmiento, J.D. Pinzón, M.A. Wilches-Mogollon, P. A. Arbelaez, L. Baldovino-Chiquillo, D. Hidalgo, L. Guzman, S.J. Mooney, Q.C. Nguyen, T. Tasdizen, D.A. Quistberg . In Journal of Urban Health, Springer, 2024.

Nine in 10 road traffic deaths occur in low- and middle-income countries (LMICs). Despite this disproportionate burden, few studies have examined built environment correlates of road traffic injury in these settings, including in Latin America. We examined road traffic collisions in Bogotá, Colombia, occurring between 2015 and 2019, and assessed the association between neighborhood-level built environment features and pedestrian injury and death. We used descriptive statistics to characterize all police-reported road traffic collisions that occurred in Bogotá between 2015 and 2019. Cluster detection was used to identify spatial clustering of pedestrian collisions. Adjusted multivariate Poisson regression models were fit to examine associations between several neighborhood-built environment features and rate of pedestrian road traffic injury and death. A total of 173,443 police-reported traffic collisions occurred in Bogotá between 2015 and 2019. Pedestrians made up about 25% of road traffic injuries and 50% of road traffic deaths in Bogotá between 2015 and 2019. Pedestrian collisions were spatially clustered in the southwestern region of Bogotá. Neighborhoods with more street trees (RR, 0.90; 95% CI, 0.82–0.98), traffic signals (0.89, 0.81–0.99), and bus stops (0.89, 0.82–0.97) were associated with lower pedestrian road traffic deaths. Neighborhoods with greater density of large roads were associated with higher pedestrian injury. Our findings highlight the potential for pedestrian-friendly infrastructure to promote safer interactions between pedestrians and motorists in Bogotá and in similar urban contexts globally.



DISC: Latent Diffusion Models with Self-Distillation from Separated Conditions for Prostate Cancer Grading
Subtitled “arXiv:2404.13097,” M.M. Ho, E. Ghelichkhan, Y. Chong, Y. Zhou, B.S. Knudsen, T. Tasdizen. 2024.

Latent Diffusion Models (LDMs) can generate high-fidelity images from noise, offering a promising approach for augmenting histopathology images for training cancer grading models. While previous works successfully generated high-fidelity histopathology images using LDMs, the generation of image tiles to improve prostate cancer grading has not yet been explored. Additionally, LDMs face challenges in accurately generating admixtures of multiple cancer grades in a tile when conditioned by a tile mask. In this study, we train specific LDMs to generate synthetic tiles that contain multiple Gleason Grades (GGs) by leveraging pixel-wise annotations in input tiles. We introduce a novel framework named Self-Distillation from Separated Conditions (DISC) that generates GG patterns guided by GG masks. Finally, we deploy a training framework for pixel-level and slide-level prostate cancer grading, where synthetic tiles are effectively utilized to improve the cancer grading performance of existing models. As a result, this work surpasses previous works in two domains: 1) our LDMs enhanced with DISC produce more accurate tiles in terms of GG patterns, and 2) our training scheme, incorporating synthetic data, significantly improves the generalization of the baseline model for prostate cancer grading, particularly in challenging cases of rare GG5, demonstrating the potential of generative models to enhance cancer grading when data is limited.



Neighborhood built environment, obesity, and diabetes: A Utah siblings study
Q.C. Nguyen, T. Tasdizen, M. Alirezaei, H. Mane, X. Yue, J.S. Merchant, W. Yu, L. Drew, D. Li, T.T. Nguyen. In SSM - Population Health, Vol. 26, 2024.

Background

This study utilizes innovative computer vision methods alongside Google Street View images to characterize neighborhood built environments across Utah.

Methods

Convolutional Neural Networks were used to create indicators of street greenness, crosswalks, and building type on 1.4 million Google Street View images. The demographic and medical profiles of Utah residents came from the Utah Population Database (UPDB). We implemented hierarchical linear models with individuals nested within zip codes to estimate associations between neighborhood built environment features and individual-level obesity and diabetes, controlling for individual- and zip code-level characteristics (n = 1,899,175 adults living in Utah in 2015). Sibling random effects models were implemented to account for shared family attributes among siblings (n = 972,150) and twins (n = 14,122).

Results

Consistent with prior neighborhood research, the variance partition coefficients (VPC) of our unadjusted models nesting individuals within zip codes were relatively small (0.5%–5.3%), except for HbA1c (VPC = 23%), suggesting a small percentage of the outcome variance is at the zip code-level. However, proportional change in variance (PCV) attributable to zip codes after the inclusion of neighborhood built environment variables and covariates ranged between 11% and 67%, suggesting that these characteristics account for a substantial portion of the zip code-level effects. Non-single-family homes (indicator of mixed land use), sidewalks (indicator of walkability), and green streets (indicator of neighborhood aesthetics) were associated with reduced diabetes and obesity. Zip codes in the third tertile for non-single-family homes were associated with a 15% reduction (PR: 0.85; 95% CI: 0.79, 0.91) in obesity and a 20% reduction (PR: 0.80; 95% CI: 0.70, 0.91) in diabetes. This tertile was also associated with a BMI reduction of −0.68 kg/m2 (95% CI: −0.95, −0.40)

Conclusion

We observe associations between neighborhood characteristics and chronic diseases, accounting for biological, social, and cultural factors shared among siblings in this large population-based study.



F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation
Subtitled “arXiv:2404.12650v1,” M.M. Ho, S. Dubey, Y. Chong, B. Knudsen, T. Tasdizen. 2024.

The Frozen Section (FS) technique is a rapid and efficient method, taking only 15-30 minutes to prepare slides for pathologists' evaluation during surgery, enabling immediate decisions on further surgical interventions. However, FS process often introduces artifacts and distortions like folds and ice-crystal effects. In contrast, these artifacts and distortions are absent in the higher-quality formalin-fixed paraffin-embedded (FFPE) slides, which require 2-3 days to prepare. While Generative Adversarial Network (GAN)-based methods have been used to translate FS to FFPE images (F2F), they may leave morphological inaccuracies with remaining FS artifacts or introduce new artifacts, reducing the quality of these translations for clinical assessments. In this study, we benchmark recent generative models, focusing on GANs and Latent Diffusion Models (LDMs), to overcome these limitations. We introduce a novel approach that combines LDMs with Histopathology Pre-Trained Embeddings to enhance restoration of FS images. Our framework leverages LDMs conditioned by both text and pre-trained embeddings to learn meaningful features of FS and FFPE histopathology images. Through diffusion and denoising techniques, our approach not only preserves essential diagnostic attributes like color staining and tissue morphology but also proposes an embedding translation mechanism to better predict the targeted FFPE representation of input FS images. As a result, this work achieves a significant improvement in classification performance, with the Area Under the Curve rising from 81.99% to 94.64%, accompanied by an advantageous CaseFD. This work establishes a new benchmark for FS to FFPE image translation quality, promising enhanced reliability and accuracy in histopathology FS image analysis.



SCorP: Statistics-Informed Dense Correspondence Prediction Directly from Unsegmented Medical Images
Subtitled “arXiv preprint arXiv:2404.17967,” K. Iyer, J. Adams, S.Y. Elhabian. 2024.

Statistical shape modeling (SSM) is a powerful computational framework for quantifying and analyzing the geometric variability of anatomical structures, facilitating advancements in medical research, diagnostics, and treatment planning. Traditional methods for shape modeling from imaging data demand significant manual and computational resources. Additionally, these methods necessitate repeating the entire modeling pipeline to derive shape descriptors (e.g., surface-based point correspondences) for new data. While deep learning approaches have shown promise in streamlining the construction of SSMs on new data, they still rely on traditional techniques to supervise the training of the deep networks. Moreover, the predominant linearity assumption of traditional approaches restricts their efficacy, a limitation also inherited by deep learning models trained using optimized/established correspondences. Consequently, representing complex anatomies becomes challenging. To address these limitations, we introduce SCorP, a novel framework capable of predicting surface-based correspondences directly from unsegmented images. By leveraging the shape prior learned directly from surface meshes in an unsupervised manner, the proposed model eliminates the need for an optimized shape model for training supervision. The strong shape prior acts as a teacher and regularizes the feature learning of the student network to guide it in learning image-based features that are predictive of surface correspondences. The proposed model streamlines the training and inference phases by removing the supervision for the correspondence prediction task while alleviating the linearity assumption. Experiments on the LGE MRI left atrium dataset and Abdomen CT-1K liver datasets demonstrate that the proposed technique enhances the accuracy and robustness of image-driven SSM, providing a compelling alternative to current fully supervised methods.



Arterial Input Function (AIF) Correction Using AIF Plus Tissue Inputs with a Bi-LSTM Network
Q. Huang, J. Le, S. Joshi, J. Mendes, G. Adluru, E. DiBella. In Tomography, Vol. 10, pp. 660-673. 2024.

Background: The arterial input function (AIF) is vital for myocardial blood flow quantification in cardiac MRI to indicate the input time–concentration curve of a contrast agent. Inaccurate AIFs can significantly affect perfusion quantification. Purpose: When only saturated and biased AIFs are measured, this work investigates multiple ways of leveraging tissue curve information, including using AIF + tissue curves as inputs and optimizing the loss function for deep neural network training. Methods: Simulated data were generated using a 12-parameter AIF mathematical model for the AIF. Tissue curves were created from true AIFs combined with compartment-model parameters from a random distribution. Using Bloch simulations, a dictionary was constructed for a saturation-recovery 3D radial stack-of-stars sequence, accounting for deviations such as flip angle, T2* effects, and residual longitudinal magnetization after the saturation. A preliminary simulation study established the optimal tissue curve number using a bidirectional long short-term memory (Bi-LSTM) network with just AIF loss. Further optimization of the loss function involves comparing just AIF loss, AIF with compartment-model-based parameter loss, and AIF with compartment-model tissue loss. The optimized network was examined with both simulation and hybrid data, which included in vivo 3D stack-of-star datasets for testing. The AIF peak value accuracy and ?????? results were assessed. Results: Increasing the number of tissue curves can be beneficial when added tissue curves can provide extra information. Using just the AIF loss outperforms the other two proposed losses, including adding either a compartment-model-based tissue loss or a compartment-model parameter loss to the AIF loss. With the simulated data, the Bi-LSTM network reduced the AIF peak error from −23.6 ± 24.4% of the AIF using the dictionary method to 0.2 ± 7.2% (AIF input only) and 0.3 ± 2.5% (AIF + ten tissue curve inputs) of the network AIF. The corresponding ?????? error was reduced from −13.5 ± 8.8% to −0.6 ± 6.6% and 0.3 ± 2.1%. With the hybrid data (simulated data for training; in vivo data for testing), the AIF peak error was 15.0 ± 5.3% and the corresponding ?????? error was 20.7 ± 11.6% for the AIF using the dictionary method. The hybrid data revealed that using the AIF + tissue inputs reduced errors, with peak error (1.3 ± 11.1%) and ?????? error (−2.4 ± 6.7%). Conclusions: Integrating tissue curves with AIF curves into network inputs improves the precision of AI-driven AIF corrections. This result was seen both with simulated data and with applying the network trained only on simulated data to a limited in vivo test dataset.



Point2SSM++: Self-Supervised Learning of Anatomical Shape Models from Point Clouds
Subtitled “arXiv:2405.09707v1,” J. Adams, S. Elhabian. 2024.

Correspondence-based statistical shape modeling (SSM) stands as a powerful technology for morphometric analysis in clinical research. SSM facilitates population-level characterization and quantification of anatomical shapes such as bones and organs, aiding in pathology and disease diagnostics and treatment planning. Despite its potential, SSM remains under-utilized in medical research due to the significant overhead associated with automatic construction methods, which demand complete, aligned shape surface representations. Additionally, optimization-based techniques rely on bias-inducing assumptions or templates and have prolonged inference times as the entire cohort is simultaneously optimized. To overcome these challenges, we introduce Point2SSM++, a principled, self-supervised deep learning approach that directly learns correspondence points from point cloud representations of anatomical shapes. Point2SSM++ is robust to misaligned and inconsistent input, providing SSM that accurately samples individual shape surfaces while effectively capturing population-level statistics. Additionally, we present principled extensions of Point2SSM++ to adapt it for dynamic spatiotemporal and multi-anatomy use cases, demonstrating the broad versatility of the Point2SSM++ framework. Furthermore, we present extensions of Point2SSM++ tailored for dynamic spatiotemporal and multi-anatomy scenarios, showcasing the broad versatility of the framework. Through extensive validation across diverse anatomies, evaluation metrics, and clinically relevant downstream tasks, we demonstrate Point2SSM++’s superiority over existing state-of-the-art deep learning models and traditional approaches. Point2SSM++ substantially enhances the feasibility of SSM generation and significantly broadens its array of potential clinical applications.



Weakly Supervised Bayesian Shape Modeling from Unsegmented Medical Images
Subtitled “arXiv:2405.09697v1,” J. Adams, K. Iyer, S. Elhabian. 2024.

Anatomical shape analysis plays a pivotal role in clinical research and hypothesis testing, where the relationship between form and function is paramount. Correspondence-based statistical shape modeling (SSM) facilitates population-level morphometrics but requires a cumbersome, potentially bias-inducing construction pipeline. Recent advancements in deep learning have streamlined this process in inference by providing SSM prediction directly from unsegmented medical images. However, the proposed approaches are fully supervised and require utilizing a traditional SSM construction pipeline to create training data, thus inheriting the associated burdens and limitations. To address these challenges, we introduce a weakly supervised deep learning approach to predict SSM from images using point cloud supervision. Specifically, we propose reducing the supervision associated with the state-of-the-art fully Bayesian variational information bottleneck DeepSSM (BVIB-DeepSSM) model. BVIB-DeepSSM is an effective, principled framework for predicting probabilistic anatomical shapes from images with quantification of both aleatoric and epistemic uncertainties. Whereas the original BVIB-DeepSSM method requires strong supervision in the form of ground truth correspondence points, the proposed approach utilizes weak supervision via point cloud surface representations, which are more readily obtainable. Furthermore, the proposed approach learns correspondence in a completely data-driven manner without prior assumptions about the expected variability in shape cohort. Our experiments demonstrate that this approach yields similar accuracy and uncertainty estimation to the fully supervised scenario while substantially enhancing the feasibility of model training for SSM construction.



Refining Skewed Perceptions in Vision-Language Models through Visual Representations
Subtitled “arXiv preprint arXiv:2405.14030,” H. Dai, S. Joshi. 2024.

Large vision-language models (VLMs), such as CLIP, have become foundational, demonstrating remarkable success across a variety of downstream tasks. Despite their advantages, these models, akin to other foundational systems, inherit biases from the disproportionate distribution of real-world data, leading to misconceptions about the actual environment. Prevalent datasets like ImageNet are often riddled with non-causal, spurious correlations that can diminish VLM performance in scenarios where these contextual elements are absent. This study presents an investigation into how a simple linear probe can effectively distill task-specific core features from CLIP’s embedding for downstream applications. Our analysis reveals that the CLIP text representations are often tainted by spurious correlations, inherited in the biased pre-training dataset. Empirical evidence suggests that relying on visual representations from CLIP, as opposed to text embedding, is more practical to refine the skewed perceptions in VLMs, emphasizing the superior utility of visual representations in overcoming embedded biases