Caleydo: Design and Evaluation of a Visual Analysis Framework for Gene Expression Data in its Biological Context|
A. Lex, M. Streit, E. Kruijff, D. Schmalstieg. In Proceeding of the IEEE Symposium on Pacific Visualization (PacificVis '10), pp. 57--64. 2010.
The goal of our work is to support experts in the process of hypotheses generation concerning the roles of genes in diseases. For a deeper understanding of the complex interdependencies between genes, it is important to bring gene expressions (measurements) into context with pathways. Pathways, which are models of biological processes, are available in online databases. In these databases, large networks are decomposed into small sub-graphs for better manageability. This simplification results in a loss of context, as pathways are interconnected and genes can occur in multiple instances scattered over the network. Our main goal is therefore to present all relevant information, i.e., gene expressions, the relations between expression and pathways and between multiple pathways in a simple, yet effective way. To achieve this we employ two different multiple-view approaches. Traditional multiple views are used for large datasets or highly interactive visualizations, while a 2.5D technique is employed to support a seamless navigation of multiple pathways which simultaneously links to the expression of the contained genes. This approach facilitates the understanding of the interconnection of pathways, and enables a non-distracting relation to gene expression data. We evaluated Caleydo with a group of users from the life science community. Users were asked to perform three tasks: pathway exploration, gene expression analysis and information comparison with and without visual links, which had to be conducted in four different conditions. Evaluation results show that the system can improve the process of understanding the complex network of pathways and the individual effects of gene expression regulation considerably. Especially the quality of the available contextual information and the spatial organization was rated good for the presented 2.5D setup.
MulteeSum: A Tool for Comparative Spatial and Temporal Gene Expression Data|
M.D. Meyer, T. Munzner, A. DePace, H. Pfister. In IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis 2010), Vol. 16, No. 6, pp. 908--917. 2010.
Cells in an organism share the same genetic information in their DNA, but have very different forms and behavior because of the selective expression of subsets of their genes. The widely used approach of measuring gene expression over time from a tissue sample using techniques such as microarrays or sequencing do not provide information about the spatial position within the tissue where these genes are expressed. In contrast, we are working with biologists who use techniques that measure gene expression in every individual cell of entire fruitfly embryos over an hour of their development, and do so for multiple closely-related subspecies of Drosophila. These scientists are faced with the challenge of integrating temporal gene expression data with the spatial location of cells and, moreover, comparing this data across multiple related species. We have worked with these biologists over the past two years to develop MulteeSum, a visualization system that supports inspection and curation of data sets showing gene expression over time, in conjunction with the spatial location of the cells where the genes are expressed — it is the first tool to support comparisons across multiple such data sets. MulteeSum is part of a general and flexible framework we developed with our collaborators that is built around multiple summaries for each cell, allowing the biologists to explore the results of computations that mix spatial information, gene expression measurements over time, and data from multiple related species or organisms. We justify our design decisions based on specific descriptions of the analysis needs of our collaborators, and provide anecdotal evidence of the efficacy of MulteeSum through a series of case studies.
Pathline: A Tool for Comparative Functional Genomics|
M.D. Meyer, B. Wong, M. Styczynski, T. Munzner, H. Pfister. In Computer Graphics Forum, Vol. 29, No. 3, Wiley-Blackwell, pp. 1043--1052. Aug, 2010.
Biologists pioneering the new field of comparative functional genomics attempt to infer the mechanisms of gene regulation by looking for similarities and differences of gene activity over time across multiple species. They use three kinds of data: functional data such as gene activity measurements, pathway data that represent a series of reactions within a cellular process, and phylogenetic relationship data that describe the relatedness of species. No existing visualization tool can visually encode the biologically interesting relationships between multiple pathways, multiple genes, and multiple species. We tackle the challenge of visualizing all aspects of this comparative functional genomics dataset with a new interactive tool called Pathline. In addition to the overall characterization of the problem and design of Pathline, our contributions include two new visual encoding techniques. One is a new method for linearizing metabolic pathways that provides appropriate topological information and supports the comparison of quantitative data along the pathway. The second is the curvemap view, a depiction of time series data for comparison of gene activity and metabolite levels across multiple species. Pathline was developed in close collaboration with a team of genomic scientists. We validate our approach with case studies of the biologists' use of Pathline and report on how they use the tool to confirm existing findings and to discover new scientific insights.
Genome-wide synteny through highly sensitive sequence alignment: Satsuma|
M. Grabherr, P. Russell, M.D. Meyer, E. Mauceli, J. Alföldi, F. Di Palma, K. Lindblad-Toh. In Bioinformatics, Vol. 26, No. 9, pp. 1145--1151. 2010.
Motivation: Comparative genomics heavily relies on alignments of large and often complex DNA sequences. From an engineering perspective, the problem here is to provide maximum sensitivity (to find all there is to find), specificity (to only find real homology) and speed (to accommodate the billions of base pairs of vertebrate genomes).
Results: Satsuma addresses all three issues through novel strategies: (i) cross-correlation, implemented via fast Fourier transform; (ii) a match scoring scheme that eliminates almost all false hits; and (iii) an asynchronous 'battleship'-like search that allows for aligning two entire fish genomes (470 and 217 Mb) in 120 CPU hours using 15 processors on a single machine.
Availability: Satsuma is part of the Spines software package, implemented in C++ on Linux. The latest version of Spines can be freely downloaded under the LGPL license from http://www.broadinstitute.org/science/programs/genome-biology/spines/ Contact: email@example.com
Metrics for Uncertainty Analysis and Visualization of Diffusion Tensor Images|
F. Jiao, J.M. Phillips, J.G. Stinstra, J. Kueger, R. Varma, E. Hsu, J. Korenberg, C.R. Johnson. In Proceedings of the 5th international conference on Medical imaging and augmented reality (MIAR), Beijing, China, Springer-Verlag, Berlin, Heidelberg pp. 179--190. September, 2010.
Visual Exploration of High Dimensional Scalar Functions|
S. Gerber, P.-T. Bremer, V. Pascucci, R.T. Whitaker. In IEEE Transactions on Visualization and Computer Graphics, IEEE Transactions on Visualization and Computer Graphics, Vol. 16, No. 6, IEEE, pp. 1271--1280. Nov, 2010.
PubMed ID: 20975167
PubMed Central ID: PMC3099238
Topology Verification for Isosurface Extraction|
SCI Technical Report, T. Etiene, L.G. Nonato, C.E. Scheidegger, J. Tierny, T.J. Peters, V. Pascucci, R.M. Kirby, C.T. Silva. No. UUSCI-2010-003, SCI Institute, University of Utah, 2010.
Analysis of Recurrent Patterns in Toroidal Magnetic Fields|
A.R. Sanderson, G. Chen, X. Tricoche, D. Pugmire, S. Kruger, J. Breslau. In IEEE Transactions on Visualization and Computer Graphics, Vol. 16, No. 6, IEEE, pp. 1431-1440. Nov, 2010.
Visualizing Summary Statistics and Uncertainty|
K. Potter, J.M. Kniss, R. Riesenfeld, C.R. Johnson. In Computer Graphics Forum, Vol. 29, No. 3, Wiley-Blackwell, pp. 823--831. Aug, 2010.
Caleydo: Connecting Pathways with Gene Expression|
M. Streit, A. Lex, M. Kalkusch, K. Zatloukal, D. Schmalstieg. In Bioinformatics, Vol. 25, No. 20, pp. 2760--2761. 2009.
Understanding the relationships between pathways and the altered expression of their components in disease conditions can be addressed in a visual data analysis process. Caleydo uses novel visualization techniques to support life science experts in their analysis of gene expression data in the context of pathways and functions of individual genes. Pathways and gene expression visualizations are placed in a 3D scene where selected entities (i.e. genes) are visually connected. This allows Caleydo to seamlessly integrate interactive gene expression visualization with cross-database pathway exploration.
Connecting Genes with Diseases|
H Müller, R Reihs, S Sauer, K Zatloukal, M Streit, A Lex, B Schlegl, D Schmalstieg. In Information Visualisation, 2009 13th International Conference, pp. 323--330. July, 2009.
This paper presents a visual data mining approach using the combination of clinical data, pathways and gene-expression data. The visual exploration of medical data using pathways to navigate and filter the data allows a more systematic and efficient investigation of problems in modern life science. A multiplicity of hypothesis can be evaluated in the same period of time, enabling a much better exploitation of the data. We present a system for data preprocessing and automatic classification, a set of visualization views and finally the integration in the Caleydo visualization framework, which enables the "coupling" of genetic and a broad spectrum of clinical data. With the help of the Caleydo framework the medical expert can identify connections between genetic parameters, patient subgroups, and drug responses.
Gaze-Based Focus Adaption in an Information Visualization System|
M. Streit, A. Lex, H. Müller, D. Schmalstieg. In Proceedings of the Conference on Computer Graphics and Visualization and Image Processing (CGVCVIP '09), 2009.
As the complexity and amount of real world data continuously grows, modern visualization systems are changing. Traditional information visualization techniques are often not sufficient to allow an in-depth visual data exploration process. Multiple view systems combined with linking & brushing are only one building block of a successful InfoVis system. In this paper we propose the incorporation of cheap and simple gaze-based interaction. We employ the tracking information not for selecting data (i.e. mouse interaction) but for an intelligent adaption of 2D and 3D visualizations. Derived from the focus+context paradigm, we call this gaze-focus. The proposed methods are demonstrated by means of three different visualizations.
MizBee: A Multiscale Synteny Browser|
M.D. Meyer, T. Munzner, H. Pfister. In IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis 2009), Vol. 15, No. 6, Note: Honorable Mention for Best Paper Award, pp. 897--904. 2009.
In the field of comparative genomics, scientists seek to answer questions about evolution and genomic function by comparing the genomes of species to find regions of shared sequences. Conserved syntenic blocks are an important biological data abstraction for indicating regions of shared sequences. The goal of this work is to show multiple types of relationships at multiple scales in a way that is visually comprehensible in accordance with known perceptual principles. We present a task analysis for this domain where the fundamental questions asked by biologists can be understood by a characterization of relationships into the four types of proximity/location, size, orientation, and similarity/strength, and the four scales of genome, chromosome, block, and genomic feature. We also propose a new taxonomy of the design space for visually encoding conservation data. We present MizBee, a multiscale synteny browser with the unique property of providing interactive side-by-side views of the data across the range of scales supporting exploration of all of these relationship types. We conclude with case studies from two biologists who used MizBee to augment their previous automatic analysis work flow, providing anecdotal evidence about the efficacy of the system for the visualization of syntenic data, the analysis of conservation relationships, and the communication of scientific insights.
Microscopic Computed Tomography–Based Virtual Histology for Visualization and Morphometry of Atherosclerosis in Diabetic Apolipoprotein E Mutant Mice|
H. Martinez, S. Prajapati, C. Estrada, F. Jimenez, I. Wu, A. Bahadur, A. Sanderson, C.R. Johnson, M. Shim, C. Keller, S. Ahuja. In Circulation, Vol. 120, No. 821--822, 2009.
Ensemble-Vis: A Framework for the Statistical Visualization of Ensemble Data|
K. Potter, A. Wilson, P.-T. Bremer, D. Williams, C. Doutriaux, V. Pascucci, C.R. Johnson. In Proceedings of the 2009 IEEE International Conference on Data Mining Workshops, pp. 233--240. 2009.
A Framework for Exploring Numerical Solutions of Advection Reaction Diffusion Equations using a GPU Based Approach|
A.R. Sanderson, M.D. Meyer, R.M. Kirby, C.R. Johnson. In Journal of Computing and Visualization in Science, Vol. 12, pp. 155--170. 2009.
Subject-specific, multiscale simulation of electrophysiology: a software pipeline for image-based models and application examples|
R.S. MacLeod, J.G. Stinstra, S. Lew, R.T. Whitaker, D.J. Swenson, M.J. Cole, J. Krüger, D.H. Brooks, C.R. Johnson. In Philosophical Transactions of The Royal Society A, Mathematical, Physical & Engineering Sciences, Vol. 367, No. 1896, pp. 2293--2310. 2009.
Microscopic Computed Tomography Based Virtual Histology for Visualization and Morphometry of Atherosclerosis in Diabetic Apolipoprotein E Mutant Mice|
H.G. Martinez, S.I. Prajapati, C.A. Estrada, F. Jimenez, M.P. Quinones, I. Wu, A. Bahadur, A. Sanderson, C.R. Johnson, M. Shim, C. Keller, S.S. Ahuja. In Circulation: Journal of the American Heart Association, Vol. 120, No. 9, pp. 821--822. 2009.
Visualization for Data-Intensive Science|
C.D. Hansen, C.R. Johnson, V. Pascucci, C.T. Silva. In The Fourth Paradigm: Data-Intensive Science, Edited by S. Tansley and T. Hey and K. Tolle, Microsoft Research, pp. 153--164. 2009.
Occam's Razor and Petascale Visual Data Analysis|
E.W. Bethel, C.R. Johnson, S. Ahern, J. Bell, P.-T. Bremer, H. Childs, E. Cormier-Michel, M. Day, E. Deines, P.T. Fogal, C. Garth, C.G.R. Geddes, H. Hagen, B. Hamann, C.D. Hansen, J. Jacobsen, K.I. Joy, J. Krüger, J. Meredith, P. Messmer, G. Ostrouchov, V. Pascucci, K. Potter, Prabhat, D. Pugmire, O. Rubel, A.R. Sanderson, C.T. Silva, D. Ushizima, G.H. Weber, B. Whitlock, K. Wu. In Journal of Physics: Conference Series, Journal of Physics: Conference Series, Vol. 180, No. 012084, pp. (published online). 2009.
One of the central challenges facing visualization research is how to effectively enable knowledge discovery. An effective approach will likely combine application architectures that are capable of running on today's largest platforms to address the challenges posed by large data with visual data analysis techniques that help find, represent, and effectively convey scientifically interesting features and phenomena.