M. Meyer, B. Wong, M. Styczynski, T. Munzner, and H. Pfister.
Eurographics/ IEEE-VGTC Symposium on Visualization, 29: 1-10 (2010)
Biologists pioneering the new field of comparative functional genomics attempt to infer the mechanisms of gene regulation by looking for similarities and differences of gene activity over time across multiple species. They use three kinds of data: functional data such as gene activity measurements, pathway data that represent a series of reactions within a cellular process, and phylogenetic relationship data that describe the relatedness of species. No existing visualization tool can visually encode the biologically interesting relationships between multiple pathways, multiple genes, and multiple species. We tackle the challenge of visualizing all aspects of this comparative functional genomics dataset with a new interactive tool called Pathline. In addition to the overall characterization of the problem and design of Pathline, our contributions include two new visual encoding techniques. One is a new method for linearizing metabolic pathways that provides appropriate topological information and supports the comparison of quantitative data along the pathway. The second is the curvemap view, a depiction of time series data for comparison of gene activity and metabolite levels across multiple species. Pathline was developed in close collaboration with a team of genomic scientists. We validate our approach with case studies of the biologistsâ€™ use of Pathline and report on how they use the tool to confirm existing findings and to discover new scientific insights.
Linearizing a pathway. (a) The node-link representation of the directed graph includes both a branch and cycle. (b) Loops are unrolled and branches are disconnected. (c) Branches are reinserted just above their reconnection points. (d) The pathway is represented as a grey segment, with genes encoded spatially with points and metabolites as lines. Short breaks in the pathway segment indicate branch points, along with stylized marks to the left of the blocks. Cy- cle start points are also shown to the left with another mark.
Whole genome duplication event. (a) The known post-duplication shift in activity patterns in the first five rows between the g1 and g2 genes is immediately obvious in Pathline, where the curves clearly have mirror symmetry. (b) The mirror symmetry is much less apparent in a conventional heatmap view showing the same data.