# glueck2018phenolines

# PhenoLines: Phenotype Comparison Visualizations for Disease Subtyping via Topic Models


Figure 1 shows PhenoLines, a visual analysis tool for the interpretation of disease subtypes built by Glueck et al., it aims to support the comparison of phenotype prevalence within and across disease subtype topics [1]. Topic models enable the user to mine cross-sectional patient's comorbidity data from electronic health records(EHRs) via a machine learning approach called topic modeling.

Glueck et al. develop PhenoLines with domain experts with both a machine learning and medical background.


The authors interview medical experts together with machine learning experts as the first step to pre-process EHRs. Disease subtypes are characterized and validated during the stage. Topic models are then trained and refined. Visualization tasks are abstracted using Brehmer & Munzner’s Multi-Level Task Typology [2].

Figure 1 shows an overview of PhenoStacks. Section A shows the setting panel for user options such as comparing, sorting, filtering and aggregating of data. Section B shows the detail panel with a summary table for all topics and a set of juxtaposed timeline charts. Section C shows the topics panel, each topic comes with a sunburst chart representing the semantic relationships between phenotypes, a scatterplot displaying correlations between summary measures, a measure of the temporal evolution of phenotype probabilities. Section C also contains a rank-ordered list summarizing the phenotypes with highest probability in each topic. Section D shows the search panel that supports the direct look-up of phenotype terms.

Related Work

Glueck et al. review previous work on integrating interactive visualization with topic modeling such as TIARA [3], ParallelTopics [4], Textflow [5]. To address scalability issues, HierarchicalTopics [6], RoseRiver [7] and TopicPanorama [8] provide approaches rely on both automated and semi-automated clustering of topics based on word similarity.

The Topic Browser [9], Termite [10] and LDAvis [11] provide techniques on verifying model quality through visual comparisons.

Visualizations have also been developed to investigate data from longitudinal cohort studies, such as CAVA [12], COQUITO [13] and CoCo [14].

Data Characteristics

  • Data source: Boston Children’s Hospital
  • Size: 13,337 patients, 66,275 documents
  • Spatial dimensionality: 2D
  • Temporal dimensionality: static
  • Type: multivariate

Visualization Techniques

  • Sunburst chart
  • Scatterplot

Papers Cited


Years Spanned


Application Domain

  • Electronic Health Record Visualization
  • Medical focus: Human Phenotype Ontology - Disease Subtyping


  1. Glueck, M., Naeini, M. P., Doshi-Velez, F., Chevalier, F., Khan, A., Wigdor, D., & Brudno, M. (2018). PhenoLines: Phenotype Comparison Visualizations for Disease Subtyping via Topic Models. IEEE Transactions on Visualization and Computer Graphics, 24(1), 371–381. https://doi.org/10.1109/TVCG.2017.2745118 ↩︎

  2. Brehmer, M., & Munzner, T. (2013). A multi-level typology of abstract visualization tasks. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2013.124 ↩︎

  3. Liu, S., Zhou, M. X., Pan, S., Song, Y., Qian, W., Cai, W., & Lian, X. (2012). TIARA: Interactive, topic-based visual text summarization and analysis. ACM Transactions on Intelligent Systems and Technology. https://doi.org/10.1145/2089094.2089101 ↩︎

  4. Dou, W., Wang, X., Chang, R., & Ribarsky, W. (2011). ParallelTopics: A probabilistic approach to exploring document collections. VAST 2011 - IEEE Conference on Visual Analytics Science and Technology 2011, Proceedings. https://doi.org/10.1109/VAST.2011.6102461 ↩︎

  5. Cui, W., Liu, S., Tan, L., Shi, C., Song, Y., Gao, Z., … Qu, H. (2011). Textflow: Towards better understanding of evolving topics in text. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2011.239 ↩︎

  6. Dou, W., Yu, L., Wang, X., Ma, Z., & Ribarsky, W. (2013). HierarchicalTopics: Visually exploring large text collections using topic hierarchies. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2013.162 ↩︎

  7. Cui, W., Liu, S., Wu, Z., & Wei, H. (2014). How hierarchical topics evolve in large text corpora. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2014.2346433 ↩︎

  8. Wang, X., Liu, S., Liu, J., Chen, J., Zhu, J., & Guo, B. (2016). TopicPanorama: A Full Picture of Relevant Topics. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2016.2515592 ↩︎

  9. Gardner, M. J., Lutes, J., Lund, J., Hansen, J., Walker, D., Ringger, E., & Seppi, K. (2010). The Topic Browser: An Interactive Tool for Browsing Topic Models. Proceedings of the Workshop on Challenges of Data Visualization, Held in Conjunction with the 24th Annual Conference on Neural Information Processing Systems (NIPS 2010). ↩︎

  10. Chuang, J., Manning, C. D., & Heer, J. (2012). Termite : Visualization Techniques for Assessing Textual Topic Models Categories and Subject Descriptors. International Conference on Advanced Visual Interfaces (AVI). https://doi.org/10.1145/2254556.2254572 ↩︎

  11. Sievert, C., & Shirley, K. (2015). LDAvis: A method for visualizing and interpreting topics. https://doi.org/10.3115/v1/w14-3110 ↩︎

  12. Zhang, Z., Gotz, D., & Perer, A. (2015). Iterative cohort analysis and exploration. Information Visualization. https://doi.org/10.1177/1473871614526077 ↩︎

  13. Krause, J., Perer, A., & Stavropoulos, H. (2016). Supporting Iterative Cohort Construction with Visual Temporal Queries. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2015.2467622 ↩︎

  14. Malik, S., Du, F., Monroe, M., Onukwugha, E., Plaisant, C., & Shneiderman, B. (2015). Cohort Comparison of Event Sequences with Balanced Integration of Visual Analytics and Statistics. Proceedings of the 20th International Conference on Intelligent User Interfaces - IUI ’15. https://doi.org/10.1145/2678025.2701407 ↩︎

🔄 Last Updated: 7/8/2020, 9:48:18 PM