# dingen2018regression

# RegressionExplorer: Interactive Exploration of Logistic Regression Models with Subgroup Analysis


This paper presents a visual analytics tool, RegressionExplorer, for the interactive exploration of logistic regression models in the field of Clinical Biostatistics. RegressionExplorer enables experts to explore patients in clinical data in order to formulate new hypotheses. This is achieved by allowing quick generation, evaluation and comparison of logistic models trained with clinical settings as shown in Figure 1.


The clinical settings used for training logistic models are discussed and chosen with domain experts. The process begins by selecting a statistically appropriate responder from a given population.

Univariate analysis is then conducted to identify covariates, which are fed into the training process, covariates are associated with the mediator, the confounder, the moderator and the responder as shown in Figure 2.

Multivariate analysis and diagnostics check the responder against all covariates for irregularities.

Domain experts then verify the model using the predicted outcomes from a selected subgroup of a population.

Related Work

The work is built upon the survey of interactive visualization systems of EHR by Rind et al. [1]. The authors also review visualization systems built by Bernard et al. [2], Loorak et al. [3] and Malik et al. [4].

Predictive analytics survey papers from Lu et al. [5] and Wang et al. [6] also inspired the authors.

Data Characteristics

Two datasets are used for this research.

  1. Cardiac conduction disorder

    • Size: 1,031 patients
    • Spatial dimensionality: 2D
    • Temporal dimensionality: static
    • Type: multivariate, 61 variables
  2. Hypernatremia

    • Size: 6,554 patients
    • Spatial dimensionality: 2D
    • Temporal dimensionality: static
    • Type: multivariate, 70 variables

Visualization Techniques

  • Matrix plot
  • Histogram

Papers Cited


Years Spanned


Application Domain

  • Clinical Biostatistics
  • Predictive Analytics
  • Machine Learning
  • Medical focus: Cardiology - Cardiac conduction disorder, Internal Medicine - Hypernatremia


  1. Rind, A., Wang, T. D., Aigner, W., Miksch, S., Wongsuphasawat, K., Plaisant, C., & Shneiderman, B. (2011). Interactive Information Visualization to Explore and Query Electronic Health Records. Foundations and Trends® in Human–Computer Interaction, 5(3), 207–298. https://doi.org/10.1561/1100000039 ↩︎

  2. Bernard, J., Sessler, D., May, T., Schlomm, T., Pehrke, D., & Kohlhammer, J. (2015). A visual-interactive system for prostate cancer cohort analysis. IEEE Computer Graphics and Applications. https://doi.org/10.1109/MCG.2015.49 ↩︎

  3. Loorak, M. H., Perin, C., Kamal, N., Hill, M., & Carpendale, S. (2016). TimeSpan: Using Visualization to Explore Temporal Multi-dimensional Data of Stroke Patients. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2015.2467325 ↩︎

  4. Malik, S., Du, F., Monroe, M., Onukwugha, E., Plaisant, C., & Shneiderman, B. (2015). Cohort Comparison of Event Sequences with Balanced Integration of Visual Analytics and Statistics. Proceedings of the 20th International Conference on Intelligent User Interfaces - IUI ’15. https://doi.org/10.1145/2678025.2701407 ↩︎

  5. Lu, Y., Garcia, R., Hansen, B., Gleicher, M., & Maciejewski, R. (2017). The State-of-the-Art in Predictive Visual Analytics. Computer Graphics Forum. https://doi.org/10.1111/cgf.13210 ↩︎

  6. Wang, X. M., Zhang, T. Y., Ma, Y. X., Xia, J., & Chen, W. (2016). A Survey of Visual Analytic Pipelines. Journal of Computer Science and Technology. https://doi.org/10.1007/s11390-016-1663-1 ↩︎

🔄 Last Updated: 7/8/2020, 9:48:18 PM