Scatter plots with marginal densities and repelling labels

Author

Michael Stadler, Charlotte Soneson

Summary

This document illustrates how to use ggExtra and ggrepel together with ggplot2 to:

  • make a scatter plot that shows densities of points in the margins
  • add labels to a scatter plot that do not overlap each other (repelling labels)

Prepare data

Run the following code to prepare the data used in this document:

suppressPackageStartupMessages({
    library(tibble)
})

# built-in `mtcars` dataset (see ?mtcars)
# ... convert some columns to factors
mycars <- mtcars
mycars$model <- rownames(mycars)
mycars$cyl <- factor(mycars$cyl, levels = c("4","6","8"))
mycars$engine_shape <- factor(c("0" = "V-shaped", "1" = "straight")[as.character(mycars$vs)])
mycars$transmission <- factor(c("0" = "automatic", "1" = "manual")[as.character(mycars$am)])

tibble(mycars)
# A tibble: 32 × 14
     mpg cyl    disp    hp  drat    wt  qsec    vs    am  gear  carb model      
   <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>      
 1  21   6      160    110  3.9   2.62  16.5     0     1     4     4 Mazda RX4  
 2  21   6      160    110  3.9   2.88  17.0     0     1     4     4 Mazda RX4 …
 3  22.8 4      108     93  3.85  2.32  18.6     1     1     4     1 Datsun 710 
 4  21.4 6      258    110  3.08  3.22  19.4     1     0     3     1 Hornet 4 D…
 5  18.7 8      360    175  3.15  3.44  17.0     0     0     3     2 Hornet Spo…
 6  18.1 6      225    105  2.76  3.46  20.2     1     0     3     1 Valiant    
 7  14.3 8      360    245  3.21  3.57  15.8     0     0     3     4 Duster 360 
 8  24.4 4      147.    62  3.69  3.19  20       1     0     4     2 Merc 240D  
 9  22.8 4      141.    95  3.92  3.15  22.9     1     0     4     2 Merc 230   
10  19.2 6      168.   123  3.92  3.44  18.3     1     0     4     4 Merc 280   
# ℹ 22 more rows
# ℹ 2 more variables: engine_shape <fct>, transmission <fct>

Create figure

Load packages

Code
library(ggplot2)
library(ggExtra)
library(ggrepel)

Plot

Let’s start with a simple scatter plot that we want to annotate with marginal distributions and labels.

Code
# create base plot
p0 <- ggplot(data = mycars, mapping = aes(x = mpg, y = hp)) +
    geom_point() +
    labs(x = "Fuel efficiency (miles/gallon)", y = "Gross horsepower") +
    theme_bw(20) +
    theme(panel.grid = element_blank(),
          legend.position = "bottom")
p0

Now we add marginal histograms to show the distribution of observations when projected to the y- or x-axis.

Code
# ... with marginal histograms
ggMarginal(p0, type = "histogram", margins = "both", size = 4, bins = 7)

We can also use density plots instead of histograms, and group our data, so that the distribution of each group is shown separately.

Code
# ... with marginal density plots by number of cylinders
ggMarginal(p0 +
               aes(fill = cyl, col = cyl) +
               labs(fill = "#cylinders:",
                    color = "#cylinders:"),
           type = "density", margins = "both", size = 4,
           groupFill = TRUE)

Finally, we use violin plots to show marginal distributions and add labels to our observations.

Code
# ... with marginal violin and labelled data points
ggMarginal(p0 +
               geom_text_repel(mapping = aes(label = model),
                               color = "black") +
               aes(fill = cyl, col = cyl) +
               labs(fill = "#cylinders:",
                    color = "#cylinders:"),
           type = "violin", margins = "both", size = 4,
           groupFill = TRUE)
Warning: ggrepel: 7 unlabeled data points (too many overlaps). Consider
increasing max.overlaps

Remarks

  • ggMarginal requires either a ggplot2 plot object with a geom_point() layer (argument p, as used above), or you can provide argument data, x and y instead.
  • ggMarginal supports different types of marginals (type argument): density, histogram, boxplot, violin, densigram.
  • ggrepel provides two major functions to add labels to a ggplot2 figure: geom_text_repel() (the one we used above) just adds the labels, and geom_label_repel() which in addition draws a rectangle underneath the text, which may make the labels easier to read on crowded plots.

Session info

Code
sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.5.2 (2025-10-31)
 os       macOS Sequoia 15.7.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/Zurich
 date     2025-11-03
 pandoc   3.6.4 @ /usr/local/bin/ (via rmarkdown)
 quarto   1.7.28 @ /usr/local/bin/quarto

─ Packages ───────────────────────────────────────────────────────────────────
 package      * version date (UTC) lib source
 cli            3.6.5   2025-04-23 [1] CRAN (R 4.5.0)
 dichromat      2.0-0.1 2022-05-02 [1] CRAN (R 4.5.0)
 digest         0.6.37  2024-08-19 [1] CRAN (R 4.5.0)
 dplyr          1.1.4   2023-11-17 [1] CRAN (R 4.5.0)
 evaluate       1.0.5   2025-08-27 [1] CRAN (R 4.5.1)
 farver         2.1.2   2024-05-13 [1] CRAN (R 4.5.0)
 fastmap        1.2.0   2024-05-15 [1] CRAN (R 4.5.0)
 generics       0.1.4   2025-05-09 [1] CRAN (R 4.5.0)
 ggExtra      * 0.11.0  2025-09-01 [1] CRAN (R 4.5.0)
 ggplot2      * 4.0.0   2025-09-11 [1] CRAN (R 4.5.0)
 ggrepel      * 0.9.6   2024-09-07 [1] CRAN (R 4.5.0)
 glue           1.8.0   2024-09-30 [1] CRAN (R 4.5.0)
 gtable         0.3.6   2024-10-25 [1] CRAN (R 4.5.0)
 htmltools      0.5.8.1 2024-04-04 [1] CRAN (R 4.5.0)
 htmlwidgets    1.6.4   2023-12-06 [1] CRAN (R 4.5.0)
 httpuv         1.6.16  2025-04-16 [1] CRAN (R 4.5.0)
 jsonlite       2.0.0   2025-03-27 [1] CRAN (R 4.5.0)
 knitr          1.50    2025-03-16 [1] CRAN (R 4.5.0)
 labeling       0.4.3   2023-08-29 [1] CRAN (R 4.5.0)
 later          1.4.4   2025-08-27 [1] CRAN (R 4.5.1)
 lifecycle      1.0.4   2023-11-07 [1] CRAN (R 4.5.0)
 magrittr       2.0.4   2025-09-12 [1] CRAN (R 4.5.1)
 mime           0.13    2025-03-17 [1] CRAN (R 4.5.0)
 miniUI         0.1.2   2025-04-17 [1] CRAN (R 4.5.0)
 otel           0.2.0   2025-08-29 [1] CRAN (R 4.5.0)
 pillar         1.11.1  2025-09-17 [1] CRAN (R 4.5.0)
 pkgconfig      2.0.3   2019-09-22 [1] CRAN (R 4.5.0)
 promises       1.5.0   2025-11-01 [1] CRAN (R 4.5.0)
 R6             2.6.1   2025-02-15 [1] CRAN (R 4.5.0)
 RColorBrewer   1.1-3   2022-04-03 [1] CRAN (R 4.5.0)
 Rcpp           1.1.0   2025-07-02 [1] CRAN (R 4.5.0)
 rlang          1.1.6   2025-04-11 [1] CRAN (R 4.5.0)
 rmarkdown      2.30    2025-09-28 [1] CRAN (R 4.5.0)
 rstudioapi     0.17.1  2024-10-22 [1] CRAN (R 4.5.0)
 S7             0.2.0   2024-11-07 [1] CRAN (R 4.5.0)
 scales         1.4.0   2025-04-24 [1] CRAN (R 4.5.0)
 sessioninfo    1.2.3   2025-02-05 [1] CRAN (R 4.5.0)
 shiny          1.11.1  2025-07-03 [1] CRAN (R 4.5.0)
 tibble       * 3.3.0   2025-06-08 [1] CRAN (R 4.5.0)
 tidyselect     1.2.1   2024-03-11 [1] CRAN (R 4.5.0)
 utf8           1.2.6   2025-06-08 [1] CRAN (R 4.5.0)
 vctrs          0.6.5   2023-12-01 [1] CRAN (R 4.5.0)
 withr          3.0.2   2024-10-28 [1] CRAN (R 4.5.0)
 xfun           0.54    2025-10-30 [1] CRAN (R 4.5.0)
 xtable         1.8-4   2019-04-21 [1] CRAN (R 4.5.0)
 yaml           2.3.10  2024-07-26 [1] CRAN (R 4.5.0)

 [1] /Users/stadler/Library/R/arm64/4.5/library/__bioc322
 [2] /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library
 * ── Packages attached to the search path.

──────────────────────────────────────────────────────────────────────────────