Let’s start with a simple scatter plot that we want to annotate with marginal distributions and labels.
Code
# create base plotp0 <-ggplot(data = mycars, mapping =aes(x = mpg, y = hp)) +geom_point() +labs(x ="Fuel efficiency (miles/gallon)", y ="Gross horsepower") +theme_bw(20) +theme(panel.grid =element_blank(),legend.position ="bottom")p0
Now we add marginal histograms to show the distribution of observations when projected to the y- or x-axis.
Code
# ... with marginal histogramsggMarginal(p0, type ="histogram", margins ="both", size =4, bins =7)
We can also use density plots instead of histograms, and group our data, so that the distribution of each group is shown separately.
Code
# ... with marginal density plots by number of cylindersggMarginal(p0 +aes(fill = cyl, col = cyl) +labs(fill ="#cylinders:",color ="#cylinders:"),type ="density", margins ="both", size =4,groupFill =TRUE)
Finally, we use violin plots to show marginal distributions and add labels to our observations.
Code
# ... with marginal violin and labelled data pointsggMarginal(p0 +geom_text_repel(mapping =aes(label = model),color ="black") +aes(fill = cyl, col = cyl) +labs(fill ="#cylinders:",color ="#cylinders:"),type ="violin", margins ="both", size =4,groupFill =TRUE)
Warning: ggrepel: 7 unlabeled data points (too many overlaps). Consider
increasing max.overlaps
Remarks
ggMarginal requires either a ggplot2 plot object with a geom_point() layer (argument p, as used above), or you can provide argument data, x and y instead.
ggMarginal supports different types of marginals (type argument): density, histogram, boxplot, violin, densigram.
ggrepel provides two major functions to add labels to a ggplot2 figure: geom_text_repel() (the one we used above) just adds the labels, and geom_label_repel() which in addition draws a rectangle underneath the text, which may make the labels easier to read on crowded plots.