ggplot2 3.2.0

May 19, 2019

We’re thrilled to announce the release of ggplot2 3.2.0 on CRAN. ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

The 3.2.0 release is a minor release which focuses on performance improvements, but also includes a few new features and a range of bug fixes. There are no breaking changes in this release, though a few changes may affect the visual appearance of plots in minor ways, noted below. Further, there are a few changes that developers of extension packages may need to take into account, but which won’t affect regular users. For a full overview of the changes, please see the release notes.

This release also includes a range of contributions made during our tidyverse developer day in Austin, many from first-time contributors. We hope these contributors have been inspired to continue taking part in the development of ggplot2.

Lastly, this release also sees the entrance of Hiroaki Yutani (yutannihilation on both Github and Twitter) into the core developer team. Hiroaki has been amazing in tackling large and small issues, and we are very lucky to have him.

Performance

A large part of this release relates to internal changes intended to speed up ggplot2. Most of these are general and will affect the rendering speed of all the plots you make, but there has also been a specific focus on geom_sf(). This is all part of a larger effort to make plotting in R more performant, and also includes changes in gtable, sf, and R itself. Make sure to use the latest version of all of these packages to get the full benefit of the improvements.

geom_sf

While most changes are general, geom_sf() has received special attention. Most of this comes down to how geom_sf() converted its input to grobs, which was done row by row. This meant that plotting 10,000 ST_POINT objects would create 10,000 point grobs instead of a single grob containing 10,000 points. The same was true for all other data types. With the new release, ggplot2 will try to pack all objects into a single grob, which is possible when all objects are of the same type (MULTI* types can be mixed with scalar types). Packing polygons into a single grob is only possible with R 3.6.0 and upwards, but all other types are backwards compatible with older versions of R. If the data contains a mix of types, ggplot2 will fall back to creating a grob for each row, but this is a much less frequent situation, and can usually be remedied by creating multiple sf layers instead. The other big performance bottleneck in plotting sf data was the normalization of the coordinates. As sf stores its data in nested lists, the standard vectorization in R doesn’t apply, which led to much worse performance compared to normalizing data stored in standard data frame format. The latest release of sf includes optimized functions for these operations implemented in C which ggplot2 now uses, so plotting performance has improved immensely.

New features

With this release, ggplot2 gains the ability to plot polygons with holes (only in R 3.6 and later). Rather than providing a new geom, the functionality is built into geom_polygon() through a new subgroup aesthetic. Much as the group aesthetic separates polygons in the data, the subgroup aesthetic separates parts of each polygon. The first occurring subgroup should always be the outer ring of the polygon and any subsequent subgroup will describe holes in this (or, potentially, polygons within the holes and so on). There will not be any checks performed on the position of the subgroups, so it is the responsibility of the user to make sure they are inside the outer ring.

library(ggplot2)
library(tibble)
radians <- seq(0, 2 * pi, length.out = 101)[-1]
circle <- tibble(
  x = c(cos(radians), cos(radians) * 0.5),
  y = c(sin(radians), sin(radians) * 0.5),
  subgroup = rep(c(1, 2), each = 100)
)
ggplot(circle) + 
  geom_polygon(
    aes(x = x, y = y, subgroup = subgroup), 
    fill = "firebrick", 
    colour = "black"
  )

The other bigger new feature is the ability to modify the guide representation of a layer through a new key_glyph argument in all geoms. While the defaults are often fine, there are situations where a different look serves the visualisation:

ggplot(economics, aes(date, psavert, color = "savings rate")) + 
  geom_line(key_glyph = "timeseries")

geom_rug has seen a range of improvements to give users more control over the appearance of the rug lines. The length of the rug lines can now be controlled and it is further possible to specify that they should be placed outside of the plotting region:

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() + 
  geom_rug(sides = "tr", length = unit(1, "mm"), outside = TRUE) + 
  # Need to turn clipping off if rug is outside plot area
  coord_cartesian(clip = "off")

Aesthetics will now accept a function returning NULL, and treat it as setting the aesthetic to NULL. This can make it easier to program with ggplot2 e.g. by catching errors from non-existing variables:

df <- data.frame(x = 1:10, y = 1:10)
wrap <- function(...) tryCatch(..., error = function(e) NULL)

ggplot(df, aes(x, y, colour = wrap(no_such_column))) +
  geom_point()

Lastly, stat_function() gains the ability to accept purrr-style lambda functions. The use of the formula notation for creating lambda functions has become widespread, and it is only natural that ggplot2 accepts it as well:

df <- data.frame(x = 1:10, y = (1:10)^2)
ggplot(df, aes(x, y)) + 
  geom_point() +
  stat_function(fun = ~ .x^2)

Minor fixes and improvements

coord_sf now behaves the same as the other coords in relation to how it draws grid lines. This means that the aesthetics of the grid matches that of the other coordinate systems, and that it can be turned off through the theming system. You may see slight visual changes when using geom_sf() as the default grid lines were slightly thicker in coord_sf() prior to this change.

library(sf) 
#> Linking to GEOS 3.6.1, GDAL 2.1.3, PROJ 4.9.3

nc <- st_read(system.file("gpkg/nc.gpkg", package = "sf"), quiet = TRUE) 

ggplot(nc) + 
  geom_sf(data = nc, aes(fill = AREA)) + 
  theme_void()

The automatic naming of scales has been refined and no longer contains back-ticks when the scale name is based on a complex aesthetic expression (e.g. aes(x = a + b)). Again, this may result in slight changes to the visual appearance of plots, but only on a very superficial level.

Acknowledgements

Thank you to the 171 people who who contributed issues, code and comments to this release: @abl0719, @agila5, @ahmohamed, @amysheep, @andhamel, @anthonytw, @Atomizer15, @atusy, @baderstine, @bakaburg1, @batpigandme, @bdschwartzkopf, @beckymaust, @behrman, @bfgray3, @billdenney, @Bisaloo, @bjreisman, @blueskypie, @brfitzpatrick, @brianwdavis, @brodieG, @caayala, @cderv, @chambox, @clairemcwhite, @clauswilke, @codetrainee, @ColinFay, @connorlewis, @coolbutuseless, @courtiol, @cschwarz-stat-sfu-ca, @cystein, @dan-booth, @daniel-wells, @DanielReedOcean, @danielsjf, @daranzolin, @dariyasydykova, @dempseynoel, @dipterix, @dirkschumacher, @dkahle, @dominicroye, @dongzhuoer, @dpseidel, @dvcv, @efehandanisman, @Eisit, @Eli-Berkow, @eliocamp, @ellessenne, @eoppe1022, @felipegerard, @fereshtehizadi, @flying-sheep, @fmmattioni, @foehnwind, @fostvedt, @gagnagaman, @gangstertiny, @GBuchanon, @gdkrmr, @gdmcdonald, @ghost, @gibran-ali, @hadley, @HaoLi111, @has2k1, @heavywatal, @Henrik-P, @HenrikBengtsson, @hlendway, @hoasxyz, @hwt, @hyiltiz, @idavydov, @Ilia-Kosenkov, @IndrajeetPatil, @ismayc, @JamesCuster, @jarauh, @JayVii, @jmarshallnz, @JohnMount, @jonoyuan, @jprice80, @jpritikin, @jrnold, @jsekamane, @jtelleria, @jugularvein, @karawoo, @Katiedaisey, @katossky, @kdarras, @kendonB, @kilterwind, @LDalby, @linzi-sg, @lionel-, @llendway, @llrs, @lpantano, @LuffyLuffy, @lwjohnst86, @lz1nwm, @m-macaskill, @malcolmbarrett, @martin-ueding, @matthewParksViome, @maxheld83, @MaximOtt, @mbertolacci, @mcguinlu, @MCM-Math, @miguelmorin, @mine-cetinkaya-rundel, @mitchelloharawild, @mluerig, @moodymudskipper, @njtierney, @noahmotion, @npjc, @oranwutang, @osorensen, @pachamaltese, @paleolimbot, @PaulLantos, @peterhurford, @phmarek, @powerbilayeredmap, @ppanko, @ptoche, @puhachov, @rajkstats, @richierocks, @sabahzero, @sahilseth, @SavasAli, @sctyner, @sebneus, @shauyin520, @sjackman, @skanskan, @slowkow, @smouksassi, @sn248, @sowla, @sschloss1, @StefanBRas, @steffilazerte, @tcastrosantos, @thomasp85, @topepo, @Torvaney, @touala, @tungmilan, @vanzanden, @vbuchhold, @W-L, @wongjingping, @wrightaprilm, @x1o, @yosukefk, @yudong862, @yutannihilation, @zeehio, @zlskidmore, and @zuccaval.

Previous Article
Resource Cleanup in C and the R API

h1 code, h2 code { font-size: 0.8em; } Introduction We have just released the cleancall package to CR...

Next Article
dplyr 0.8.1

Introduction We’re delighted to announce the release of dplyr 0.8.1 on CRAN 🎉 ! This is a minor release t...