Unnest expands a list-column containing data frames into rows and columns.
Usage
unnest(
data,
cols,
...,
keep_empty = FALSE,
ptype = NULL,
names_sep = NULL,
names_repair = "check_unique",
.drop = deprecated(),
.id = deprecated(),
.sep = deprecated(),
.preserve = deprecated()
)
Arguments
- data
A data frame.
- cols
<
tidy-select
> List-columns to unnest.When selecting multiple columns, values from the same row will be recycled to their common size.
- ...
: previously you could write
df %>% unnest(x, y, z)
. Convert todf %>% unnest(c(x, y, z))
. If you previously created a new variable inunnest()
you'll now need to do it explicitly withmutate()
. Convertdf %>% unnest(y = fun(x, y, z))
todf %>% mutate(y = fun(x, y, z)) %>% unnest(y)
.- keep_empty
By default, you get one row of output for each element of the list that you are unchopping/unnesting. This means that if there's a size-0 element (like
NULL
or an empty data frame or vector), then that entire row will be dropped from the output. If you want to preserve all rows, usekeep_empty = TRUE
to replace size-0 elements with a single row of missing values.- ptype
Optionally, a named list of column name-prototype pairs to coerce
cols
to, overriding the default that will be guessed from combining the individual values. Alternatively, a single empty ptype can be supplied, which will be applied to allcols
.- names_sep
If
NULL
, the default, the outer names will come from the inner names. If a string, the outer names will be formed by pasting together the outer and the inner column names, separated bynames_sep
.- names_repair
Used to check that output data frame has valid names. Must be one of the following options:
"minimal
": no name repair or checks, beyond basic existence,"unique
": make sure names are unique and not empty,"check_unique
": (the default), no name repair, but check they are unique,"universal
": make the names unique and syntactica function: apply custom name repair.
tidyr_legacy: use the name repair from tidyr 0.8.
a formula: a purrr-style anonymous function (see
rlang::as_function()
)
See
vctrs::vec_as_names()
for more details on these terms and the strategies used to enforce them.- .drop, .preserve
: all list-columns are now preserved; If there are any that you don't want in the output use
select()
to remove them prior to unnesting.- .id
: convert
df %>% unnest(x, .id = "id")
todf %>% mutate(id = names(x)) %>% unnest(x))
.- .sep
New syntax
tidyr 1.0.0 introduced a new syntax for nest()
and unnest()
that's
designed to be more similar to other functions. Converting to the new syntax
should be straightforward (guided by the message you'll receive) but if
you just need to run an old analysis, you can easily revert to the previous
behaviour using nest_legacy()
and unnest_legacy()
as follows:
library(tidyr)
<- nest_legacy
nest <- unnest_legacy unnest
See also
Other rectangling:
hoist()
,
unnest_longer()
,
unnest_wider()
Examples
# unnest() is designed to work with lists of data frames
df <- tibble(
x = 1:3,
y = list(
NULL,
tibble(a = 1, b = 2),
tibble(a = 1:3, b = 3:1, c = 4)
)
)
# unnest() recycles input rows for each row of the list-column
# and adds a column for each column
df %>% unnest(y)
#> # A tibble: 4 × 4
#> x a b c
#> <int> <dbl> <dbl> <dbl>
#> 1 2 1 2 NA
#> 2 3 1 3 4
#> 3 3 2 2 4
#> 4 3 3 1 4
# input rows with 0 rows in the list-column will usually disappear,
# but you can keep them (generating NAs) with keep_empty = TRUE:
df %>% unnest(y, keep_empty = TRUE)
#> # A tibble: 5 × 4
#> x a b c
#> <int> <dbl> <dbl> <dbl>
#> 1 1 NA NA NA
#> 2 2 1 2 NA
#> 3 3 1 3 4
#> 4 3 2 2 4
#> 5 3 3 1 4
# Multiple columns ----------------------------------------------------------
# You can unnest multiple columns simultaneously
df <- tibble(
x = 1:2,
y = list(
tibble(a = 1, b = 2),
tibble(a = 3:4, b = 5:6)
),
z = list(
tibble(c = 1, d = 2),
tibble(c = 3:4, d = 5:6)
)
)
df %>% unnest(c(y, z))
#> # A tibble: 3 × 5
#> x a b c d
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 2 1 2
#> 2 2 3 5 3 5
#> 3 2 4 6 4 6
# Compare with unnesting one column at a time, which generates
# the Cartesian product
df %>%
unnest(y) %>%
unnest(z)
#> # A tibble: 5 × 5
#> x a b c d
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 2 1 2
#> 2 2 3 5 3 5
#> 3 2 3 5 4 6
#> 4 2 4 6 3 5
#> 5 2 4 6 4 6